Home Performance Guide

Performance Optimization

Levython is designed for high performance with JIT compilation, optimized memory layout, and zero-overhead abstractions.

Benchmarks

Performance comparison with other interpreted languages:

Benchmark	Levython	Python 3.11	Node.js 20	Ruby 3.2
Fibonacci (n=35)	0.45s	3.2s	0.8s	5.1s
List operations (1M)	0.12s	0.68s	0.25s	1.2s
String concat (100K)	0.08s	0.45s	0.15s	0.92s
Startup time	8ms	35ms	65ms	125ms

JIT Compilation

Levython uses Just-In-Time compilation for hot code paths:

How JIT Works

Bytecode Interpretation: Code starts in interpreted mode
Hot Path Detection: Frequently called functions identified
Native Compilation: JIT compiles to x86-64 machine code
Execution: Subsequent calls run native code directly

# This function will be JIT compiled after ~100 calls
fun factorial(n)
    if n <= 1 then
        return 1
    end
    return n * factorial(n - 1)
end

# First 100 calls: interpreted (~2.5s total)
# Remaining calls: JIT compiled (~0.3s total)
for i in range(0, 10000) do
    factorial(20)
end

Optimization Techniques

1. Use Local Variables

Local variable access is faster than global lookups:

# Slower - global lookup each iteration
for i in range(0, 1000000) do
    result <- math.sqrt(i)
end

# Faster - cache in local variable
sqrt <- math.sqrt
for i in range(0, 1000000) do
    result <- sqrt(i)
end

2. Avoid Repeated Calculations

# Slower - recalculate every iteration
for i in range(0, len(items)) do
    process(items[i])
end

# Faster - calculate once
n <- len(items)
for i in range(0, n) do
    process(items[i])
end

3. Use List Comprehensions

# Slower - multiple append operations
result <- []
for i in range(0, 1000) do
    if i % 2 == 0 then
        append(result, i * 2)
    end
end

# Faster - optimized list building
result <- [i * 2 for i in range(0, 1000) if i % 2 == 0]

4. Preallocate Lists

# Slower - dynamic growth
data <- []
for i in range(0, 10000) do
    append(data, i)
end

# Faster - preallocate with known size
size <- 10000
data <- [0] * size
for i in range(0, size) do
    data[i] <- i
end

5. String Building

# Slower - creates new string each iteration
result <- ""
for i in range(0, 1000) do
    result <- result + str(i) + ","
end

# Faster - use list and join
parts <- []
for i in range(0, 1000) do
    append(parts, str(i))
end
result <- join(parts, ",")

Memory Efficiency

NaN-Boxing

Levython uses NaN-boxing to store all values in 64 bits:

Integers: 53-bit signed integers stored directly
Floats: Standard IEEE 754 double precision
Pointers: 48-bit pointers in NaN payload
Booleans: Special NaN patterns

List Memory Layout

Operation	Memory Complexity	Notes
Create empty list	24 bytes	Header + small capacity
Append (within capacity)	0 bytes	No reallocation
Append (resize needed)	N * 1.5 * 8 bytes	Growth factor of 1.5x
Per element	8 bytes	NaN-boxed value

Profiling

Measure execution time of code sections:

# Time a function
start <- time()
result <- fibonacci(30)
elapsed <- time() - start
say("Execution time: " + str(elapsed) + "s")

# Compare two approaches
start1 <- time()
method1()
time1 <- time() - start1

start2 <- time()
method2()
time2 <- time() - start2

say("Method 1: " + str(time1) + "s")
say("Method 2: " + str(time2) + "s")
say("Speedup: " + str(time1 / time2) + "x")

Best Practices

✅ Use local variables instead of global lookups
✅ Cache function references in loops
✅ Preallocate lists with known sizes
✅ Use list comprehensions for filtering/mapping
✅ Join strings with join() instead of concatenation
✅ Avoid unnecessary type conversions
✅ Use range() instead of building full lists
❌ Don't optimize prematurely - profile first
❌ Don't use recursion for deep call stacks (>1000)
❌ Don't create temporary lists in hot loops

Compiler Optimizations

Levython automatically applies these optimizations:

Constant Folding

# Input code
x <- 10 * 5 + 3

# Optimized to
x <- 53

Dead Code Elimination

# Input code
if false then
    say("Never executed")
end

# Optimized to
# (entire block removed)

Loop Invariant Code Motion

# Input code
for i in range(0, 1000) do
    limit <- len(items) * 2
    process(items[i], limit)
end

# Optimized to
limit <- len(items) * 2
for i in range(0, 1000) do
    process(items[i], limit)
end

Performance Tips by Use Case

Data Processing

Use list comprehensions for filtering
Batch operations instead of processing one at a time
Cache computed values in local variables

String Manipulation

Use join() for building long strings
Cache string functions (upper, lower, trim)
Avoid repeated concatenation in loops

Numerical Computing

Use integers when possible (faster than floats)
Vectorize operations using list comprehensions
Minimize function call overhead in tight loops