Performance Optimization

Levython is designed for high performance with JIT compilation, optimized memory layout, and zero-overhead abstractions.

Benchmarks

Performance comparison with other interpreted languages:

Benchmark Levython Python 3.11 Node.js 20 Ruby 3.2
Fibonacci (n=35) 0.45s 3.2s 0.8s 5.1s
List operations (1M) 0.12s 0.68s 0.25s 1.2s
String concat (100K) 0.08s 0.45s 0.15s 0.92s
Startup time 8ms 35ms 65ms 125ms

JIT Compilation

Levython uses Just-In-Time compilation for hot code paths:

How JIT Works

  1. Bytecode Interpretation: Code starts in interpreted mode
  2. Hot Path Detection: Frequently called functions identified
  3. Native Compilation: JIT compiles to x86-64 machine code
  4. Execution: Subsequent calls run native code directly
# This function will be JIT compiled after ~100 calls
fun factorial(n)
    if n <= 1 then
        return 1
    end
    return n * factorial(n - 1)
end

# First 100 calls: interpreted (~2.5s total)
# Remaining calls: JIT compiled (~0.3s total)
for i in range(0, 10000) do
    factorial(20)
end

Optimization Techniques

1. Use Local Variables

Local variable access is faster than global lookups:

# Slower - global lookup each iteration
for i in range(0, 1000000) do
    result <- math.sqrt(i)
end

# Faster - cache in local variable
sqrt <- math.sqrt
for i in range(0, 1000000) do
    result <- sqrt(i)
end

2. Avoid Repeated Calculations

# Slower - recalculate every iteration
for i in range(0, len(items)) do
    process(items[i])
end

# Faster - calculate once
n <- len(items)
for i in range(0, n) do
    process(items[i])
end

3. Use List Comprehensions

# Slower - multiple append operations
result <- []
for i in range(0, 1000) do
    if i % 2 == 0 then
        append(result, i * 2)
    end
end

# Faster - optimized list building
result <- [i * 2 for i in range(0, 1000) if i % 2 == 0]

4. Preallocate Lists

# Slower - dynamic growth
data <- []
for i in range(0, 10000) do
    append(data, i)
end

# Faster - preallocate with known size
size <- 10000
data <- [0] * size
for i in range(0, size) do
    data[i] <- i
end

5. String Building

# Slower - creates new string each iteration
result <- ""
for i in range(0, 1000) do
    result <- result + str(i) + ","
end

# Faster - use list and join
parts <- []
for i in range(0, 1000) do
    append(parts, str(i))
end
result <- join(parts, ",")

Memory Efficiency

NaN-Boxing

Levython uses NaN-boxing to store all values in 64 bits:

  • Integers: 53-bit signed integers stored directly
  • Floats: Standard IEEE 754 double precision
  • Pointers: 48-bit pointers in NaN payload
  • Booleans: Special NaN patterns

List Memory Layout

Operation Memory Complexity Notes
Create empty list 24 bytes Header + small capacity
Append (within capacity) 0 bytes No reallocation
Append (resize needed) N * 1.5 * 8 bytes Growth factor of 1.5x
Per element 8 bytes NaN-boxed value

Profiling

Measure execution time of code sections:

# Time a function
start <- time()
result <- fibonacci(30)
elapsed <- time() - start
say("Execution time: " + str(elapsed) + "s")

# Compare two approaches
start1 <- time()
method1()
time1 <- time() - start1

start2 <- time()
method2()
time2 <- time() - start2

say("Method 1: " + str(time1) + "s")
say("Method 2: " + str(time2) + "s")
say("Speedup: " + str(time1 / time2) + "x")

Best Practices

  • ✅ Use local variables instead of global lookups
  • ✅ Cache function references in loops
  • ✅ Preallocate lists with known sizes
  • ✅ Use list comprehensions for filtering/mapping
  • ✅ Join strings with join() instead of concatenation
  • ✅ Avoid unnecessary type conversions
  • ✅ Use range() instead of building full lists
  • ❌ Don't optimize prematurely - profile first
  • ❌ Don't use recursion for deep call stacks (>1000)
  • ❌ Don't create temporary lists in hot loops

Compiler Optimizations

Levython automatically applies these optimizations:

Constant Folding

# Input code
x <- 10 * 5 + 3

# Optimized to
x <- 53

Dead Code Elimination

# Input code
if false then
    say("Never executed")
end

# Optimized to
# (entire block removed)

Loop Invariant Code Motion

# Input code
for i in range(0, 1000) do
    limit <- len(items) * 2
    process(items[i], limit)
end

# Optimized to
limit <- len(items) * 2
for i in range(0, 1000) do
    process(items[i], limit)
end

Performance Tips by Use Case

Data Processing

  • Use list comprehensions for filtering
  • Batch operations instead of processing one at a time
  • Cache computed values in local variables

String Manipulation

  • Use join() for building long strings
  • Cache string functions (upper, lower, trim)
  • Avoid repeated concatenation in loops

Numerical Computing

  • Use integers when possible (faster than floats)
  • Vectorize operations using list comprehensions
  • Minimize function call overhead in tight loops