Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generally Slow Interpreter Performance #4

Open
ashn-dot-dev opened this issue Feb 21, 2025 · 0 comments
Open

Generally Slow Interpreter Performance #4

ashn-dot-dev opened this issue Feb 21, 2025 · 0 comments

Comments

@ashn-dot-dev
Copy link
Owner

ashn-dot-dev commented Feb 21, 2025

Generally, the Lumpy interpreter tends to perform poorly. Lumpy takes more time to execute a given piece of code than similar scripting languages, and takes significantly more time to execute a given piece of code than languages compiling to native machine code via an optimizing compiler.

To measure this performance gap I put together a quick benchmark measuring the time taken to compute the 25th Fibonacci number in a handful of languages using the same naive recursive Fibonacci function implementation. For the benchmark, I chose C (via Clang) and Sunder (with Clang as the backend C compiler) in order to test native ahead-of-time compiled code produced by an optimizing compiler. For scripting languages, I chose Lua (via lua.c), Python (via CPython), and Monkey (via my own monkey-python implementation which uses a similar tree-walk interpreter to Lumpy).

Fibonacci-25 in C:

#include <stdio.h>

static unsigned long
fibonacci(unsigned long n)
{
    if (n <= 1) {
        return n;
    }
    return fibonacci(n - 1) + fibonacci(n - 2);
}

int
main(void)
{
    printf("%lu\n", fibonacci(25));
}

Fibonacci-25 in Sunder:

import "std";

func fibonacci(n: usize) usize {
    if n <= 1 {
        return n;
    }
    return fibonacci(n - 1) + fibonacci(n - 2);
}

func main() void {
    var answer = fibonacci(25);
    std::print_format_line(std::out(), "{}", (:[]std::formatter)[std::formatter::init[[typeof(answer)]](&answer)]);
}

Fibonacci-25 in Lua:

function fibonacci(n)
    if n <= 1 then
      return n
    end
    return fibonacci(n - 1) + fibonacci(n - 2)
end

print(fibonacci(25));

Fibonacci-25 in Python:

def fibonacci(n):
    if n <= 1:
        return n
    return fibonacci(n - 1) + fibonacci(n - 2)

print(fibonacci(25))

Fibonacci-25 in Monkey:

let fibonacci = fn(n) {
    if (n < 2) {
        return n;
    }
    return fibonacci(n - 1) + fibonacci(n - 2);
};

puts(fibonacci(25));

Fibonacci-25 in Lumpy:

let fibonacci = function(n) {
    if n <= 1 {
        return n;
    }
    return fibonacci(n - 1) + fibonacci(n - 2);
};

dumpln(fibonacci(25));

The benchmark script and programs can be found here. The output/summary for this GitHub issue was generated the morning of 2021-02-21 on my M2 Macbook Air. Measurements were taken using Clang version Apple clang version 16.0.0 (clang-1600.0.26.6), Sunder release 2025.02.19, Lua at version Lua 5.4.7 Copyright (C) 1994-2024 Lua.org, PUC-Rio, CPython at version Python 3.13.2, monkey-python at commit 173d1852e8fee25476388a656551753e486fc7b0, and Lumpy at commit abf380050192f2b843b3340a6f3ebececfb03a40.

The benchmark script:

docmd() {
    echo "\$ ${1}"
    eval "${1}"
}

bench() {
    RUNS=5
    for i in $(seq "${RUNS}"); do
        docmd "time ${1}"
        if [ "${i}" -ne "${RUNS}" ]; then echo '----------------'; fi
    done
}

echo 'System Info'
echo '==========='
sysctl -n machdep.cpu.brand_string
uname -v
printf '\n\n'

trap '{ rm -f fibonacci-25; }' EXIT

echo 'C AOT compiled with optimization -O0'
echo '===================================='
docmd 'clang -O0 -o fibonacci-25 fibonacci-25.c'
bench './fibonacci-25'
printf '\n\n'

echo 'C AOT compiled with optimization -O2'
echo '===================================='
docmd 'clang -O2 -o fibonacci-25 fibonacci-25.c'
bench './fibonacci-25'
printf '\n\n'

echo 'Sunder AOT compiled with optimization -O0'
echo '========================================='
SUNDER_CC=clang SUNDER_CFLAGS='-O0' \
docmd 'sunder-compile -o fibonacci-25 fibonacci-25.sunder'
bench './fibonacci-25'
printf '\n\n'

echo 'Sunder AOT compiled with optimization -O2'
echo '========================================='
SUNDER_CC=clang SUNDER_CFLAGS='-O2' \
docmd 'sunder-compile -o fibonacci-25 fibonacci-25.sunder'
bench './fibonacci-25'
printf '\n\n'

echo 'Sunder compiled-and-run using sunder-run with optimization -O0'
echo '=============================================================='
SUNDER_CC=clang SUNDER_CFLAGS='-O0' \
bench "sunder-run fibonacci-25.sunder"
printf '\n\n'

echo 'Sunder compiled-and-run using sunder-run with optimization -O2'
echo '=============================================================='
SUNDER_CC=clang SUNDER_CFLAGS='-O2' \
bench "sunder-run fibonacci-25.sunder"
printf '\n\n'

echo 'Lua (Standalone lua.c)'
echo '======================'
bench 'lua fibonacci-25.lua'
printf '\n\n'

echo 'Python interpreted using CPython'
echo '================================'
bench 'python3 fibonacci-25.py'
printf '\n\n'

echo 'Monkey interpreted using monkey-python.py (CPython)'
echo '==================================================='
bench 'python3 ~/sources/monkey-python/monkey.py fibonacci-25.monkey'
printf '\n\n'

echo 'Lumpy interpreted using lumpy.py (CPython)'
echo '=========================================='
bench 'python3 $LUMPY_HOME/lumpy.py fibonacci-25.lumpy'
printf '\n\n'

echo 'Lumpy interpreted using lumpy (Nuitka-compiled lumpy.py)'
echo '========================================================'
bench 'lumpy fibonacci-25.lumpy'
printf '\n\n'```

Benchmark output:

$ sh bench.sh >"$(date -u +%Y-%m-%d)-bench.output.txt" 2>&1 && cat "$(date -u +%Y-%m-%d)-bench.output.txt"
System Info
===========
Apple M2
Darwin Kernel Version 23.6.0: Mon Jul 29 21:16:46 PDT 2024; root:xnu-10063.141.2~1/RELEASE_ARM64_T8112


C AOT compiled with optimization -O0
====================================
$ clang -O0 -o fibonacci-25 fibonacci-25.c
$ time ./fibonacci-25
75025

real	0m0.162s
user	0m0.002s
sys	0m0.001s
----------------
$ time ./fibonacci-25
75025

real	0m0.004s
user	0m0.002s
sys	0m0.001s
----------------
$ time ./fibonacci-25
75025

real	0m0.002s
user	0m0.001s
sys	0m0.001s
----------------
$ time ./fibonacci-25
75025

real	0m0.002s
user	0m0.001s
sys	0m0.001s
----------------
$ time ./fibonacci-25
75025

real	0m0.002s
user	0m0.001s
sys	0m0.001s


C AOT compiled with optimization -O2
====================================
$ clang -O2 -o fibonacci-25 fibonacci-25.c
$ time ./fibonacci-25
75025

real	0m0.122s
user	0m0.000s
sys	0m0.001s
----------------
$ time ./fibonacci-25
75025

real	0m0.002s
user	0m0.000s
sys	0m0.001s
----------------
$ time ./fibonacci-25
75025

real	0m0.002s
user	0m0.001s
sys	0m0.001s
----------------
$ time ./fibonacci-25
75025

real	0m0.002s
user	0m0.001s
sys	0m0.001s
----------------
$ time ./fibonacci-25
75025

real	0m0.002s
user	0m0.000s
sys	0m0.001s


Sunder AOT compiled with optimization -O0
=========================================
$ sunder-compile -o fibonacci-25 fibonacci-25.sunder
$ time ./fibonacci-25
75025

real	0m0.185s
user	0m0.008s
sys	0m0.002s
----------------
$ time ./fibonacci-25
75025

real	0m0.007s
user	0m0.005s
sys	0m0.001s
----------------
$ time ./fibonacci-25
75025

real	0m0.006s
user	0m0.004s
sys	0m0.001s
----------------
$ time ./fibonacci-25
75025

real	0m0.005s
user	0m0.004s
sys	0m0.001s
----------------
$ time ./fibonacci-25
75025

real	0m0.005s
user	0m0.004s
sys	0m0.001s


Sunder AOT compiled with optimization -O2
=========================================
$ sunder-compile -o fibonacci-25 fibonacci-25.sunder
$ time ./fibonacci-25
75025

real	0m0.172s
user	0m0.002s
sys	0m0.002s
----------------
$ time ./fibonacci-25
75025

real	0m0.004s
user	0m0.002s
sys	0m0.001s
----------------
$ time ./fibonacci-25
75025

real	0m0.002s
user	0m0.001s
sys	0m0.001s
----------------
$ time ./fibonacci-25
75025

real	0m0.002s
user	0m0.001s
sys	0m0.001s
----------------
$ time ./fibonacci-25
75025

real	0m0.002s
user	0m0.001s
sys	0m0.001s


Sunder compiled-and-run using sunder-run with optimization -O0
==============================================================
$ time sunder-run fibonacci-25.sunder
75025

real	0m0.551s
user	0m0.362s
sys	0m0.033s
----------------
$ time sunder-run fibonacci-25.sunder
75025

real	0m0.553s
user	0m0.368s
sys	0m0.033s
----------------
$ time sunder-run fibonacci-25.sunder
75025

real	0m0.556s
user	0m0.369s
sys	0m0.033s
----------------
$ time sunder-run fibonacci-25.sunder
75025

real	0m0.552s
user	0m0.371s
sys	0m0.034s
----------------
$ time sunder-run fibonacci-25.sunder
75025

real	0m0.560s
user	0m0.374s
sys	0m0.035s


Sunder compiled-and-run using sunder-run with optimization -O2
==============================================================
$ time sunder-run fibonacci-25.sunder
75025

real	0m0.879s
user	0m0.697s
sys	0m0.038s
----------------
$ time sunder-run fibonacci-25.sunder
75025

real	0m0.887s
user	0m0.699s
sys	0m0.039s
----------------
$ time sunder-run fibonacci-25.sunder
75025

real	0m0.884s
user	0m0.699s
sys	0m0.039s
----------------
$ time sunder-run fibonacci-25.sunder
75025

real	0m0.886s
user	0m0.701s
sys	0m0.038s
----------------
$ time sunder-run fibonacci-25.sunder
75025

real	0m0.882s
user	0m0.697s
sys	0m0.040s


Lua (Standalone lua.c)
======================
$ time lua fibonacci-25.lua
75025

real	0m0.014s
user	0m0.011s
sys	0m0.002s
----------------
$ time lua fibonacci-25.lua
75025

real	0m0.009s
user	0m0.008s
sys	0m0.001s
----------------
$ time lua fibonacci-25.lua
75025

real	0m0.008s
user	0m0.007s
sys	0m0.001s
----------------
$ time lua fibonacci-25.lua
75025

real	0m0.008s
user	0m0.007s
sys	0m0.001s
----------------
$ time lua fibonacci-25.lua
75025

real	0m0.007s
user	0m0.006s
sys	0m0.001s


Python interpreted using CPython
================================
$ time python3 fibonacci-25.py
75025

real	0m0.031s
user	0m0.021s
sys	0m0.004s
----------------
$ time python3 fibonacci-25.py
75025

real	0m0.022s
user	0m0.019s
sys	0m0.003s
----------------
$ time python3 fibonacci-25.py
75025

real	0m0.022s
user	0m0.019s
sys	0m0.003s
----------------
$ time python3 fibonacci-25.py
75025

real	0m0.022s
user	0m0.019s
sys	0m0.003s
----------------
$ time python3 fibonacci-25.py
75025

real	0m0.022s
user	0m0.019s
sys	0m0.003s


Monkey interpreted using monkey-python.py (CPython)
===================================================
$ time python3 ~/sources/monkey-python/monkey.py fibonacci-25.monkey
75025

real	0m1.389s
user	0m1.308s
sys	0m0.078s
----------------
$ time python3 ~/sources/monkey-python/monkey.py fibonacci-25.monkey
75025

real	0m1.376s
user	0m1.309s
sys	0m0.066s
----------------
$ time python3 ~/sources/monkey-python/monkey.py fibonacci-25.monkey
75025

real	0m1.377s
user	0m1.316s
sys	0m0.060s
----------------
$ time python3 ~/sources/monkey-python/monkey.py fibonacci-25.monkey
75025

real	0m1.387s
user	0m1.314s
sys	0m0.072s
----------------
$ time python3 ~/sources/monkey-python/monkey.py fibonacci-25.monkey
75025

real	0m1.380s
user	0m1.319s
sys	0m0.059s


Lumpy interpreted using lumpy.py (CPython)
==========================================
$ time python3 $LUMPY_HOME/lumpy.py fibonacci-25.lumpy
75025

real	0m3.902s
user	0m2.512s
sys	0m1.387s
----------------
$ time python3 $LUMPY_HOME/lumpy.py fibonacci-25.lumpy
75025

real	0m3.943s
user	0m2.537s
sys	0m1.404s
----------------
$ time python3 $LUMPY_HOME/lumpy.py fibonacci-25.lumpy
75025

real	0m3.963s
user	0m2.548s
sys	0m1.413s
----------------
$ time python3 $LUMPY_HOME/lumpy.py fibonacci-25.lumpy
75025

real	0m3.930s
user	0m2.530s
sys	0m1.398s
----------------
$ time python3 $LUMPY_HOME/lumpy.py fibonacci-25.lumpy
75025

real	0m3.957s
user	0m2.547s
sys	0m1.408s


Lumpy interpreted using lumpy (Nuitka-compiled lumpy.py)
========================================================
$ time lumpy fibonacci-25.lumpy
75025

real	0m3.062s
user	0m3.031s
sys	0m0.022s
----------------
$ time lumpy fibonacci-25.lumpy
75025

real	0m3.044s
user	0m3.026s
sys	0m0.017s
----------------
$ time lumpy fibonacci-25.lumpy
75025

real	0m3.031s
user	0m3.013s
sys	0m0.014s
----------------
$ time lumpy fibonacci-25.lumpy
75025

real	0m3.038s
user	0m3.018s
sys	0m0.019s
----------------
$ time lumpy fibonacci-25.lumpy
75025

real	0m3.041s
user	0m3.027s
sys	0m0.013s

Benchmark summary:

$ cat 2025-02-21-bench.output.summary.txt 
System Info
===========
Apple M2
Darwin Kernel Version 23.6.0: Mon Jul 29 21:16:46 PDT 2024; root:xnu-10063.141.2~1/RELEASE_ARM64_T8112


C AOT compiled with optimization -O0
====================================
0m0.162s
0m0.004s
0m0.002s
0m0.002s
0m0.002s

C AOT compiled with optimization -O2
====================================
0m0.122s
0m0.002s
0m0.002s
0m0.002s
0m0.002s

Sunder AOT compiled with optimization -O0
=========================================
0m0.185s
0m0.007s
0m0.006s
0m0.005s
0m0.005s

Sunder AOT compiled with optimization -O2
=========================================
0m0.172s
0m0.004s
0m0.002s
0m0.002s
0m0.002s

Sunder compiled-and-run using sunder-run with optimization -O0
==============================================================
0m0.551s
0m0.553s
0m0.556s
0m0.552s
0m0.560s

Sunder compiled-and-run using sunder-run with optimization -O2
==============================================================
0m0.879s
0m0.887s
0m0.884s
0m0.886s
0m0.882s

Lua (Standalone lua.c)
======================
0m0.014s
0m0.009s
0m0.008s
0m0.008s
0m0.007s

Python interpreted using CPython
================================
0m0.031s
0m0.022s
0m0.022s
0m0.022s
0m0.022s

Monkey interpreted using monkey-python.py (CPython)
===================================================
0m1.389s
0m1.376s
0m1.377s
0m1.387s
0m1.380s

Lumpy interpreted using lumpy.py (CPython)
==========================================
0m3.902s
0m3.943s
0m3.963s
0m3.930s
0m3.957s

Lumpy interpreted using lumpy (Nuitka-compiled lumpy.py)
========================================================
0m3.062s
0m3.044s
0m3.031s
0m3.038s
0m3.041s
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant