Skip to content

Generating Python bindings performance

Fedor Chelnokov edited this page Oct 13, 2024 · 4 revisions

Generating Python bindings performance

All C++ functions and classes are automatically exposed to Python using MRBind, see more.

Since the number of exposed entities is huge, this process is not instantaneous and proper build settings are crucial for best performance.

Below measurements were made on a computer with

OS: Windows 11
CPU: Model Name AMD Ryzen 9 3900X 12-Core Processor, 24 Concurrent Threads
Memory: 64 Gb

Release configuration

This configuration takes much more time due to compiler optimizations, and is built with the command

scripts\mrbind\generate_win.bat -B -j NumThreads NUM_FRAGMENTS=NumFragments

Single threaded builds

They are the best if you have limited amount of memory, and increasing NumFragments can decrease peak memory consumption even more.

NumThreads NumFragments Build Time Comment
1 1 20 min 40 sec
1 2 16 min 46 sec
1 4 14 min 47 sec
1 5 14 min 45 sec
1 6 14 min 37 sec Fastest
1 8 16 min 36 sec

For some reason, subdivision of bindings on 6 fragments gives best single-threaded performance, even though the system is not limited by memory.

Multi threaded builds

These are much faster, but the peak memory consumption increases proportionally to the number of threads compared to single-threaded builds with the same number of fragments.

NumThreads NumFragments Build Time Comment
4 4 6 min 6 sec
6 6 4 min 10 sec
8 8 4 min 1 sec
24 24 3 min 20 sec
24 28 3 min 15 sec
24 32 3 min 3 sec Fastest
24 36 3 min 6 sec
24 40 3 min 6 sec
24 48 3 min 10 sec
32 32 3 min 9 sec

The best building times are reached with the number of threads equal to CPU concurrency, and somewhat larger number of fragments.

Clone this wiki locally