-
Notifications
You must be signed in to change notification settings - Fork 65
Generating Python bindings performance
All C++ functions and classes are automatically exposed to Python using MRBind
, see more.
Since the number of exposed entities is huge, this process is not instantaneous and proper build settings are crucial for best performance.
Below measurements were made on a computer with
OS: Windows 11
CPU: Model Name AMD Ryzen 9 3900X 12-Core Processor, 24 Concurrent Threads
Memory: 64 Gb
This configuration takes much more time due to compiler optimizations, and is built with the command
scripts\mrbind\generate_win.bat -B -j NumThreads NUM_FRAGMENTS=NumFragments
They are the best if you have limited amount of memory, and increasing NumFragments
can decrease peak memory consumption even more.
NumThreads | NumFragments | Build Time | Comment |
---|---|---|---|
1 | 1 | 20 min 40 sec | |
1 | 2 | 16 min 46 sec | |
1 | 4 | 14 min 47 sec | |
1 | 5 | 14 min 45 sec | |
1 | 6 | 14 min 37 sec | Fastest |
1 | 8 | 16 min 36 sec |
For some reason, subdivision of bindings on 6 fragments gives best single-threaded performance, even though the system is not limited by memory.
These are much faster, but the peak memory consumption increases proportionally to the number of threads compared to single-threaded builds with the same number of fragments.
NumThreads | NumFragments | Build Time | Comment |
---|---|---|---|
4 | 4 | 6 min 6 sec | |
6 | 6 | 4 min 10 sec | |
8 | 8 | 4 min 1 sec | |
24 | 24 | 3 min 20 sec | |
24 | 28 | 3 min 15 sec | |
24 | 32 | 3 min 3 sec | Fastest |
24 | 36 | 3 min 6 sec | |
24 | 40 | 3 min 6 sec | |
24 | 48 | 3 min 10 sec | |
32 | 32 | 3 min 9 sec |
The best building times are reached with the number of threads equal to CPU concurrency, and somewhat larger number of fragments.