-
Notifications
You must be signed in to change notification settings - Fork 65
Generating Python bindings performance
All C++ functions and classes are automatically exposed to Python using MRBind
, see more.
Since the number of exposed entities is huge, this process is not instantaneous and proper build settings are crucial for best performance.
Below measurements were made on a computer with
OS: Windows 11
CPU: AMD Ryzen 9 3900X 12-Core Processor, 24 Concurrent Threads
Memory: 64 Gb
and the best timing from several attempts with the same parameters is put in the table.
This configuration takes much more time due to compiler optimizations, and is built with the command
scripts\mrbind\generate_win.bat -B -j NumThreads NUM_FRAGMENTS=NumFragments
They are the best if you have limited amount of memory, and increasing NumFragments
can decrease peak memory consumption even more.
NumThreads | NumFragments | Build Time | Comment |
---|---|---|---|
1 | 1 | 20 min 40 sec | |
1 | 2 | 16 min 46 sec | |
1 | 4 | 14 min 47 sec | |
1 | 5 | 14 min 45 sec | |
1 | 6 | 14 min 37 sec | Fastest |
1 | 8 | 16 min 36 sec |
For some reason, subdivision of bindings on 6 fragments gives best single-threaded performance, even though the system is not limited by memory.
These are much faster, but the peak memory consumption increases proportionally to the number of threads compared to single-threaded builds with the same number of fragments.
NumThreads | NumFragments | Build Time | Comment |
---|---|---|---|
4 | 4 | 6 min 6 sec | |
6 | 6 | 4 min 10 sec | |
8 | 8 | 4 min 1 sec | |
24 | 24 | 3 min 20 sec | |
24 | 28 | 3 min 15 sec | |
24 | 32 | 3 min 3 sec | Fastest |
24 | 36 | 3 min 6 sec | |
24 | 40 | 3 min 6 sec | |
24 | 48 | 3 min 10 sec | |
32 | 32 | 3 min 9 sec |
The best building times are reached with the number of threads equal to CPU concurrency, and somewhat larger number of fragments.
This configuration is much faster to build and can be used for monitoring in regular builds, the command is
scripts\mrbind\generate_win.bat -B -j NumThreads NUM_FRAGMENTS=NumFragments MODE=none
Time measurements:
NumThreads | NumFragments | Build Time | Comment |
---|---|---|---|
12 | 12 | 1 min 44 sec | |
16 | 16 | 1 min 41 sec | |
20 | 20 | 1 min 41 sec | |
24 | 24 | 1 min 36 sec | Fastest |
24 | 32 | 1 min 53 sec | |
26 | 26 | 1 min 41 sec |
The best building times are reached with the number of threads equal to CPU concurrency and the same number fragments (unlike Release configuration, where more fragments is preferable).