-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modern multiprocessing for MCMC NEGFC, small fix to tutorial 05A #658
Conversation
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #658 +/- ##
==========================================
- Coverage 33.13% 23.77% -9.35%
==========================================
Files 87 87
Lines 16696 16720 +24
==========================================
- Hits 5531 3975 -1556
- Misses 11165 12745 +1580 ☔ View full report in Codecov by Sentry. |
Thanks for the changes @IainHammond. I think it'd still better for the user to be able to provide the number of CPUs to use for the MCMC. In many scenarios, either the user has to share the machine with others, or may want to run other jobs on their machine simultaneously, so would not want all available CPUs to be used. Can you adapt this? |
good call - provided nproc to Pool in the latest commit (8d0ab52) |
Thanks @IainHammond ! |
So emcee has been locked into version 2.2.1 for some time, which was release on 15 July 2016. The reason was likely to be due to the change in multiprocessing behaviour in version 3.0 -> with the removal of the
threads
argument in favour ofpool
inmultiprocessing
.From https://emcee.readthedocs.io/en/stable/user/upgrade/
"The
threads
keyword argument has been removed in favor of thepool
interface (see the Parallelization tutorial for more information). The old interface had issues with memory consumption and hanging processes when the sampler object wasn’t explicitly deleted. Thepool
interface has been supported since the first release of emcee and existing code should be easy to update following the Parallelization tutorial."This older version was indeed causing some occasional problems (at least for me) such as memory accumulating or the code seemingly stopping. To overcome this I've jumped us up to emcee 3.1.6 and made all of the recommended changes to the call for
emcee.EnsembleSampler
provided in the docs. It works, and it produces the same results. Is it faster? A bit. The user doesn't need to manually assign the number of processors anymore either as python automatically determines what is available. Themultiprocessing
start method is forkserver by default as it was running about 10% faster than fork. If it's unavailable, spawn is used instead so that Windows is supported (albeit spawn is about 20% slower than forkserver). Finally, a newmoves
keyword replaces thea
keyword, andsampler.lnprobability
is nowsampler.get_log_prob()
.The implication of this is that VIP technically needs emcee 3.0 -> from now on and requirements.txt has been updated to remove the version lock. In the long term this is much better, but will cause an error if a user doesnt update emcee following this pull request.
Random things: