-
Notifications
You must be signed in to change notification settings - Fork 6
Better cache control #259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I cannot comment deeply, because I have not looked into this in detail yet myself.
We found this necessary to clear up the memory. |
Regarding the bigdataviewer-playground/src/main/java/sc/fiji/bdvpg/scijava/services/SourceAndConverterService.java Lines 475 to 481 in da133be
However the imageloaders need to implement closeable, and this does not seem to be the case... I'll make a PR. |
Where does the |
PR done here: bigdataviewer/bigdataviewer-core#142 (Closeable is |
Note that several |
Yes, it's already taken care of in the lines before (bdv - pg stores which sources are attached to a spimdataset): bigdataviewer-playground/src/main/java/sc/fiji/bdvpg/scijava/services/SourceAndConverterService.java Lines 464 to 483 in da133be
|
OK, how would I use this in MoBIE? There we open the SACs ourselves, not via bdv-pl... |
You can with this function: bigdataviewer-playground/src/main/java/sc/fiji/bdvpg/scijava/services/SourceAndConverterService.java Lines 544 to 546 in 538ed29
|
There's now a global cache and globalcacheloader objects which are used by resampled sources and by every image loader which contains a All relevant classes are in the package Also : the cache is weighted by the memory thanks to this function: bigdataviewer-playground/src/main/java/sc/fiji/bdvpg/cache/AbstractGlobalCache.java Lines 119 to 146 in ddac2e1
|
@NicoKiaru Does this work now? |
What's your imageloader ? I can tell you if it will work or not ? If, in your image loader there is a field named If your image loader implements |
Thanks @NicoKiaru ! That would be the most relevant right now: https://github.com/mobie/mobie-io/blob/main/src/main/java/org/embl/mobie/io/ome/zarr/loaders/N5OMEZarrImageLoader.java But long term we may want to switch to opening images like this: https://github.com/mobie/mobie-io/blob/3fd0180fa013141dc5d8370dd365794904bfaf27/src/main/java/org/embl/mobie/io/ome/zarr/hackathon/MultiscaleImage.java#L154 |
Looks ok for the cache. But it does not implement Closeable. Since there already is a close function, just add This can directly use the GlobalCache I've implemented (see below) The issue I see with MultiscaleImage is that you use Here the cache mechanism is hidden into N5Utils. In its current form it won't be optimal because the cache size is handled at a per source level. Let's say you have 15 Gb of RAM, and a hundred sources of 10 Gb each: You can use unbounded soft refs everywhere:
You can use a bounded number of refs:
What you would want ideally, is to set a total amount of memory that can be used by the cache of all sources, and when the max is reached, you want to remove the least useful data first, to let some space for new things. And if the limit is global for all sources, you do not need to care whether one source occupies all the RAM, or if many sources are each using a fraction of the total memory. That's the problem I solved here: bigdataviewer-playground/src/main/java/sc/fiji/bdvpg/scijava/services/SourceAndConverterService.java Lines 307 to 343 in 698cccf
I'm replacing the cache field (through reflection) of SpimData with a The global loader cache is here: And it can use a BoundedHashMap or a Caffeine cache backend. Also importantly, it measures the memory size of each cell which is added in the cache, assuming all objects are instances of bigdataviewer-playground/src/main/java/sc/fiji/bdvpg/cache/AbstractGlobalCache.java Lines 119 to 146 in 698cccf
So if you have a source with little cells (32x32x32) and other sources with big cells (1024x1024x32), the cache will be know which memory size is occupied. There's one limitation though: it only works with read-only cache. If there was a way to specify the LoaderCache to use in N5Utils, then the problem could be solved this way. |
The goal
I'd like to improve Bdv-Playground and make it capable of handling really big data with low RAM.
Why it's not ideal currently
It's doing ok, but it sometimes fails because some cached sources have a
SoftRef
policy which is not clearing the memory early enough. When the memory is nearly full, the JVM then spends most of its time in GC calls, and this makes the GUI not responsive and me pretty annoyed. Soft references do not seem to be cleared fast enough in this case, and also, part of the problem comes from the absence of order guarantee when removing cached data : the 'cell' which has been accessed the latest is not necessarily the first removed (see extract from https://www.baeldung.com/java-soft-references):Where soft references are generated in bdv playground ?
I think the main issue in bdv playground comes from the class RAIHelper which has a softref policy. RAIHelper is used by default in bdv-playground for all cached sources which are not coming from a SpimDataset (Resampled sources, Fused sources).
How to solve the issue ?
1. Quick fix
I could probably just put a
BoundSoftRef
policy with a fixed amount of cells in RAIHelper, but one of the design problem I have is that I make a cache for each 3D RAI (one cache per resolution level and per timepoint). So if I have 50 timepoints and 5 resolution levels, I create 250 different caches. If I put a bound of 10 cells for each cache, I have in fact a max bound of 2500 cells.2. Better fix
2.a. A cache for each source
It would be better to have at least a single cache for each Source. Like in the AbstractSpimSource implementation, a key of the cache should contain all the information to access a cached cell (timepoint, level, gridlocation). But I'm a bit lost in all the cache API (cache loader, cache, access flags...)
2.b. A global cache for all SourceAndConverter registered in bdv-playground
I was wondering whether it was possible to implement a single cache that would contain all the cached cells from all bdv-playground sources. Basically a key of this cache would identify:
This would allow to invalidate all caches at once if necessary and programmatically. Also, a bounded soft ref policy could be applied all at once and act upon all sources.
There is a potential issue memory leak issue, but I think it should be ok: if a SourceAndConverter object is kept in memory because of its remaining cached cells, the order guarantee on BoundedSoftRef should sooner or later manage to get rid of the remaining cells. Also, when a Source is removed from Bdv-Playground, we could make sure to invalidate its cached cells from the global cache.
So : do you think 2.b. is a good option ? If yes, I'll probably need a bit of help to start it, because as I wrote, I'm a bit lost in the API.
Optionally:
ping @tischi @tpietzsch
The text was updated successfully, but these errors were encountered: