-
-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit the cluster cache by memory instead of number of clusters #956
base: main
Are you sure you want to change the base?
Changes from 1 commit
7ab76f9
a21ee25
71a496f
1add1d3
dc82b88
a5032d2
61c863e
8c91f85
9b9e229
f896c0f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -43,10 +43,38 @@ | |
#include <cstddef> | ||
#include <stdexcept> | ||
#include <cassert> | ||
#include <iostream> | ||
|
||
namespace zim { | ||
|
||
template<typename key_t, typename value_t> | ||
struct UnitCostEstimation { | ||
template<typename value_t> | ||
static size_t cost(const value_t& value) { | ||
return 1; | ||
} | ||
}; | ||
|
||
/** | ||
* A lru cache where the cost of each item can be different than 1. | ||
* | ||
* Most lru cache is limited by the number of items stored. | ||
* This implementation may have a different "size" per item, so the current size of | ||
* this lru is not the number of item but the sum of all items' size. | ||
* | ||
* This implementation used is pretty simple (dumb) and have few limitations: | ||
* - We consider than size of a item do not change over time. Especially the size of a | ||
* item when we put it MUST be equal to the size of the same item when we drop it. | ||
* - Cache eviction is still a Least Recently Used (LRU), so we drop the least used item(s) util | ||
* we have enough space. No other consideration is used to select which item to drop. | ||
* | ||
* This lru is parametrized by a CostEstimation type. The type must have a static method `cost` | ||
* taking a reference to a `value_t` and returing its "cost". As already said, this method must | ||
* always return the same cost for the same value. | ||
* | ||
* While cost could be any kind of value, this implemention is intended to be used only with | ||
* `UnitCostEstimation` (classic lru) and `FutureToValueCostEstimation<ClusterMemorySize>`. | ||
*/ | ||
template<typename key_t, typename value_t, typename CostEstimation> | ||
class lru_cache { | ||
Comment on lines
+78
to
79
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a serious enhancement that should be difficult to do right. LRU is about heuristics of maintaining items in the cache based on the order they have been used. Introducing cost adds an important dimension to consider - when facing the choice of dropping from the cache an expensive old item or a cheap not-so-old one, which one is more justified? Or the other way around - it may be preferable to get rid of a heavy item that is not the first candidate for removal according to the LRU heuristics since that will make more room in the cache for multiple lightweight items. |
||
public: // types | ||
typedef typename std::pair<key_t, value_t> key_value_pair_t; | ||
|
@@ -81,9 +109,10 @@ class lru_cache { | |
}; | ||
|
||
public: // functions | ||
explicit lru_cache(size_t max_size) : | ||
_max_size(max_size) { | ||
} | ||
explicit lru_cache(size_t max_cost) : | ||
_max_cost(max_cost), | ||
_current_cost(0) | ||
{} | ||
|
||
// If 'key' is present in the cache, returns the associated value, | ||
// otherwise puts the given value into the cache (and returns it with | ||
|
@@ -103,6 +132,8 @@ class lru_cache { | |
auto it = _cache_items_map.find(key); | ||
if (it != _cache_items_map.end()) { | ||
_cache_items_list.splice(_cache_items_list.begin(), _cache_items_list, it->second); | ||
decreaseCost(CostEstimation::cost(it->second->second)); | ||
increaseCost(CostEstimation::cost(value)); | ||
it->second->second = value; | ||
} else { | ||
putMissing(key, value); | ||
|
@@ -120,37 +151,72 @@ class lru_cache { | |
} | ||
|
||
bool drop(const key_t& key) { | ||
list_iterator_t list_it; | ||
try { | ||
auto list_it = _cache_items_map.at(key); | ||
_cache_items_list.erase(list_it); | ||
_cache_items_map.erase(key); | ||
return true; | ||
list_it = _cache_items_map.at(key); | ||
} catch (std::out_of_range& e) { | ||
return false; | ||
} | ||
decreaseCost(CostEstimation::cost(list_it->second)); | ||
_cache_items_list.erase(list_it); | ||
_cache_items_map.erase(key); | ||
return true; | ||
} | ||
|
||
bool exists(const key_t& key) const { | ||
return _cache_items_map.find(key) != _cache_items_map.end(); | ||
} | ||
|
||
size_t size() const { | ||
return _cache_items_map.size(); | ||
size_t cost() const { | ||
return _current_cost; | ||
} | ||
|
||
size_t getMaxSize() const { | ||
return _max_size; | ||
size_t getMaxCost() const { | ||
return _max_cost; | ||
} | ||
|
||
void setMaxSize(size_t newSize) { | ||
while (newSize < this->size()) { | ||
void setMaxCost(size_t newMaxCost) { | ||
while (newMaxCost < this->cost()) { | ||
dropLast(); | ||
} | ||
_max_size = newSize; | ||
_max_cost = newMaxCost; | ||
} | ||
|
||
protected: | ||
|
||
void increaseCost(size_t extra_cost) { | ||
// increaseSize is called after we have added a value to the cache to update | ||
// the size of the current cache. | ||
// We must ensure that we don't drop the value we just added. | ||
// While it is technically ok to keep no value if max cache size is 0 (or memory size < of the size of one cluster) | ||
// it will make recreate the value all the time. | ||
// Let's be nice with our user and be tolerent to misconfiguration. | ||
if (!extra_cost) { | ||
// Don't try to remove an item if we have new size == 0. | ||
// This is the case when concurent cache add a future without value. | ||
// We will handle the real increase size when concurent cache will directly call us. | ||
return; | ||
} | ||
_current_cost += extra_cost; | ||
while (_current_cost > _max_cost && size() > 1) { | ||
dropLast(); | ||
} | ||
} | ||
|
||
void decreaseCost(size_t costToRemove) { | ||
if (costToRemove > _current_cost) { | ||
std::cerr << "WARNING: We have detected inconsistant cache management, trying to remove " << costToRemove << " from a cache with size " << _current_cost << std::endl; | ||
std::cerr << "Please open an issue on https://github.com/openzim/libzim/issues with this message and the zim file you use" << std::endl; | ||
_current_cost = 0; | ||
} else { | ||
_current_cost -= costToRemove; | ||
} | ||
} | ||
|
||
private: // functions | ||
void dropLast() { | ||
auto list_it = _cache_items_list.back(); | ||
decreaseCost(CostEstimation::cost(list_it.second)); | ||
_cache_items_map.erase(_cache_items_list.back().first); | ||
_cache_items_list.pop_back(); | ||
} | ||
|
@@ -159,15 +225,19 @@ class lru_cache { | |
assert(_cache_items_map.find(key) == _cache_items_map.end()); | ||
_cache_items_list.push_front(key_value_pair_t(key, value)); | ||
_cache_items_map[key] = _cache_items_list.begin(); | ||
if (_cache_items_map.size() > _max_size) { | ||
dropLast(); | ||
} | ||
increaseCost(CostEstimation::cost(value)); | ||
} | ||
|
||
size_t size() const { | ||
return _cache_items_map.size(); | ||
} | ||
|
||
|
||
private: // data | ||
std::list<key_value_pair_t> _cache_items_list; | ||
std::map<key_t, list_iterator_t> _cache_items_map; | ||
size_t _max_size; | ||
size_t _max_cost; | ||
size_t _current_cost; | ||
}; | ||
|
||
} // namespace zim | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like such subtle pitfalls here and there (that may turn into time-bombs). This makes me come up with the following question and proposal:
Is memory usage accounting going to expand beyond clusters? If not, then why don't we implement item cost tracking at
ConcurrentCache
level, leavinglru_cache
absolutely intact?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. The may purpose of all this is to avoid a exaggerated memory consumption for long time running kiwix-serve with a "lot" (not defined, relative to the system) of zim to serve.
We have assumed it was the cluster cache taking all this memory. Mainly because we currently have a limit per zim file (and so, a lot of zim files make the cache mostly unlimited) and this cache should be more memory consuming than dirent cache.
So there is no current plan to extend this limitation to anything else for now, but we may if we found our assumption was wrong.
Your idea seems good at first glance. However thinking twice, here few limitations (not necessarily blockers):
put
/putMissing
anddrop
method to drop itself the clusters from the cache and trace the allocated memory.At the end, it would simply move all the things about memory consumption a level up (in ConcurrentCache) and make it call lru_cache to drop an item even if its internal limit is not reach.
It would indeed make lru_cache almost unmodified but I'm not sure it would worth it.
(And I don't see a reason we would like to keep lru_cache clean from memory limitation but accept the modifications in ConcurrentCache)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main benefit would be avoiding design quirks.
ConcurrentCache
is a relatively lightweight wrapper aroundlru_cache
(without any is-a relation between them). Its public API related to memory usage management consists of just three member functions:Maintaining memory limits entirely within that small set of functions can be achieved with less changes. It also simplifies the implementation in which the cost of an item is evaluated only once (on addition) and is stored with the item.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But ConcurrentCache is a thin wrapper to make the lru_cache thread safe.
Adding a memory limitation on it seems to me this is a quirk we should avoid.
I've started implementing your idea to see if I was wrong. Here few things have faced:
lru_cache::dropLast
return the drop item. However, current drop implementation return false if key is not found. As we want to return the value and can't usestd::optional
(we limit ourselves to cpp14), we would have to throw thestd::out_of_range
instead of returning false. Add so all user of drop have to catch it.Anyway, last commit implement your idea. We can discuss about real implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, it was more complicated than I thought, but it helped to surface a potential problem that was present with the old implementation too. I felt that something was wrong, but I thought that it was just an uncomfortable feeling caused by the arcane interplay between
lru_cache
andConcurrentCache
. Now with the cost accounting fully done in a single class the problem was easier to spot.Consider working with two ZIM files - one on (very) slow drive and the other on a fast drive. A cluster from the slow ZIM file is requested. A placeholder for it is created in the cache, and while the cluster is loading, a burst of activity on the fast ZIM file happens that fills the cache with entries from the fast ZIM file and drops the placeholder entry of the slow ZIM file. Since the cost of that placeholder entry is not known, a fake value of 0 is used instead (the result being that the removal of the entry has no effect and leaves no trace). Then loading of the slow cluster completes and the cache cost is increased by its actual memory usage value, but that item is no longer in the cache! Repetition of that scenario can lead to the current "fake" cost of the cache being above the limit, even though the cache is empty!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same scenario can occur more naturally as follows:
Thus, a cache miss, followed by a large enough number of cache hits (so that the cache miss is pushed to the end of the LRU queue), followed by another cache miss (occurring before the first cache miss is fully resolved), will result in the first cache miss being evicted from the cache before it was even properly recorded and the cache memory usage balance off by the cost of that ill-fated cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last commit should fix this point.
(I have removed the "WIP" commit)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it doesn't