-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IVF_PQ re-ranking #502
IVF_PQ re-ranking #502
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work on implementing this!
Can we also get the benchmark results for SIFT-1M or SIFT-10M and S3 storage? This will give us a good idea of whether we need more code optimizations on the reranking part of the code.
src/include/index/ivf_pq_index.h
Outdated
std::move(initial_distances), std::move(initial_ids)); | ||
} | ||
|
||
if (::num_vectors(feature_vectors_) == 0 && !group_) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also load all feature_vectors_
for the infinite memory case during the constructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep so that will happen (we hit that code in the second call to rerank_query()
in rerank()
.
But this was just to cover the case where you create an empty index and then try to query with re-ranking. But I realized we actually don't need to throw here, so updated by moving the check for group_
below and only reading from the URI if we have the group. Also added a test for this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry but I still don't understand where we load the full feature_vectors_
array. Shouldn't this be inside read_index_infinite
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, oops, I thought you meant loading feature_vectors_
during add()
. You are correct, we do not load the full feature_vectors_
anywhere except during add()
.
In the constructor we don't know if we're opening for infinite or finite queries, that is per-query. We could perhaps load the full feature_vectors_
during the call to query_infinite()
(which calls read_index_infinite()
), but then we will perhaps be loading way more vectors than we actually need to, and than we would if we just loaded them in rerank()
. The benefit though would be we don't need to do any load during the second call to query_infinite()
, whereas today we do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are multiple options here, so maybe add some query/constructor parameters to make sure that we can leverage all of them and understand what happens in each case. For example, for small arrays and infinite memory, it would make sense to load the full feature_vectors_
array once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Talked offline - will do this along with moving memory_budget
into the index constructor in a follow-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a TODO or shortcut for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Started on it here! #518
Yep, will do, just opened up #514 which will make this comparison easy. Would prefer to not block this PR on that though, if it's okay with you? The |
As this is new functionality mostly relevant for larger datasets stored in object stores, I would prefer to see the performance of it to understand the current state. And since #514 is ready I think we should try to produce it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to perform more benchmarks/optimizations to make sure that IVF_PQ
is materially better than IVF_FLAT
for large objectstore-based datasets and OOC query execution. However, I am ok to merge this as a first version of reranking and follow up in a more specific PR.
What
Adds support for re-ranking to IVF_PQ. This is controlled by
k_factor
. We first query fork_factor * k
vector IDs and indices, then fetch the original (not PQ-encoded) vectors and re-rank those.Testing
Benchmarks
Here are the results with
local-benchmarks
SIFT 1 million. Note that this is with local benchmarks so doesn't include network latency. But for the everything is local case we get a nice accuracy improvement with a fairly small speed penality: