Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update IVF_PQ to set memory_budget in constructor, support preload feature_vectors and metadata only modes #518

Merged
merged 20 commits into from
Sep 16, 2024

Conversation

jparismorgan
Copy link
Contributor

@jparismorgan jparismorgan commented Sep 6, 2024

What

Makes three changes to IVF_PQ:

  1. Update IVF_PQ to set memory_budget in constructor instead of during query. This matches what Python does and improves things because now the first infinite RAM query is not slower the rest.
  2. Support a new mode where we preload feature_vectors for use in re-ranking. This is useful so that for smaller indexes we can load all the data we need (like faiss does).
  3. Add a metadata-only open mode so that if the user sets open_for_remote_query_execution in Python we won't load data in C++, just metadata.

Also updates ivf_pq_index_test because it is flaky (here is a PR without these changes failing: https://github.com/TileDB-Inc/TileDB-Vector-Search/actions/runs/10809242895/job/29983933038?pr=520).

Testing

  • Updates tests to create infinite RAM and finite RAM indexes and make sure the results are the same.
  • Existing tests pass.

Benchmarks

Before this, the first IVF_PQ query with query_infinite_ram() when the index URI is a remote AWS URI would be slower than all the rest (because we first had to load the data with read_index_infinite()):

query_time_vs_accuracy

After this change, we instead get this (notice that with k_factor=1 each query is the same speed, but for others it is not because we are fetching the feature_vectors for re-ranking on each query):

query_time_vs_accuracy

And if we use index = IVFPQIndex(index_uri, config, preload_k_factor_vectors=True), then we this (notice that we have pre-fetched the feature_vectors so there is no query time difference):

query_time_vs_accuracy

The code to run this is below:

def benchmark_ivf_pq():
    index_type = "IVF_PQ"
    timer = timer_manager.new_timer(index_type)

    k = 100
    queries = load_fvecs(SIFT_QUERIES_PATH)
    dimensions = queries.shape[1]
    gt_i, gt_d = get_groundtruth_ivec(SIFT_GROUNDTRUTH_PATH, k=k, nqueries=len(queries))

    for partitions in [200]:
        for num_subspaces in [dimensions / 4]:
            for k_factor in [1, 1.5, 2, 4, 8, 16]:
                tag = f"{index_type}_partitions={partitions}_num_subspaces={num_subspaces}_k_factor={k_factor}"
                logger.info(f"Running {tag}")

                index_uri = get_uri(tag)

                timer.start(tag, TimerMode.INGESTION)
                ingest(
                    index_type=index_type,
                    index_uri=index_uri,
                    source_uri=SIFT_BASE_PATH,
                    config=config,
                    partitions=partitions,
                    training_sampling_policy=TrainingSamplingPolicy.RANDOM,
                    num_subspaces=num_subspaces,
                )
                ingest_time = timer.stop(tag, TimerMode.INGESTION)

                # The index returned by ingest() automatically has memory_budget=1000000 set. Open
                # a fresh index so it's clear what config is being used.
                index = IVFPQIndex(index_uri, config)

                for nprobe in [10, 10, 15, 20, 25, 30]:
                    timer.start(tag, TimerMode.QUERY)
                    _, result = index.query(
                        queries, k=k, nprobe=nprobe, k_factor=k_factor
                    )
                    query_time = timer.stop(tag, TimerMode.QUERY)
                    acc = timer.accuracy(tag, accuracy(result, gt_i))
                    logger.info(
                        f"Finished {tag} with nprobe={nprobe}. Ingestion: {ingest_time:.4f}s. Query: {query_time:.4f}s. Accuracy: {acc:.4f}."
                    )

                cleanup_uri(index_uri)

    timer.save_and_print_results()

@jparismorgan jparismorgan mentioned this pull request Sep 6, 2024
@jparismorgan jparismorgan marked this pull request as ready for review September 10, 2024 09:07
src/include/index/index_defs.h Outdated Show resolved Hide resolved
src/include/index/ivf_pq_index.h Outdated Show resolved Hide resolved
src/include/index/ivf_pq_index.h Outdated Show resolved Hide resolved
@jparismorgan
Copy link
Contributor Author

I'm going to force a merge here because the failures are unrelated to this PR and I'd like this PR in the upcoming release:

(209 durations < 0.005s hidden.  Use -vv to show these durations.)
=========================== short test summary info ============================
FAILED test/test_cloud.py::CloudTests::test_cloud_flat - ModuleNotFoundError: No module named 'tiledb.array'
FAILED test/test_cloud.py::CloudTests::test_cloud_ivf_flat - ModuleNotFoundError: No module named 'tiledb.array'
FAILED test/test_cloud.py::CloudTests::test_cloud_ivf_flat_random_sampling - ModuleNotFoundError: No module named 'tiledb.array'
=========== 3 failed, 94 passed, 657 warnings in 1233.55s (0:20:33) ============

@jparismorgan jparismorgan merged commit 35acf35 into main Sep 16, 2024
5 of 6 checks passed
@jparismorgan jparismorgan deleted the jparismorgan/ivf-pq-memory-budget-to-ctor branch September 16, 2024 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants