Skip to content

The course provides a comprehensive guide to optimizing retrieval systems in large-scale RAG applications. It covers tokenization, vector quantization, and search optimization techniques to enhance search quality, reduce memory usage, and balance performance in vector search systems.

Notifications You must be signed in to change notification settings

ksm26/Retrieval-Optimization-From-Tokenization-to-Vector-Quantization

Repository files navigation

Welcome to the "Retrieval Optimization: From Tokenization to Vector Quantization" course! 🎓 The course teaches you how to optimize vector search in large-scale customer-facing RAG applications.

📘 Course Summary

In this course, you’ll dive deep into tokenization and vector quantization techniques, exploring how to optimize search in large-scale Retrieval-Augmented Generation (RAG) systems. Learn how different tokenization methods impact search quality and explore optimization techniques for vector search performance.

What You’ll Learn:

  1. 🧠 Embedding Models and Tokenization: Understand the inner workings of embedding models and how text is transformed into vectors.
  2. 🔍 Tokenization Techniques: Explore several tokenizers like Byte-Pair Encoding, WordPiece, Unigram, and SentencePiece, and how they affect search relevancy.
  3. 🚀 Search Optimization: Learn to tackle common challenges such as terminology mismatches and truncated chunks in embedding models.
  4. 📊 Search Quality Metrics: Measure the quality of your search using various metrics and optimize search performance.
  5. ⚙️ HNSW Algorithm Tuning: Adjust Hierarchical Navigable Small Worlds (HNSW) parameters to balance speed and relevance in vector search.
  6. 💾 Vector Quantization: Experiment with major quantization methods (product, scalar, and binary) and understand their impact on memory usage and search quality.

🔑 Key Points

  • 🧩 Tokenization in Large Models: Learn how tokenization works in large language models and how it affects search quality.
  • 🛠️ Training Tokenizers: Explore how Byte-Pair Encoding, WordPiece, and Unigram are trained and function in vector search.
  • 🔄 Search Optimization: Understand how to adjust HNSW parameters and vector quantizations to optimize your retrieval systems.

👨‍🏫 About the Instructor

  • 👨‍💻 Kacper Łukawski: Developer Relations Lead at Qdrant, Kacper brings expertise in vector search optimization and teaches practical techniques to enhance search efficiency in RAG applications.

🔗 To enroll or learn more, visit 📚 deeplearning.ai.

About

The course provides a comprehensive guide to optimizing retrieval systems in large-scale RAG applications. It covers tokenization, vector quantization, and search optimization techniques to enhance search quality, reduce memory usage, and balance performance in vector search systems.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published