Welcome to the "Retrieval Optimization: From Tokenization to Vector Quantization" course! 🎓 The course teaches you how to optimize vector search in large-scale customer-facing RAG applications.
In this course, you’ll dive deep into tokenization and vector quantization techniques, exploring how to optimize search in large-scale Retrieval-Augmented Generation (RAG) systems. Learn how different tokenization methods impact search quality and explore optimization techniques for vector search performance.
What You’ll Learn:
- 🧠 Embedding Models and Tokenization: Understand the inner workings of embedding models and how text is transformed into vectors.
- 🔍 Tokenization Techniques: Explore several tokenizers like Byte-Pair Encoding, WordPiece, Unigram, and SentencePiece, and how they affect search relevancy.
- 🚀 Search Optimization: Learn to tackle common challenges such as terminology mismatches and truncated chunks in embedding models.
- 📊 Search Quality Metrics: Measure the quality of your search using various metrics and optimize search performance.
- ⚙️ HNSW Algorithm Tuning: Adjust Hierarchical Navigable Small Worlds (HNSW) parameters to balance speed and relevance in vector search.
- 💾 Vector Quantization: Experiment with major quantization methods (product, scalar, and binary) and understand their impact on memory usage and search quality.
- 🧩 Tokenization in Large Models: Learn how tokenization works in large language models and how it affects search quality.
- 🛠️ Training Tokenizers: Explore how Byte-Pair Encoding, WordPiece, and Unigram are trained and function in vector search.
- 🔄 Search Optimization: Understand how to adjust HNSW parameters and vector quantizations to optimize your retrieval systems.
- 👨💻 Kacper Łukawski: Developer Relations Lead at Qdrant, Kacper brings expertise in vector search optimization and teaches practical techniques to enhance search efficiency in RAG applications.
🔗 To enroll or learn more, visit 📚 deeplearning.ai.