Contextual search with vector search: exploring your options with open source tools - Olena Kutsenko

Learn about vector search implementation using open-source tools like Postgres, OpenSearch & Redis. Discover key metrics, search approaches & best practices for similarity searches.

Key takeaways

Vector search helps find similarities between objects by converting data into vectors in multi-dimensional space using machine learning models
Popular vector search databases include:
- Postgres with pgvector
- OpenSearch
- Clickhouse
- Redis
Key metrics for comparing vectors:
- L2 (Euclidean) distance
- Cosine similarity
- Inner product
- L1 norm
Two main search approaches:
- KNN (K-Nearest Neighbors) - precise but slower
- ANN (Approximate Nearest Neighbors) - faster but less precise
Important considerations for vector search:
- Model selection should align with use case
- Data characteristics affect index choice
- Recall rate indicates result quality
- Pre/post filtering can improve performance
Common vector search applications:
- Semantic search
- Recommendation systems
- Image similarity
- Document retrieval
Best practices:
- Use batching for data ingestion
- Consider data update frequency when choosing index
- Combine vector search with traditional filtering
- Test different distance metrics for your use case
RAG (Retrieval Augmented Generation) can be enhanced with vector search to provide context for Large Language Models
Vector dimensions typically range from 300-700, depending on the model used
Performance optimization through:
- Efficient indexing strategies
- Clustering similar vectors
- Proper distance metric selection
- Balance between precision and speed

Contextual search with vector search: exploring your options with open source tools - Olena Kutsenko

More talks