Practical LLM Fine Tuning For Semantic Search | Dr. Roman Grebennikov

Learn strategies for fine-tuning LLMs in semantic search applications, exploring loss functions, data challenges, model selection, and practical deployment tips with Dr. Roman Grebennikov.

Key takeaways
  • Fine-tuning LLMs for semantic search shows significant improvements over non-fine-tuned models, with NDCG scores jumping from ~12 to 50+ after fine-tuning

  • Multiple loss function options exist for training, including:

    • Cosine similarity loss
    • Multiple negatives ranking loss
    • Influencer loss
    • Triplet loss
  • Key challenges in semantic search implementation:

    • Finding quality training data with positive/negative examples
    • Balancing precision vs recall
    • Managing computational costs during inference
    • Handling brand searches and specific product names
  • Practical tips:

    • Start with pre-trained models from Hugging Face as baseline
    • Use CPU quantization for cost efficiency when possible
    • Combine semantic search with traditional lexical search for best results
    • Focus on data quality over model size/complexity
  • Data considerations:

    • Query-document pairs need relevance labels (0-1)
    • Can use customer behavior data as signals
    • Batch size impacts training effectiveness (larger generally better)
    • Consider domain-specific needs (e.g., food delivery vs electronics)
  • Model selection guidance:

    • E5 models provide good baseline performance
    • Larger models give better results but are slower
    • Fine-tuned smaller models often outperform larger non-fine-tuned ones
    • Consider sentence transformers for practical implementations
  • Production deployment considerations:

    • Vector search databases needed for scale
    • Hybrid approaches combining semantic + lexical search recommended
    • Regular retraining may be needed as data/patterns change
    • Monitor zero-result rates and conversion metrics