Sujit Pal - Building Learning to Rank models for search using LLMs | PyData Global 2023

Learn how to build Learning-to-Rank models using LLMs to generate training data. Compare different ranking approaches and explore feature engineering for search relevance.

Key takeaways

LLMs can be used to generate relevance judgments for training Learning-to-Rank (LTR) models, reducing the need for expensive human annotations
The speaker implemented and compared four different LTR models:
- Point-wise regression
- RankNet (pairwise)
- LambdaRank
- LambdaMart
RankNet performed best in their experiments, achieving precision@10 of 8.38 compared to 8.50 for hand-tuned models
Key implementation details:
- Used 61 features per query-document pair
- Combined multiple data sources including lexical, vector, and knowledge graph features
- Used Claude AI for generating relevance judgments
- Implemented as both training and inference pipelines
LLM-generated judgments showed ~70% overlap with human expert judgments, though LLMs tended to be more lenient
The approach requires fewer training examples - only needs 5-10% of query-document pairs rather than exhaustive labeling
Feature engineering combined multiple approaches:
- Term frequency and TF-IDF features
- Concept and semantic group overlap
- Vector similarities
- Knowledge graph features
Primary advantages:
- Reduces manual labeling effort
- Achieves comparable performance to hand-tuned systems
- Can be implemented with relatively little human intervention
- Provides interpretable features compared to pure vector approaches
Main challenges identified:
- LLMs sometimes make relevance leaps that humans wouldn’t
- Performance degrades as relevance scores increase
- Data distribution can be imbalanced
The approach is particularly valuable for domains lacking user feedback signals (like healthcare) compared to e-commerce where click data is abundant

Sujit Pal - Building Learning to Rank models for search using LLMs | PyData Global 2023

More talks