We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Sujit Pal - Building Learning to Rank models for search using LLMs | PyData Global 2023
Learn how to build Learning-to-Rank models using LLMs to generate training data. Compare different ranking approaches and explore feature engineering for search relevance.
-
LLMs can be used to generate relevance judgments for training Learning-to-Rank (LTR) models, reducing the need for expensive human annotations
-
The speaker implemented and compared four different LTR models:
- Point-wise regression
- RankNet (pairwise)
- LambdaRank
- LambdaMart
-
RankNet performed best in their experiments, achieving precision@10 of 8.38 compared to 8.50 for hand-tuned models
-
Key implementation details:
- Used 61 features per query-document pair
- Combined multiple data sources including lexical, vector, and knowledge graph features
- Used Claude AI for generating relevance judgments
- Implemented as both training and inference pipelines
-
LLM-generated judgments showed ~70% overlap with human expert judgments, though LLMs tended to be more lenient
-
The approach requires fewer training examples - only needs 5-10% of query-document pairs rather than exhaustive labeling
-
Feature engineering combined multiple approaches:
- Term frequency and TF-IDF features
- Concept and semantic group overlap
- Vector similarities
- Knowledge graph features
-
Primary advantages:
- Reduces manual labeling effort
- Achieves comparable performance to hand-tuned systems
- Can be implemented with relatively little human intervention
- Provides interpretable features compared to pure vector approaches
-
Main challenges identified:
- LLMs sometimes make relevance leaps that humans wouldn’t
- Performance degrades as relevance scores increase
- Data distribution can be imbalanced
-
The approach is particularly valuable for domains lacking user feedback signals (like healthcare) compared to e-commerce where click data is abundant