Andrey Cheptsov - Leveraging open-source LLMs for production | PyData Global 2023

Python Ai

Learn how to leverage open-source LLMs in production with guidance on model selection, optimization techniques, fine-tuning approaches, and deployment considerations.

Key takeaways

Open-source LLMs offer full control over model behavior, data processing, and privacy compared to proprietary models
Key advantages of open-source LLMs:
- Cost reduction potential through optimization
- Customization flexibility
- No vendor lock-in risk
- Full privacy control
- Community-driven improvements
Major open-source LLM models:
- LLaMA 2 (Meta/Microsoft) - available in 7B, 13B, 70B parameters
- Code LLaMA - specialized for code generation
- Mistral - efficient 7B model with strong performance
Technical optimization techniques:
- LoRA (Low Rank Adaptation) reduces memory requirements for training
- Quantization converts weights to lower precision formats (int8, int4)
- Combining LoRA and quantization can reduce memory needs by 10x
Fine-tuning approaches:
- Supervised fine-tuning for basic task adaptation
- RLHF (Reinforcement Learning from Human Feedback) for instruction following
- DPO (Direct Preference Optimization) as simpler alternative to RLHF
Production deployment considerations:
- Memory requirements vary significantly by model size
- Batch inference can improve latency
- Multiple models can be served using shared resources
- Commercial usage rights vary by model license
Benchmarks show open-source models approaching GPT-3.5/GPT-4 quality on specific tasks through fine-tuning
Common use cases:
- Code generation
- SQL query generation
- Structured data extraction
- API automation
- Synthetic dataset generation

Andrey Cheptsov - Leveraging open-source LLMs for production | PyData Global 2023

More talks