We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Dean Pleban - Customizing and Evaluating LLMs, an Ops Perspective | PyData Global 2023
Discover the operational perspectives on customizing and evaluating Large Language Models, including strategies like prompt engineering, RAG, LAURA, and PEFT, as well as best practices for fine-tuning, retraining, and evaluating these powerful AI models.
- Customizing LLMs is essential for achieving high-stakes goals, such as medical diagnoses and legal cases.
- There are various strategies for customization, including prompt engineering, reference and preference-based methods, and RAG (retrieval augmented generation).
- RAG can update the model context more simply than updating the model itself.
- LAURA (Large-scale Adaptation) and PEFT (Parameter Efficient Fine Tuning) are often overlooked, but powerful methods for customization.
- Fine tuning and retraining involve comprehensive fine-tuning and retraining on new data.
- Human judgment and feedback are essential for evaluating LLMs.
- RAG integrates well with other customization techniques.
- Customization is needed for both industry teams and smaller startups.
- Metrics, such as LoRa, PEF, and RAG, are essential for evaluating LLMs.
- It is crucial to ensure that the evaluation data is representative, free from biases, and covers necessary dimensions.
- Output validation and collecting feedback from production are also important.
- Customizing LLMs requires expertise in prompting, reference and preference-based methods, and RAG.
- Fine tuning and retraining require comprehensive fine-tuning and retraining on new data.
- Open-source libraries, such as LAURA and PEF, can provide a ton of metrics out of the box and scaffolding to help with customization.
- Tooling is essential for evaluating LLMs.