We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Dean Pleban - Customizing and Evaluating LLMs, an Ops Perspective | PyData Global 2023
Discover the operational perspectives on customizing and evaluating Large Language Models, including strategies like prompt engineering, RAG, LAURA, and PEFT, as well as best practices for fine-tuning, retraining, and evaluating these powerful AI models.
- Customizing LLMs is essential for achieving high-stakes goals, such as medical diagnoses and legal cases.
 - There are various strategies for customization, including prompt engineering, reference and preference-based methods, and RAG (retrieval augmented generation).
 - RAG can update the model context more simply than updating the model itself.
 - LAURA (Large-scale Adaptation) and PEFT (Parameter Efficient Fine Tuning) are often overlooked, but powerful methods for customization.
 - Fine tuning and retraining involve comprehensive fine-tuning and retraining on new data.
 - Human judgment and feedback are essential for evaluating LLMs.
 - RAG integrates well with other customization techniques.
 - Customization is needed for both industry teams and smaller startups.
 - Metrics, such as LoRa, PEF, and RAG, are essential for evaluating LLMs.
 - It is crucial to ensure that the evaluation data is representative, free from biases, and covers necessary dimensions.
 - Output validation and collecting feedback from production are also important.
 - Customizing LLMs requires expertise in prompting, reference and preference-based methods, and RAG.
 - Fine tuning and retraining require comprehensive fine-tuning and retraining on new data.
 - Open-source libraries, such as LAURA and PEF, can provide a ton of metrics out of the box and scaffolding to help with customization.
 - Tooling is essential for evaluating LLMs.