Structured output with large language models / Uri Goren (Argmax)

Explore structured output with large language models, including JetGPT and its applications, and discover techniques like verbalizer constraint search and greedy search to overcome limitations and improve efficiency and accuracy.

Key takeaways
  • The focus of the talk is on structured output with large language models, with a specific focus on JetGPT and its use cases.
  • Large language models like JetGPT can be used for various applications, but are often limited by their inability to provide structured output.
  • One way to address this limitation is by using a technique called “verbalizer constraint search”, which forces the model to generate output in a specific format.
  • Another technique is to use a greedy search algorithm, which selects the most likely next token based on the model’s predictions.
  • OpenAI’s tokenizer is different from Hugging Face’s tokenizer, and can be used to split text into tokens.
  • Tokenization is an important step in NLP, as it allows models to understand the structure of text.
  • The talk highlights the importance of scaling and efficiency in using large language models, particularly in real-time bidding applications.
  • JetGPT can be used for various applications, including natural language processing and recommendation engines.
  • The talk also discusses the importance of considering the output format of a model, particularly in applications where structured output is required.
  • The use of beam search and greedy search algorithms can be used to improve the efficiency and accuracy of model output.
  • The talk concludes by highlighting the importance of considering the limitations and challenges of large language models, and the need for continued research and innovation in this area.