Developing Cloud-Native Java AI applications with DJL and LangChain4J by Sébastien Blanc, Alex Soto

Learn how to build production-ready AI apps in Java using DJL and LangChain4j. Covers model management, RAG, security, observability, and cloud-native best practices.

Key takeaways
  • Langchain4j provides Java developers with AI capabilities including model management, RAG (Retrieval Augmented Generation), tools/functions, and memory management

  • Deep Java Library (DJL) enables loading and inferencing AI models locally in Java applications without requiring Python dependencies

  • Vector databases like PGVector (Postgres extension) and Redis can be used to store embeddings for semantic search and RAG applications

  • Ollama allows running open source LLMs locally, similar to how Docker runs containers

  • AI applications often need stateful components like memory and conversation context which can be managed through databases or in-memory stores

  • Graph-based flows help orchestrate complex AI interactions and maintain application state between model calls

  • Tools/functions allow LLM models to call Java code directly, enabling integration with business logic

  • OpenTelemetry integration provides observability for AI applications including traces and metrics

  • Semantic caching improves performance by returning cached responses for semantically similar queries

  • Protection against prompt injection and other security concerns can be implemented through model validation and safety checks