We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Maarten Sukel - Jounai.nl: Playing with New Tech to Reinvent the News | PyData Amsterdam 2024
Explore how Maarten Sukel built Junai.nl, an AI-driven news platform using LLMs for automated content generation, and learn about the technical challenges and solutions involved.
-
Built Junai.nl - an AI-driven news platform that automatically generates news articles and podcasts from ~50 trusted sources using LLMs, without human intervention
-
Tech stack includes:
- Backend: Java/Spring Boot
- Frontend: Vue.js/Nuxt.js
- Azure Container Apps for deployment
- OpenAI APIs for content generation
- Structured LLM output validation using Pandera
-
Cost optimization:
- Current operational cost ~$80/month on Azure
- API costs under $2/week
- Costs reduced by switching to smaller models
- Image generation discontinued due to high API costs
-
Key technical implementation details:
- Uses Jaccard similarity for article deduplication
- Server-side rendering implemented for SEO
- Automated deployment via GitHub Actions
- Simple clustering for related content
- Multiple AI “personalities” for different content types
-
Challenges faced:
- Cultural localization of AI-generated content
- Legal considerations around content sourcing
- Cost management with scaling
- Quality control of AI outputs
- Speech synthesis quality in Dutch
-
Lessons learned:
- Importance of data validation and testing
- Benefits of experimenting with new technologies
- Value of small-scale projects for learning
- Need for careful prompt engineering
- Importance of structured output validation for LLMs
-
Future considerations:
- Potential for local model deployment
- Expectation of decreasing API costs
- Need for better cost optimization at scale
- Possibility of multilingual expansion
- Focus on maintaining truthful and objective reporting