Patrick de Oude - Reinforcement learning for Food Waste Reduction within Albert Heijn

Learn how Albert Heijn applies reinforcement learning to reduce food waste by 50% by 2030, featuring Q-learning, online-offline integration, and exploring the trade-off between exploitation and exploration.

Key takeaways

Infusing reinforcement learning in food waste reduction
Focus on Albert Heijn’s reduction target of 50% by 2030
Utilize Q-learning with discrete state spaces and low cardinality
Integration of online and offline learning
Exploration-exploitation trade-off for optimal decision-making
Measuring performance with off-policy evaluation and discounted cumulative rewards
Continuous improvement through reinforcement learning
Rule-based markdown decision process for ensuring consistency

Patrick de Oude - Reinforcement learning for Food Waste Reduction within Albert Heijn

More talks