Patrick de Oude - Reinforcement learning for Food Waste Reduction within Albert Heijn

Learn how Albert Heijn applies reinforcement learning to reduce food waste by 50% by 2030, featuring Q-learning, online-offline integration, and exploring the trade-off between exploitation and exploration.

Key takeaways
  • Infusing reinforcement learning in food waste reduction
  • Focus on Albert Heijn’s reduction target of 50% by 2030
  • Utilize Q-learning with discrete state spaces and low cardinality
  • Integration of online and offline learning
  • Exploration-exploitation trade-off for optimal decision-making
  • Measuring performance with off-policy evaluation and discounted cumulative rewards
  • Continuous improvement through reinforcement learning
  • Rule-based markdown decision process for ensuring consistency