Real-time transaction categorization w/ Bayesian feedback loop - Tijl Kindt | PyData Eindhoven 2021

"Learn how ING Netherlands is using a real-time transaction categorization system with a Bayesian feedback loop, trained on expert mappings and improved with customer feedback, to optimize financial services and improve customer experience."

Key takeaways
  • The talk describes a system for real-time transaction categorization using Bayesian feedback loop.
  • The system uses a Dirichlet distribution to draw samples and generate probabilities for categorization.
  • The model is trained on expert mappings and then improved through feedback from customers.
  • The system handles both implicit and explicit feedback, and weighs them based on customer behavior.
  • The Bayesian inference is used to update the probabilities based on customer feedback.
  • The system uses Thompson sampling to balance exploration and exploitation.
  • The talk also mentions that customers can recategorize transactions, which is used as feedback for improving the model.
  • The system uses a streaming data analytics platform built on top of Apache Flink and Kafka.
  • The platform processes transactions in real-time and updates the model accordingly.
  • The model is calibrated using a Naive Bayes model.
  • The talk mentions that the system is currently in beta with 1% of ING Netherlands customers.
  • The system plans to be launched in Belgium next year.
  • Customers can give feedback in the form of text, category changes, and behavior.
  • The system uses text mining to extract insights from transaction descriptions.
  • The talk also mentions the importance of using customer feedback to improve the model.