Lessons learned from deploying Machine Learning in an old-fashioned heavy industry

Learn hard-won insights from deploying machine learning in traditional industry: data challenges, model selection, scaling issues, customer relationships, and infrastructure needs.

Key takeaways
  • Simple machine learning models (linear regression, random forest) often outperform complex ones for industrial applications. Avoid spending too much time on complex models.

  • Customer data is extremely challenging to work with - issues include data quality, calibration changes, and different standards between facilities.

  • Machine Learning as a Service (MLaaS) is much harder to scale than Software as a Service (SaaS) - requires more consulting, customer support and domain expertise.

  • Proper model evaluation is critical - use temporal cross-validation instead of random cross-validation to respect time-based relationships in the data.

  • Most of the work is infrastructure, not ML - data pipelines, monitoring, configuration management, and customer-facing interfaces are larger parts than the actual models.

  • Domain expertise and ability to communicate with customers in their language is essential - customer success personnel are often more important than ML engineers.

  • Regular retraining and monitoring of models is necessary due to machine recalibrations, seasonal changes, and other real-world factors.

  • Aim to control your own data collection rather than relying on customer data when possible to ensure quality and consistency.

  • Simple, interpretable models help gain customer trust and make troubleshooting easier in industrial settings.

  • Real-world industrial applications often have significant delays between predictions and ground truth (like 28-day cement strength tests), which must be accounted for in the ML system design.