We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Malte Tichy - Paradoxes in model training and evaluation under constraints | PyData Global 2023
Explore how capacity constraints affect ML demand forecasting accuracy. Learn methods to properly model and evaluate truncated data while avoiding bias in predictions.
-
When dealing with constrained demand (like limited inventory), sales data alone does not reflect true customer demand, as it’s capped by capacity limits
-
Don’t equate sales with demand unless you’re certain capacity limits are never reached - censored/truncated data creates bias in model training
-
Evaluate models by grouping predictions rather than outcomes to avoid selection bias. Group by predicted capacity hit probability rather than actual capacity hits
-
Account for constraints explicitly in probability distributions and expected values calculations rather than using simplified approximations
-
Distinguish between unconstrained demand (potential customer interest) and constrained demand (actual sales limited by capacity)
-
Using constrained sales as an approximation for unconstrained demand leads to systematic underforecasting and increasing stockouts
-
Forward-looking evaluation (based on predictions) is more meaningful than backward-looking analysis of outcomes
-
Balance between stockouts and waste requires proper probabilistic modeling of demand under constraints
-
Tools like statsmodels can help with truncated distribution analysis, though more open source solutions are needed
-
Clean, controlled test cases should be evaluated before moving to complex real-world scenarios with additional complications