We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Talks - Krishi Sharma: Trust Fall: Three Hidden Gems in MLFlow
Discover three lesser-known MLflow features: autologging across frameworks, Git commit tracking for reproducibility, and best practices for data preservation and backup.
-
MLflow’s autolog feature automatically detects ML frameworks and tracks relevant metrics/parameters without manual configuration
-
Git commit hash logging in MLflow provides traceability between code versions and model metrics, enabling reproducibility
-
Regular code commits and database backups are critical - one project lost all metrics when the MLflow database was accidentally deleted
-
MLflow organizes experiments hierarchically with experiments containing individual runs, each with unique hash IDs for tracking
-
The tool supports multiple ML frameworks including PyTorch, TensorFlow and newer LLM frameworks
-
MLflow provides built-in visualization capabilities to compare different experiment runs and track metric changes over time
-
The system includes artifact storage functionality that can integrate with S3 or local storage to track model files and data
-
Custom metrics and parameters can be logged alongside framework-specific metrics for comprehensive experiment tracking
-
MLflow can be run entirely locally with a SQLite database, though cloud backup is recommended
-
The tool helps build trust in ML applications by maintaining clear documentation and providing reproducible results that can be audited
-
Pre-commit hooks can be used to ensure code is committed before executing MLflow runs, maintaining version control integrity
-
The model registry feature enables versioning and organizing models for deployment