Juan Luis Cano Rodríguez - Who needs ChatGPT? Rock solid AI pipelines with Hugging Face and Kedro

Ai Automation

Learn how to build production-ready ML pipelines by combining Kedro's software engineering best practices with Hugging Face's state-of-the-art AI models and tools.

Key takeaways

Kedro is an open source framework that applies software engineering best practices to data science and ML pipelines, helping transition from experiments to production
The framework decouples I/O operations from computation by separating datasets (inputs/outputs) from nodes (computation steps), making pipelines more maintainable
Kedro projects follow a standardized template structure, with clean separation between configuration, data, notebooks and source code
The data catalog provides a declarative way to define datasets and their locations (local, S3, databases etc.), abstracting away data access details
Pipelines are defined as directed acyclic graphs (DAGs) of nodes, with clear dependencies between computation steps
Kedro integrates with major orchestration platforms like Airflow, Argo, Kubeflow while remaining orchestrator-agnostic
The framework supports experiment tracking and can connect with MLflow through plugins
Kedro is extensible through hooks and plugins, following similar patterns to tools like PyTest
While not a full MLOps solution, Kedro focuses on providing solid foundations for building maintainable ML pipelines
The project is now part of Linux Foundation AI & Data with multiple stakeholders including McKinsey, Societe General and others

Juan Luis Cano Rodríguez - Who needs ChatGPT? Rock solid AI pipelines with Hugging Face and Kedro

More talks