We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Ionut Barbu & Tim Brakenhoff - How to build production-ready data science pipelines with Kedro
Learn how to build production-ready data science pipelines using Kedro, a framework that provides structure, organization, and customization options for data projects.
- Conferences and workshops in data science can be helpful in improving code quality and organization.
- Kedro provides structure and organization for data science projects.
- Predefined places for storing data, such as CSV, Excel, and Parquet, are built into Kedro.
- Kedro uses a DAG (Directed Acyclic Graph) to visualize pipelines.
- Kedro can be used to create a reproducible, maintainable, and modular data science pipeline.
- It is essential to have a data catalog to store data and track changes.
- Integrate with MLflow for versioning and tracking machine learning models.
- Use YAML files for configuration and storing data.
- CADRO provides a modular package for building data science pipelines.
- Kedro can be used with multiple deployment options, such as AWS, Azure, and manual.
- Add value to a project by using Kedro to create a structure and organization.
- Use Kedro to make changes to the code and for managing intermediate data sets.
- CADRO provides a tool for stitching together multiple nodes in a pipeline.
- Kedro can be used to create a structured data catalog.
- Use Kedro to create high-quality code and useful data engineering pipelines.