Tam-Sanh Nguyen - Writing and Scaling Collaborative Data Pipelines with Kedro

Learn how to write and scale collaborative data pipelines with Kedro, a framework that simplifies and standardizes data pipelines, making them more maintainable and scalable.

Key takeaways

Data pipelines start small, but can grow to become complex and challenging to maintain.
Kedro is a framework that helps to simplify and standardize data pipelines, making them more maintainable and scalable.
Kedro provides a configuration system that allows for easy modification of pipeline behavior.
Pipelines can be complex and difficult to visualize, but Kedro provides tools to help with this.
Data pipelines require a balance between data engineering and data science, and Kedro helps to facilitate this.
Kedro allows for easy deployment and sharing of pipelines, and provides mechanisms for tracking and managing pipeline versions.
Kedro is designed to be extensible, and is compatible with other tools and technologies such as React and Airflow.
Data pipelines are often difficult to maintain because they lack standardized organization and structure, and Kedro addresses this.
Kedro provides a standardized way of organizing and naming pipeline components, making it easier to understand and work with complex pipelines.
Data pipelines can be viewed as a form of “altruistic programming”, where the code is written with the intention of making it easy for others to understand and extend.
Kedro aims to promote a culture of collaboration and standardized practices in the field of data engineering and science.
Data pipelines can be thought of as a form of “audio engineering”, where the goal is to extract meaningful insights from raw data, and Kedro provides tools to help with this.
Kedro is designed to be flexible and adaptable, allowing it to be used in a variety of different contexts and applications.

Tam-Sanh Nguyen - Writing and Scaling Collaborative Data Pipelines with Kedro

More talks