We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Datta & Rodríguez - Building the composable Python data stack with Kedro & Ibis | PyData London 2024
Build a composable Python data stack with Kedro, a data pipeline framework, and Ibis, a query engine, for efficient data processing and flexible pipeline reuse.
- Kedro is a Python framework for building data pipelines that integrates with Ibis for querying data.
- The goal is to process data with Ibis and create a data pipeline with Kedro.
- Kedro can connect to various backends, including DuckDB, Postgres, and more.
- A key feature of Kedro is that it extracts the data processing logic from the Python code, making it easier to read and modify the data pipeline.
- The catalog is a central configuration file that defines the datasets and their relationships.
- Kedro provides a way to create a model input table by consolidating multiple datasets.
- Ibis provides support for various operations, including filter, group by, aggregation, and sort.
- Ibis also supports joins and allows for chaining of operations.
- Kedro provides an easy way to create a pipeline with a sequence of nodes.
- The pipeline can be reused across multiple backends, making it a flexible solution for data processing.
- The catalog can be easily extended to include more datasets and relationships.
- Ibis provides advanced features like support for UDFs and data type conversion.
- The speaker emphasizes that Kedro is designed to be extensible and flexible, and that it can be used with various databases and data processing backends.