Juan Luis Cano Rodríguez - Who needs ChatGPT? Rock solid AI pipelines with Hugging Face and Kedro

Python Ai

Learn how to build rock-solid AI pipelines using Hugging Face and Kedro, a Python framework for data pipelines that integrates seamlessly with various data storage and processing systems.

Key takeaways

Kedro is a Python framework for data pipelines that can easily integrate with various data storage and processing systems.
FillMaskModel is a model that can fill in missing data, and the author uses it to create a summarizer that summarizes a list of posts.
The author also uses Hugging Face transformers and the PyData ecosystem to build the summarizer.
Kedro has a wide range of integrations, including Airflow, Argo, Dask, Kubeflow, and Prefect, which can make it easier to create and manage data pipelines.
The author uses the Kedro CLI to define a pipeline that includes a FillMaskModel node, which takes in a list of posts and returns a summarized version of them.
The author also uses the Kedro CLI to create a Kedro dataset and populate it with data from a MinIO bucket.
Kedro can be used to handle large-scale data processing and can be scaled horizontally, making it suitable for production use cases.
The author highlights the importance of proper data processing and summarizes the key steps to follow when building a data pipeline with Kedro.
Kedro has a strong focus on software engineering principles and is designed to make it easy to build and maintain large-scale data pipelines.
The author uses the Kedro visualizer to visualize the data pipeline and see the flow of data through it.

Juan Luis Cano Rodríguez - Who needs ChatGPT? Rock solid AI pipelines with Hugging Face and Kedro

More talks