We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Juan Luis Cano Rodríguez - Who needs ChatGPT? Rock solid AI pipelines with Hugging Face and Kedro
Learn how to build rock-solid AI pipelines using Hugging Face and Kedro, a Python framework for data pipelines that integrates seamlessly with various data storage and processing systems.
- Kedro is a Python framework for data pipelines that can easily integrate with various data storage and processing systems.
- FillMaskModel is a model that can fill in missing data, and the author uses it to create a summarizer that summarizes a list of posts.
- The author also uses Hugging Face transformers and the PyData ecosystem to build the summarizer.
- Kedro has a wide range of integrations, including Airflow, Argo, Dask, Kubeflow, and Prefect, which can make it easier to create and manage data pipelines.
- The author uses the Kedro CLI to define a pipeline that includes a FillMaskModel node, which takes in a list of posts and returns a summarized version of them.
- The author also uses the Kedro CLI to create a Kedro dataset and populate it with data from a MinIO bucket.
- Kedro can be used to handle large-scale data processing and can be scaled horizontally, making it suitable for production use cases.
- The author highlights the importance of proper data processing and summarizes the key steps to follow when building a data pipeline with Kedro.
- Kedro has a strong focus on software engineering principles and is designed to make it easy to build and maintain large-scale data pipelines.
- The author uses the Kedro visualizer to visualize the data pipeline and see the flow of data through it.