Kubeflow for Machine Learning • Holden Karau & Adi Polak

Holden Karau & Adi Polak

Learn how Kubeflow simplifies machine learning model training with its pluggable architecture and user-focused API, designed to make data science tools more accessible and collaborative.

Key takeaways

Kubeflow is designed to simplify the process of machine learning model training
Kubeflow Pipelines does not have automatic filter pushdown and query pushdown, unlike Spark
Ray and Dask provide a more similar API to Spark, but with a different underlying architecture
Kubeflow’s design principle is to provide a simple, pluggable architecture for data science tools
The primary focus of Kubeflow is on providing a simple API for data scientists, with a strong emphasis on collaboration
There is no automatic filter pushdown in Ray and Dask, unlike Spark
Ray and Dask share similarities with Spark in terms of their APIs, but have different architectures
Kubeflow Pipelines allows users to define and execute data pipelines, with a focus on simplicity and ease of use
Inspired by the concept of functional programming, Kubeflow aims to simplify the process of machine learning model training
Kubeflow Pipelines can be used to bridge the gap between Spark and frameworks such as TensorFlow and PyTorch
Kubeflow’s metadata tracking allows users to track and manage metadata for their models and pipelines
Kubeflow Pipelines provides a simple, pluggable architecture for data science tools, making it easy to integrate with existing tools.

Kubeflow for Machine Learning • Holden Karau & Adi Polak

More talks