Scaling Machine Learning with Spark • Adi Polak & Holden Karau • GOTO 2023

Discover how to scale machine learning with Apache Spark, exploring infrastructure, engineering, and the importance of translators, feature engineering, and scheduling for efficient and scalable solutions.

Key takeaways

Key Takeaways

The importance of considering the infrastructure and engineering aspects of deploying machine learning models
The need for a translator between different formats and tools, such as Perket and PyTorch/TensorFlow
The value of feature engineering and the need to consider the trade-offs involved
The importance of leveraging existing tools and infrastructure, such as Spark, and integrating them with other technologies
The role of scheduling and the need for a more efficient and scalable solution
The importance of considering the pros and cons of different tools and technologies, such as PyTorch and TensorFlow
The need for a more streamlined and user-friendly approach to machine learning, including the use of notebooks and the importance of providing inline explanations and feedback
The importance of considering the pros and cons of different tools and technologies, such as PyTorch and TensorFlow, and the need for a translator between different formats and tools
The role of data infrastructure and the need to consider the trade-offs involved
The importance of providing feedback and review to improve the quality of the book
The value of having a conversational and approachable writing style
The importance of considering the pros and cons of different tools and technologies
The need for a more scalable and efficient solution for machine learning
The importance of providing inline explanations and feedback
The value of considering the trade-offs involved in deploying machine learning models

Scaling Machine Learning with Spark • Adi Polak & Holden Karau • GOTO 2023

Key Takeaways

More talks