Rust Vienna Jan 2024 - Serverless Data Pipelines in Rust by Michele Vigilante

Discover how Rust and Data Fusion can revolutionize your data engineering workflows, enabling high-performance and scalable data pipelines for large-scale data processing.

Key takeaways
  • A data pipeline is a set of data processing steps that transfers data from one system to another, transforming and consolidating it along the way.
  • Rust is a great fit for data engineering due to its performance and scalability.
  • Apache Arrow is a columnar in-memory data format that is very fast and has a small at-rest size.
  • Data Fusion is a query engine/data processing framework that was authored by Andy Grove and is now maintained by the Apache Arrow PMC.
  • Data Fusion exposes all of the Arrow primitive types and has a very good integration with Data Fusion.
  • Data Fusion can be used for local file systems, object stores, and databases.
  • Data Fusion has a very good error handling and safety guarantees.
  • The author implemented Data Fusion in Rust and uses it for his work.
  • Data Fusion can be used for large-scale data processing and has a very good performance.
  • The author recommends using Data Fusion for data engineering tasks that require high performance and scalability.
  • The author also recommends using Rust for data engineering tasks due to its performance and scalability.
  • The author has given talks on Data Fusion and Rust and has written blog posts on the topic.
  • The author is a data engineer at Radency and has previously worked on automation software in C/C++.
  • The author has also worked on other projects such as Apache Arrow and Polars.
  • The author is a big fan of Rust and thinks it is a great language for data engineering.
  • The author also thinks that Data Fusion is a great tool for data engineering and recommends it to others.
  • The author has given a talk on Data Fusion and Rust and has written a blog post on the topic.
  • The author is a data engineer at Radency and has previously worked on automation software in C/C++.
  • The author has also worked on other projects such as Apache Arrow and Polars.
  • The author is a big fan of Rust and thinks it is a great language for data engineering.
  • The author also thinks that Data Fusion is a great tool for data engineering and recommends it to others.