We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Akshay Gupta - When is a compiled language like Rust beneficial for Data Scientists? | SciPy 2024
Explore when Rust's performance benefits outweigh its learning curve for data science tasks. Learn key considerations for adopting Rust vs Python and practical hybrid approaches.
-
Rust shows promising performance improvements (50-200x speedups) for data science workloads, but comes with significant learning curve and maintainability challenges
-
Python remains the best default choice for data scientists due to its ecosystem, flexibility, and low barrier to entry
-
Polars (Python library written in Rust) offers major performance gains without requiring direct Rust knowledge, making it a good middle-ground solution
-
Selective use of Rust for computationally intensive components while keeping the main codebase in Python may be optimal - full rewrites in Rust are usually not justified
-
Distribution and cross-compilation of Rust code presents significant challenges compared to Python
-
Memory safety benefits of Rust (preventing ~70% of vulnerabilities according to Microsoft) are valuable but must be weighed against development time costs
-
Team/org adoption of Rust faces barriers like:
- Limited number of developers who can maintain the code
- Longer development cycles
- Steep learning curve for data scientists
- Compilation overhead impacting interactive development
-
The Rust compiler provides excellent guidance but compile times impact the interactive development workflow data scientists prefer
-
Benchmarking shows Rust outperforms Numba for recursive calculations, but vectorized operations don’t gain as much benefit
-
Consider the actual business impact of performance improvements - 45 vs 35 minute runtimes may not meaningfully change workflows