We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Akshay Gupta - When is a compiled language like Rust beneficial for Data Scientists? | SciPy 2024
Explore when Rust's performance benefits outweigh its learning curve for data science tasks. Learn key considerations for adopting Rust vs Python and practical hybrid approaches.
- 
    Rust shows promising performance improvements (50-200x speedups) for data science workloads, but comes with significant learning curve and maintainability challenges 
- 
    Python remains the best default choice for data scientists due to its ecosystem, flexibility, and low barrier to entry 
- 
    Polars (Python library written in Rust) offers major performance gains without requiring direct Rust knowledge, making it a good middle-ground solution 
- 
    Selective use of Rust for computationally intensive components while keeping the main codebase in Python may be optimal - full rewrites in Rust are usually not justified 
- 
    Distribution and cross-compilation of Rust code presents significant challenges compared to Python 
- 
    Memory safety benefits of Rust (preventing ~70% of vulnerabilities according to Microsoft) are valuable but must be weighed against development time costs 
- 
    Team/org adoption of Rust faces barriers like: - Limited number of developers who can maintain the code
- Longer development cycles
- Steep learning curve for data scientists
- Compilation overhead impacting interactive development
 
- 
    The Rust compiler provides excellent guidance but compile times impact the interactive development workflow data scientists prefer 
- 
    Benchmarking shows Rust outperforms Numba for recursive calculations, but vectorized operations don’t gain as much benefit 
- 
    Consider the actual business impact of performance improvements - 45 vs 35 minute runtimes may not meaningfully change workflows