We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Talks - Bradley Dice: Hacking `import` for speed: how we wrote a GPU accelerator for pandas
Learn how NVIDIA's cuDF accelerates pandas workflows with GPU computing for 10-100x speedups, using import hacks for zero-code-change migration & automatic CPU fallback.
-
cuDF Pandas is an NVIDIA-developed GPU accelerator that enables zero-code-change acceleration for pandas workflows, offering 10-100x speedups
-
Works best for medium to large datasets (5-20GB range), with GPU memory being the main limiting factor
-
Provides 60-75% coverage of pandas API, automatically falling back to CPU for unsupported operations
-
Integrates with broader RAPIDS ecosystem, including Dask for scaling beyond single GPU memory limits
-
Available through conda and pip packages, recently added to Google Colab with GPU runtime support
-
Uses custom import machinery to intercept pandas imports and proxy them to GPU-accelerated implementations
-
Includes built-in profiling tools to track which operations run on GPU vs CPU
-
Maintains compatibility with the broader Python data science ecosystem (matplotlib, scikit-learn, etc.)
-
Passes over 90% of pandas test suite, ensuring reliable drop-in replacement functionality
-
Particularly effective for operations like joins, groupby aggregations, and filtering operations
-
Handles memory movement between CPU/GPU automatically and transparently to users
-
Primary limitations include GPU memory constraints, incomplete API coverage, and some performance overhead from CPU/GPU transfers when falling back