We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Talks - Alex Monahan, Gabor Szarnyas: Python and SQL: Better Together, Powered by DuckDB
Learn how DuckDB brings SQL and Python together, delivering 10-100x performance gains through vectorized processing, seamless data format handling, and memory-smart execution.
-
DuckDB is an analytical SQL database designed to integrate seamlessly with Python workflows, running in-process rather than client-server
-
Key features include vectorized processing, multi-core utilization, and ability to handle datasets larger than RAM by gracefully degrading to disk-based processing
-
Excels at reading/writing multiple data formats (Parquet, CSV, JSON) directly in place without format conversion, plus integrates with pandas, Arrow, and other Python ecosystem tools
-
Achieves 10-100x better performance compared to traditional solutions for analytical workloads by avoiding network overhead and optimizing for modern CPUs
-
Offers flexible API options including native SQL, pandas-like data frames, Ibis interface, and experimental PySpark compatibility
-
Focuses on single-node performance optimization rather than distributed computing, aiming to delay need for cluster deployment
-
Provides ACID transaction support, crash recovery, and persistent storage while maintaining SQLite-like simplicity of deployment
-
Runs anywhere Python runs - laptops, servers, browsers (via WebAssembly), and edge devices with minimal dependencies
-
MIT licensed open source project with over 2M downloads monthly, created by database experts targeting analytical Python workloads
-
Best suited for data science/analytics workflows rather than high-concurrency transactional use cases