Faster Pandas: Make your code run faster and consume less memory| Miki Tebeke, CEO 353solutions.

Make Pandas code more efficient and scalable by optimizing memory usage, avoiding loops, and leveraging parallel processing, profiling tools, and specific data types.

Key takeaways

Optimize before measuring: Measure the performance of code before optimizing it to avoid unnecessary optimizations that may not provide a significant gain.
Understand the Python VM: Understand how the Python virtual machine (VM) works to optimize code better.
Use profiling tools: Use profiling tools like cProfile and line_profiler to measure the performance of code.
Avoid for loops in Pandas: Avoid using for loops in Pandas as they can be slow, instead use vectorized operations.
Use dtypes: Use specific dtypes when loading data from CSV to reduce memory usage.
Monitor memory usage: Monitor memory usage to detect anomalies and optimize code accordingly.
Use parallel processing: Use parallel processing libraries like Dask and PySpot to process large datasets.
Optimize for specific use cases: Optimize code for specific use cases and requirements.
Understand the data: Understand the data and its characteristics to optimize code accordingly.
Use NaN-aware operations: Use NaN-aware operations in Pandas to handle missing values efficiently.
Avoid guessing: Avoid guessing the performance of code and instead use profiling tools to measure it.
Know when to optimize: Know when to optimize code and when to consider alternative solutions.
Use timing and profiling: Use timing and profiling tools to measure the performance of code and identify bottlenecks.
Optimize for business value: Optimize code for business value and metrics rather than just performance.
Test and measure: Test and measure the performance of code before and after optimization.

Faster Pandas: Make your code run faster and consume less memory| Miki Tebeke, CEO 353solutions.

More talks