We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Ian Ozsvald - Making Pandas Fly
Optimize your Pandas workflow with these high-performance tips: categorize data, use nullable data types, optimize memory usage, and more explored in the speaker's book and online resources.
- Tip: Use categorical data type instead of strings, which are expensive in RAM and slow operations.
-
Use nullable data types like
int64andbooleanfor better performance and to reduce data size. -
Installing
bottlenecklibrary can improve performance of certain operations. -
Using
Daskcan improve performance by splitting data into smaller chunks and processing them in parallel. -
Using
Modincan extend the pandas idea in different ways. -
Think about using
float32instead offloat64for numeric operations. -
Consider installing
numexprfor faster computations. -
Use
stringinstead ofobjectfor datetime columns. -
The
categorytype is not a magic string type, use it correctly. -
Use
nbytesto check the memory usage. - The speaker’s book, “High Performance Python” provides more information on these topics.
- Attend the speaker’s blog for more updates and courses.
- The speaker’s community, IPython, has many resources available.