We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Shaurya Agarwal - All Them Data Engines: Data Munging with Python circa 2023 | PyData Global 2023
Learn effective data munging techniques in Python using NumPy, pandas, and default dict, with benefits such as improved readability, lower memory overhead, and efficient calculations.
- Use Python for data munging with NumPy and pandas.
- Python code is more readable and easier to maintain with pandas.
- Use list comprehension to create a list of unique tags, and sort them.
- Groups by object in pandas is useful for aggregating data.
- Avoid using lists for large amounts of data, use NumPy arrays or pandas data frames instead.
- Memory overhead of pandas is lower due to its use of NumPy arrays under the hood.
- Use default dict for handling missing values in data frames.
- Data types in NumPy are strict, which can make it easier to work with large datasets.
- Use eager evaluation in Python for simplicity and performance.
- Grouping data by year or genre in pandas is easy and straightforward.
- Use NumPy arrays to do calculations and aggregations efficiently.
- Python’s typing module allows for type annotations, which can be used to improve code readability and maintainability.