Itamar Turner-Trauring - Optimize first, parallelize second: a better path to faster data processing

Optimize your software for speed and efficiency before parallelizing for better results, and consider the environmental impact of your processing choices with expert Itamar Turner-Trauring.

Key takeaways
  • Optimize software first, then parallelize
  • Parallelism does not reduce costs, only increases them
  • Focus on optimizing single-core performance before considering parallelism
  • Prioritize processing efficiency over computing power
  • Many algorithms cannot be parallelized, and some may have limited parallelization potential
  • Modern CPUs have many CPU cores, but it’s often not possible to use all of them effectively
  • Software performance can be improved through various means, including:
    • Optimizing code for the specific problem and data
    • Using the right algorithms and data structures
    • Utilizing modern CPU features such as SIMD and instruction-level processing
    • Compiling to machine code with tools like Numba
    • Using just-in-time compilation and caching
  • Cloud computing and distributed computing can reduce costs, but may not be necessary for every scenario
  • Consider factors like CO2 emissions and environmental impact when considering computing resources
  • It’s often not possible to make an algorithm faster without rearchitecting the software or using a different algorithm
  • Optimizing software for modern CPUs can result in significant speed improvements
  • Code profiler tools can help identify performance bottlenecks and areas for optimization