Out-of-order execution - what can it do for me? - Patrick Schittekat - NDC TechTown 2023

Discover the power of out-of-order execution (OOE) and how it can improve performance by overlapping long latency operations with shorter ones, but also learn about its limitations and the techniques used to mitigate mistakes.

Key takeaways
  • Out-of-order execution (OOE) is a hardware feature that allows instructions to be executed in a non-sequential order.
  • OOE can improve performance by overlapping long latency operations, such as loading data from memory, with shorter latency operations, such as executing instructions on the CPU.
  • The reorder buffer is a key component of OOE, as it holds instructions that have been reordered and are waiting to be executed.
  • Load hosting is a technique used by OOE to predict whether a load instruction will be executed before a store instruction, and if so, to execute the load instruction before the store.
  • Branch prediction is another important aspect of OOE, as it allows the CPU to predict the outcome of a conditional branch instruction and execute subsequent instructions accordingly.
  • OOE can lead to bad speculation if the prediction is incorrect, and the CPU needs to restart from a previous point.
  • To mitigate this, CPUs use techniques such as branch prediction tables and machine clears to detect and correct mistakes.
  • OOE has been shown to improve performance in various benchmarks, such as clang, and can provide a significant speedup in certain scenarios.
  • However, OOE also has limitations, such as the need for complex prediction algorithms and the potential for mistakes, which can lead to reduced performance.
  • New CPU architectures, such as the ARM architecture, also implement OOE, which can provide improved performance and efficiency.