Features of Modern Processors 2.
Out of Order Execution. (Dynamic Execution)
As it was described before, the instructions are stored in the registers and they are loaded from the memory. However, given that nowadays memories are still not quite too fast, is possible that the arguments of such instructions are not available “on time”. For this reason Out of order execution can execute orders that appear later on the instructions but have their parameters ready. This improves instruction throughput and makes it easier for compilers to arrange machine code for optimal performance. [Hager][1].
The basic idea is to keep track of the program order in which instructions entered the pipeline and for each of these instructions, it maintains a temporary register storage.[Balasubramonian][6] This way if is needed later to perform a different operation using some of the instructions or results obtained previously.
Using static (in-order) processors the steps that follows usually are:
- Instruction fetch.
- If input operands are available (in registers for instance), the instruction is dispatched to the appropriate functional unit. If one or more operands are unavailable during the current clock cycle (generally because they are being fetched from memory), the processor stalls until they are available.
- The instruction is executed by the appropriate functional unit.
- The functional unit writes the results back to the register file
Using dynamic processors, the steps are:
- Instruction fetch.
- Instruction dispatch to an instruction queue (also called instruction buffer or reservation stations).
- The instruction waits in the queue until its input operands are available.
- The instruction is then allowed to leave the queue before earlier, older instructions.
- The instruction is issued to the appropriate functional unit and executed by that unit.
- The results are queued.
- Only after all older instructions have their results written back to the register file, then this result is written back to the register file. This is called the graduation or retire stage. wikipedia
Avoiding the stalls on the processors helps improve considerably the performance. This graph would help us understand why,
Taken from: [wright][8]
In larger operations the improvement would impact higher the performance. Cur- rent out-of-order designs can keep hundreds of instructions in flight at any time, using a reorder buffer that stores instructions until they become eligible for execution[1].