User Tools

Site Tools


mywiki:hw:mips:barrier_fence

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
mywiki:hw:mips:barrier_fence [2014/07/28 20:50] shaoguohmywiki:hw:mips:barrier_fence [2022/04/02 17:29] (current) – external edit 127.0.0.1
Line 1: Line 1:
 **MIPS Pipeline Hazards, Memory Alignment and Barrier/Fences** **MIPS Pipeline Hazards, Memory Alignment and Barrier/Fences**
 ====== Pipeline Hazards/Branch Delay ====== ====== Pipeline Hazards/Branch Delay ======
 +{{:mywiki:hw:mips:800px-mips_architecture_pipelined_.svg.png?600|}}
 MIPS has explicit pipeline hazards; **the instruction immediately following a branch or jump instruction will always be executed** (this instruction is sometimes referred to as the "b**ranch delay slot**"). If your code was really assembled exactly as you wrote it: MIPS has explicit pipeline hazards; **the instruction immediately following a branch or jump instruction will always be executed** (this instruction is sometimes referred to as the "b**ranch delay slot**"). If your code was really assembled exactly as you wrote it:
 <file> <file>
Line 48: Line 49:
 CPU cores contain multiple execution units.  For example, a modern Intel CPU contains 6 execution units which can do a combination of arithmetic, conditional logic, and memory manipulation.  Each execution unit can do some combination of these tasks.  These execution units operate in parallel allowing instructions to be executed in parallel.  This introduces another level of non-determinism to program order if it was observed from another CPU. CPU cores contain multiple execution units.  For example, a modern Intel CPU contains 6 execution units which can do a combination of arithmetic, conditional logic, and memory manipulation.  Each execution unit can do some combination of these tasks.  These execution units operate in parallel allowing instructions to be executed in parallel.  This introduces another level of non-determinism to program order if it was observed from another CPU.
  
-{{ {{:mywiki:hw:mips:cpu.png?400|}} |}}+{{:mywiki:hw:mips:cpu.png|}} 
  
 Loads and stores to the caches and main memory are buffered and re-ordered using the load, store, and write-combining buffers.  These buffers are associative queues that allow fast lookup.  This lookup is necessary when a later load needs to read the value of a previous store that has not yet reached the cache.  Figure 1 above depicts a simplified view of a modern multi-core CPU.  It shows how the execution units can use the local registers and buffers to manage memory while it is being transferred back and forth from the cache sub-system. Loads and stores to the caches and main memory are buffered and re-ordered using the load, store, and write-combining buffers.  These buffers are associative queues that allow fast lookup.  This lookup is necessary when a later load needs to read the value of a previous store that has not yet reached the cache.  Figure 1 above depicts a simplified view of a modern multi-core CPU.  It shows how the execution units can use the local registers and buffers to manage memory while it is being transferred back and forth from the cache sub-system.
mywiki/hw/mips/barrier_fence.1406551845.txt.gz · Last modified: (external edit)