mywiki:hw:mips:barrier_fence
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| mywiki:hw:mips:barrier_fence [2014/07/28 13:51] – shaoguoh | mywiki:hw:mips:barrier_fence [2022/04/02 17:29] (current) – external edit 127.0.0.1 | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| - | **MIPS Pipeline Hazards | + | **MIPS Pipeline Hazards, Memory |
| ====== Pipeline Hazards/ | ====== Pipeline Hazards/ | ||
| + | {{: | ||
| MIPS has explicit pipeline hazards; **the instruction immediately following a branch or jump instruction will always be executed** (this instruction is sometimes referred to as the " | MIPS has explicit pipeline hazards; **the instruction immediately following a branch or jump instruction will always be executed** (this instruction is sometimes referred to as the " | ||
| + | < | ||
| + | __start: | ||
| + | addi $a0, $0, 100 | ||
| + | addi $a1, $0, 200 | ||
| + | jal test | ||
| + | |||
| + | test: | ||
| + | add $v0, $a0, $a1 #this instruction will be executed twice | ||
| + | jr $ra | ||
| + | </ | ||
| + | |||
| + | Then the **add** instruction would be executed **twice** around the time that the jal happens: once in the delay slot, and once on the following cycle when the program counter change has actually taken effect. | ||
| + | ===== Branch Delay Solution ===== | ||
| + | |||
| + | By default, the GNU assembler reorders instructions for you: it is clear that the second addi must always be executed, so it can be swapped with the jal instruction, | ||
| + | |||
| + | If you don't want it to do this reordering for you, add the directive: | ||
| + | .set noreorder | ||
| + | | ||
| + | In this case, you must deal with the hazards by yourself like below: | ||
| + | < | ||
| + | .set noreorder | ||
| + | |||
| + | __start: | ||
| + | addi $a0, $0, 100 | ||
| + | jal test | ||
| + | addi $a1, $0, 200 #must put after jal instruction | ||
| + | |||
| + | test: | ||
| + | add $v0, $a0, $a1 | ||
| + | jr $ra | ||
| + | | ||
| + | </ | ||
| + | |||
| + | ====== Memory Alignment ====== | ||
| + | Most RISC processors will generate **an alignment fault** when a load or store instruction accesses a **misaligned address**. This allows **the operating system to emulate the misaligned access** using other instructions. For example, the alignment fault handler might use byte loads or stores (which are always aligned) to emulate a larger load or store instruction. | ||
| + | |||
| + | Some architectures like MIPS have special unaligned load and store instructions. One unaligned load instruction gets the bytes from the memory word with the lowest byte address and another gets the bytes from the memory word with the highest byte address. Similarly, store-high and store-low instructions store the appropriate bytes in the higher and lower memory words respectively. | ||
| ====== Memory Barrier/ | ====== Memory Barrier/ | ||
| Line 10: | Line 49: | ||
| CPU cores contain multiple execution units. | CPU cores contain multiple execution units. | ||
| - | {{ ::cpu.png |}} | + | {{:mywiki: |
| Loads and stores to the caches and main memory are buffered and re-ordered using the load, store, and write-combining buffers. | Loads and stores to the caches and main memory are buffered and re-ordered using the load, store, and write-combining buffers. | ||
| Line 20: | Line 59: | ||
| Memory barriers are a complex subject. | Memory barriers are a complex subject. | ||
| - | ====== Store Barrier | + | ===== Store Barrier ===== |
| A store barrier, “sfence” instruction on x86, **forces all store instructions prior to the barrier to happen before the barrier and have the store buffers flushed to cache for the CPU on which it is issued.** | A store barrier, “sfence” instruction on x86, **forces all store instructions prior to the barrier to happen before the barrier and have the store buffers flushed to cache for the CPU on which it is issued.** | ||
| Line 57: | Line 96: | ||
| </ | </ | ||
| - | ====== Load Barrier | + | ===== Load Barrier ===== |
| A load barrier, “lfence” instruction on x86, **forces all load instructions after the barrier to happen after the barrier and then wait on the load buffer to drain for that CPU.** | A load barrier, “lfence” instruction on x86, **forces all load instructions after the barrier to happen after the barrier and then wait on the load buffer to drain for that CPU.** | ||
| - | ====== Full Barrier | + | ===== Full Barrier ===== |
| A full barrier, " | A full barrier, " | ||
| - | ====== Java Memory Model ====== | + | ===== Java Memory Model ===== |
| In the Java Memory Model a **volatile** field has a **store barrier inserted after a write** to it and **a load barrier inserted before a read of it**. Qualified final fields of a class have a store barrier inserted after their initialization to ensure these fields are visible once the constructor completes when a reference to the object is available. | In the Java Memory Model a **volatile** field has a **store barrier inserted after a write** to it and **a load barrier inserted before a read of it**. Qualified final fields of a class have a store barrier inserted after their initialization to ensure these fields are visible once the constructor completes when a reference to the object is available. | ||
| - | ====== Atomic Instructions and Software Locks ====== | + | ===== Atomic Instructions and Software Locks ===== |
| Atomic instructions, | Atomic instructions, | ||
| - | ====== Performance Impact of Memory Barriers | + | ===== Performance Impact of Memory Barriers ===== |
| Memory barriers prevent a CPU from performing a lot of techniques to hide memory latency therefore they have a significant performance cost which must be considered. | Memory barriers prevent a CPU from performing a lot of techniques to hide memory latency therefore they have a significant performance cost which must be considered. | ||
| + | |||
| + | ====== MIPS Linux barrier ====== | ||
| + | Refer to: Linux/ | ||
| + | < | ||
| + | define __sync() | ||
| + | | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | : /* no output */ \ | ||
| + | : /* no input */ \ | ||
| + | : " | ||
| + | |||
| + | |||
| + | #define __fast_iob() | ||
| + | | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | : /* no output */ \ | ||
| + | : " | ||
| + | : " | ||
| + | |||
| + | # define fast_wmb() | ||
| + | # define fast_rmb() | ||
| + | # define fast_mb() | ||
| + | |||
| + | #define wmb() | ||
| + | #define rmb() | ||
| + | #define mb() wbflush() | ||
| + | #define iob() | ||
| + | |||
| + | #define set_mb(var, value) ... | ||
| + | #define smp_llsc_mb() ... | ||
| + | |||
| + | </ | ||
mywiki/hw/mips/barrier_fence.1406526664.txt.gz · Last modified: (external edit)
