AEMB Threading Optimisation

AEMB Threading Optimisation

The result of a recent research collaboration between Universiti Teknologi PETRONAS and us, has been published and indexed in IEEE.

This paper aims at explaining the architecture of the AEMB and optimizing its threading model. AEMB is a soft core, open source processor designed for FPGA implementation. AEMB uses a fine grained model that interleaves threads one instruction at a time with separate register sets for each thread. The chosen optimization is to change the current threading model to a Coarse-grained that switches threads on branch instructions. The advantage of this approach is that control hazards will be avoided in most cases hence stalling the pipeline till branch targets are calculated will be drastically decreased. This is quite an improvement over the previous core where the processor stalls for one cycle on any branch instruction encountered. The disadvantage to the coarse grained threading model is that data hazards that can’t be forwarded can now cause the processor to stall up to three cycles in the worst case scenario compared to only one stall in the old model. As for Area consumption on FPGA, synthesis showed that the modified core utilizes double the number of LUTs that the original AEMB needs but there was no significant increase in the number of register. Further quantitative analysis is necessary to determine the total gain in performance by running the suitable benchmarks on both versions of the processor.

Future research on the AEMB itself will be halted since we have stopped development of the AEMB. However, the lessons learned will be applied to the development of the new processor since it will feature a similar multi-threading model.