AnyCPU http://anycpu.org/forum/ |
|
Readings on high-performance CPU designs http://anycpu.org/forum/viewtopic.php?f=15&t=556 |
Page 1 of 1 |
Author: | BigEd [ Sun Dec 16, 2018 6:12 pm ] |
Post subject: | Readings on high-performance CPU designs |
. I thought it might be good to have a thread of resources and examples of complex CPUs, even though most homebrew CPUs fall firmly in the category of simple CPUs. (For myself, I'd be happy to invent or implement any working CPU, let alone one which overlaps fetch and execute!) First off, some resources courtesy of robfinch, from this thread. A Superscalar Out-of-Order x86 Soft Processor for FPGA (Thesis by Henry Ting-Hei Wong, PDF, 275 pages) Quote: 14.2 Contributions This thesis presents the design of a superscalar out-of-order x86 FPGA soft processor. In this thesis, we make the following contributions:
3) • Design a microarchitecture that correctly implements the x86 instruction set sufficiently well to boot modern operating systems (Chapters 4 and 5) • Design FPGA circuits for key processor components that implement the out-of-order microarchitecture (Chapters 6 to 12) • Evaluate the performance and costs of an out-of-order soft processor compared to a commercial in-order soft processor and x86 hard processors (Chapter 13) There's a great deal in here and it's very well written. I've only scratched the surface. Quote: Contents 1 Introduction 1.1 Why a Faster Single-Threaded Soft Processor? 1.2 Why Out-of-Order? 1.3 Why x86? 1.4 Contributions 1.5 Organization 2 Background 3 Comparing FPGA vs. Custom CMOS Circuits 4 CPU Design Methodology 5 Proposed Processor Microarchitecture 6 Instruction Fetch and Decode 7 Register Renaming 8 Reorder Buffer and Instruction Commit 9 Instruction Scheduling 10 Register Files and Instruction Execution 11 Out-of-order Memory Execution Schemes 12 Memory and Cache System 13 Processor Performance 14 Conclusions A Floating-Point Emulation Mechanism B Integer Division Algorithm and Circuit C Test System Configuration Details D List of Micro-Ops and Logical Registers E Comparison with Intel Haswell Rob also reminds us that the Usenet group comp.arch is still alive and sometimes very interesting. For example: Linked from that thread, there's a series of university lectures by Prof. Ajit Pal here: There's also much interesting information about IBM's ill-fated and mismanaged ACS supercomputer from the early 60s. IBM invented and then forgot about Dynamic Instruction Scheduling. By not understanding how the ACS worked, they thought it was not worth producing. Maybe start here: https://people.cs.clemson.edu/~mark/acs_technical.html Quote: Quote: I built a Precursor that ran at a 10ns cycle. 5 levels of logic. Each chip could dissipate 3 watts and with 625 chips we had to have coolant. We used FC78 as a liquid coolant. The precursor was a path through a 24 bit adder. Or maybe start here https://people.cs.clemson.edu/~mark/acs.html with particular reference to Lynn Conway's historical PDFs. Her much more recent paper reflecting on ACS and how it all went wrong is also very interesting: |
Author: | robfinch [ Thu Jan 10, 2019 9:42 am ] |
Post subject: | Re: Readings on high-performance CPU designs |
Found this thesis about a "braided" architecture: https://repositories.lib.utexas.edu/bit ... f71786.pdf Relies on software to identify "braids" which are subsets of basic-blocks. Workload is distributed among eight braid execution units (BEU's in the core). The BEU's each have their own register file and there is a global register file as well. So two levels of files. There's the potential for a smaller footprint for processor structures thus maintaining a high performance. |
Author: | BigEd [ Thu Jan 10, 2019 10:13 am ] |
Post subject: | Re: Readings on high-performance CPU designs |
Thanks for the link - that's a very promising title: "Braids: Out-of-Order Performance with Almost In-Order Complexity" |
Page 1 of 1 | All times are UTC |
Powered by phpBB® Forum Software © phpBB Group http://www.phpbb.com/ |