View unanswered posts | View active topics It is currently Sun Sep 22, 2019 2:47 pm



Reply to topic  [ 3 posts ] 
 Readings on high-performance CPU designs 
Author Message

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1257
.
I thought it might be good to have a thread of resources and examples of complex CPUs, even though most homebrew CPUs fall firmly in the category of simple CPUs. (For myself, I'd be happy to invent or implement any working CPU, let alone one which overlaps fetch and execute!)

First off, some resources courtesy of robfinch, from this thread.

A Superscalar Out-of-Order x86 Soft Processor for FPGA (Thesis by Henry Ting-Hei Wong, PDF, 275 pages)

Quote:
14.2 Contributions
This thesis presents the design of a superscalar out-of-order x86 FPGA soft processor. In this thesis, we
make the following contributions:
    • Study how the FPGA substrate affects circuits and out-of-order processor microarchitecture (Chapter
    3)
    • Design a microarchitecture that correctly implements the x86 instruction set sufficiently well to
    boot modern operating systems (Chapters 4 and 5)
    • Design FPGA circuits for key processor components that implement the out-of-order microarchitecture
    (Chapters 6 to 12)
    • Evaluate the performance and costs of an out-of-order soft processor compared to a commercial
    in-order soft processor and x86 hard processors (Chapter 13)


There's a great deal in here and it's very well written. I've only scratched the surface.

Quote:
Contents
1 Introduction
1.1 Why a Faster Single-Threaded Soft Processor?
1.2 Why Out-of-Order?
1.3 Why x86?
1.4 Contributions
1.5 Organization
2 Background
3 Comparing FPGA vs. Custom CMOS Circuits
4 CPU Design Methodology
5 Proposed Processor Microarchitecture
6 Instruction Fetch and Decode
7 Register Renaming
8 Reorder Buffer and Instruction Commit
9 Instruction Scheduling
10 Register Files and Instruction Execution
11 Out-of-order Memory Execution Schemes
12 Memory and Cache System
13 Processor Performance
14 Conclusions
A Floating-Point Emulation Mechanism
B Integer Division Algorithm and Circuit
C Test System Configuration Details
D List of Micro-Ops and Logical Registers
E Comparison with Intel Haswell


Rob also reminds us that the Usenet group comp.arch is still alive and sometimes very interesting. For example:

Linked from that thread, there's a series of university lectures by Prof. Ajit Pal here:

There's also much interesting information about IBM's ill-fated and mismanaged ACS supercomputer from the early 60s. IBM invented and then forgot about Dynamic Instruction Scheduling. By not understanding how the ACS worked, they thought it was not worth producing. Maybe start here:
https://people.cs.clemson.edu/~mark/acs_technical.html
Quote:
Quote:
I built a Precursor that ran at a 10ns cycle. 5 levels of logic. Each chip could dissipate 3 watts and with 625 chips we had to have coolant. We used FC78 as a liquid coolant. The precursor was a path through a 24 bit adder.
-- Bill Mooney, personal correspondence, describing a test of ACS circuitry in 1968


Or maybe start here
https://people.cs.clemson.edu/~mark/acs.html
with particular reference to Lynn Conway's historical PDFs. Her much more recent paper reflecting on ACS and how it all went wrong is also very interesting:


Sun Dec 16, 2018 6:12 pm
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 920
Location: Canada
Found this thesis about a "braided" architecture:

https://repositories.lib.utexas.edu/bit ... f71786.pdf

Relies on software to identify "braids" which are subsets of basic-blocks. Workload is distributed among eight braid execution units (BEU's in the core).

The BEU's each have their own register file and there is a global register file as well. So two levels of files. There's the potential for a smaller footprint for processor structures thus maintaining a high performance.

_________________
Robert Finch http://www.finitron.ca


Thu Jan 10, 2019 9:42 am
Profile WWW

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1257
Thanks for the link - that's a very promising title: "Braids: Out-of-Order Performance with Almost In-Order Complexity"


Thu Jan 10, 2019 10:13 am
Profile
Display posts from previous:  Sort by  
Reply to topic   [ 3 posts ] 

Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group
Designed by ST Software