View unanswered posts | View active topics It is currently Thu Mar 28, 2024 8:11 pm



Reply to topic  [ 9 posts ] 
 CPU Control - Microcode vs Finite State Machine 
Author Message

Joined: Sat Jun 16, 2018 2:51 am
Posts: 50
What are general thoughts people have on these two methods to control the CPU's decision making: 1) microcode and 2) finite state machines (FSM). Also, are there other methods available?

An example of a microcoded CPU is the one taught by Ben Eater in this Youtube series: link. And an example of a FSM CPU is the One Page CPU project from this forum: link.

Here's what I've gathered from studying the two CPUs above:

  • Microcode is more flexible. Since the decision making code exists in memory, it can easily be changed/extended by updating the memory. For this reason also, if the memory gets corrupted, so does your CPU's decision making...whereas in FSM, the decision making is hardwired
  • FSM allows for variable length instructions. I.e. each instruction is only as long as it needs to be. Whereas in microcode (at least what I saw in Ben Eater's CPU), the number of cycles needed to execute an instruction is fixed with micro-NOPs used to pad out short instructions
  • The speed of a microcoded CPU is limited to the speed of the memory storing the microinstructions

What is the "common" method used in "common" processors such as Intel and AVR? What's historically been used (I would assume FSM as memory was not cheap/reliable)? What is the "if-constraints-did-not-exist" method?

What is your preferred method and why?


Mon Jan 07, 2019 12:08 am
Profile

Joined: Wed Apr 24, 2013 9:40 pm
Posts: 213
Location: Huntsville, AL
Microcode is just another way of implementing an FSM. Like all design techniques, there are pros and cons to either approach. Certainly memory speed is a driving factor when using a microcoded FSM versus an FSM implemented using random logic.

In modern programmable devices, complexity of the random/discrete logic equations can often be used to gauge whether a microcoded FSM is more appropriate. As the number of states increases, the number of inputs in the two-level AND-OR logic equations will determine the number of logic levels needed to implement the equations. The number of logic levels is directly related to the logic delay. When the logic delays of a discrete logic implementation equals, or nearly equals, the access time of the available memory, a microcoded FSM may be more appropriate than a discrete logic FSM.

In the case of a modern FPGA, the built-in block RAMs can be used to implement the microcode memories. The average access time of these devices is on the order of 2.5 ns. The LUT and the interconnect delays of a random logic implementation has an average delay of 0.3 ns to 0.5 ns per logic level. Therefore, when the logic equations require between 5 and 8 levels of logic to implement, a microcoded FSM implementation may often result in a faster implementation.

The efficiency of the implementation of a microcoded FSM is primarily limited by the imagination of the designer. There are any number of approaches that can be taken to implement a microcoded FSM which can result in very fast transition from one state to another. There is a balance that must be struck between the complexity of the microcode sequencer and that of a corresponding discrete logic FSM. That balance will be determined by the designer and the implementation approach that the designer prefers to use.

There are numerous approaches that can be used to implement microcoded FSMs. One approach is to use a microprogram sequencer. The complexity of the sequencer is again only limited by the imagination of the designer. I have successfully used this approach using a microprogram sequencer based on the Fairchild 9408 device. More complicated devices, like the AMD 2910, are available, but I've found that I don't generally need some of the advanced capabilities that AMD 2910 provided, and when I do need the built-in counters of that device, I can easily add those facilities when needed as part of the support logic for the microcoded FSM. Another approach that I've used successfully for microcoded general purpose HW is based on a simple priority encoder and an embedded next address field.

The sequencer approach simplifies the generation of the microcode because the next address is supplied by a loadable counter, i.e. a microprogram counter. The approach using an embedded next address field is very easy to implement uses a minimal amount of logic, and can be mapped directly to a simple FSM. The microcoded FSMs provide a way to more easily time multiplex common control structures and improve the overall logic utilization efficiency.

If you're interested, you can see examples of systems that I have implemented with variations of the microprogram sequencer approach in these three Github repositories: (1) M65C02, (2) M65C02A, and (3) MiniCPU. An incomplete description of the approach used for the first two of these microcoded processor designs can be be found here.

An example of the approach using a priority encoder and embedded next address field can also be found in my Github account: RTFIFO. A more complete description of this particular implementation can be found in this article I wrote for fpgarelated.com: Use Microprogramming to Save Resources and Increase Functionality.

As the MiniCPU and RTFIFO projects demonstrate, there are a variety of approaches that can be used, and using LUT memory instead of Block RAM is an option. I might have improved the overall speed of the M65C02 / M65C02A projects by widening the microprogram memory beyond the 32 / 36 bit widths that I used in these two projects, but I had set for myself that limit that I would only use two block RAMs (each) for these projects: microprogram sequence and instruction decode.

Use the approach that makes for an efficient, easy to maintain project. Invariably, once a project is complete, there will be few if any changes that need to be made over the life cycle if sufficient planning was performed during the project inception phase. I will sometimes resort to using the microcoded approach first as I can more easily map the states to the problem. The availability to me of a simple microprogram assembler that I can use to develop the FSM state transition table in text form makes this decision fairly easy to justify.

For example, after struggling to implement a simple, discrete logic state machine for the EnDat 2.2 communications protocol, I easily implemented it from start to finish with a microcoded FSM in just a few hours; simulation included. I have directed the use of a microcoded FSM to implement UDP/IP, ARP, and ICMP. The result was a fast, efficient, and reliable Ethernet interface using the embedded Tri-mode MACs in the Virtex-5 FPGAs: none of that crap associated with endless updates to the LWIP SW-based stack for the MicroBlaze as can be seen in the support stream for that project.

In the end, the choice of microcoded or discrete logic FSMs is application-specific. I find microded FSMs are easier to manage, but only if you have a tool for programming them efficiently. Microcode FSMs can be abused, just as discrete logic FSMs, and can also lead to wildly incorrect assumptions about the nature of the HW.

_________________
Michael A.


Mon Jan 07, 2019 2:28 am
Profile

Joined: Sat Jun 16, 2018 2:51 am
Posts: 50
Thanks for the detailed reply and links :)
I'm currently reading your article and will explore the repositories shared (probably starting with the MiniCPU as it seems to be the easiest of the group to wrap my head around ;) ).


Tue Jan 08, 2019 2:00 am
Profile

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1780
It's an interesting question that you pose, quadrant. I wonder if I can add anything useful to MichaelM's response.

We might note that microcode is structured: it's a list of instructions, of a fixed format. An FSM is a very general idea and the state transition diagram could look like something very ordered or something very like spaghetti. As a consequence, maintaining and debugging microcode might be easier, although it might need more by way of background knowledge. Microcode might even be programmed in a higher level representation, again making it easier to think about, explain, debug, update.

We might also note that a structure design can allow a structured implementation. If a microprocessor, or discrete CPU, has microcode in a ROM, or an EPROM, then we get something easy to manufacture, cheap to buy, easy to update. A microcoded microprocessor with bugs might be cheaper to fix, either with a manufacturing change or even with a field update.

We can also note that other kinds of logic can have structured implementations, with the same advantages. PALs and GALs on circuit boards, PLAs in chips. The 68000 has lots of microcode but also several PLAs, which the designers could think of as magic boxes and which could be updated cheaply during design iterations. If there were fields of random logic (unstructured implementation) then it takes more time and effort to re-implement and as the size and shape might change it might be difficult to fit the new revision into the old hole.

But, all of these considerations apply more to complex CPUs than to simple ones. If microcode is in some sense inefficient, for all its advantages, it might not be a good implementation technique for something very minimal. If there are relatively few instructions and addressing modes, there's less leverage in using microcode for the machine. If every instruction has only two or three states to progress through, there's barely any sense in which we need a program counter.

You will see microcode in modern complex CPUs, but mainly to perform costly and rare operations. Sometimes (I believe) instructions on x86 will have a fast path for the common case and will abort into a microcoded alternative if a difficult uncommon case turns up.

I'm a bit surprised to read that the original ARM1 is described as having microcode, but see here:
http://www.righto.com/2016/02/reverse-e ... ssors.html
It's tiny and it's simple, but it is apparently a microsequencer. (I shouldn't be surprised, in the sense that I've certainly read this article before.) I'm not at all surprised that the ARM1 had very regular implementation: it was designed and built "with no people, for no money."


Tue Jan 08, 2019 9:59 am
Profile

Joined: Wed Jan 02, 2019 4:10 pm
Posts: 14
I think it's also worth distinguishing between microcode implemented in ROM on-chip, and microcode in RAM that can be customised by the user (or vendor).

In that sense, the 6502 is also a microcoded CPU, as a significant amount of die area is devoted to a microcode ROM - but this ROM, and the instruction set it implements, was carefully designed to be as cheap to implement as possible (which resulted in the "undocumented" instructions which exercised the microcode in strange ways).

CPUs with user-customisable microcode have traditionally been rare. However, modern CPUs tend to have vendor-upgradable microcode which has been used to fix or work around bugs post-release. The microcode patches are usually integrated into BIOS updates for the m/board, but can also be loaded through the OS kernel, and are cryptographically signed to prevent them becoming a security hazard in themselves.

I think CPUs based purely on FSMs would tend to be among the simplest ones, such as One Instruction Computers. Very old machines such as the Harwell WITCH and the decidedly unimpressive series of drum-memory computers of the early 60s might also fall into this category.


Last edited by Chromatix on Tue Jan 08, 2019 9:54 pm, edited 1 time in total.



Tue Jan 08, 2019 9:50 pm
Profile

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1780
(The 6502 decode matrix isn't really a ROM - there's no address decode - but it is structured implementation.)


Tue Jan 08, 2019 9:54 pm
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
My preferred method to use is a simple FSM. A couple of the RISC style processors I’ve worked on that have overlapped pipeline stages only end up with a few states (run, memory, cache load) without a lot of complex transistions, so it’s not worth the effort to micro-code the state machine. For a pipelined RISC machine most of the operations (Ifetch, decode, execute, and writeback) occurs in a single RUN state.
I find the FSM approach simple to implement, however the number of states needs to be limited.
I find that if the FSM starts to get complex maybe it’s better to implement things with software. Software originated to handle complex state transitions.
I confess I’ve not micro-coded anything of my own, so my view is a bit biased, I’ve studied it some and I think it has it’s place especially in some of the more complex processors. I keep toying with the idea of trying to develop a micro-coded machine to handle a processor with a segmented, protected memory system. I think a micro-coded machine requires a little more thought be put into the design up-front which is maybe not a bad thing. It may also require additional files, toolset (micro-code assembler) and skillset during development.
For a one-man hobbyist team things have to be kept simple or achieving any result will get drowned out by the LOC and bugs.
Hmm, micro-coding things might reduce the LOC.

_________________
Robert Finch http://www.finitron.ca


Wed Jan 09, 2019 6:46 am
Profile WWW

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1780
Just a thought: while the OPC series are simple but designed around a state machine, the Gigatron might be worth a look for an example of something simple but microcoded. Simple in hardware organisation, anyway - there might well be complexity in the microcode program.

Edit: oops, no, it seems the Gigatron is a super-simple RISC Harvard machine. It then (generally) executes an interpreter which acts as a less simple von Neumann machine. Perhaps if you look at that outer level you can think of it as microcoded. Or perhaps I am trying to dig myself out of a hole.


Quote:
GCL is a compiled low-level language for writing simple games and
applications for the Gigatron TTL microcomputer, without bothering
the programmer with the harsh timing requirements of the hardware
platform.

vCPU is the interpreted 16-bit machine language running in the dead
time of the video/sound loop.

The two are closely tied together, and we sometimes we mix them up.
Technically still, "GCL" is the source language or notation, while
"vCPU" is the virtual CPU, or interpreter, that is executing compiled
GCL instructions.

vCPU's advantages over native 8-bit Gigatron code are:
1. you don't need to think about video timing with everything you do
2. operations are 16-bits, and
3. programs can run from RAM.
Its disadvantage is that vCPU is slower than native code.

The main menu, `Snake', `Mandelbrot', `Tiny BASIC v2' and `WozMon'
are pure GCL programs.


(As for the original question, I could only guess at whether the small 8 bit microcontrollers are built with or without microcode.)


Wed Jan 09, 2019 5:34 pm
Profile

Joined: Sat Jun 16, 2018 2:51 am
Posts: 50
Thank you everyone for the input! While quite a bit of it is over my head, I'll keep revisiting your responses till it sinks in ;) !

Quote:
I'm a bit surprised to read that the original ARM1 is described as having microcode, but see here:
http://www.righto.com/2016/02/reverse-e ... ssors.html

Neat link, thanks!

Quote:
In that sense, the 6502 is also a microcoded CPU, as a significant amount of die area is devoted to a microcode ROM - but this ROM, and the instruction set it implements, was carefully designed to be as cheap to implement as possible (which resulted in the "undocumented" instructions which exercised the microcode in strange ways).

I'm not familiar with the 6502 architecture, definitely something to look into. Never heard of the Harwell WITCH, from the images/videos I am coming across it looks to be quite the beast!

Quote:
I find the FSM approach simple to implement, however the number of states needs to be limited.
Ah ok, good to know!

Quote:
it seems the Gigatron is a super-simple RISC Harvard machine. It then (generally) executes an interpreter which acts as a less simple von Neumann machine.
Thanks for this link. There's a lot to pick up from it...once you get over the initial overwhelm-ent.

---

With regards to microcode and flexibility, the Xerox Alto that Curious Marc and friends repaired comes to mind. I am not sure about the details, but in one of those videos it's mentioned.


Fri Jan 11, 2019 2:06 am
Profile
Display posts from previous:  Sort by  
Reply to topic   [ 9 posts ] 

Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group
Designed by ST Software