View unanswered posts | View active topics It is currently Thu Mar 28, 2024 4:14 pm



Reply to topic  [ 12 posts ] 
 "Homebrew CPUs: Messing around with a J1" 
Author Message

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1780
Interesting article over on FPGArelated, about the small and simple J1 CPU (good for Forth, apparently very clean verilog, and only just over 100 lines of it) and some illustrations of how easy it is to make changes to it:

Image

The original J1 is on github here

The article also recommends "the must-read book by Philip Koopman, Stack Computers: The New Wave. If you haven't read it, you really should."


Fri Oct 30, 2015 10:53 pm
Profile

Joined: Mon Oct 07, 2019 2:41 am
Posts: 585
If you are using a FPGA why not go 24 bits. Memory is cheap compared to 1974.


Mon Oct 07, 2019 7:33 am
Profile

Joined: Sun Aug 04, 2013 2:19 am
Posts: 8
oldben wrote:
If you are using a FPGA why not go 24 bits. Memory is cheap compared to 1974.


A bit stale, but it is worth considering:

FPGA resources are surprisingly limiting:

* On-board block RAM is at a premium as going off-board is much slower, noisier and more error-prone. It never comes in 24 bits. To build 24-bit memories we will use up a lot of interconnects and extra decoding and selection logic.

* Off-board SRAMS require custom boards, as modern FPGA devboards come with dynamic RAM. Dynamic RAM is much slower for random access and requires complex memory controllers - either proprietary or DIY projects more complex than a small CPU;

* If you ever do decide to go off-board, where do you get 24-bit memory?

* Interconnects are limited and the faster paths get used up quickly, the wider you go; this compromises other circuitry on the FPGA as well;

* ALUs get slower as ripple carry propagation has finite per-bit cost. Building more sophisticated carry circuits makes for undue complexity;

* The problems are multiplied many times over, as typical MISC CPUs immediately perform many simultaneous ALU operation on TOS and NOS while the opcode is being decoded; the decoder muxes the appropriate result. So we don't have a single ALU, we have multiple units for each possible operation - add, subtract, and, or, shift, etc. Each unit brings more above-mentioned interconnect and carry issues.

* Wide muxes get very expensive quickly and require more layers of logic (especially on cheaper older FPGAs). This also requires more interconnects and compromises maximum speed.

I could probably keep going but you get the idea. The whole point of these devices is to be very simple and very fast.


Mon May 24, 2021 4:13 am
Profile

Joined: Mon Oct 07, 2019 2:41 am
Posts: 585
The only slowdown that makes sense is the ALU has to have 8 bits more ripple carry. I assume that the FPGA has ample innerconnects
and internal ram/rom. Why do I get the feeling that we are still in the 80's where you had 16K drams as standard memory and having
no memory past 32K Forth is designed as control language so going to a wider word with makes more sense,than a 32 bit forth if you need
data a bit larger than 16 bits.


Mon May 24, 2021 7:05 pm
Profile

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1780
Even the difference between wiring up some, or all, of the onboard 64k in an FPGA can make a significant difference in clock speed. Wiring is not free.


Mon May 24, 2021 9:09 pm
Profile

Joined: Sun Aug 04, 2013 2:19 am
Posts: 8
oldben: do not assume anything (except that there are people who know more than you do)... Instead, read about low-level FPGA interconnects, if you can find the information. Then try tracing interconnects using vendor tools. Read about routing blocks and try to figure out how they work and what the limitations are. Spend a few years building circuits and solving floorplanning issues and sudden unexplained slowdowns due to extra routing. Then your opinions and assumptions may be correct and worthy of publishing.

In the meantime, if you ask a question and someone bothered to put together a cogent answer, read it carefully before replying with your feelings of how the universe works.


Tue May 25, 2021 10:38 pm
Profile

Joined: Mon Oct 07, 2019 2:41 am
Posts: 585
The main reason I sugested a larger word size is the 16 bit cpu only has a 13 bit PC.
I guess that just leaves going to a 32 bit design if you want more code space.


Wed May 26, 2021 12:10 am
Profile

Joined: Sun Aug 04, 2013 2:19 am
Posts: 8
That is true, if you need more code space. However, the question is: why would you?

The design parameters for this CPU were selected to fit into a very small FPGA with minimal block RAM, primarily to drive a small bitmap display in a low-cost commercial product, Gameduino. 8K words was deemed sufficient by the author.

Have you written much assembly code yourself? 8K words is an awful lot of code if you don't use 'modern tools'. J1 code is smaller than you would expect. It operates on the datastack, so there are no parameters to shuffle, stack frames to maintain, entry or exit code. Remember that every instruction is only one-word long, including calls and jumps. And returns are often free. This makes small, reusable functions very attractive, leading to very dense code.


Wed May 26, 2021 12:44 am
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Quote:
That is true, if you need more code space. However, the question is: why would you?

It is a lot easier to develop an app in a high-level language using compiled or interpreted code than it is to use assembler. Most high-level languages and OSes will easily blow the 64kB limit of a 16-bit machine. How big is a JVM? The add the app. Then add the OS for support. Probably >128kB. Something like Linux may require >512kB. Rather than fiddle with bank switching and overlay loaders, it is more cost effective to use a 32-bit machine. Todays’ software and apps are making use of 64-bit hardware. 32-bit is becoming retro.
FPGAs, even low-cost ones these days, have enough capacity for 32-bit processing. Why not use it? A 32-bit RISCV machine with a few peripherals fit easily into an xc7a15 device the smallest. A few years ago, the challenge was to fit a cpu into a device, the devices being mush smaller than what is available today.

_________________
Robert Finch http://www.finitron.ca


Mon May 31, 2021 3:46 pm
Profile WWW

Joined: Mon Oct 07, 2019 2:41 am
Posts: 585
One other factor is that before the 1990's programs were more thrifty
of memory and file I/O than today. 64 Kb was ample for most 1970's unix programs as
the REAL OS handled I/O and the shell in the background.
Rather than saying program uses ### KB bytes of memory and needs redefine that into
# K of instructions #K of static memory (pointer size) #K of array data (int and floating point). This ignores the heap
and factors out device native word length.


Mon May 31, 2021 7:10 pm
Profile

Joined: Mon Jan 22, 2018 2:49 pm
Posts: 19
I like the J1 a lot, and its Novix/RTX200x predecessors.

The source code nudged me into learning Verilog, and I quit using VHDL for new designs.


Tue Jun 01, 2021 4:06 pm
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Quote:
The source code nudged me into learning Verilog, and I quit using VHDL for new designs.

I found System Verilog to be even easier to use than Verilog. Two big features I use are structures and passing arrays to modules.

_________________
Robert Finch http://www.finitron.ca


Thu Jun 10, 2021 2:35 am
Profile WWW
Display posts from previous:  Sort by  
Reply to topic   [ 12 posts ] 

Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group
Designed by ST Software