View unanswered posts | View active topics It is currently Mon Oct 14, 2019 7:03 pm



Reply to topic  [ 15 posts ] 
 Kobold - an innovative 16 bit TTL Retrocomputer 
Author Message

Joined: Mon Aug 14, 2017 8:23 am
Posts: 83
Hi All,

Over on hackaday.io Roelh has been busy with his latest TTL project

https://hackaday.io/project/164897-kobo ... l-computer

It's a 16-bit machine, entirely in 74xx TTL, with a 20 bit address space.

The instruction set is inspired by PDP-11 and 68000.

The cpu is 5.4" x 4" and contains 34 TTL ICs and a microcode ROM. The CPU is a daughterboard that plugs into a motherboard that holds memory, video generation hardware and peripherals.

The ISA appears to be fully architected and a prototype pcb has been designed. There are some clever tricks in this design, and Roelh has obviously put a lot of thought into this project.

I'm looking forwards to further updates.



Ken


Fri Aug 23, 2019 5:10 pm
Profile
User avatar

Joined: Fri Mar 22, 2019 8:03 am
Posts: 184
Location: Girona-Catalonia
That's a great find. Thanks for sharing !!


Sat Aug 24, 2019 3:06 pm
Profile

Joined: Mon Oct 07, 2019 1:26 pm
Posts: 6
Hi all,

I made a huge change in the Kobold design, the new design is named Kobold K2.

You can find it here: https://hackaday.io/project/167605-kobo ... l-computer

Some characteristics stayed the same:
- 16 bit processor
- 20 bit address

But new characteristics are:
- no microcode. RISC instead of CISC.
- full 16 bit data path and 16 bit ALU
- 2 cycles per instruction (a Fetch cycle and an Execute cycle)
- von Neumann type, single memory space
- four 16-bit data registers D0-D3
- four 16 bit address registers A0-A3 with four 4-bit page registers. This includes PC (A0).

The performance will improve a lot, mainly due to the RISC strategy, fast access to 4 data registers and 4 address registers, and to having everything 16 bit wide.

Focus is still on a minimum amount of parts. No programmable components and no 74181. The CPU will be smaller than a Eurocard.

This is not a load/store architecture. Most instructions have a general operand and a register operand.
The general operand can be:

- (An) // register indirect
- (An+N) // register indirect with displacement (0-15)
- #nnnn // immediate, equivalent to (A0+N) (that addresses a nearby location in the program).
- label // absolute addressing, also called Zero-page addressing. 64 available locations.
- A0-A3 // one of the address registers

The register operand can be D0-D3 or A0-A3.

There are not many instructions, and for several actions you need one instruction more than usual. But the most frequently used instructions are included. The ALU only knows the operations ADD and NOR.

- By setting one of the ALU operands to zero, ADD becomes MOV (load) and NOR becomes MOVC (move complement value)
- Subtract can be done by MOVC followed by ADD with carry set
- a data register can be complemented by NOR #0,Dn
- AND and OR can be done by a combination of NOR and complement

Branching is only possible on carry/no-carry condition. So checking for equality need an instruction more than usual. Both branching and instruction sequencing work in an unusual way, please refer to the project for details.


Mon Oct 07, 2019 2:09 pm
Profile

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1277
Interesting update - I will read what you've written about the new machine.


Mon Oct 07, 2019 9:31 pm
Profile
User avatar

Joined: Fri Mar 22, 2019 8:03 am
Posts: 184
Location: Girona-Catalonia
Thanks for sharing this. It's all very instructive. I have a few questions thought.

- I see that the instruction set working mechanics is very special, particularly considering the use of instruction "slots" and the non-lineal execution of instructions. I understand it's designed in this way to help fitting all the encodings in 16 bits while still allowing direct decoding. Is this correct?.

- What programs or code do you plan to run on the processor, and how does this fit with your 1Mb addressing space? Do you plan to implement a compiler for it?

- What are the thick lines on the pcb layouts?

Quoted from hackaday: "[VGA Support] will be similar to the first Kobold. But interrupts are difficult in the new design, so I plan to use a DMA system where the video system stops the CPU to obtain access to the shared RAM. So the CPU will only run during blanking time."

- Is this correct?, will the cpu only run during the blanking intervals just like the Gigatron?. And won't this heavily penalise performance?.

- I understand that your original Kobolt design used the blanking intervals to update the display memory through interrupts, thus leaving all the remaining time for other cpu tasks. So basically the opposite than what you are proposing now. Why did you abandon the previous concept?

Thanks


Tue Oct 08, 2019 10:33 am
Profile

Joined: Mon Oct 07, 2019 1:26 pm
Posts: 6
Hi Joan,

Quote:
- I see that the instruction set working mechanics is very special, particularly considering the use of instruction "slots" and the non-lineal execution of instructions. I understand it's designed in this way to help fitting all the encodings in 16 bits while still allowing direct decoding. Is this correct?.


That's not quite correct. Without the instruction slots there would be much more room to encode instructions, amd/or decoding could be simpler.

The program counter is always problematic in a minimal parts CPU. You could use 4 hc161, then you also need two 8-bit buffers to place the PC on the address bus (unless you use Harvard). Harvard was discarded because it complicates (takes several parts) executing from RAM and loading programs in there, and accessing constants in tables, and using immediates.

In Kobold-one I used only a 4 bit PC that took the place of the lower 4 bits in the 'full' pc that resides in the HC670's. Then at every 16th instruction you need a jump to the next section.

But in Kobold K2 I had some room left in the 16-bit instruction (normally there is never enough room, but with a 2-instruction ALU, and not having instructions with 8-bit immediates included, there were spare bits). I took inspiration from prehistoric computers that executed instructions from a magnetized barrel, where every instruction contains the address of the following instruction. Now, if 3 bits in the instruction point to the next instruction, there is no need for a program counter. And by appending these bit to a 'normal' register, the total address space increases ! The PC is now just a register that points to the current block of eight instructions. It is a counter that can not, and does not, count. Every 8th instruction is an instruction that uses the ALU to increment the PC (just as if it was any other register).

Now this brings new possibilities/features.

1) Big advantage is, that there is no longer a barrier of 32K instructions. The instruction address space is now 32Kword x 8 = 256K instructions. If you have a jump, the immediate operand provides the 16 program counter bits (bit0 always 0) and 3 bits in the instruction extend the address. Same for Calls.

2) 16-bit immediates can be addressed as PC+displacement. There is no need for logic to skip the next instruction because it is an immediate. Just jump to another slot number.

3) Within the 8-instruction block, the instructions can be placed in arbitrary sequence. This gives freedom to do some tricks.

4) For return addresses, you only have to store the 16 bit PC, but you can return anywhere within the 256K program. But per group of 8 instructions only one call is possible (or things will get more difficult).

5) Every ADD instruction can have an integrated branch (BRC / BRNC ) when the branch is within current instruction block.

6) Unconditional jumps within the current block can be integrated into every instruction.

To ease programming, the assembler language will have quite normal BRC / BRNC instructions, and the assembler will either make them a JMP, or optimize them away because the jump is within the block.

There are also some problems, though. Tables in the program might have a hole in the address bits, or must be placed on different pages to distribute them evenly over the memory.

But I don't design for commercial use, and I like to do things a little bit different.

Quote:
- What programs or code do you plan to run on the processor, and how does this fit with your 1Mb addressing space? Do you plan to implement a compiler for it?


I have started with an assembler in Javascript, modifying the assemblerhttp://www.enscope.nl/rrca/ of the RISC Relay CPU https://hackaday.io/project/11012-risc-relay-cpu. After that, there might come a C compiler. The C compiler might even run on the device itself, but that will be a long way to go. Programming some simple games will be nice.

Planned are two 256Kw RAM's (one of them used for video) and some flash memory. Programs can execute from flash or from RAM.

Quote:
- What are the thick lines on the pcb layouts?


That are the power and ground lines. I made them quite wide to have a low ground and power impedance (perhaps overdone).

Quote:
Quoted from hackaday: "[VGA Support] will be similar to the first Kobold. But interrupts are difficult in the new design, so I plan to use a DMA system where the video system stops the CPU to obtain access to the shared RAM. So the CPU will only run during blanking time."

- Is this correct?, will the cpu only run during the blanking intervals just like the Gigatron?. And won't this heavily penalise performance?.


I started with Kobold One.

Well I thought, if programming a game, the CPU will mostly be writing to the screen and that can only be done in blanking time (unless there are two independent frame buffers). So the CPU might spend most of the non-blanking time waiting. And there must be a system for the CPU to check if blanking time has come.
By letting it run only during blanking, the cpu can always write to the screen when it wants. And the RAM can be shared without having buffers or multiplexers in data and address bus.

And the next thought was, if it only runs during blanking time, then it better run fast. So I added data registers, made the ALU 16 bit, and changed to RISC.

Note that 'only run during blanking time' is not integrated into the CPU itself. You could run it all the time and have a video system that does double buffering. Perhaps
the video system will also be on a card that is connected to the main board, making several video versions possible.

Quote:
- I understand that your original Kobolt design used the blanking intervals to update the display memory through interrupts, thus leaving all the remaining time for other cpu tasks. So basically the opposite than what you are proposing now. Why did you abandon the previous concept?


If a CPU uses microcode, it is not difficult to check the interrupt signal at the end of each instruction. If the interrupt occurs, the microcode can do all needed actions for saving PC and other program state.

In the RISC design this is harder to do. You will need an extra hardware register as temporary PC storage, and logic to handle that. The Kobold K2 has a part of the PC in a register, and 3 bits in the instruction register. That really complicates saving the state.

Perhaps it would be easier if interrupts are only allowed when there is a jump to another instruction block. But that would increase interrupt response time, and especially in a horizontal sync interrupt there is not much time. There may be not enough time left to make the HSYNC time useful.

I simply choose to not support interrupts, keeping component count low.


However the video system will need to be smarter now because there are no interrupts that can do smart things. I'm working on it.


I hope this answers your questions !


Tue Oct 08, 2019 12:16 pm
Profile

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1277
That non-incrementing PC is quite something!


Tue Oct 08, 2019 4:47 pm
Profile
User avatar

Joined: Fri Mar 22, 2019 8:03 am
Posts: 184
Location: Girona-Catalonia
roelh wrote:
Hi Joan,
I hope this answers your questions !

Yes, thanks, that answers them. In some ways I still regard your original design as a good one though. I kind of liked your approach at VGA, which shouldn’t affect much performance (provided there was enough time to keep the screen updated)


Tue Oct 08, 2019 10:11 pm
Profile

Joined: Mon Oct 07, 2019 1:26 pm
Posts: 6
Done some small changes today. Shift left should be possible by adding to itself, but only a memory location or address register can be added to a data register, so a multiple-shift would require two instructions per shift. Now changed, the 299's will do the shift in the same way as right shift. The POLL option had to go for this. Schematics and descriptions on Hackaday were updated.

Also entered the schematic in Logisim. The first few instructions were succesfully simulated.


Fri Oct 11, 2019 1:47 pm
Profile

Joined: Mon Aug 14, 2017 8:23 am
Posts: 83
Roelh,

Thanks for the update.

I'm planning on using a 74xx194 universal 4-bit shift register as the ALU/accumulator register for my 4-bit wide bitslice design.

I definitely like your use of the (still available) 74xx670 for the address and data registers - but I am still working my way through your instruction slot concept, and the object model of memory addressing - both of which I believe could be very powerful techniques.

I wonder whether the LogiSim .dig file format is compatible with "Digital". https://github.com/hneemann/Digital

It would be good if there was a common file format with which we can share ideas.

Best,


Ken


Fri Oct 11, 2019 2:18 pm
Profile

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1277
Just got to the bit where you OR the displacement and the bank nibble. Again, that's quite something! (If I understand correctly, your 64 addressable 16 byte structures are scattered through memory, as the displacement nibble is prepended at the high end of the address.)

Do I take it that one of the four data registers is hard coded to be zero? Or do I misunderstand the annotations?


Fri Oct 11, 2019 2:20 pm
Profile

Joined: Mon Oct 07, 2019 1:26 pm
Posts: 6
Monsonite, Logisim is open source now and it has offspring. Don't know if Digital is one of them, but if it is, the files might be compatible.
I'll put the logisim file on Hackay soon. Just thinking of one more little change. But you could just also install Logisim to read my file...

Ed, the memory could perhaps better be seen as two-dimensional:
- 1st dimension is address 16 bit
- 2nd dimension is 4-bit displacement or page number, or 3-bit instruction slot.

I think for normal use, in the 16-bit space you should decide which sections you use with a page register, and which sections you use with displacement, and then stick to that, unless you want to do tricks.

In Kobold 1 and earlier designs (risc relay cpu), I always Or'ed displacement value with the lower address bits (A4-A1 for mixed word/byte machine). For Kobold, I needed page registers to go above 64K.

In Kobold K2, I started with completely separating the displacement/slot from the address, this gives addressing more than 64K. I once made a programming language, a bit like LISP, that had 8 fields per object (where lisp has 2). This was an almost exact match.

Lateron, I figured that for several applications it migh be clumsy that you can only reach full memory by using displacements that are fixed in instructions. Think things like loading programs, copying memory sections or writing to VRAM. So I felt the need for a register that could take the place of the displacement. And that register came as a HC670, a private page register for each address register.

Quote:
Do I take it that one of the four data registers is hard coded to be zero? Or do I misunderstand the annotations?

I don't know where you got this from. None of the data registers has to be zero. But in the first concept of Kobold K2, there indeed was a zero-data register. You can still see that in my hand-drawn picture in the first HAD log, where the 2-bit dataregister address becomes 00 when the IR_L bit is active. Then later I connected the IR_L bit to the output enable of the data registers, and used pulldown resistors. This seems possible because the signals dont go to the CPU connector, they only drive the NOR and ADD section in the ALU, so the load is quite low.
One of the problems with the always-zero register is that you have to initialize it, otherwise there will be no LOAD instructions (because the ALU will always ADD or NOR the operand with a dataregister). And how do you initialize it (in a 670) when there is no LOAD instruction, and the 670 has no reset ?


Fri Oct 11, 2019 3:17 pm
Profile

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1277
It was this diagram and those nearby which have a Zero annotation:
Image


Fri Oct 11, 2019 3:26 pm
Profile

Joined: Mon Oct 07, 2019 1:26 pm
Posts: 6
I'm sorry... was a leftover from the first Kobold K2 version. I'll remove the 'Zero'.


Fri Oct 11, 2019 3:43 pm
Profile

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1277
Ah, right! In the OPC machines, we have more registers, and we do keep R0 as zero, but we do that by intercepting the reads, not by controlling the content of the register.


Fri Oct 11, 2019 4:27 pm
Profile
Display posts from previous:  Sort by  
Reply to topic   [ 15 posts ] 

Who is online

Users browsing this forum: rwiker and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group
Designed by ST Software