Hi Joan,
Quote:
- I see that the instruction set working mechanics is very special, particularly considering the use of instruction "slots" and the non-lineal execution of instructions. I understand it's designed in this way to help fitting all the encodings in 16 bits while still allowing direct decoding. Is this correct?.
That's not quite correct. Without the instruction slots there would be much more room to encode instructions, amd/or decoding could be simpler.
The program counter is always problematic in a minimal parts CPU. You could use 4 hc161, then you also need two 8-bit buffers to place the PC on the address bus (unless you use Harvard). Harvard was discarded because it complicates (takes several parts) executing from RAM and loading programs in there, and accessing constants in tables, and using immediates.
In Kobold-one I used only a 4 bit PC that took the place of the lower 4 bits in the 'full' pc that resides in the HC670's. Then at every 16th instruction you need a jump to the next section.
But in Kobold K2 I had some room left in the 16-bit instruction (normally there is never enough room, but with a 2-instruction ALU, and not having instructions with 8-bit immediates included, there were spare bits). I took inspiration from prehistoric computers that executed instructions from a magnetized barrel, where every instruction contains the address of the following instruction. Now, if 3 bits in the instruction point to the next instruction, there is no need for a program counter. And by appending these bit to a 'normal' register, the total address space increases ! The PC is now just a register that points to the current block of eight instructions. It is a counter that can not, and does not, count. Every 8th instruction is an instruction that uses the ALU to increment the PC (just as if it was any other register).
Now this brings new possibilities/features.
1) Big advantage is, that there is no longer a barrier of 32K instructions. The instruction address space is now 32Kword x 8 = 256K instructions. If you have a jump, the immediate operand provides the 16 program counter bits (bit0 always 0) and 3 bits in the instruction extend the address. Same for Calls.
2) 16-bit immediates can be addressed as PC+displacement. There is no need for logic to skip the next instruction because it is an immediate. Just jump to another slot number.
3) Within the 8-instruction block, the instructions can be placed in arbitrary sequence. This gives freedom to do some tricks.
4) For return addresses, you only have to store the 16 bit PC, but you can return anywhere within the 256K program. But per group of 8 instructions only one call is possible (or things will get more difficult).
5) Every ADD instruction can have an integrated branch (BRC / BRNC ) when the branch is within current instruction block.
6) Unconditional jumps within the current block can be integrated into every instruction.
To ease programming, the assembler language will have quite normal BRC / BRNC instructions, and the assembler will either make them a JMP, or optimize them away because the jump is within the block.
There are also some problems, though. Tables in the program might have a hole in the address bits, or must be placed on different pages to distribute them evenly over the memory.
But I don't design for commercial use, and I like to do things a little bit different.
Quote:
- What programs or code do you plan to run on the processor, and how does this fit with your 1Mb addressing space? Do you plan to implement a compiler for it?
I have started with an assembler in Javascript, modifying the assembler
http://www.enscope.nl/rrca/ of the RISC Relay CPU
https://hackaday.io/project/11012-risc-relay-cpu. After that, there might come a C compiler. The C compiler might even run on the device itself, but that will be a long way to go. Programming some simple games will be nice.
Planned are two 256Kw RAM's (one of them used for video) and some flash memory. Programs can execute from flash or from RAM.
Quote:
- What are the thick lines on the pcb layouts?
That are the power and ground lines. I made them quite wide to have a low ground and power impedance (perhaps overdone).
Quote:
Quoted from hackaday: "[VGA Support] will be similar to the first Kobold. But interrupts are difficult in the new design, so I plan to use a DMA system where the video system stops the CPU to obtain access to the shared RAM. So the CPU will only run during blanking time."
- Is this correct?, will the cpu only run during the blanking intervals just like the Gigatron?. And won't this heavily penalise performance?.
I started with Kobold One.
Well I thought, if programming a game, the CPU will mostly be writing to the screen and that can only be done in blanking time (unless there are two independent frame buffers). So the CPU might spend most of the non-blanking time waiting. And there must be a system for the CPU to check if blanking time has come.
By letting it run only during blanking, the cpu can always write to the screen when it wants. And the RAM can be shared without having buffers or multiplexers in data and address bus.
And the next thought was, if it only runs during blanking time, then it better run fast. So I added data registers, made the ALU 16 bit, and changed to RISC.
Note that 'only run during blanking time' is not integrated into the CPU itself. You could run it all the time and have a video system that does double buffering. Perhaps
the video system will also be on a card that is connected to the main board, making several video versions possible.
Quote:
- I understand that your original Kobolt design used the blanking intervals to update the display memory through interrupts, thus leaving all the remaining time for other cpu tasks. So basically the opposite than what you are proposing now. Why did you abandon the previous concept?
If a CPU uses microcode, it is not difficult to check the interrupt signal at the end of each instruction. If the interrupt occurs, the microcode can do all needed actions for saving PC and other program state.
In the RISC design this is harder to do. You will need an extra hardware register as temporary PC storage, and logic to handle that. The Kobold K2 has a part of the PC in a register, and 3 bits in the instruction register. That really complicates saving the state.
Perhaps it would be easier if interrupts are only allowed when there is a jump to another instruction block. But that would increase interrupt response time, and especially in a horizontal sync interrupt there is not much time. There may be not enough time left to make the HSYNC time useful.
I simply choose to not support interrupts, keeping component count low.
However the video system will need to be smarter now because there are no interrupts that can do smart things. I'm working on it.
I hope this answers your questions !