Last visit was: Sat Oct 25, 2025 4:27 am
It is currently Sat Oct 25, 2025 4:27 am



 [ 219 posts ]  Go to page Previous  1, 2, 3, 4, 5 ... 15  Next
 rf68000 - 68k similar core 
Author Message

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2405
Location: Canada
I haven't done much more on this project. Instead I've taken up writing a sci-fi story which is taking up more of my time.


I had a brief look at floating point support. I've written some FP primitives but they have limited pipelining so they use a slow clock. I really should go back and add more pipelining to increase the fmax.


I'm backtracking on the mailbox implementation. It's questionable to connect every mailbox to every sender in parallel. I think it would be slow because of all the routing. So it's back to having a message queue in the router component.

_________________
Robert Finch http://www.finitron.ca


Tue Jun 26, 2018 5:18 am WWW

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1841
> SF story
Interesting - hope we get a chance to read it one day. Does it involve a rogue CPU?


Tue Jun 26, 2018 8:22 am

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2405
Location: Canada
Quote:
Interesting - hope we get a chance to read it one day. Does it involve a rogue CPU?

The story involves an immortal woman escaping from a planet about to be fried by the star. Sorry, no rogue CPU in the story. There is only limited AI in the story. Takes place in about a 1990 technical level, with one or two exceptions.

I’ve written about 11,000 words over the last two weeks. It’ll likely be a year or so before the story is done.

_________________
Robert Finch http://www.finitron.ca


Wed Jun 27, 2018 1:15 am WWW

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1841
It's a huge amount of work, I know that - but not from personal experience. Good luck with the endeavour!


Wed Jun 27, 2018 7:12 am

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2405
Location: Canada
Added states to support floating point to FT68000x16.v. While the 68000 supports several different operand types (single, double, long, word, byte, etc). The FT68k will only support doubles (64 bits) to begin with. There is some finagling of the datapath necessary as the FT68k previously didn’t support data over 32 bits.

Used an unused mode, reg combination of the 68k (mode 7, reg 6) to implement reading of 64-bit immediate constants. Immediate constants are handled slightly different in FT68k. Rather than look at the op size, there is a separate mode, reg combination to support encoding 16, 32, or 64-bit constants. This allows longer operations to make use of shorter constant encodings. It does make the required instruction encodings not 100% binary compatible, but very close. Since FT68k is little endian constants are encoded differently than the 68k anyway.

_________________
Robert Finch http://www.finitron.ca


Fri Jun 29, 2018 4:11 am WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2405
Location: Canada
Now that I selected multiple mode,reg combinations to represent immediates, I've decided that was a bad way to do things and to change it all around and use just a single mode,reg combo. This came from a desire to support extended precision (96 bit) arithmetic. Instead the immediate values are going to be processed with bit 15 of a 16 bit parcel indicating to include more bits in the constant (0 means include the next parcel). The problem is that so many different sizes of constants are required. I want to be able to support at least 128 bits and lower. 128 bits take 9x 15 bits. A small constant (less than 16 bits) will have bit 15 set to indicate that is the last parcel required. I'm going to modify the core so that addresses and displacements work the same way.

I came up with an extended precision (96 bit format) for the FPGA makeing use of 18x18 multipliers.

Attachment:
DXPres.png


You do not have the required permissions to view the files attached to this post.

_________________
Robert Finch http://www.finitron.ca


Sat Jul 07, 2018 6:01 am WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2405
Location: Canada
Got back to this project thinking I may reuse it in a new project.

Modified FT68000 and created a new version without the task support. The original FT68000 supported multiple task registers in hardware. This made task switching fairly painless for up to 512 tasks. The processor also automatically switched to exception processing tasks instead of vectoring to exception processing routines. The new version dropped all this support and works more like the original 68000. One key difference is the lack of two different operating modes. The new FT68000 version has only a single mode, single stack pointer. Multiple levels of operating modes are going to be supported by using a multi-core processor. Each core will have a distinct operating level.
For the target FPGA device, at least a quad-core will be developed. One core for supervisor mode, handling external exceptions, and three cores for processing user tasks which operate in an external exception less environment. On the user cores task switching will be triggered through OS calls.
Added to the core are two prefix instructions (RSV:, CRSV:) that cause the next load or store operation to reserve the load/store address or clear a reservation on the store address. If the reservation clear prefix is used the store will not complete unless the address is reserved. The status of the store operation is returned in the carry bit of the status register. The address reservation mechanism is a powerful mechanism inherited from RISC designs. Highly useful to create semaphores and other objects required for multi-core operation.

Core size is about 6748 LUTs or 10,800 LC’s.

_________________
Robert Finch http://www.finitron.ca


Mon Aug 24, 2020 4:23 am WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2405
Location: Canada
Setup multiple register sets / contexts. The data register file which contains eight registers is no more compact than a data register file that contains 64 registers. So, the core goes with a 64-entry data register and 64-entry address register files, of which only 8 register may be selected at any one time. Each group of eight registers is referred to as a register context. Which register context is active is determined by the AXC register. The AXC standing for Active Execution Context, is a three-bit register reflected as the upper nybble of the program counter. Placing the AXC in the program counter allows it to be saved and restored during exception processing. This would reduce the possible size of a program from 4GB down to 256MB. However, also added to the core is a code segment register. The addition of a code segment register makes losing the upper bits of the program counter relatively insignificant. Code modules are seldom more than a few megabytes in size. The code segment register expands the addressing range to 40-bits.

Requiring that the stack alignment be long-words (32-bits) greatly simplifies the state machine for a number of instructions such as RTS or RTE which then don’t have to worry about unaligned data spanning stack words. I think I will force the stack to always be long-word aligned, but this break compatibility with the original 68k.

_________________
Robert Finch http://www.finitron.ca


Tue Aug 25, 2020 4:11 am WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2405
Location: Canada
Fixed up the cross-assembler to accept a notation used by the VBCC compiler. The assembler croaked on the following line:

l19 reg d2/d3/d4

It’s a label l19 followed by the ‘reg’ keyword followed by a register list. The purpose is to define the label as having the value of the register list. It’s similar to an equate. It’s useful to define the register list as a label when the register list isn’t known at compile time. It allows writing a movem.l as movem.l l19,-(a7) for instance. Where l19 is really a list. There were some hoops to jump through to get the assembler to recognize this syntax.

The FT68000 is pretty much source code compatible with a 68000. I think the VBCC compiler can be used to compile ‘C’ code for this with relatively few changes required. The cross assembler is required however as the object code differs; the byte order is different for constants as an example.

_________________
Robert Finch http://www.finitron.ca


Wed Aug 26, 2020 4:15 am WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2405
Location: Canada
Renamed this core rf68000 for the 2022 version. Yes. Started working on it a little bit again.

Made a configuration option to use big endian byte order. This makes it more compatible with vasm.

The test system has eight cores setup to run at almost 80 MHz networked together in a ring topology. The multiply and divide instructions were omitted from the instruction set to improve the core timing. Each core has only about 64kB available to it. 128kB is shared between two cores in a node.

_________________
Robert Finch http://www.finitron.ca


Wed Nov 02, 2022 4:00 am WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2405
Location: Canada
Got a processor test program off the web, and used it to verify the operation of the processor. Found lots of bugs so far. I thought I had the processor pretty much working, but I guess not. I left out some of the less frequently used instructions so have added them in for compatibility and to get the test program to pass.
Multiply was added back in, as it meets timing at 80 MHz.

_________________
Robert Finch http://www.finitron.ca


Fri Nov 04, 2022 11:42 am WWW

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1841
Always good to see a testsuite, especially when it uncovers bugs! And good to see multiply back in too. (Is that a very few cycles using hardware multipliers?)


Fri Nov 04, 2022 12:45 pm

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2405
Location: Canada
Quote:
(Is that a very few cycles using hardware multipliers?)

Yes, it is using the hardware multiplier. It is fast because it is only 16x16 which can be done in a single cycle. Overall, the core should be faster than a stock 68k, because I think many instructions execute faster than they would for a 68k. The Wishbone bus is a minimum of two clocks versus four for the 68000. However, the block RAM has about four cycles of latency plus a cycle to select between bus masters. So, it is about five or six clocks or more per memory access. There is minimal pipelining in the core. Some flag updates take place during the next i-fetch, but that is about it. Lack of pipelining gives it a higher clock rate, and the core runs at 80 MHz. Eight (or more) cores fit into the FPGA. I am going to have at least one dedicated to running in supervisor mode all the time to service interrupts. One thing left out of the core is separate user and system stack pointers. With a multi-core architecture different cores can be used to support supervisor functions so the dichotomy is not needed in the core.

Another difference from the m68k is that hardware interrupt routines need to be 16-byte aligned, as the lower four bits of the vector are being used to indicate the hardware thread to use to service the interrupt. Not sure about this feature yet.

Some bugs fixed:

PEA was not fully implemented. CHK was not implemented. MOVEM was decoding the wrong bit for the size of the operation. Silly me. MOVEM of a word to data register was not sign extending.

I have worked my way through about ¾ of the processor test fixing bugs along the way. One interesting bug was one which modified the executing program in such a way that it continued to the end of the test program. For a minute I thought it was successful until I took a closer look.

_________________
Robert Finch http://www.finitron.ca


Sat Nov 05, 2022 9:31 am WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2405
Location: Canada
I believe IIRC the test suite is the same one used to verify Easy68k. It is about 6,500 lines long and quite thorough. Great for uncovering bugs.

_________________
Robert Finch http://www.finitron.ca


Sat Nov 05, 2022 9:33 am WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2405
Location: Canada
Mile-stone:
Got the 3 second startup delay routine that flashes the LEDs to work in the FPGA. It shows that several instructions work.
Added code for a divider.

Latest fixes:
For some reason the Sxx instructions were setting to inverted logic.
Shift opcodes were not decoded properly. Memory shift ops were not setting the count to one. Selecting constant or register counts for register shift operations was reversed.
Forgot to wire up the response part of the network. This resulted in the processor hanging waiting for a read response.

The network on chip as two circular paths, a request path and a response path. This is so that responses are still possible if requests saturate the network.

_________________
Robert Finch http://www.finitron.ca


Sun Nov 06, 2022 8:14 am WWW
 [ 219 posts ]  Go to page Previous  1, 2, 3, 4, 5 ... 15  Next

Who is online

Users browsing this forum: Chrome-12x-bots, claudebot, Newsai and 0 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  
Powered by phpBB® Forum Software © phpBB Group
Designed by ST Software