Author |
Message |
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2215 Location: Canada
|
Another project has been started that’s based around the 68k’s instruction set. It’s hoped to be able to use slightly modified tools for the 68k. The instruction set is almost identical, but the core’s little endian rather than big endian. It also supports high-speed task switching and vectoring to tasks rather than jumping to interrupt processing routines. One of the issues with the 68k is it’s slow task switch / interrupt response time. This core makes it fast taking only a handful of clock cycles to switch tasks. It’s estimated to run up to about 85MHz in an -1 Artix7 FPGA. It also typically uses fewer clock cycles than the 68k in order to execute instructions. The core is a hard-coded state machine and not micro-coded. http://github.com/robfinch/Cores/blob/master/FT68000/trunk/rtl/verilog/FT68000.v
_________________Robert Finch http://www.finitron.ca
Last edited by robfinch on Sun Jan 15, 2023 3:54 pm, edited 1 time in total.
|
Tue Jan 31, 2017 11:44 am |
|
|
Dr Jefyll
Joined: Tue Jan 15, 2013 5:43 am Posts: 189
|
A little-endian 68K?!! I love it!! So much more sensible.
|
Tue Jan 31, 2017 2:34 pm |
|
|
BigEd
Joined: Wed Jan 09, 2013 6:54 pm Posts: 1806
|
Wow - a CISC! That's going to be a lot of work, surely?
|
Tue Jan 31, 2017 4:19 pm |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2215 Location: Canada
|
2017/01/31 Yes, a lot (maybe too much) work. But if it can be made to work well enough for it’s intended purpose then that’s okay. It has to be able to move 32 bit values around to memory mapped I/O. FT68000 has a simpler to debug non-overlapped pipeline, but the instruction decode is a lot more complex than a RISC machine.
It was shelved in the past for fear of the amount of effort to validate everything. It was first coded about 2008, so it’s a project that’s been resurrected. The hope is that existing tools for the 68k can be used to validate and make use of the core.
A 32 bit processor (32 bit data bus) with a Windows based toolset available was desired. I’d like to put together a music machine with multiple cores (MCMM – multi-core music machine) and sound generators for each core. A couple of 32 bit machines were reviewed which were on the order of 3,000 LUTs. The larger project may end up using something like RISC-V cores or DSD7. But for now using the FT68000 is being explored.
The biggest issue is getting software for the core to work. A 68k ‘C’ compiler that outputs assembler code should work.
_________________Robert Finch http://www.finitron.ca
|
Tue Jan 31, 2017 7:33 pm |
|
|
BigEd
Joined: Wed Jan 09, 2013 6:54 pm Posts: 1806
|
Certainly sounds interesting! Having a validation suite would be a big help.
|
Tue Jan 31, 2017 7:38 pm |
|
|
Dr Jefyll
Joined: Tue Jan 15, 2013 5:43 am Posts: 189
|
robfinch wrote: The hope is that existing tools for the 68k can be used to validate and make use of the core. [...] A 68k ‘C’ compiler that outputs assembler code should work. So, if you supply your own assembler then you're OK. But besides the compiler are there other tools you could adapt to your purpose? BTW have you considered targeting the Coldfire (68K subset) instruction set instead? Betcha it'd save you a lot of work, and have minimal downside. https://en.wikipedia.org/wiki/Freescale_ColdFire
|
Wed Feb 01, 2017 2:01 am |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2215 Location: Canada
|
I probably will end up working with a subset of the core's capabilities which would probably happen to be something like the ColdFire core. It's NXP's Coldfire now. It's a good idea, but I wrote the core already to support the full instruction set. It's too bad one can't purchase IP cores for single use like one can purchase chips. 2017/02/01 A 68k assembler was modified to work with FT68k. A eight core system was setup with an on chip network. The following test program has virtually no chance of working. It turns on the LEDs onboard. Complicated by the fact the core is working through a network controller. Anyway I'm going to try it as a first test. Code: NOCC EQU 0xFFD80000 NOCC_PKTLO EQU 0x00 NOCC_PKTMID EQU 0x04 NOCC_PKTHI EQU 0x08 NOCC_TXPULSE EQU 0x18 NOCC_STAT EQU 0x1C
code org 0xFFFC0000 cold_start: move.l #0x3FFC,A7 ; setup stack pointer move.l #0x0000001F,d2 ; select write cycle to main system move.l #0xFFDC0600,d1 ; LEDs address moveq.l #127,d0 ; LEDs data jsr xmitPacket cs1: bra.s cs1
;--------------------------------------------------------------------------- ;---------------------------------------------------------------------------
xmitPacket: move.l d7,-(a7) move.l a0,-(a7) ; first wait until the transmitter isn't busy move.l #NOCC,a0 xmtP1: move.l NOCC_STAT,d7 and.w #0x8000,d7 ; bit 15 is xmit status bne.s xmtP1 ; Now transmit packet move.l NOCC_PKTHI(a0),d2 ; set high order packet word (control) move.l NOCC_PKTMID(a0),d1 ; set middle packet word (address) move.l NOCC_PKTLO(a0),d0 ; and set low order packet word (data) clr.l NOCC_TXPULSE(a0) ; and send the packet move.l (a7)+,a0 move.l (a7)+,d7 rts
_________________Robert Finch http://www.finitron.ca
|
Thu Feb 02, 2017 7:08 am |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2215 Location: Canada
|
2017/02/03 Well it didn’t work at all. So I’ve decided to use the DSD core instead, something that’s at least partially working.
_________________Robert Finch http://www.finitron.ca
|
Sun Feb 05, 2017 12:47 am |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2215 Location: Canada
|
This project is going again as it's being used to test the audio/video controller core. A version with a 16 bit data bus was created to be compatible with the test system. It boots up displays a couple of led statuses (that's further than it got before) and hangs when it goes to return from a subroutine. The strange thing is the 68k core hangs in the same spot. Two processors with slightly different object code both hang in the same spot. Makes me think it's a system problem not a processor problem.
_________________Robert Finch http://www.finitron.ca
|
Wed Dec 13, 2017 11:23 am |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2215 Location: Canada
|
I started taking another look at this core after realizing how well the 68k compares to other architectures; this core being really just a variant of the 68k.
Been doing some reading on comp.arch and it seems that the 68k has some advantages over other architectures in performance, power consumption and code size. Including comparisons to RISC processors like ARM, or the INTEL cores. Given charts that people present it’s a wonder that there aren’t more 68k systems; the 68k seems to compare fairly well to other processing cores.
Compared to FT68k core the FT64 core is an ideal compiler target. FT68k is not as easy a machine to develop a reasonable compiler for. But FT68k is probably hard to beat for assembler language level programs.
So onward I work.
_________________Robert Finch http://www.finitron.ca
|
Mon Jun 04, 2018 1:40 am |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2215 Location: Canada
|
Deliberating whether or not to modify all the bus access code that looks like this: Code: IFETCH: if (!cyc_o) begin < ... start bus cycle ... > end else if (ack_i) begin < ... term. bus cycle ... > state <= DECODE; end
To be like this: Code: IFETCH: if (!cyc_o) begin if (!ack_i) begin < ... start bus cycle ... > end end else if (ack_i) begin < ... term. bus cycle ... > state <= DECODE; end
The benefit is higher reliability because one bus cycle is guarenteed to be over before the next one will start. The drawback is that it's code bloat and takes more hardware. It should not really be neccessary to wait for !ack_i because ack_i is supposed to automatically go away at the end of a bus cycle. In the case of the current system that's within one clock cycle. Higher reliability comes from the fact that if the ack_i misses timing by a clock cycle the next bus cycle won't start. Most of the WISHBONE cores I've seen don't bother with this sort of thing and just assume ack_i will be negated after a clock cycle.
_________________Robert Finch http://www.finitron.ca
|
Mon Jun 04, 2018 4:27 am |
|
|
Garth
Joined: Tue Dec 11, 2012 8:03 am Posts: 285 Location: California
|
That's similar to I²C where you might want to make sure the clock line is up before pulling it down again. If it's still down, it could be because of bus capacitance not being charged up yet by the pullup resistor, or by a device asking for more time by holding the clock line down. I've never implemented that overhead though. I know Samuel Falvo liked four-phase bus transactions.
_________________http://WilsonMinesCo.com/ lots of 6502 resources
|
Mon Jun 04, 2018 8:38 pm |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2215 Location: Canada
|
Opcodes 4E7B to 4E7F were used up to add CSR read / write instructions. Most processors have configuration and status registers and instructions for manipulating them. The 68k only operates on the status register. CSR instructions are needed by FT68000. Currently only the tick and task registers are supported, but the instruction format allows for up to 1024 CSR registers.
While the 68k has dual stack pointers (one for each of user mode and supervisor mode) the FT68k doesn’t. It has only a single stack pointer. The reason being that an FT68k requiring multiple privilege levels will use multiple processor cores to implement them. Each core has a complete set of registers.
I’m pondering the use of hardware mailboxes and messaging to pass arguments to the operating system or device drivers.
_________________Robert Finch http://www.finitron.ca
|
Fri Jun 08, 2018 10:29 pm |
|
|
MichaelM
Joined: Wed Apr 24, 2013 9:40 pm Posts: 213 Location: Huntsville, AL
|
I like the idea of mailboxes / message passing. I have used those concepts in a number of systems I've built in the past, and the concepts have helped avoid many shared memory access faults that may have been a concern otherwise. I thought a variation on your NOC ideas would be particularly applicable to a multi-processor like what you appear to be working toward.
_________________ Michael A.
|
Sat Jun 09, 2018 4:26 pm |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2215 Location: Canada
|
Just looking at the OS message structure the messages are small enough that a wide circular buffer could be used to store message info. Each message element could be loaded into a latch, then all the latches loaded in a single clock cycle to the circular buffer. The buffer would have an eight-to-one mux on it to be able to load from up to eight different cores. Using a single cycle load means not having to worry about locking access to the mailbox for multiple clock cycles. I suppose a diagram would be helpful. I want to avoid putting the mailbox on a shared memory or network bus, hence input eight ports to the mailbox. The mailbox needs only a single output port to the core servicing the mailbox. The idea is the mailbox looks like a simple peripheral. It'll be a lot of wires and interconnect to implement eight mailboxes, one for each core. Code: typedef struct tagMSG{ unsigned __int16 link; unsigned __int16 retadr; // return address unsigned __int16 tgtadr; // target address unsigned __int16 type; unsigned int d1; // payload data 1 unsigned int d2; // payload data 2 unsigned int d3; // payload data 3 } MSG;
_________________Robert Finch http://www.finitron.ca
|
Sun Jun 10, 2018 2:12 am |
|