View unanswered posts | View active topics It is currently Fri Apr 26, 2024 9:54 am



Reply to topic  [ 102 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6, 7  Next
 rf68000 - 68k similar core 
Author Message

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Quote:
I believe the guy who maintains WinUAE (Toni Wilen - mangled spelling probably) has some of the most extensive tests for 680x0 compatibility. The 68000 implementation is supposedly clock cycle perfect even for esoteric happenings.
Thanks, I will check it out. But I was not trying for cycle accuracy so a port of test code may involve a lot of changes (time).

The following mod turned out not to work because implementation of the system could not place all the clock gates required.
Added clock gates and a clock enable to the cores. The clock to a core can now be turned off or on using the CLOCK command. The primary core’s clock cannot be turned off. CLOCK 0 will turn off clocks to all cores. Otherwise, the CLOCK command accepts a bitmask where each bit corresponds to one of the cores. A ‘1’ in the bitmask will turn on a core’s clock, a ‘0’ will turn it off.

Started working on Femtiki OS for multi-core operation. Planning on having a separate copy of the OS on each node. The nodes will communicate using messages stored in a common memory. Each core will be responsible for a small number of threads (16 or 32) due to memory limitations; thread info being stored in local block RAM.

Still have not got the dram to work yet.

_________________
Robert Finch http://www.finitron.ca


Tue Nov 29, 2022 4:38 am
Profile WWW

Joined: Wed Apr 24, 2013 9:40 pm
Posts: 213
Location: Huntsville, AL
Rob:

Following your 68k adventure here. Always enjoy reading about your processors.

Many years ago I experimented with a clock switching primitive built into Spartan 3A and Virtex 5 families. It provided glitch-less switching of two unrelated clocks. I used it for inserting wait states into my multiphase micro programmed 65C02 soft core. I stopped using it because it was not a synthesizable element and I wanted my cores to be fully synthesizable. I am pretty sure that there is at least one of these elements in each clock generation tile of current Xilinx FPGAs. You’ll have to look in the libraries guide for the FPGA family that you’re using for the instantiation template.

I think this built-in primitive is just what you need to stop, or substantially lower the speed, of all of your processor cores, including the primary core’s clock.

_______________________

BUFGMUX

BUFGCE

BUFGCE_1

BUFGCTRL

_________________
Michael A.


Wed Nov 30, 2022 2:53 am
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Quote:
BUFGCE
This is the primitive I was trying to use. But there is a routing issue with the number of them located in the same area of the chip. Anyway, I switched to disabling the clock on the entire node rather than individual processing cores because there are fewer nodes, and then it could route the clock enables. The system really should send an advance signal that the clock is being shut down in case there is state that needs to be saved. I think this will be done with a level 7 interrupt, then 2000 cycles later the clock will shut off. The 2000 cycles should be enough to save state, then a STOP instruction would be executed.

I found out the CPU was executing the bus error routine during dram access.
A couple of issues with dram access. 1) the bus error circuit was configured to generate an error after 150 clock cycles passed. This was probably too short an interval. If the frame buffer or other device which uses a long burst access is occupying up the dram controller, it might just take 150 clock cycles by the time the cpu is selected for dram access. So, the bus error timeout was updated to 250 clock cycle. 2) the channel id was not being specified for the dram request. The controller uses this id to determine where to place results. As an unspecified value it would likely default to zero or possibly all ones, 15, in this case. The CPU is on channel 7, so it would never receive a response. This was also easy to fix by defining it as seven for the CPU.

I had to reduce the number of cores present in the system from eight down to six. With all the fixes the toolset could no longer place and route the design; it ran out of possibilities for LUT arrangements. The design almost filled the FPGA. Removing one pair of cores made the design a chunk smaller. So now it may be possible to fit other goodies in like the sprite controller and graphics accelerator.

With a couple of dram issues fixed the dram works unreliably. Values read are not the same as values written. But at least it no longer hangs the machine. It appears that the last access to DRAM is being held indefinitely. DRAM access is controlled by the output of a fifo, and it seems that if the fifo’s output does not change, the DRAM will be continually accessed. Some more logic is required here.
DRAM signals were being set before the control values from the fifo were available. This led to incorrect settings causing bad read and write operations. The fix was to register the control signals from the fifo at a later stage after they were valid.
Still tweaking the multi-port memory controller. It works a little better now, but it is updating the wrong address with the right data. I think it updates the previous address; the write enable signal needs to be delayed by one more cycle I think.

_________________
Robert Finch http://www.finitron.ca


Wed Nov 30, 2022 4:16 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Took me a bit to realize that there was an issue with the system cache. Memory cells located in the cache were being updated. I was assuming that there was a main memory issue. Reads were coming from the cache not main memory.
One issue with the cache is that the write port was active for reads as well as writes. Another issue was that all the ways were being updated from way #0 causing valid data to be overwritten.

After a few fixes the cache is working much better. Testing by filling memory with a value reveals however that occasionally the wrong bytes get updated. Sometime the first byte of the previous row of memory, sometimes the first byte of the next row of memory gets updated in addition to the update row. Since it does not happen consistently, I am looking where there could be close timing issues.

The cache update takes place a line at a time. Update is handled using read-modify-write cycles, so individual bytes are not updated. Yet that is exactly what is happening. Somehow the odd byte gets updated.
I extended the time on the byte lane select and the cache line address. Both now appear one full cycle before the updated. That should give plenty of setup time.

The system cache is 64kB, 4-way associative. With up to eight read port.

The extra bytes are no longer being updated; whatever I did seems to have fixed the issue. I thought I had the controller working after testing by filling memory with values and dumping it, so I updated it on anycores.org. Then I found out after writing a ram test routine and running it that the controller still does not work. If a fill command is used every fourth line of memory gets updated. The other lines are showing as zero. So xxxxxx30 to xxxxxx3F gets updated correctly while xxxxxx00 to xxxxxx2F does not. It is difficult to discern what is happening as reads come through the cache. So, either the cache does not update properly on writes, or main memory is not getting updated properly.

Further examination reveals that values are not being read from main memory. A value of zero is being read. So, value may not be written to main memory correctly.

_________________
Robert Finch http://www.finitron.ca


Thu Dec 01, 2022 4:44 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
I believe I have the dram working now. However, it finds errors during the memory test. It came back with about two screens worth of errors out of 512MB. Not quite perfect yet.

_________________
Robert Finch http://www.finitron.ca


Thu Dec 01, 2022 8:11 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Still tweaking the multi-port memory controller.

_________________
Robert Finch http://www.finitron.ca


Fri Dec 02, 2022 5:51 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Just thinking about testing memory tonight.

The ram test writes ‘aaaa5555’ to all 512 MB of RAM then it reads it all back and checks that the read-back value is ‘aaaa5555’. Next the value ‘5555aaaa’ is written to all 512 MB of ram, then read-back and checked for ‘5555aaaa’.
The first 16kB of the first readback has errors in it. The error addresses all end in a ‘4’. As in $20001004. Not every address that ends in ‘4’ is reported. However, the second readback of the RAM is successful without any errors. 16kB is a suspicious size because it is the size of one way of the cache.

During testing in the first phase only writes occur meaning the cache would not be updated. In the second half of the first phase where read-back is taking place, values from memory would be loaded into the cache. When a read operation takes place, the cache is given a certain number of cycles to respond within. If the data is not found in the cache within the given number of cycles, then the read request is queued in the memory request fifo. If the request is in the fifo it forces the cache to be loaded from memory.

Added some regs in the data readback path.

_________________
Robert Finch http://www.finitron.ca


Fri Dec 02, 2022 8:25 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Latest Mods
Added bit pair manipulation instructions by repurposing the bit number value. Only the low order bits of the value used to specify the bit number are significant for the 68000. That means the high order bits can be used to specify other things. In this case if bit 7 of the value is set, then the instruction is treated as a bit-pair instruction. The bit number value then specifies a bit pair instead of an individual bit. The status of the bit pair can be recorded neatly in the four ccr flags. Zf is set if the value is 00, cf is set if the value is 01, nf is set if the value is 10 and vf is set if the value is 11.

Tentatively added a BIN2BCD instruction which converts a 32-bit binary number to an eight-digit BCD number. The double-dabble module is used which processes two bits per clock so the instruction takes about 20 clock cycles to complete. I used an instruction from the $Axxx range of instructions which I hope will not conflict with anything else. A BCD2BIN instruction will likely be coming in the future, but it may use a lot of multipliers and adders.

I made use of the BIN2BCD instruction to display the memory page being processed by the ramtest routine. It reports in MB processed as a decimal number rather than as a hex number. It is the same module used to process the BCD instructions. Thinking about adding a decimal mode flag to the processor like the 6502.

_________________
Robert Finch http://www.finitron.ca


Sat Dec 03, 2022 7:36 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Added the BCD2BIN conversion instruction. It uses a lot of adders to perform the conversion.

Added an I/O permissions table which supports 64 address spaces. A bit in the table must be set to access the corresponding I/O device.

Added an MMU between the network and the DRAM / IO. It is a simple address mapper. The MMU supports 64 address spaces of 32MB each, with a 64kB page size.

I have been working on the OS, Femtiki, and tried to get the timer IRQ working without luck yet.

_________________
Robert Finch http://www.finitron.ca


Sun Dec 04, 2022 9:46 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Working on getting things working through the MMU. The I/O addresses are now virtual addresses. They needed to be within range of a virtual address so they all changed. The I/O address range selection is now controlled by a register in the CPU. The only thing with a physical address selection is the MMU itself, and the address of the MMU is configured by a processor register.

Backed out the changes for the MMU. Actually, added some ‘if’ statements to build conditionally for it. The screen display was not updated properly, every fourth character was corrupt, and the system was pretty much hung, when the MMU was present. One thing that worked was the timer ISR. The IRQ live indicator on the screen was updating as expected. I suspect there are issues with the additional clock cycle required for the MMU address translation.

_________________
Robert Finch http://www.finitron.ca


Tue Dec 06, 2022 6:54 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Milestone: IO permissions bitmap working

Latest Fixes

The SET opcodes with a data register as a target were performing a decrement-and-branch operation instead. The decoder was missing a bit. This led to a crash during startup. The test suite did not test setting a data register. The test suite was updated to check for this.

_________________
Robert Finch http://www.finitron.ca


Thu Dec 08, 2022 5:38 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Latest Additions
Added 128-bit densely packed decimal floating-point instructions FADD, FSUB, FMUL, FDIV, FCMP, FMOVE using the opcodes reserved for packed decimal. FMOVE does integer to float and float to integer conversions. Decimal floating point is available only on the primary core.

Latest Mods
Modified the monitor program to test FP by adding two numbers and displaying the sum. Does not work yet.

Latest Fixes
Missed a decode bit in the FP instructions causing them not to work resulting in unimplemented instruction error.
Ported a routine over from C to asm that should be able to print a decimal floating-point number in a reasonable fashion. The routine is about 500 LOC, so it blew my budget for the boot ROM. The size of the boot ROM was increased by 2kB.

Plans
I would like to incorporate the decimal floating point into TinyBasic. Each decimal float takes 16B storage instead of the 4B for integers. So, I may increase the amount of memory available to TInyBasic. Also, the values need to be 16B aligned meaning the variable name and value cannot be stored together unless 16B are allowed for the variable name. Tempting, but I think I will stick to TinyBasics 26 variables.

_________________
Robert Finch http://www.finitron.ca


Fri Dec 09, 2022 4:03 am
Profile WWW

Joined: Mon Nov 28, 2022 2:51 pm
Posts: 4
robfinch wrote:
Added 128-bit densely packed decimal floating-point

Is the FPU working with 128 or with 152 bits?


Fri Dec 09, 2022 9:33 am
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Quote:
Is the FPU working with 128 or with 152 bits?
At least 152 bits. There are a couple of extra digits added for rounding. And depending on the operation, multiply for instance uses double the number of bits.

I switched it to 96-bit decimal floating-point to reduce the hardware requirements. So that is 116+ bits. Only about 20 digits or so are needed for COBOL.
96-bit DFP provides 25 digits.

_________________
Robert Finch http://www.finitron.ca


Sat Dec 10, 2022 12:59 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Latest Fixes
The clock enable to the float multiplier was not wired correctly. This led to a crash when performing a multiply operation.

Float branches were reading a second opcode before the displacement causing the incorrect branch displacement to be read and a crash. This occurred because they were lumped in with the rest of the FP instructions, when they should have been treated separately.

At least FADD seems to work. Dumping values using the monitor’s D command reveals that a sum is calculated by the monitor's TFP command, and it looks like it could be right. Displaying the result as a string does not work yet. The conversion hangs trying to extract digits.

_________________
Robert Finch http://www.finitron.ca


Sun Dec 11, 2022 5:45 am
Profile WWW
Display posts from previous:  Sort by  
Reply to topic   [ 102 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6, 7  Next

Who is online

Users browsing this forum: No registered users and 13 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group
Designed by ST Software