View unanswered posts | View active topics It is currently Fri Apr 26, 2024 10:28 am



Reply to topic  [ 775 posts ]  Go to page Previous  1 ... 37, 38, 39, 40, 41, 42, 43 ... 52  Next
 Thor Core / FT64 
Author Message

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Started working out of a new folder “Thor2022”. Got most things ported for Thor2022. Took an initial stab at getting the compiler and assembler to work with new instruction formats. Had to modify the compiler to split constants larger than 64-bits into multiple operations as the assembler cannot handle 128-bit constants yet. It should be possible to at least load a 128-bit constant piece-meal using shift and or operations.

_________________
Robert Finch http://www.finitron.ca


Fri Mar 04, 2022 6:33 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Running Thor22 in SIM. Lots of small bugs worked out. Finally got to ‘AA’ LED display.

_________________
Robert Finch http://www.finitron.ca


Sat Mar 05, 2022 7:12 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Trying to get a system built though to running it in FPGA hardware.

Ran into a nasty bug in the BIU. A combinational loop. But I cannot seem to identify what the cause of it is. On the schematic a bus signal is being fed back to itself when the signal is replicated through multiple LUTs. As far as I can tell there is no loop in the code. I have run into this before and it took me about a week to find the loop. If I recall correctly last time it had to do with a bad signal name.

Played with the MMU logic some more. It now supports both page tables and inverted page tables. Selected between the two by the low order bit of the page table base register.

_________________
Robert Finch http://www.finitron.ca


Mon Mar 07, 2022 11:13 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Breaking the BIU up into more and more modules to try and localize the combinational loop. It seems to move around all over the place involving different paths between registers.

The combinational loop is gone now and I am not sure why. I must have changed the right line of code. The biggest change I made was to the synthesis settings to turn off retiming.

Forgot to update the micro-code for Thor2022 to correspond to the new instruction formats. Resulting in a crash when the ENTER instruction was executed.

_________________
Robert Finch http://www.finitron.ca


Tue Mar 08, 2022 4:01 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Poured some more work into the compiler. In theory it now supports 128-bit integers. Got the parameters for the 128-bit divide backwards. This led to a zero result which did not match with the 64-bit result. I built a 64 vs 128-bit integer compare for the whole expression tree to identify where things were amiss. In most cases the 64-bit and 128-bit integers should agree, unless of course the integer is larger than 64-bits.

Modified the format of the PTE to allow for a 12-bit asid. The asid is tied to the process id and on my windows machine there are more than 256 process running. Also expanded the virtual address range to 48 bits recognized in the TLB. Basically, to use up as many bits of TLB memory as possible without incurring the additional block RAMs. the PTE is 90 bits in size now. Five of them will fit into a 512-bit cache line.

The page table group PTG, is 512-bits wide. The inverted page table uses open addressing with quadratic probing for collisions. I have it generating a page fault after 12 PTGs have been searched for translations with no match. 12x5 is 60 colliding translations, probably not very likely.

I am wondering about keeping track of empty PTEs in PTGs during the search for a translation. Ideally the first empty PTE should be used to store a translation on a miss. So, there needs to be a history record of the search. This has not been built yet. Recorded needs to be the PTG and PTE entry number. The search could also be made to stop when it finds an empty PTG. Care must be taken because deleted entries could clear the PTG while there are still translations left yet in following PTGs.

Also wondering if the table could be packed once a PTG is cleared. If all the entries in the PTG are clear it might be worthwhile to take collision entries and move them from following groups to the empty group. This would reduce search times.

Modified the data cache to use odd/even cache lines to handle unaligned data. The size of the data cache had to be doubled to support this, so it is now a whopping 64kB. Two sets of tags, odd and even, were needed.

_________________
Robert Finch http://www.finitron.ca


Wed Mar 09, 2022 5:47 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
More work on the MMU / BIU. It now supports either software or hardware managed TLB in addition to supporting hash or hierarchical page tables. It is just a matter of flipping the correct bits.

_________________
Robert Finch http://www.finitron.ca


Fri Mar 11, 2022 5:04 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Yet more work, this time on the TLB. The TLB has been updated to support least-recently-used updates in addition to random or fixed replacement. Which of the three algorithms to use is specified by the TLBRW instruction.

Spent more time designing features of the MMU. Created an entity called the access rights table, ART to hold access rights for a page. The first time a translation is looked up the access rights are loaded from the ART. After that, future translations do not access the ART.
The design is still in flux. I am considering making the access counter part of the TLB entry.

Here are the layouts of the PTE and ARTE.
Attachment:
File comment: page table entry
PTE.png
PTE.png [ 10.98 KiB | Viewed 699 times ]

Attachment:
File comment: access rights table entry format
ARTE.png
ARTE.png [ 13 KiB | Viewed 707 times ]


And the TLBE
Attachment:
File comment: TLB entry
TLBE.png
TLBE.png [ 24.08 KiB | Viewed 699 times ]

_________________
Robert Finch http://www.finitron.ca


Sat Mar 12, 2022 4:22 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Added auto-aging to the TLB entries. Periodically the access counter is shifted right by a bit.

_________________
Robert Finch http://www.finitron.ca


Sun Mar 13, 2022 5:00 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Decided to shelve the hierarchical page tables for now. Not satisfied with the need for 128-bit PTEs. The hashed page table does not care about power-of-two address alignment and uses less memory. It is probably faster as well.

Increased the size of a PTG so that eight PTEs would fit. This means the PTG is not an evenly sized page of memory in size, but it turns out not to matter for the hash table. It is still an even multiple of 128-bits though as this is the size of a memory access.

Made the reading of the PTG quit as soon as a matching PTE is found. The search takes place at the same time the PTG is being loaded. This can be done because the valid bit in the PTE is zero until it is loaded from memory. Also modified the PTG update to write only the last half of the PTE where the accessed bit is stored. So, there is only one memory access required.

_________________
Robert Finch http://www.finitron.ca


Mon Mar 14, 2022 5:41 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Lost several hours of work. Somehow the file I was working on got reverted to an older version. I’m guessing I messed up, copying the file in the wrong direction when updating version control.

Made the region table updateable. Added an instruction to update the region table. There needs to be a separate access rights table, ART, for each region. The address of this table is stored in the region table. I am now calling the ART table the PMT standing for page management table.

Added the ability to bypass levels in the hierarchical page table.

_________________
Robert Finch http://www.finitron.ca


Tue Mar 15, 2022 3:29 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
I have updated the MMU to absorb 10 bits at a time of the address. PTEs will have to be in clusters of four pages. This is not much different that just increasing the size of a page, except that the smaller pages get to be kept. This is only for hierarchical tables.

Added garbage collection cards to the TLB entries. The cards bits are set in a similar fashion to the dirty or modified bit. The TLB entries can then be scanned by the garbage collector to see where pointer stores occurred.

_________________
Robert Finch http://www.finitron.ca


Wed Mar 16, 2022 3:09 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Now using a 64kB page size after doing some research and finding a 4kB page probably too small. Also came up with a way to use 1kB sections out of the 64kB. So, the 64kB page can be divided up.

Added MMU caching of PDE lookups. The cache is really small, eight entries, but fully associative.

_________________
Robert Finch http://www.finitron.ca


Thu Mar 17, 2022 3:56 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Mostly more work on the BIU. Added MMU caching of PTGs in addition to PTEs. Realized I have got the hierarchical page table placed in the physical address space, and I am thinking maybe it should be in the virtual space, but that could lead to nested tlb misses. For the hash table I am assuming the entire table will be present in physical memory all the time. It is also located at a physical address.
Got to the stage of LED output in simulation once again.

_________________
Robert Finch http://www.finitron.ca


Fri Mar 18, 2022 4:28 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Latest Fixes: The assembler was not encoding register fields for any R1 type instruction. This was found from the PTGHASH instruction which evaluated to zero all the time. Other R1 type instructions had not been used yet.

The MOV instruction was not encoded correctly, it was using the Thor2021 opcode. This caused the value zero to be moved into registers.

Several instructions got missed in the instruction length decode. This led to various crashes in sim.

Latest Mods: modified the PUSH and POP instructions to push or pop up to four registers. Previously it could handle only three, but with extra bits available in the opcode due to smaller register spec fields a fourth register could be added. Was also able to free up two now redundant opcodes.

The hash page table is now implemented entirely in block RAM. This takes up about 1/3 of the available block rams. Uses a 256-bit wide port on the cpu side for loading and storing and a 2048-bit wide port on the memory side to allow an entire page group to be read in a single cycle. The hash table now has the same performance as a cache. It is just as fast at address lookup as the TLB. So the TLB has been removed.

Got to the LED lighting up again in sim, this time using hash page table lookups for virtual addressing.

_________________
Robert Finch http://www.finitron.ca


Sun Mar 20, 2022 3:15 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Decided to get rid of one bit of the register type bits. The register type field was a two-bit field that specified if the register spec was for a register, vector register or a constant. The constant capability is redundant most of the time as there are other instruction formats where an immediate value can be specified. The type field has been reduced to specifying either a vector or a scalar register.

13-bit constants were not encoded properly by the assembler, two bits were trimmed off and the constant encoded as 11-bits. This led to loops not working correctly. The size of
the constant field for Thor2022 increased by two bits and the assembler was only partially updated to account for this.

Up to the clear-screen point again. Flashy LEDs worked. And the three second delay worked all running in a virtual address space using a hash table. Text output is close to working.

_________________
Robert Finch http://www.finitron.ca


Mon Mar 21, 2022 3:03 am
Profile WWW
Display posts from previous:  Sort by  
Reply to topic   [ 775 posts ]  Go to page Previous  1 ... 37, 38, 39, 40, 41, 42, 43 ... 52  Next

Who is online

Users browsing this forum: AhrefsBot and 17 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group
Designed by ST Software