AnyCPU - View topic - Thor Core / FT64

Page 48 of 52

[ 775 posts ]

Go to page Previous 1 ... 45, 46, 47, 48, 49, 50, 51, 52 Next

Thor Core / FT64

Author	Message
robfinch Joined: Sat Feb 02, 2013 9:40 am Posts: 2095 Location: Canada	Re: Thor Core / FT64 Did a lot of work on the Thor2024 specifications document. Changed from 16-bit instruction parcels to 32-bit parcels. Also changed the alignment of instructions to byte alignment from 16-bit alignment. Stuck on Thor2023 in simulation. A data bus is not being loaded properly. The load is delayed by several cycles with ‘X’s prior to the load. There are no ‘X’s as inputs AFAIK. I do not know what is causing the delay. But it causes the wrong data to be output during a store operation. _________________ Robert Finch http://www.finitron.ca
Tue May 16, 2023 6:28 am

robfinch Joined: Sat Feb 02, 2013 9:40 am Posts: 2095 Location: Canada	Re: Thor Core / FT64 Worked on the design of Thor2024 some more. Up to 300 pages of specs now with more to include yet. Just added the float exception trigger, enable, disable, and clear instructions. These are fit in with the generic IRQ generating instruction. Branch instructions may be either 32 or 64 bits in size. 32-bit branches only support comparing two registers and branching to a 12-bit target displacement. 64-bit branches add the option of storing a return address in a link register, branching to an address in a target register, and 40-bit branch target displacements. The larger displacement may be handy for randomizing the address of code in a large virtual address space. Also supported is a three-way branch for less than, greater than, or equal. BGL. The tree-way branch has two 20-bit displacement fields for the less than and greater than targets. If operands are equal execution continues with the next instruction. While 32-bit instructions parcels are in use, code may be byte aligned. The jump and branch instructions support a byte-aligned target. _________________ Robert Finch http://www.finitron.ca
Wed May 17, 2023 5:31 am

robfinch Joined: Sat Feb 02, 2013 9:40 am Posts: 2095 Location: Canada	Re: Thor Core / FT64 Shelving Thor again as it requires a larger FPGA to do it justice. I may be able to obtain a larger FPGA. A "free" toolset for larger FPGAs was pointed out being in GitHub. https://github.com/openXC7 _________________ Robert Finch http://www.finitron.ca
Tue May 23, 2023 4:26 am

robfinch Joined: Sat Feb 02, 2013 9:40 am Posts: 2095 Location: Canada	Re: Thor Core / FT64 Decided I was a crazy to shelve Thor and start yet another project once I started looking at developing software for rfx32. Instead, a trimmed down version of Thor is being implemented, based on copying most of the rfx32 code, but using the latest Thor ISA. Timing for Thor is somewhat slower since 64-bits are being used instead of 32. Tools indicate the max clock rate is about 45.5 MHz, so the system is being built to run under the 40 MHz clock. According to the tools the path through the divider is the longest one, which strikes me as a bit strange since the divider is a simple sequentially clocked radix-2 divider. I would have expected the 64-bit multiplier to be the slowest path. I suppose I could try breaking up the divide into finer stages but it would then take more clock cycles. I also suspect it will be challenging to get a 64-bit machine working beyond 50 MHz in the FPGA. _________________ Robert Finch http://www.finitron.ca
Thu May 25, 2023 3:44 am

robfinch Joined: Sat Feb 02, 2013 9:40 am Posts: 2095 Location: Canada	Re: Thor Core / FT64 Managed to edge past 50 MHz timing by pipelining the multiplier so it now takes four clock cycles, and adding states to the divider, which now takes about 140 clocks. The trick will be to maintain the timing as the core is improved. _________________ Robert Finch http://www.finitron.ca
Thu May 25, 2023 7:56 am

oldben Joined: Mon Oct 07, 2019 2:41 am Posts: 593	Re: Thor Core / FT64 Does the muliplier, match memory timing well for indexing like foo[a,b]?
Fri May 26, 2023 3:55 am

robfinch Joined: Sat Feb 02, 2013 9:40 am Posts: 2095 Location: Canada	Re: Thor Core / FT64 Quote: Does the muliplier, match memory timing well for indexing like foo[a,b]? it would be better if the multiply were faster as the memory reference can not complete until after the multiply is done. If possible, the compiler will convert multiplies into shifts which are single cycle. Multiply should take about the same length of time as a cache access. _________________ Robert Finch http://www.finitron.ca
Fri May 26, 2023 4:16 am

robfinch Joined: Sat Feb 02, 2013 9:40 am Posts: 2095 Location: Canada	Re: Thor Core / FT64 Moved the inline ALU code out to its own module. The code was duplicated twice in the top module, now it is two instances of the same module. This should make it easier to manage in the future. Made some of the code more generic in nature, accepting a parameter for the number of queue entries. Added predicated execution of instructions where applicable. Decided to not support predicated instruction execution for flow control operations. Predication has use for vector operations and for very short sequences of instructions; otherwise, it is better to branch. It is possible to branch around flow control instructions based on the value of a predicate register if needed. Predicating branches would cost too many branch displacement bits. _________________ Robert Finch http://www.finitron.ca
Fri May 26, 2023 4:19 am

robfinch Joined: Sat Feb 02, 2013 9:40 am Posts: 2095 Location: Canada	Re: Thor Core / FT64 Added the ATOM modifier, which has somewhat dubious operation due to the need to apply the mask immediately at the fetch stage. ATOM is automatic interrupt control over a range of instructions. The ATOM modifier sets the minimum interrupt level for the next eight instructions. The interrupt level can be set separately for each instruction. The master interrupt mask still applies. For instance, setting the interrupt level to ‘7’ for instructions will ensure that only non-maskable interrupts are recognized. However, due to the current implementation the first instruction after the ATOM always has interrupts masked to level 7. A bitmask from the ATOM instruction is stored in a buffer which shifts as instructions are queued. Postfix instructions do not count as instructions. Since it would be possible to disable interrupts for an extended period of time if a long sequence of postfix instructions were coded, an exception will occur if more than four postfix instructions in a row are used. Thinking about getting rid the of the PRED modifier. It sounds simple on paper but implementing it is challenging and most instructions already have a predicate register spec field. Unlike the ATOM modifier it is critical that it be applied to the correct instructions. For instance, if the ATOM modifier is off by an instruction then interrupts may be disabled for an extra clock cycle. This is probably non-critical. If the PRED modifier is off by an instruction then an instruction may be executed or elided that should not be. It may be necessary to surround the instructions covered by a PRED with NOP ramps. The PRED modifier is detected at instruction queue time after decode, but it affects which predicate register is read for the instruction in the instruction fetch stage. Used register code #63 to specify to use a postfix immediate for the operand instead of a register value. So, there are now only 63 general purpose registers. Even with 63 registers the register file is looking cramped. Some of the argument registers are shared with predicate registers. Squeezed a rounding mode field into FP instructions. More bits than needed were being used to specify the FP function. So, some were traded off to allow a rounding mode field. _________________ Robert Finch http://www.finitron.ca
Sat May 27, 2023 2:42 am

robfinch Joined: Sat Feb 02, 2013 9:40 am Posts: 2095 Location: Canada	Re: Thor Core / FT64 Spent part of today coding hardware table walkers for both hierarchical and hash page tables. The hash page table is really fast as it is made from block RAM rather than going out to main memory. Set the page size at 64kB so the block RAM usage can be limited to about 1/6 the FPGA memory. The hash table uses wide memory to allow searching eight entries in parallel. It is also simple enough that it is clocked at double the CPU clock rate. The hierarchical table walker is a little more complex. It acts as a bus master, triggered by a TLB miss. _________________ Robert Finch http://www.finitron.ca
Sun May 28, 2023 9:51 am

robfinch Joined: Sat Feb 02, 2013 9:40 am Posts: 2095 Location: Canada	Re: Thor Core / FT64 Milestone: Executed the first instruction for Thor2024 in simulation today. Just a NOP. Milestone: got LED output in simulation. First pass at assembler written. Using Fibonacci again to test. Had to put code in to backup the PC by two instructions if there was a cache miss. Since the core is pipelined it increments the PC by two instructions before it knows there was a cache miss. I should maybe try registering the miss address rather than using a subtractor. I wonder which has better timing? _________________ Robert Finch http://www.finitron.ca
Mon May 29, 2023 3:55 am

robfinch Joined: Sat Feb 02, 2013 9:40 am Posts: 2095 Location: Canada	Re: Thor Core / FT64 Dealing with a complicated pipelining issue today. At the fetch stage instructions are copied to a fetch buffer. Copying the instructions to the buffer occurs a clock cycle after the cache is accessed. Accessing the cache takes a clock cycle. The program counter is one ahead of the fetch copy. So, when there is a miss there may still be valid data in the pipeline that needs to be copied to the fetch buffer. For a miss, the PC has already incremented to the next address, so the PC needs to be backed up to the miss address. It starts to get complicated when the fetch buffer cannot be loaded yet because the previous instructions have not queued. The data has to be held in the pipeline until the fetch buffer is ready to be loaded. Add to that a branch miss occurring at the same time and it seems to turn into a real mess. I have not hit the right combination of logic yet. Either the same instructions are queued multiple times, or instructions are skipped over and not queued. I think I have this solved now, except that instructions in the branch shadow are being executed when they should not be. The core running in sim until it hits the first branch now. _________________ Robert Finch http://www.finitron.ca
Tue May 30, 2023 7:36 am

robfinch Joined: Sat Feb 02, 2013 9:40 am Posts: 2095 Location: Canada	Re: Thor Core / FT64 Got the branch shadow execution of instructions fixed. Relatively easy, by tracking the last instruction or two that was queued at branch time and stomping on them if there is a branch. Issues executing the first loop to store to the screen. Sometimes the same store is happening twice and other times a store operation is dropped. Not sure if this is just how instructions are executing in general or if it’s the store operation. Code: 02:000000000000001E 9303000000 23: mov t3,r0 02:0000000000000023 0403000002 24: ldi t2,16384 25: .st1: 02:0000000000000028 57023800007C0000 26: sto t0,txtscreen[r0+t3] 02:0000000000000030 00FD 02:0000000000000032 84E3400000 27: add t3,t3,8 02:0000000000000037 28E830F8FF 28: blt t3,t2,.st1 29: 02:000000000000003C 9303000000 30: mov t3,r0 02:0000000000000041 0403400100 31: ldi t2,40 Instructions at 3Ch and 41h fetched after the branch due to pipelining are being correctly stomped on and not executed. The branch does loop backwards and the register in the register file can be seen incrementing by eight due to the add instruction. _________________ Robert Finch http://www.finitron.ca
Wed May 31, 2023 12:49 pm

robfinch Joined: Sat Feb 02, 2013 9:40 am Posts: 2095 Location: Canada	Re: Thor Core / FT64 Playing with cordic. Used to calculate sine and co-sine. Cannot get it quite to work. It looks like the inverse gain calculation may be wrong. If an angle of 0 degrees is input, the sine output is zero, correct, but the cos output is nuts. For various angles the ratio of sine to cos looks close to correct, but the values are nuts. _________________ Robert Finch http://www.finitron.ca
Fri Jun 02, 2023 9:06 am

robfinch Joined: Sat Feb 02, 2013 9:40 am Posts: 2095 Location: Canada	Re: Thor Core / FT64 Got it working now, now to make an FP sin / cos module. _________________ Robert Finch http://www.finitron.ca
Fri Jun 02, 2023 11:52 am
Display posts from previous: Sort by

Page 48 of 52

[ 775 posts ]

Go to page Previous 1 ... 45, 46, 47, 48, 49, 50, 51, 52 Next

Thor Core / FT64

Who is online