View unanswered posts | View active topics It is currently Fri Apr 19, 2024 7:24 am



Reply to topic  [ 775 posts ]  Go to page Previous  1 ... 32, 33, 34, 35, 36, 37, 38 ... 52  Next
 Thor Core / FT64 
Author Message

Joined: Mon Oct 07, 2019 2:41 am
Posts: 592
How well does the segment model handle a complex heap? Can garbage collection be done?
Ben.


Sun Oct 17, 2021 9:42 pm
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Quote:
How well does the segment model handle a complex heap? Can garbage collection be done?
I will have to think on this question some more. By a complex heap I am assuming you mean one that uses multiple segments? It should be just as easy to handle as it is for an x86. There are more segment registers than there are with an x86, so that might ease the implementation of some routines. With a 64-bit address space an app can use a flat model and manage memory without switching segments. I do not see an app exceeding a 64-bit address space anytime soon. I think there are two garbage collections, one done by the system and one done by apps. The MMU is going to support a physical address space greater than 64-bit. This is mainly academic for now. But with the advent of quantum computers and DNA processors maybe the 64-bit address space will get approached. What will computing look like 50 years from now?

Added a bunch of neural network accelerator instructions. They are mostly for configuring neurons. The network is an array of 16 neurons in a single layer with up to 1024 inputs for each neuron. A multi-layer network can be formed by reprogramming the weights and using previous output activation levels as inputs for the next layer. Multiple layers may also be supported by partitioning the arrays, for example using 256 inputs for each layer. The neurons perform calculations using saturating 16.16 fixed point arithmetic.

_________________
Robert Finch http://www.finitron.ca


Mon Oct 18, 2021 3:56 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Added AES encryption and decryption instructions. Borrowing a lot from the RISCV design as I do not know a lot about encryption. I have studied the functions some of which are multiple rotate rights and xors. It is tempting to make a generic function to do this, but the instruction format would be less compact.
Got the source valid logic coded. Source valid logic indicates if a source operand is automatically valid and does not need to be read from a register. In other words, it is an immediate value or is not present in the instruction.

_________________
Robert Finch http://www.finitron.ca


Tue Oct 19, 2021 8:03 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Moved the Thor files to their own repository.

_________________
Robert Finch http://www.finitron.ca


Tue Oct 19, 2021 8:49 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Modified the instructions formats so that a fixed instruction size could be used, 41 bits three per 128 bit bundle. The instructions executed most often were turning out to be 32 or 40 bits. For instance, branches representing about 20% of instructions were 40 or 56 bits in size.
Modified the branch format to increase the number of bits available for a target. There were only 10 bits specifiable for the target address and that is not enough to cover all cases. The target field was increased to 16 bits. The Tb2 field was removed as it does not make sense to use a vector register in the comparison. The Cn2 field was shifted to a single opcode bit, since the second bit was unused. The Lk2 field was reduced to a single bit allowing the selection of only a single link register. And finally, the instruction format was made one bit wider. Unconditional branches have a 27 bit target address field.
Just busy updating the documentation. There are a lot of tables to update.

_________________
Robert Finch http://www.finitron.ca


Thu Oct 21, 2021 12:43 pm
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Switched back to a variable sized (16,32,48, or 64 bits) instruction, to improve the software development.

Spent the day coding the cpu.c, cpu.h, and cpu_errors.h files for the vasm assembler. Got enough of the coding done to cover the most popular instructions. Not ready to build yet, but getting there.

_________________
Robert Finch http://www.finitron.ca


Tue Oct 26, 2021 4:06 am
Profile WWW

Joined: Mon Oct 07, 2019 2:41 am
Posts: 592
Quote:
Sperry (Unisys) 1100/90 Opcodes :

AGB Add GarBage
BBL Branch on Burned out Light
BAH Branch And Hang
BLI Branch and Loop Infinite
BPB Branch on Program Bug
BPO Branch if Power Off
CPB Create Program Bug
CRN Convert to Roman Numerals
DAO Divide And Overflow
ERS Erase Read-only Storage
HCF Halt and Catch Fire
IAD Illigical And
IOR Illogical Or
MDB Move and Drop Bits
MWK Multiply WorK
PAS Print And Smear
RBT Read and Break Tape
RPM Read Programmer's Mind
RRT Record and Rip Tape
RSD Read and Scramble Data
RWD ReWind Disk
TPR Tear PapeR
WED Write and Erase Data
WID Write Invalid Data
XIO Execute Invalid Opcode
XOR Execute OperatoR
XPR Execute ProgrammeR


Some less popular instructions if you missed them the first time.
Good to see you moving forward.
Ben.
Note BPO only is only used once, with IRQ service routines.
except for HAL 9000.


Wed Oct 27, 2021 12:15 am
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
There is like 300 pages of opcode descriptions for Thor2021 so lots of opportunity to invent new meanings.

I got the first build of the assembler done. This is really a trial build to ensure that I am working in the right direction. Much is left out, but the general gist of what needs to be done is there. I have written a cpu.c file for the vasm assembler. And if things work okay it should be usable with the vlink linker. There is an issue within an ‘if’ statement. The following expression is evaluating to true when it should not be AFAIK.
Code:
 (mnemo->operand_type[i])&OP_IMM) && (op.type==OP_IMM)

It is beginning to look like either an issue with the workstations execution of the program or possibly a compiler issue that needs to be resolved. The makefile runs the cl compiler for visual studio, but it looks like it may be trying to use win32’s on a 64-bit machine. I dumped all the values with a printf() and the expression should not be true.

Another issue is trying to make use of the seven-bit immediate field selected in the register field of the instructions. The assembler cannot distinguish at parse time that the immediate field is only seven bits in size. This causes difficulties with pattern matching. It might turn out that the field needs to be larger than seven bits once it is linked. I may have to concoct a means to indicate to the assembler the immediate is only seven bits. This type of thing has been done in the past to represent lower and upper halves of a value for instance, using ‘<’ and ‘>’.

_________________
Robert Finch http://www.finitron.ca


Thu Oct 28, 2021 4:22 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
More work on the assembler today. Added many more instructions. Got enough instructions encoding to assemble a Fibonacci program and see what the result is like.
Did some more work on the documentation, updating the instruction formats. The document is not completely up to date with current instruction formats, but it is improved.
About time to put more work into the RTL code.

_________________
Robert Finch http://www.finitron.ca


Fri Oct 29, 2021 3:58 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Got a good chunk of RTL coding done today. I decided to go for a “simple” in order scalar pipelined core. I figure this is about all that will easily fit onto the FPGA. It is also probably the most straightforward way to get something working. Once it is working then I will improve it. A lot of tools yet to be built.

_________________
Robert Finch http://www.finitron.ca


Sat Oct 30, 2021 5:54 am
Profile WWW

Joined: Mon Oct 07, 2019 2:41 am
Posts: 592
Once you get the RTL working, you might consider just what instructions are being used
and where. 32 bit floating point and short ints may be the exception today compared with 30 years
ago. Do you have a case instruction for switch cases in c? Any instructions for CoRoutines?
Both are weird but useful jump instructions. Can you handle PASCAL or ALGOL? Both seem to compile
into P code. Forth may need the weird jmp next.
Ben.


Sat Oct 30, 2021 8:15 pm
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Added time facility to the core.

Quote:
Once you get the RTL working, you might consider just what instructions are being used
and where. 32 bit floating point and short ints may be the exception today compared with 30 years
ago. Do you have a case instruction for switch cases in c? Any instructions for CoRoutines?
Both are weird but useful jump instructions. Can you handle PASCAL or ALGOL? Both seem to compile
into P code. Forth may need the weird jmp next.

I am reminded of the Lowest common denominator when choosing instructions to support. Simple instructions can be used to implement more complex ones.
I think the instructions need to be selected before the RTL code is complete.
The core will process all floats as 64-bit double precision. There are conversion functions to and from 32-bit.
No fancy instructions in this architecture yet.
There is no case instruction. It is possible to jump to a value in a register so table-based or computed switches may be set up. Branches can branch compared directly to a small immediate value -64 to +63 which is handy when testing switch cases.
Just thinking: With vectors supported a nifty case instruction could be implemented. One vector register could be loaded with case values, another vector register with case target addresses for case code. Then if the case value matched in the first vector register, code pointed to by the second would be executed.
Do co-routines, PASCAL and ALGOL need special instructions? Can you point out an example?

_________________
Robert Finch http://www.finitron.ca


Mon Nov 01, 2021 5:26 am
Profile WWW

Joined: Mon Oct 07, 2019 2:41 am
Posts: 592
Quote:
Do co-routines, PASCAL and ALGOL need special instructions? Can you point out an example?

CoRoutine:
PC exchanged with R
SP exchanged with R+1
FP exchanged with R+2

See Kuth for details
PASCAL or ALGOL P CODE

table1: // stack operation add
*S = *S++ + *S
JMP table[(unsigned byte) * P ++] // get next instruction

table2: // stack operand add # byte
*S = * S + (signed byte) *P++
JMP table[(unsigned byte) * P ++] // get next instruction

Not complex but messes up the cache.
Can a cache [256?] entries be pre assigned to a register pointer like a stack
or jump table?
Ben.


Mon Nov 01, 2021 5:47 pm
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Thanks Ben, you have got me studying co-routines which I have not warily used. Done something similar with threads too.
Thinking out loud here:
For co-routines the desire is vary the target address within a coroutine according to its state. If the state is 1 then jump to point A of the target coroutine. If the state is 2 then jump to point B of the target coroutine and so on. I have seen examples of this implemented with switch statements based on a state var. The locations of the target addresses in the coroutines are demarked with the coroutine return instruction. I think this can be done using jump tables. Some support may need to be built into the compiler, but I do not think special instructions need be used. However it is implemented, it should be supported by the compiler.
A coroutine return (or yield) statement could be treated as part of an enumeration, each return getting an enumerated value. A jump table could be built containing the proper coroutine addresses.
At the entrance point of the coroutine a table-based jump would be done. This may be a slow way of doing things however.
Code:
Corout1:
   ldw      %r1,state                  ; get the current state of this routine
   ldi      %r60,jmptabl[%r1*8]   ; fetch target address from table
   csrrw   %r0,#CA4OFFS,%r60      ; load target address into code address register
   jmp      [ca4]                        ; jump to it

Coroutines may need their own stacks.
I am liking Wikipedia’s grammar for coroutines. The keyword ‘coroutine’ used to indicate code for a coroutine. And the ‘yield’ keyword for transferring flow between coroutines.
Code:
coroutine(stack1) CoRout1(int a)
{
   int x;
   int y;

   forever {
      yield CoRout2(y);
   }
}

coroutine(stack2) CoRout2(int a)
{
   int x;
   int y;

   forever {
      yield CoRout1(x);
   }
}


I guess the table may be eliminated by using the address after the yield as the state var.
Code:
Corout1:
   ldo      %r1,state                  ; get the current "state" of this routine
   csrrw   %r0,#CA4OFFS,%r1      ; load target address into code address register
   jmp      [ca4]                        ; jump to it
   ...
   ; yield point
   ldi      %r1,next_addr            ; get the next address to continue from
   sto      %r1,state                  ; store it in state variable
   jmp      Corout2                     ; jump to Corout2
next_addr:
   ...

With global optimizations, the state variable could be in a register
The above is overly simple as the stack and frame pointers also need be changed. They need to be changed to those of the target coroutine as there may be multiple coroutines.
Code:
Corout1:
   ldo      %r1,state                  ; get the current "state" of this routine
   csrrw   %r0,#CA4OFFS,%r1      ; load target address into code address register
   jmp      [ca4]                        ; jump to it
   ...
   ; yield point
   ldi      %r1,next_addr            ; get the next address to continue from
   sto      %r1,state                  ; store it in state variable
   sto      %r63,sp_save               ; save stack pointer
   sto      %r62,fp_save               ; and frame pointer
   jmp      Corout2                     ; jump to Corout2
next_addr:
   ldo      %r63,sp_save            ; restore stack pointer
   ldo      %r62,fp_save            ; restore frame pointer
   ...

So, coroutines need a context area in the global address space.

_________________
Robert Finch http://www.finitron.ca


Tue Nov 02, 2021 7:13 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
It looks like a lot of modern languages support coroutines, so I decided to add support for them to the CC64 compiler. I think I figured out the code that the compiler needs to generate for coroutines. The yield statement can accept an expression that evaluates to a function address. Making it possible to use pointers.

Code:
coroutine A()
{
   forever {
      yield B();
   }
}

coroutine B()
{
   forever {
      yield A();
   }
}

void main()
{
   A();
}

; Compiler declares the following global variables in the data segment
_data_start:
Co_A_target      dco   Co_A_first
Co_A_orig_sp   dco   0
Co_A_orig_fp   dco   0
Co_A_orig_lr   dco   0
Co_A_sp_save   dco   0
Co_A_fp_save   dco   0

Co_B_target      dco   Co_B_first
Co_B_orig_sp   dco   0
Co_B_orig_fp   dco   0
Co_B_orig_lr   dco   0
Co_B_sp_save   dco   0
Co_B_fp_save   dco   0

[code]
; %r61 is the global pointer
; %r62 is the frame pointer
; %r63 is the stack pointer
;
Co_A_yield:
   ldi      %r61,_data_start
   ldo      %r3,Co_A_target[%r61]         ; get the current target of this routine
   csrrw   %r0,#CA4OFFS,%r3      ; load target address into code address register
   jmp      [ca4]                        ; jump to it
   ; On first entry save off current stack and frame pointer
Co_A_first:
   sto      %r63,Co_A_orig_sp[%r61]
   sto      %r62,Co_A_orig_fp[%r61]
   sto      %r1,Co_A_orig_lr[%r61]
   < allocate a stack>
   ldi      %r63,stack+stacksz
   ...
   ;--------------------------------------
   ; yield point
   ;--------------------------------------
   ldi      %r3,next_addr            ; get the next address to continue from
   sto      %r3,Co_A_target[%r61]         ; store it in target variable
   < save register context as if calling another subroutine >
   sto      %r62,Co_A_fp_save[%r61]   ; and frame pointer
   sto      %r63,Co_A_sp_save[%r61]   ; save stack pointer
   jmp      Co_B_yield                     ; jump to Corout B run entry point
next_addr:
   ldo      %r62,Co_A_fp_save[%r61]   ; restore frame pointer
   ldo      %r63,Co_A_sp_save[%r61]   ; restore stack pointer
   < restore register context>
   ...
   ;---------------------------------------
   ; return point
   ;---------------------------------------
   ldi      %r3,Co_A_first               ; setup entry point in case called again
   sto      %r3,Co_A_target[%r61]
   ldo      %r1,Co_A_orig_lr[%r61]
   ldo      %r63,Co_A_orig_sp[%r61]   ; get back original stack
   ldo      %r62,Co_A_orig_fp[%r61]   ; and frame pointer
   rts

_________________
Robert Finch http://www.finitron.ca


Wed Nov 03, 2021 4:26 am
Profile WWW
Display posts from previous:  Sort by  
Reply to topic   [ 775 posts ]  Go to page Previous  1 ... 32, 33, 34, 35, 36, 37, 38 ... 52  Next

Who is online

Users browsing this forum: No registered users and 8 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group
Designed by ST Software