View unanswered posts | View active topics It is currently Fri Mar 29, 2024 1:17 am



Reply to topic  [ 121 posts ]  Go to page Previous  1 ... 4, 5, 6, 7, 8, 9  Next
 RTF64 processor 
Author Message

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Latest Fixes:
<software> The operand size field of the TST instruction was placed incorrectly by the assembler. Leading to an unimplemented instruction being encoded. The size field for single register ops was being defaulted incorrectly to a byte size. This affected the TST, COM, NEG, and NOT instructions.
<hardware> Several of the variations of MUL, DIV, and REM were not decoded at the decode stage leading to unimplemented instruction errors.

Latest Mods:
<hardware> The NEG and COM instructions now fully support byte, wyde, tetra or octa and parallel operations. Previously only octa operations were supported. Reassigned the PUSH, PUSHC, LINK and UNLINK opcodes out of the memory opcode space to an unused space.

_________________
Robert Finch http://www.finitron.ca


Sun Nov 29, 2020 4:13 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Latest Fixes:
<software> Getting the keyboard status and scan code needed to be provided for the debug keyboard routines. The assembler portion of the keyboard driver was not included in the build. The compiler was generating code to allocate and remove one too many words on the stack for the return block because the return block size included the return address. However, the return address was already allocated by a CALL. So, the size should have been one less. This affected the location of parameters passed on the stack, which were then off by one word. There were multiple copies of some files included in the assembly build. This resulted in phase errors and branch-out-of-range errors due to duplicate labels.

Latest Mods:
<hardware> SIMD forms of ADD and SUB were added. A 64-bit register is treated as one 64-bit value, two 32-bit values, four 16-bit values or eight 8-bit values. Also added were byte, wyde, and tetra versions.

<software> FMTK device drivers are being modified to use messaging primitives and mailboxes rather than function pointers. A device driver now looks like:
Code:
 int serial_driver()
{
  p = DeviceTable[5]; 
  forever {
    WaitMsg(p->hSendMsg, &d1, &d2, &d3, -1);
    switch(d1 & 0xff) {
    case DVC_GetUnit:
      val = serial_get(handle);
      SendMsg(p->hRcvMbx, E_Ok, val, 0);
      break;
< … parse other messages … >
     case DVC_Shutdown:
        serial_flushi(handle);
        DBGDisplayAsciiStringCRLF(B"Serial shutdown");
      SendMsg(p->hRcvMbx, E_Ok, 1, 0);
      return (E_Ok);
    default:
      SendMsg(p->hRcvMbx, E_BadDevOp, 0, 0);
        break;
    }
  }
}

The driver waits forever for messages, parses the message request, executes methods if needed and sends a response back. Previously each function was invoked via a function pointer. The new way makes it easier to queue programs waiting for devices. It also automatically blocks programs since requests are waiting for responses. Each device now has a pair of mailboxes, one for send one for receive. The OS open(), close(), read(), and write() functions were re-written to use messaging.

_________________
Robert Finch http://www.finitron.ca


Mon Nov 30, 2020 4:01 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Latest Fixes:
<hardware> The LEA instruction was updating the execute result bus when the value is calculated during the memory stage. It should have been updating the memory result bus. This led to strings not displaying properly among other things. The FSTO instruction was only partially implemented in hardware leading to an illegal instruction trap.
<software> The compiler croaked on an undefined symbol. Solution: the symbol was defined as an int and an error message spit out so that the compile could continue. The assembler did not assemble the short form of the FSTO instruction properly.

Latest Mods:
<software> The assembler now supports the short form of the LEA instruction. The short form is used when the stack pointer or frame pointer is referenced in the instruction.
<hardware> The text blitter was modified to support color. This makes it almost a tile graphics blitter. This is to support the emoticons in Unicode. Text cells up to 32x32 may now be colored in 4,8,12,16,20, or 32 bits per pixel. Color depth is an attribute of the font. The text blitter also accepts a 20-bit character code instead of a 16 bit one. This should make it possible to support the Unicode characters.

Latest Musing:
Since there are over 140,000 Unicode characters, to support the Unicode using a 24-bit value is what I think might be appropriate. It allows a few bits extra for expansion of Unicode. This affects characters and strings in the compiler. It also affects load / store operations in hardware. It might be an idea to support 24-bit loads and stores. Using 24-bits rather than 32-bits saves 25% on memory space. Alternately, three 21-bit values fit into 64-bits saving even more space, which is another thought.

_________________
Robert Finch http://www.finitron.ca


Tue Dec 01, 2020 3:52 am
Profile WWW

Joined: Mon Oct 07, 2019 2:41 am
Posts: 585
But how many unicode characters are really needed.
For bootstapping I could see 4 code pages loaded from rom,
unless you want boot up in say Japanese.


Tue Dec 01, 2020 5:53 am
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Quote:
But how many unicode characters are really needed.

I was thinking along the same lines, how many does one really need, and they could be paged, and a 16-bit char code used. Some of the picto-graphic char sets have thousands of characters in them so at least a 16-bits is probably needed. Does one really need them all simultaneously available? The thought I had was that even 140,000 characters is only about 1MB of 8x8 monochrome images, which is not very much in terms of memory available today. For a text display of 100x50 (5,000 chars) only about 16k ram is required for 24-bit codes. This it not the 80's. I may modify the text controller to display Unicode characters. There’s room for a 20-bit char code if the colors are trimmed back a little bit.
Quote:
For bootstapping I could see 4 code pages loaded from rom,
unless you want boot up in say Japanese.

I may not want the system to boot up in Japanese, but I do not know what someone else might want to do. If one desires the system to be portable then maybe supporting a full Unicode char set is a possibility. Right now, the compiler uses 16-bit codes to represent strings, maybe that should be changed to 24-bits or 21-bits. Strings are searched and compared a word at a time (64-bits) for performance reasons. It would not be too difficult to add a 21-bit search instruction like the BYTNDX, and WYDNDX instructions.

[sci-fi]An alien race is encountered and they use a 16-million character character set (to support many languages in the galaxy). The only system supporting that many characters is one you've designed.
[/sci-fi]

_________________
Robert Finch http://www.finitron.ca


Tue Dec 01, 2020 1:18 pm
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Supporting Unicode
Milestone: first text string relative to global pointer displayed

Latest Fixes:
<software> The assembler was calculating global pointer relative addresses according to the current output segment when it should have been the symbol’s segment. This led to incorrect values for symbol locations resulting in a hang on a bad address.

Latest Mods:
<software>The compiler was modified to be able to store character strings in UTF-21 encoding. Three characters are stored per 64-bit word. Processing the strings in library routines needs to be done. It will be challenging since a char pointer cannot simply be incremented. Some work was done on this.

<hardware> Decodes for the BIT instruction were removed as BIT is no longer part of the ISA. The text controller was modified to support a 24-bit character code. The color selection was reduced by a bit to RGB666 to provide room for more char codes.
A UTF21 search instruction (U21NDX) was added to the instruction set. It searches a word for the first UTF21 character matching a given character and returns the index of character in the word.

Latest Musing:
I am wondering how to support the display of a full Unicode character set using only a 32kB bitmap memory. My first thought is to implement it as a cache, and have the bitmaps stored in main memory. The cache would need to be multi-way associative in case two characters share the same low order bits of the char code. My second thought is to limit the number of characters to a smaller value say 12-bit and allow multiple fonts on the screen at one time. So, the input char code would contain a font id plus the 12-bit character code. With a small number (eight) different fonts allowed at the same time multiple languages could be displayed on screen, or one language in multiple fonts.
Mulling over the idea of having a UTF21 pointer increment instruction. The LSB’s of the pointer would go 0,1,2,0,1,2,0,1,2… as it increments. It might be useful to have other increment patterns as well. For instance, five 12-bit pixels fit into 64-bits. It would be convenient if a pointer LSB’s went 0 to 4. There aught to be a way to support oddball pointer increments or additions. This can be done with a small subroutine, possibly inline.
Code:
 private void IncUTF21(byte** p)
{
  (*p)++;
  if ((*p & 7) > 2) {
    *p &= -8;
    (*p) += 8;
  }
}

Some functions are easy to implement and also fast, for example strlen():
Code:
size_t (utf21_strlen)(const byte *s)
{   // find length of s[]
  int n, k;
 
  for (n = -1, k = 0; n < 0; k += 3, s += 8)
    n = __u21ndx(*(int *)s,0);
  k += n;
  return (k); 
}

_________________
Robert Finch http://www.finitron.ca


Wed Dec 02, 2020 4:02 am
Profile WWW

Joined: Mon Oct 07, 2019 2:41 am
Posts: 585
Since you have hardware to unpack stuff,
this web site has some ideas on packing floating point data
http://www.quadibloc.com/comp/compint.htm
(You need dig for it)
Ben.


Wed Dec 02, 2020 9:40 pm
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Latest Milestone:
The keyboard initialization routine may or may not be working. The LEDs on the keyboard light up. It is supposed to spit out an error message if the keyboard cannot be initialized and no error message appears so that may be good. It can take up to 10 to 15 seconds to run the initialization routine, but it is usually pretty quick (1s). the ‘DBG>’ prompt appears on screen indicating the debugger program is run.

Latest Fixes:
<software> Encoding of the BBS, BBC instructions by the assembler was completely messed up. It was encoded according to an outdated instruction format and encoded as three bytes when it should have been four. This caused an unintended branch into the middle of another function. The FLDO instruction short form was encoded as four bytes, it should have been three. This led to a hang.

Latest Mods:
<software> The compiler was modified to use temporary registers that do not need to be saved and restored to the stack in place of register variables for leaf functions. This decreases the size of leaf functions because there is no register stack code and improves performance. This conserved about 1kB of code space in the boot rom.
<hardware> An instruction was added PTRINC that does the equivalent of the following in a single clock cycle (increment a pointer according to a mod value):
Code:
 void _ptrinc(byte** p, int mod)
{
  (*p)++;
  if ((*p & 7) > mod) {
    *p &= -8;
    (*p) += 8;
  }
}

The following function does a field extract using the pointer managed by ptrinc().
Code:
 int:32 _subword(int* p, int wd)
{
  int wd;
  int:32 ch;
  int n;

  wd = *p;
  n = __mulf(p & 7,wd); // Fast single cycle multiply intrinsic
  ch = wd[n+wd-1:n];      // Field extract
  return (ch);
}

It allows incrementing a pointer by the size of an element in a word instead of by a byte. The low order bits of the pointer can then be used with a field extract to select out the element.

Future:
Having learned a little bit more, I have begun work on an Open PowerPC compatible core - nPower. It will be a super-pipelined core like the RTF64. As a first step only the minimal number of instructions necessary to run software will be implemented. A lot of the PowerPC architecture is going to be left out.

<bugs>There is an issue with the memsetO() function on the return, it returns to a garbage address. Usually I can spot the issue right away but this one has me stumped. The code generated by the compiler is:
Code:
 public code _memsetO:
  gcsub    $sp,$sp,#16
  sto      $fp,[$sp]
  mov      $fp,$sp
  gcsub    $sp,$sp,#16
  ldo      $t0,40[$fp]
  ldo      $t2,24[$fp]
;    const unsigned int:64 uc = c;
  ldo      $t3,32[$fp]
;    su = (unsigned int:64 *)s;
  mov      $t1,$t2
;    for (; n > 0; ++su, --n)
  sle      $cr0,$t0,$x0
  bt       $cr0,memsetO_13
memsetO_12:
;       *su = uc;
  sto      $t3,[$t1]
  add      $t1,$t1,#8
  sub      $t0,$t0,#1
  sgt      $cr0,$t0,$x0
  bt.      $cr0,memsetO_12
memsetO_13:
;    return (s);
  mov      $a0,$t2
memsetO_8:
memsetO_11:
  mov      $sp,$fp
  ldo      $fp,[$sp]
  add      $sp,$sp,#16
  ret      #24
endpublic

The CC64 source code is:
Code:
 #include <string.h>

void *(memsetO)(void *s, int c, size_t n)
   {   /* store c throughout unsigned char s[n] */
   const unsigned int:64 uc = c;
   unsigned int:64 *su;

   su = (unsigned int:64 *)s;
   for (; n > 0; ++su, --n)
      *su = uc;
   return (s);
   }

_________________
Robert Finch http://www.finitron.ca


Thu Dec 03, 2020 3:55 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
<bugs> There was an extra opening ‘{‘ in one of the FMTK functions. This caused me several hours debugging. Variables in following functions were then not being found. The location of the error message did not correspond to the location of the bug. I thought something was amiss with symbol lookups which has been working rather well for a while now. I put in all kinds of debugging code into the searches, and was eventually able to track down the issue. The compiler was considering variables in a hierarchical fashion which almost worked. The search function begins with the current function and searches outwards since the function may be a class method. In this case it looked like functions were defined within the scope of another function, so it almost worked that way. Shades of Pascal.

Latest Mods:
Got rid of the multiple register sets. There is now just a single register set, but a separate stack pointer for each operating mode. There are two extra stack pointers as that makes eight for a nice even number. There are six operating modes: user, supervisor, hypervisor, machine, interrupt and debug. The OS code has been worked on and is coming along slowly.

Work was disrupted by a blown UPS power supply. It went 'click' and powered everything off on me, right in the middle of a build and while I was editing files.

_________________
Robert Finch http://www.finitron.ca


Fri Dec 04, 2020 4:31 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Now that a large chunk of the core has been debugged, it is going to be shelved temporarily while work on nPower is done. The RTF64 core is too large for present purposes, nPower should be much smaller.

_________________
Robert Finch http://www.finitron.ca


Sat Dec 05, 2020 5:12 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
A couple of modules have been created for shift and R1 type ops. Making the core more modular will hopefully reduce the build times.

_________________
Robert Finch http://www.finitron.ca


Sun Dec 06, 2020 4:36 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Latest Fixes:
<software>Merge operations for SETxx instructions were encoded incorrectly. They were shifted a bit from the correct position.

<documentation> In the documentation the encodings for the merge operations were in the wrong order. The rtl code was changed to match the documentation as that was easier to modify.

_________________
Robert Finch http://www.finitron.ca


Mon Dec 07, 2020 3:46 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Latest Mods:
A store immediate instruction was added to the instruction set. This stores a 14-bit value as an octa, highly useful for subroutine arguments. This saved 4.5kB of memory out of 98.7kB over a load immediate then store approach, improving overall code density by about 4.5%.

_________________
Robert Finch http://www.finitron.ca


Tue Dec 08, 2020 3:57 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Experimented with loop invariant code motion (LICM) in the compiler. After several tries and finding the results to be not correct, this feature was stubbed out. It is tricky to implement. It is necessary to check that the instruction does not have side-effects within the loop. Only expressions that are known to be unchanging in the loop can be hoisted. There are many expressions especially involving memory operations where the compiler cannot know whether or not the expression is constant in the loop. That means that most expressions can not be hoisted out of the loop. Some of the simpler expressions that might have been hoisted out of loops end up being eliminated completely by the substitution of register variables for the expression in other optimizations.

Latest Fixes: <hardware> Most stores were using an outdated instruction format. The instruction format was changed to match the load format a while ago, but the rtl code never got updated. It is amazing that software worked as well as it did. When the store immediate instruction was added, adding it to the select logic got missed. This resulted in selects not being active during the write cycle, meaning memory did not get updated.
Write enable signals in the TLB unit were using the wrong signal name due to letters being transposed. Also, in the TLB unit the input address signal name was incorrect.

<software> A missing return value led to null pointer issues in the compiler.

Latest Mods:
Got rid of the conditional move instruction cmovenz. It represented only about 0.04% of instructions. It was not implemented yet in the rtl code.

_________________
Robert Finch http://www.finitron.ca


Wed Dec 09, 2020 3:56 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
RTF64 Pipeline Diagram
Attachment:
File comment: RTF64 Pipeline Diagram
RTF64PipelineDiagram.png
RTF64PipelineDiagram.png [ 46.97 KiB | Viewed 1193 times ]

_________________
Robert Finch http://www.finitron.ca


Wed Dec 09, 2020 5:24 am
Profile WWW
Display posts from previous:  Sort by  
Reply to topic   [ 121 posts ]  Go to page Previous  1 ... 4, 5, 6, 7, 8, 9  Next

Who is online

Users browsing this forum: Bing [Bot] and 7 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group
Designed by ST Software