View unanswered posts | View active topics It is currently Tue Mar 19, 2024 5:37 am



Reply to topic  [ 82 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6  Next
 CC64 / ARPL Compiler 
Author Message

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
The compiler is now about 90% compatible with the test suite. Out of 219 test cases there are only about a half dozen with issues.

Updated FPP the preprocessor to version 2.0. The big modification was to the expression evaluation. The ternary conditional operator “?”, “:’ was added. It caused a fail on test #T00075.c which is now fixed.

Improved the compiler some more. Added some limit logic on several recursive routines to prevent stack overflows when something in the compiler fails. Also added iteration counts in several places.

_________________
Robert Finch http://www.finitron.ca


Tue Jul 20, 2021 6:37 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Modified the compiler to output local labels. When the compiler was originally written local labels were not popular, unknown to the author. To prevent name clashes, the compiler output the name of the translation unit as part of the label name. Labels were effectively global.

Modified the CC64 language to support power-series enums. This allows an enumerator to be a power-of-two for instance. Writing “enum (2^) { a, b, c};” will assign a=1, b=2 and c=4. The enum type is limited to 16-bit values however. This may work well with case values being able to make use of the BBS / BCC instructions for better code density.

The compiler also now makes use of ENTER and LEAVE instructions.

_________________
Robert Finch http://www.finitron.ca


Wed Jul 21, 2021 3:53 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Worked on exception handling in the compiler. When exception handling is enabled every function has a default exception handler associated with it. The default handler merely returns to the caller’s exception handler. To override the action of the default handler some grammar rules were added. It is possible to specify a default handler as in:
Code:
try int main(int argc)
{
   printf ("In main");
}
catch(char *str)
{
   printf("error is %s", str);
}

This places the try block outside of the function declaration. All functions are effectively tried by default, so the the ‘try’ keyword can be omitted.

_________________
Robert Finch http://www.finitron.ca


Thu Jul 22, 2021 4:42 am
Profile WWW

Joined: Mon Oct 07, 2019 2:41 am
Posts: 581
How do you select a exception?


Thu Jul 22, 2021 11:52 pm
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Good question.

Exceptions can be caused by the program using the throw() keyword. throw(“hello world”); will cause an exception with the type char * to be caught by a char * handler or the default catch handler. Exceptions can also be processed if caused by the system or hardware. There are potentially 256 hardware exception causes. There will be a bitmap in the application’s control block indicating which of those exceptions it desires to be notified of. The OS generally takes care of hardware exceptions, but in some cases they are ignored like divide by zero. It is more of a notification that gets sent to the app if a hardware exception occurs. Usually the OS will do its own processing on it. The type of a hardware exception is “exception” which is a built in compiler type. So a catch(exception e) is needed.
When an exception occurs the OS must check the apps bitmap to see if it should notify the app. If an exception is wanted then the OS digs into the app stack to find the latest exception handler, this address is at 16[$FP] in the stack. The OS then sets the app to return to the exception handler the next time the app is active.
Hopefully the way it is setup it will not be possible to crash the system by having a bad hardware exception handler.

Code:
try int main(int argc)
{
   int x, y;
   
   try {
      printf("In try");
      try {
         printf("try again");
      }
      catch (char ch) {
         printf("caught char");
      }
      printf("after throw");
   }
   catch (int erc) {
      printf("catch int");
   }
   catch (char ch) {
      printf("%c", ch);
   }
   catch (...) {
      printf("catch all");
   }
   try {
      printf("try 2");
      x = x + 1;
      if (y == 0)
         throw ("Divide by zero");
      x =x / y;
   }
   catch(char *str) {
      printf(str);
   }
   return (x + y);
}
catch (...)
{
   printf("In default catch.");
}

_________________
Robert Finch http://www.finitron.ca


Sat Jul 24, 2021 4:05 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
It looks like I got the frame pointer offset wrong. It is at 8[$FP] where the exception handler address is stored. When the RTL code was writ 8[$FP] was used, but when I worked on the compiler I used 16[$FP] for some reason.

The ENTER instruction now zeros out the exception handler address. So that if an exception occurs right during function entry the OS won’t try and pass it off to a bad address. Assuming the OS will check the pointer is non-null.

_________________
Robert Finch http://www.finitron.ca


Sat Jul 24, 2021 6:58 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
The compiler has been updated to include support for RISCV. Shown below output for Fibonacci.

Code:
;{++
 
   bss
   align   8

.global _nums:

   fill.b   240,0x00                   
 
   code
   align   16

   .global _main
  ;====================================================
; Basic Block 0
;====================================================
_main:
  subi     $sp,$sp,32
  sd       $fp,[$sp]
  mv       $fp,$sp
  subi     $sp,$sp,32
  la       $gp,__data_start
  sd       $s0,0[$sp]
  sd       $s1,8[$sp]
  sd       $s2,16[$sp]
  sd       $s3,24[$sp]
; c1 = 0;
  mov      $s3,$x0
; c2 = 1;
  li       $s1,1
; for (n = 0; n < 23; n = n + 1) {
  mov      $s0,$x0
  li       $t5,23
  bge      $s0,$t5,.C00017
.C00016:
; if (n < 1) {
  li       $t5,1
  bge      $s0,$t5,.C00019
; nums[0] = 1;
  li       $t5,1
  sd       $t5,_nums[$gp]
; c = 1;
  li       $s2,1
  bra      .C00020
.C00019:
; nums[n] = c;
  slli     $t5,$s0,3
  la       $t6,_nums[$gp]
  add      $t5,$t5,$t6
  sd       $s2,0[$t5]
; c = c1 + c2;
  add      $s2,$s3,$s1
; c1 = c2;
  mov      $s3,$s1
; c2 = c;
  mov      $s1,$s2
.C00020:
  addi     $s0,$s0,1
  li       $t5,23
  blt      $s0,$t5,.C00016
.C00017:
.C00015:
  ld       $s0,0[$sp]
  ld       $s1,8[$sp]
  ld       $s2,16[$sp]
  ld       $s3,24[$sp]
  mv       $sp,$fp
  ld       $fp,[$sp]
  ret   


   rodata
   align   16

;--}

_________________
Robert Finch http://www.finitron.ca


Sat Jul 24, 2021 1:29 pm
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Just comparing the compiler output for riscv and any1. The riscv code is almost twice the size. 37 instructions vs 21 for any1. A lot of the difference is in the prolog and epilog sequences where any1 makes use of load-store multiple and enter and leave instructions. For a larger, more realistic routine there would not be as much of a difference. I am tempted to compile the boot rom for riscv and compare.

_________________
Robert Finch http://www.finitron.ca


Sat Jul 24, 2021 5:28 pm
Profile WWW

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1780
That's a good result!


Sat Jul 24, 2021 8:14 pm
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Been modifying the compiler for Thor2023.

Modified the compiler to allow constants to be placed in registers. Previously the compiler did not allow this because most constants could be directly encoded in an instruction. It actually increases code size and lowers performance to place constants in registers because registers need to be loaded and stored to memory.
However, branches need to encode constants in an additional postfix, and this can be removed if frequently used constants were placed in registers for branches. So, now the compiler assigns a low priority to placing constants in registers. If the constant is used eight or ten times for instance, it may end up being placed in a register.

An issue is that common subexpression elimination takes place before it is known what the instructions are. Ideally only constants associated with branches would be placed in registers. But there is no way to know a branch instruction is going to be emitted. It could be guessed at best, so the simpler solution of low priority constants is used.

_________________
Robert Finch http://www.finitron.ca


Tue Apr 04, 2023 9:47 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
I have been busy compiling the standard C library and reviewing the output for mistakes. Code generation is looking pretty good. I have been refactoring things so there are quite a few changes to the source, but not really to the operation.
Lots of hiccups on lines like:
Code:
# *(*(short * *)(((px->ap) += (sizeof (short *) + (1U) & ~(1U))) - (sizeof (short *) + (1U) & ~(1U)))) = px->nchar;
  ldh      t1,0[s0+s8*]
  add      t1,t1,s8
  sub      t0,t1,s8
  ldh      t0,[t0]
  ldt      t0,[t0]
  ldh      t1,0[s0+s9*]
  stt      t1,[t0]


I tried comparing the code size for the sieve() program between the m68k and Thor2023.
Thor: 230 bytes
m68k: 94 bytes

Thor code outputs one or two instructions fewer than the 68k code, but the instructions are generally 2.5x wider. 40-bits versus 16-bits.

_________________
Robert Finch http://www.finitron.ca


Thu Apr 06, 2023 5:40 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Removed float triple support from the compiler.

Found and fixed some instructions that were not supported in the same manner as Thor2022. Largely being SETxx instructions. Broke up the code generation for function calls into smaller methods. Made some of the methods part of the generic code generation and others specific to Thor2023.

Improved the assembler’s support of instructions, fixed some encoding issues.

_________________
Robert Finch http://www.finitron.ca


Fri Apr 07, 2023 4:50 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Added compiler support for auto-inlining of functions. If a function is shorter than a threshold then it will be converted to an inline function. The default threshold is five instructions. The threshold may be set with the -finline command line option. The function is still output from the compiler even though it is also in-lined, in case the function is called via a pointer, or it is externally linked.

Added float power series to enumerations. It is now possible to use a float constant to define the power increment. The enumeration will calculate the appropriate constants using float arithmetic, then assign them to integer constants. Example:
Code:
enum { X = 2 } x;
enum (*1.5) { a, b, c, g, h} Y;

int
main()
{
   return X * c;
}

The value 4 is return. (2 * 2.25)
Code:
_main:
  ldi      a0,4
.00011:
  rts   

Enumerations are always signed 16-bit integer values.

Added literal compression of quad precision to double or single precision values for use in float operations. If the value can be represented at a lower precision without losing precision then it will be. This is for immediate constants placed in the code. To do is add support for compression to half precision.

_________________
Robert Finch http://www.finitron.ca


Tue Apr 11, 2023 3:24 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
More work on the compiler, mainly on aggregate assignments. They work much better now. I managed to get the dot (.) and brackets [] selection of aggregates working. The generated code looks much closer to working.

Got the compiler to use program counter relative addressing to read constant data. This avoids having to use a register to hold an address pointing to constant data. Unfortunately, since memory pages are 16kB in size, the displacement field of the typical load or store instruction is not large enough to reach constant data. This means 32-bit constant postfix instruction will be output most of the time to reach this data.

_________________
Robert Finch http://www.finitron.ca


Wed Apr 12, 2023 7:38 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Cc64 now supports nested functions! Functions can be defined within other functions to a depth of 30. This is a feature of some languages like Pascal and C#.

Modified the inline() attribute to accept a parameter specifying the degree of inlining. If the parameter is zero the method will never be inlined. Otherwise, it will be inlined if the number of instructions in the method is less than the threshold. So, “inline(500) sieve();” will inline the sieve function if it contains fewer than 500 instructions. The count is only approximate as it includes labels and other non-instructions. A lot of code may be created if nested inline functions are used.

The “bit” type and bit arrays are busted. I may drop these from the compiler.
Code:
int
main()
{
   inline(20) int foo1() {
      return 43;
   }
   int sub1(int a, int b) {
      int sub2(int c, int d) {
         return (c*d);
      }
      return (a+b);
   }
   printf("%d", foo());
   printf("%d", foo1()*8);
   sieve();
}

_________________
Robert Finch http://www.finitron.ca


Thu Apr 13, 2023 4:38 am
Profile WWW
Display posts from previous:  Sort by  
Reply to topic   [ 82 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6  Next

Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group
Designed by ST Software