View unanswered posts | View active topics It is currently Thu Mar 28, 2024 12:49 pm



Reply to topic  [ 82 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6  Next
 CC64 / ARPL Compiler 
Author Message

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
The compiler continues to evolve. Color graphing still doesn’t work yet but a lot of code is written. To do are the insertions of spill and load code. I’m not sure I entirely understand parts of it so there was some guesswork involved. I’ve found the pseudo-code in the Briggs thesis helpful. Just detailed enough to get the idea.

Declare() is a monster function comprised of several hundred LOC used in declaration parsing. It does a wide variety of miscellaneous things required for parsing the declaration. So, I’m trying to break it up into several smaller functions to make it more manageable.
Been rearranging (refactoring?) the code somewhat today. Organizing parts of the parsing, optimization, and code generation to be underneath other classes. I’m not sure this is a wise choice. For instance rather than have a file called ParseEverything, parsing is broken up and falls under expression processing (ENODE.cpp), statement (Statement.cpp) processing and declaration parsing. This is based on the primary object that the parsing method is working with. It’s a matter of what’s the most convenient to work with. It was convenient for instance to have all the optimization code in one file.

_________________
Robert Finch http://www.finitron.ca


Tue Aug 28, 2018 3:40 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Global variables are slowly being moved so they are class members. Currently the compiler optimizes and generates code for functions one at a time as it’s parsing declarations. An eventual goal is to make it possible to do whole program optimizations. There are a lot of global variables that could really just be members of the ‘Function’ class or other classes.
Right now, the compiler uses a ton of memory (approx. 20MB) just to compile in it’s function by function basis. But with a machine with gigabytes of ram perhaps some more state could be retained in the compiler for whole program optimizations.

_________________
Robert Finch http://www.finitron.ca


Wed Aug 29, 2018 5:08 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Played around with some of the CC64 language constructs. CC64 allows use of the ‘then’ keyword in an if statement. It now accepts a syntax where the brackets around the if expression are not required if a ‘then’ is used.
A do-once loop construct was invented by allowing the ‘while’ at the end of a do loop to be omitted. In that case the body of the loop will be executed only a single time, unless… there’s a continue statement within the body of the do statement. Continue will cause a loop back to the top. Inserting a break statement causes the do to be broken as per usual operation. All statements are effectively ‘do’ (done) statements except the ‘do’ is missing.

So the following code prints “hello” only one time.
Code:
do {
        printf(“Hello”);
} // no while

OR

do printf(“Hello”);

Using a ‘do’ increases the loop depth tracked by the compiler. It may be used to help prioritize register allocations, giving a boost in priority to operations performed within the ‘do’.
Note that following a do-once loop with the while loop is confusing. The ‘while’ keyword will associate with an immediately previous ‘do’ statement because it looks like a ‘do-while’ statement to the compiler. So, to follow a do-once with a while loop an extra statement separator (’;’) is required.

I've been having fun debugging the register allocator. Something isn't working quite right yet. I suspect it has to do with the interference graph. There isn't enough interferences so the allocator uses too few registers.

_________________
Robert Finch http://www.finitron.ca


Sat Sep 01, 2018 3:49 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Slowly ironing out the bugs in the color graphing register allocator.
A problem that held me up for quite a while was iterating backwards through the basic blocks, a loop termination condition wasn’t correct so only a single instruction was examined out of the block. It was a single character typo difficult to spot by inspection. It was also difficult to find by stepping through code.
I finally got the register allocator to assign different colors to different live ranges. Another problem was clearing the set of used colors was being done inside a loop and it should have been outside.
Yet another problem was reconsidering already colored ranges in subsequent passes of the colorizer. Too much state was effectively reset.
I didn’t implement the renumber stage of allocator. Instead of going by a set of values, the virtual registers numbers are used directly. This takes significantly more memory in the interference graph which then has a lot of unused space. However, the amount of memory in use (20MB) is still small compared to the amount of memory in the workstation. I’ve found in recent times that I have a tendency to use memory hungry algorithms that are simplified versions of final ones.

It’s almost working. I still have to polish off the spill and fill code.
I’ve noticed the allocator does a ton of stores and loads in larger routines. I’m not sure it’s worth the extra memory overhead. There likely is something wrong with the code yet. Compiling the code also seems sluggish compared to the simpler allocator. Still it only takes less than a second to compile a file.
I have an ulterior motive for getting color graphing working. If it works well enough maybe a machine with only 16 regs could be in the works.

_________________
Robert Finch http://www.finitron.ca


Sun Sep 02, 2018 2:49 am
Profile WWW

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1780
Interesting - it feels like register allocation was a very hot topic at one point, although maybe the best known approaches have stabilised now. As you say, with improved compiler technology, maybe a simpler/cheaper/smaller CPU will suffice.


Sun Sep 02, 2018 5:33 am
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
I have come. I have conquered. The register allocator was one thing I wanted to understand, and I think I have a pretty good handle on it now. I await for a GC book to arrive in the mail. I follow comp.arch. One of the latest topics there is superconducting computers. I wonder if they could be implemented using room-temperature super-conductors.

_________________
Robert Finch http://www.finitron.ca


Sun Sep 02, 2018 7:34 am
Profile WWW

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1780
I realise there's another point lurking in your adventure: if you make a CPU with too many resources for the compiler to use, you may have something more complex/large/costly than you should have. At least, until the compiler catches up.


Sun Sep 02, 2018 7:36 am
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Labels on the mind tonight.
Creating the control flow graph has to find labels. In my simple implementation it calls a routine called ‘FindLabel’ which searches the entire peep list of instructions for the label. Well that’s inefficient and slow. So, I switched it to looking up the label in a table of labels which uses the label number as it’s index. This is just directly reading a single word without having to search. The cost is the label table has to be built, but it turns out there’s a place to do this when the optimizer is checking for unreferenced labels.
Upgraded the O(n^2) algorithm for removal of unreferenced labels to O(n). When I initially wrote it I didn’t worry too much about it as functions are generally small <100 LOC.
I’ve made several improvements to the compiler but I’ve yet to notice much difference in speed. It’s likely limited by loading of the file to be compiled.

_________________
Robert Finch http://www.finitron.ca


Mon Sep 03, 2018 3:19 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Tonight’s issue was that I created a do-once construct but didn’t update the optimizer to process it. This left a block of un-optimized code in the output. It took a bit of tracing to track down.

_________________
Robert Finch http://www.finitron.ca


Fri Sep 07, 2018 3:19 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Changed the expression parser so that it places immediate operands as the second operand (swaps order if necessary) for multiply and add operations. This allows the compiler to use immediate mode rather than loading a value into a register first. This is a very simple optimization which is done at parse time and so available even if optimization is turned off. In theory the optimizer already does this swapping around but it seems to sometimes miss opportunities depending on the order of expression nodes.

I put some time into the eventual creation of an LLVM version.

_________________
Robert Finch http://www.finitron.ca


Sat Sep 08, 2018 2:17 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Spent much of the last two days (excepting a barbeque get-together, and some other mudane tasks) trying to get cmake to recognize commands. This is needed to build LLVM. I started working on table definitions for FT64. LLVM seems fairly reasonable to learn.

Continuing on with CC64: I loaded the instruction table into Excel, sorted it, then copied it back to the source code. I found several duplicated opcodes as a result. Sorting the opcode table is in preparation for being able to parse assembly language text. Looking up the mnemonics is going to use a simple binary search as opposed to a trie search since the mnemonics are already available to be used for output.

_________________
Robert Finch http://www.finitron.ca


Sun Sep 09, 2018 9:12 pm
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Ported the compiler for RISCV compatible output.

The compiler now detects constants larger than 32 bits and outputs them to a literal array in read-only memory space, for loading with a load instruction into a temp register. Previously the compiler just output the 64-bit constant as an immediate value and left it up to the assembler to determine what to do. The assembler couldn’t handle constants over 32 bits. RISCV doesn’t have a direct means to use large constants, so there are several instructions and registers involved to load and shift the constant into place. It’s probably just as fast to load the constant from a table. As constants over 32 bits are rare the table is limited to the small size of 100 entries per module.

_________________
Robert Finch http://www.finitron.ca


Sun Feb 16, 2020 3:31 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Ported the compiler over for RTF64. It is being fixed up based on a test suite found here: https://github.com/c-testsuite/c-testsuite

It passes about 75% of the tests. Most of the test that fail are due to things not being supported or not available yet. For instance the stdint.h file is needed by some of the tests and CC64 does not yet have a version of this file. Another thing not supported by the compiler is the _Generic() operation that is newer in C.

The compiler does not handle function pointers very well. It is easy to come up with cases that cause it to spew out error messages. Since function pointers are rarely used I do not know if I will fix this or not. I believe the compiler does handle the original C style function pointers.

Just eyeballing the compiler generated code because the RTF64 processor isn't working in an FPGA yet.

_________________
Robert Finch http://www.finitron.ca


Wed Nov 11, 2020 3:30 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Spent too much of the day trying to fix the following. For f1 the compiler is associating (c and b) as the parameters not a, b. This causes an undefined symbol error on 'a' which leads to a bunch of other errors. I am not sure I want to try and fix many of the things that are legal to do in C that nobody ever does.


Code:
     1   int
     2   f2(int c, int b)
     3   {
     4   return c - b;
     5   }
     6   


*** local symbol table ***




     7   int (*
     8   f1(int a, int b))(int c, int b)
     9   {
    10   if (a != b)
 *** error 20: E Expression expected
 *** error 6: E Bad punctuation
 *** error 6: E Bad punctuation
 *** error 20: E Expression expected
 *** error 6: E Bad punctuation
 *** error 20: E Expression expected
 *** error 4: E Undefined symbol
    11   return f2;
 *** error 6: E Bad punctuation
    12   return 0;
    13   }
    14   


*** local symbol table ***




    15   int
    16   main()
    17   {
    18   int (* (*p)(int a, int b))(int c, int d) = f1;
 *** error 0: E Syntax error
    19   
    20   
    21   return (*(*p)(0, 2))(2, 2);
    22   }
    23   

_________________
Robert Finch http://www.finitron.ca


Sat Jul 17, 2021 4:04 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
This is the current work in progress, getting the compiler to correctly compile test case #T00124 . It is a little further along than a couple of days ago. What is missing is the generation of the function f1. The compiler says “aha” a function pointer so it does not expect a function body to follow, and skips spitting it out. It is interesting because of the way the pointer is declared. Since f1 is just a pointer to a func. an anonymous function is needed to set f1 to point to. In main() the f1 pointer is used to initialized another pointer. The code generated looks like it could be right even though a dereference error is reported.

Code:
     1   int
     2   f2(int c, int b)
     3   {
     4   return c - b;
     5   }
     6   


*** local symbol table ***




     7   int (*
     8   f1(int a, int b))(int c, int b)
     9   {
    10   if (a != b)
    11   return f2;
    12   return 0;
    13   }
    14   
    15   int
    16   main()
    17   {
    18   int (* (*p)(int a, int b))(int c, int d) = f1;
    19   
    20   
    21   return (*(*p)(0, 2))(2, 2);
 *** error 18: E Dereference
    22   }
    23   


*** local symbol table ***

13 _p         =fffffff8   -    Auto        Pointer to Long
13 _d         =fffffff8   -    Auto        Pointer to Function returning Pointer to Long



 

 *** global scope typedef symbol table ***

29 _main      =000018   -    Global      Function returning Long
      Parameters:
         Type array:
   
Stack Space:
      Argbot: 0
      Tmpbot: -8
      Stkspc: 8
      28 __new      =000000   -   
28 __delete   =000000   -   
13 _p         =fffffff8   -    Auto        Pointer to Function returning Pointer to Long
28 __autonew  =000000   -   
7 _b         =000010   -    Global      Long
7 _c         =000008   -    Global      Long
13 _f1        =000000   -    Global      Pointer to Function returning Long
29 _f2        =000000   -    Global      Function returning Long
      Parameters:
         Type array:
   007 007
Stack Space:
      Argbot: 0
      Tmpbot: 0
      Stkspc: 0
      
 *** structures and unions ***


Generated code:
Code:
   code
   align   16
;====================================================
; Basic Block 0
;====================================================
public code _f2:
  link     #32
  ldo      $t1,16[$fp]
  ldo      $t2,24[$fp]
  sub      $a0,$t1,$t2
T00124_8:
  unlink
  add      $sp,$sp,#16
  ret   
endpublic



       bss
   align   8
public bss _f1:
   fill.b   8,0x00                   

endpublic
    align   2
public bss _c:
   fill.b   8,0x00                   

endpublic
    align   2
public bss _b:
   fill.b   8,0x00                   

endpublic
    code
   align   16
          code
   align   16
;====================================================
; Basic Block 0
;====================================================
public code _main:
  sub      $sp,$sp,#40
  lea      $gp,__data_start
  sto      $s0,0[$sp]
; int (* (*p)(int a, int b))(int c, int d) = f1;
  ldo      $s0,_f1[$gp]
; return (*(*p)(0, 2))(2, 2);
  ldo      $t0,[$s0]
  push     $x0
  ldi      $t1,#2
  push     $t1
  jal      $ra,[$t0]
  mov      $t0,$a0
  ldi      $t1,#2
  push     $t1
  ldi      $t1,#2
  push     $t1
  jal      $ra,[$t0]
  mov      $t0,$a0
  mov      $a0,$t0
T00124_13:
T00124_16:
  ldo      $s0,0[$sp]
  add      $sp,$sp,#40
  ret   
endpublic



   rodata
   align   16
;   global   _main
;   global   _b
;   global   _c
;   global   _f1
;   global   _f2

_________________
Robert Finch http://www.finitron.ca


Sun Jul 18, 2021 6:16 am
Profile WWW
Display posts from previous:  Sort by  
Reply to topic   [ 82 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6  Next

Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group
Designed by ST Software