Last visit was: Thu Oct 31, 2024 11:50 pm
|
It is currently Thu Oct 31, 2024 11:50 pm
|
Author |
Message |
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2205 Location: Canada
|
C64 compiler is 'C' derivative that should be able to compile almost any C program. It has a few extra features not found in standard C. It's evolved for a few years now for supporting different processing cores, most recently for the FT64 core. The most recent improvement was the compiler now realizes when it can use r1 and r2 as temporaries in a leaf function. This compresses a number of smaller routines which use r1 to return values. Example abs() function which has just a single line of code (r18 is the I parameter): Code: _abs: ; return ((i < 0) ? -i : i); bge r18,r0,stdlib_3 neg r1,r18 bra stdlib_4 stdlib_3: mov r1,r18 stdlib_4: stdlib_5: ret #8
_________________Robert Finch http://www.finitron.ca
Last edited by robfinch on Fri Jan 26, 2024 3:43 pm, edited 2 times in total.
|
Thu Jul 20, 2017 7:25 am |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2205 Location: Canada
|
Updated FPP to search the ‘FPPINC’ directory for files before the normal ‘INCLUDE’ directory. The problem was unintentionally reading MS include files when building the system software. A private include directory path was required. FPP is the pre-processor for the compiler which handles all the '#' directives.
_________________Robert Finch http://www.finitron.ca
|
Fri Jul 21, 2017 3:17 am |
|
|
BigEd
Joined: Wed Jan 09, 2013 6:54 pm Posts: 1803
|
Hi Rob I take it that's this code https://github.com/robfinch/Cores/tree/ ... ftware/C64but it looks like you have several versions of both C64 and FPP in your repo - one for each core? Looks like C64 is in Visual C++ and FPP is in C. I suppose A64 is the assembler for each core - is that needed by your C compiler? And I see E64 in some cases, but I'm not sure what it might be. Interestingly, we've just added a preprocessor to our OPC series, and we've used filepp which is a simple preprocessor written in perl. Since picking that, we've noticed gpp which is a rather general preprocessor capable of acting somewhat like cpp but with many different optional behaviours.
|
Fri Jul 21, 2017 3:33 am |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2205 Location: Canada
|
Quote: but it looks like you have several versions of both C64 and FPP in your repo - one for each core? Looks like C64 is in Visual C++ and FPP is in C. I suppose A64 is the assembler for each core - is that needed by your C compiler?
Yes, there is a separate version for each core. At one time I was trying to maintain a single version with support for multiple cores but it's a lot of work. The most recent set of tools is under the FT64 folder. They only work properly for FT64 however. A64 might still work with other cores. I keep the backend of the assembler in a separate file for each core. I used to do the same for the compiler but I found there were switch statements all over the place to accommodate different cores. I'm using replication in part as a means for backup of tools, borrowing an idea from nature. I don't want to destroy the existing working software. So I replicate the toolset for a new core then modify it to suit without going backwards and updating older software. But it results in a lot of duplicates and outdated software. A64 is needed to assembler the output of the compiler. E64 is a software emulator for FT64 (not working yet). L64 is a simple linker (but it's not been used in a while - seriously out of date). I studied how gcc works with it's backends for different processors but I'm not sure that it'd be any less work to develop for a new core than simply replicating an existing compiler. I'm not fond of the giant master control program idea. gcc does have other benefits. Quote: Interestingly, we've just added a preprocessor to our OPC series, and we've used filepp which is a simple preprocessor written in perl. Since picking that, we've noticed gpp which is a rather general preprocessor capable of acting somewhat like cpp but with many different optional behaviours. To each his own. It's good to be able to reuse existing software. I haven't been able to get a unix like interface working reliably on my Windows workstation. One reason I've avoided working with some toolsets.
_________________Robert Finch http://www.finitron.ca
|
Fri Jul 21, 2017 10:14 am |
|
|
BigEd
Joined: Wed Jan 09, 2013 6:54 pm Posts: 1803
|
Thanks for the detail. I don't think I'd spotted E64. I do see the problem with trying to support multiple core designs. In fact this is exactly why we've felt the need for a preprocessor, but time will tell whether we have a maintainable approach (we have three somewhat similar cores at present.)
|
Fri Jul 21, 2017 10:29 am |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2205 Location: Canada
|
The compiler has been modified to accept branch hints in 'if' statements. The 'if' expression can now optionally take a second constant expression to specify the branch prediction. So an 'if' statement with a statically predicted taken branch would look like: Code: if (a <10; 1) ... <other code> 1 is predicted taken, 0 is predicted not-taken. Leaving the second expression out results in a dynamically determined branch prediction.
_________________Robert Finch http://www.finitron.ca
|
Fri Jul 28, 2017 4:17 pm |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2205 Location: Canada
|
A C64 compiler for OPC5 is in the works. It is stored under the software/C64 - OPC5 folder on my GitHub account.
_________________Robert Finch http://www.finitron.ca
|
Sun Jul 30, 2017 12:05 am |
|
|
BigEd
Joined: Wed Jan 09, 2013 6:54 pm Posts: 1803
|
Very exciting! And thanks of course for finding OPC5 sufficiently intriguing.
|
Sun Jul 30, 2017 5:55 am |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2205 Location: Canada
|
I just wrote a long post and it timed out when I went to post. Compiler output for aggregate assignments (not quite working yet): Code: typedef struct A { int a,b; } A; int*f1(){ A x[2]={{1,2},{3,4}}; return g(&x[1].a); // { dg-warning "returns address of local variable" } }
Compiler outputs: Code: public code _f1: sub r14,r0,4 sto r13,r14,0 sto r12,r14,2 mov r12,r14,0 sub r14,r0,12 # A x[2]={{1,2},{3,4}}; lea r5,r12,-12 mov r6,r5 mov r7,r0,3 sto r7,r6,0 # return g(&x[1].a); // { dg-warning "returns address of local variable" } mov r5,r0,6 lea r6,r12,-12 add r5,r6,0 sub r14,r0,2 sto r5,r14,0 mov r13,r15,2 mov r15,r0,_g add r14,r0,2 addrtmp2_4: mov r14,r12,0 ld r13,r14,0 ld r12,r14,2 add r14,r0,4 mov r15,r13,0 endpublic
_________________Robert Finch http://www.finitron.ca
|
Mon Jul 31, 2017 7:39 am |
|
|
BigEd
Joined: Wed Jan 09, 2013 6:54 pm Posts: 1803
|
Hi Rob Amazing to see compiler output for OPC5!
I don't think I've fully digested the code, but I notice a things: - our preference, or convention, is to use r1-r4 for three purposes: passing parameters in, as scratch during a routine, and returning results. With this convention, simpler subroutines would be able to avoid any stack allocation. - I see 'lea' which we don't have - probably this is a 'mov'? - I think the trailing zero can be omitted, and depending on the assembler it might be that you need to do that, to get the one-word form which will be smaller and maybe faster. - If you move up to the OPC6 instruction set, some or all of your 'sub' will become 'dec' which again may be smaller and faster. - I think possibly your stack adjustments are assuming a byte addressed stack? If so, they should be halved.
|
Mon Jul 31, 2017 4:05 pm |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2205 Location: Canada
|
I've fixed a number of things since the last post. I fixed the 'lea' to an add/mov operation. I also fixed the stack adjustments for a word oriented machine. For the trailing zero I was assuming the assembler might not like it if they were omitted. It works much better now at aggregate assignments. Code: _f1: sub r14,r0,2 sto r13,r14,0 sto r12,r14,1 mov r12,r14,0 sub r14,r0,6 # A x[2]={{c*8,2},{3,4*c}}; mov r5,r0,0 add r5,r12,-6 mov r6,r5 ld r7,r12,2 sto r7,r12,2 ldb r7,r12,2 mov r8,r0,8 mov r1,r7,0 mov r2,r8,0 mov r13,r15,2 mov r15,r0,_mul sto r1,r6,0 mov r7,r0,2 sto r7,r6,1 mov r7,r0,3 sto r7,r6,3 ld r7,r12,2 sto r7,r12,2 ldb r7,r12,2 mov r8,r0,4 mov r1,r7,0 mov r2,r8,0 mov r13,r15,2 mov r15,r0,_mul sto r1,r6,4 # return g(&x[1].a); // { dg-warning "returns address of local variable" } mov r5,r0,3 mov r6,r0,0 add r6,r12,-6 add r5,r6,0 sub r14,r0,1 sto r5,r14,0 mov r13,r15,2 mov r15,r0,_g add r14,r0,1 addrtmp2_4: mov r14,r12,0 ld r13,r14,0 ld r12,r14,1 add r14,r0,2 mov r15,r13,0
Still a couple of bugs, but closer. The compiler croaked on another gcc test for a complicated assignment, but I'm not going to worry about it because the MSC compiler also croaked in the same way
_________________Robert Finch http://www.finitron.ca
|
Tue Aug 01, 2017 4:04 am |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2205 Location: Canada
|
Adding a vector type to C64 (but not for OCP5). Vectors are assumed to be 512 bytes in size. (64, 64 bit elements) so they eat up memory space really fast. A variable can be declared as an 'int vector' or 'float vector' and the compiler will use vector registers and operations with the var.
_________________Robert Finch http://www.finitron.ca
|
Tue Aug 01, 2017 4:08 am |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2205 Location: Canada
|
The last example used stacked space for variables. This example shows the compiler doesn't allocate and link the stack if it doesn't need to. Note in order to get registers used as parameters the register keyword has to be used. Code: int abs(register int a) { return a < 0 ? -a : a; }
int min(register int a, register int b) { return a < b ? a : b; }
int max(register int a, register int b) { return a > b ? a : b; }
unsigned int minu(register unsigned int a, register unsigned int b) { return a < b ? a : b; } Code: code _abs: # return a < 0 ? -a : a; cmp r8,r0,0 pl.mov r15,r0,TestAbs_4 TestAbs_6: not r1,r8,0 add r1,r0,1 mov r15,r0,TestAbs_5 TestAbs_4: mov r1,r8 TestAbs_5: TestAbs_7: mov r15,r13,0 _min: # return a < b ? a : b; cmp r8,r9,0 pl.mov r15,r0,TestAbs_11 TestAbs_13: mov r1,r8 mov r15,r0,TestAbs_12 TestAbs_11: mov r1,r9 TestAbs_12: TestAbs_14: mov r15,r13,0 _max: # return a > b ? a : b; cmp r8,r9,0 mi.mov r15,r0,TestAbs_18 z.mov r15,r0,TestAbs_18 mov r1,r8 mov r15,r0,TestAbs_19 TestAbs_18: mov r1,r9 TestAbs_19: TestAbs_20: mov r15,r13,0 _minu: # return a < b ? a : b; cmp r8,r9,0 nc.mov r15,r0,TestAbs_24 TestAbs_26: mov r1,r8 mov r15,r0,TestAbs_25 TestAbs_24: mov r1,r9 TestAbs_25: TestAbs_27: mov r15,r13,0
I still have to figure out how to use r1-r4 as parameters, temporaries and return values at the same time. If a register is used as a parameter it has to be flagged as not available as a temporary because the value might be needed later in the function. It's easy to code by hand in assembler but not so simple for the compiler.
_________________Robert Finch http://www.finitron.ca
|
Tue Aug 01, 2017 4:57 am |
|
|
BigEd
Joined: Wed Jan 09, 2013 6:54 pm Posts: 1803
|
robfinch wrote: I still have to figure out how to use r1-r4 as parameters, temporaries and return values at the same time. If a register is used as a parameter it has to be flagged as not available as a temporary because the value might be needed later in the function. It's easy to code by hand in assembler but not so simple for the compiler. Ah, of course - an interesting one - I know almost nothing of the innards of a compiler. Perhaps condensing the three uses into fewer registers is something which can be done after the function is fully captured in a suitable data structure.
|
Tue Aug 01, 2017 5:31 am |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2205 Location: Canada
|
Quote: - our preference, or convention, is to use r1-r4 for three purposes: passing parameters in, as scratch during a routine, and returning results. With this convention, simpler subroutines would be able to avoid any stack allocation. I think it's going to be too difficult for the compiler to manage the use of r1 to r4. If one looks at something like RISC-V they have the use of registers as return values, function arguments and temporaries as separate register ranges. For the RISC-V spec: Quote: x16–17 v0–1 Return values Caller x18–25 a0–7 Function arguments Caller x26–30 t0–4 Temporaries Caller
I don't know if gcc can handle it. It's still possible for the compiler to avoid stack allocations by using registers. I'd suggest using only r1, r2 as return values or temporaries (for compiled code) and using two other ranges of registers for parameters and temporaries. Right now C64 uses r8 to 10 as parameters and r5 to r7 (+r1,r2 sometimes) as temporaries. It might be desirable to leave a couple of registers unassigned, unused by the compiler (r3, r4 scratch use). Quote: Ah, of course - an interesting one - I know almost nothing of the innards of a compiler. Perhaps condensing the three uses into fewer registers is something which can be done after the function is fully captured in a suitable data structure. It probably could be done, but then one wouldn't know for sure which registers the compiler chose to optimize for usage. Suppose it can't optimize the use of parameter register. Then which register is a parameter register would depend on the function called. That'd make it difficult to interface to hand written assembler code.
_________________Robert Finch http://www.finitron.ca
|
Tue Aug 01, 2017 5:48 am |
|
Who is online |
Users browsing this forum: claudebot and 0 guests |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot post attachments in this forum
|
|