Last visit was: Sat Sep 07, 2024 10:28 am
It is currently Sat Sep 07, 2024 10:28 am



 [ 77 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6  Next
 Porting the C64 compiler to target OPC5 (or OPC6) 
Author Message

Joined: Tue Jan 15, 2013 10:11 am
Posts: 114
Location: Norway/Japan
robfinch wrote:
'C' does not guarantee the order in which the auto-inc and auto-dec operations take place. Different compilers support them differently.
If one does something like p[ndx++] the index may be updated either before or after it's used to index the element depending on the compiler.
This is not correct, as far as I know. N++ and ++N are well defined.
Code:
if (a++) /* Test, then increment */
if (++a) /* Increment, then test */

If the sequence is undetermined it would be impossible to do things like the above, or this snippet from BSD strlcpy:
Code:
        while (--n != 0){
            if ((*d++ = *s++) == '\0')
                break;
        }


Last edited by Tor on Fri Aug 11, 2017 8:56 am, edited 1 time in total.



Fri Aug 11, 2017 8:52 am

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1796
robfinch wrote:
If one does something like p[ndx++] the index may be updated either before or after it's used to index the element depending on the compiler.

Are you absolutely sure? Although there are difficulties with the semantics of pre and post increment and decrement, I don't think this is one of them.

Edit: oops, I see I failed to notice the reply from Tor.


Fri Aug 11, 2017 8:53 am

Joined: Tue Feb 10, 2015 7:07 am
Posts: 52
robfinch wrote:
Okay, I have updated the compiler. I got little side-tracked trying to add a color-graphing register allocator. It's just a bit too complex for me to understand :) So I've been researching and scratching my head.

Excellent, I'll continue testing the Pi program this morning.

Current code is here:
https://github.com/hoglet67/opc/tree/c_ ... i-spigot-c

Are you happy for me to continue to report issues here?
robfinch wrote:
'C' does not guarantee the order in which the auto-inc and auto-dec operations take place. Different compilers support them differently.
If one does something like p[ndx++] the index may be updated either before or after it's used to index the element depending on the compiler. I think not even brackets are respected.
When I found this out I started writing all the inc/dec operations as separate instructions with a semi-colon. So I write
p[ndx]; ndx++;
to guarantee the order and it's portable between compilers. (Yes I got burned on this issue before).

My understanding is the opposite.

*ptr++ is evaluated as *(ptr++) because the precedence of the postfix increment operator is higher than the dereference operator.

ptr++ should be evaluated as ptr, and the increment should happen afterwards.

This is what section 6.5.2.4 of the ANSI C standard (C99) says:
Quote:
2 The result of the postfix ++ operator is the value of the operand. After the result is
obtained, the value of the operand is incremented. (That is, the value 1 of the appropriate
type is added to it.) See the discussions of additive operators and compound assignment
for information on constraints, types, and conversions and the effects of operations on
pointers. The side effect of updating the stored value of the operand shall occur between
the previous and the next sequence point

http://www.open-std.org/jtc1/sc22/wg14/ ... df#page=87

This is consistent with the C89 version:
http://port70.net/~nsz/c/c89/c89-draft.html#3.3.2.4

Section 2.8 of K&R (2nd Edition) (page 46) says the same. As does the 1st Edition (archive.org).

There may be some compilers that get this wrong, but it seems pretty clear the compiler would be at fault, rather than the spec being ambiguous.

Dave


Fri Aug 11, 2017 9:28 am

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2153
Location: Canada
More on #4.

outch(*ptr++);

Currently gets evalulated like: outch (*(ptr++));

If I move the ++ later it gets evaluated like:

outch((*ptr)++);

which increments the character, probably not what is desired.

However I have to admit that most C compilers do what the person programming expects.

I can probably put a hack in to see if there's an identifier directly before the ++.
I'm not sure what'll happen with a more complex expression though.

Suppose its: outch(*structvar.member++);

The ++ would increment the struct member then, if it's just a simple identifier check.
More head scratching. I think I can fix it. Some experimentation required.

_________________
Robert Finch http://www.finitron.ca


Fri Aug 11, 2017 9:57 am WWW

Joined: Tue Feb 10, 2015 7:07 am
Posts: 52
Hi Rob,

I can understand how postfix increment might be a pain to implement. As long as we know, it can easily be avoided.

I'm continuing debugging the Pi program. Here's the next issue:

11. Array read not dereferenced properly
Code:
         x = 10 * pi[i]+ q * i;

Code:
018d  10c5 fea8                            mov     r5,r12,-344    # r5 is now the address of pi[0]
018f  0037                                 mov     r7,r3          # r3 is i
0190  0457                                 add     r7,r5          # r7 is now the address of pi[i]
0191  1005 000a                            mov     r5,r0,10
0193  0071                                 mov     r1,r7          # This should be ld r1, r7 as r7 is the address of pi[i]
0194  0052                                 mov     r2,r5
0195  190d 02e9                            jsr     r13,r0,__mul
0197  0016                                 mov     r6,r1
0198  17c5 fff9                            ld      r5,r12,-7
019a  0051                                 mov     r1,r5
019b  0032                                 mov     r2,r3
019c  190d 02e9                            jsr     r13,r0,__mul
019e  0017                                 mov     r7,r1
019f  0065                                 mov     r5,r6
01a0  0475                                 add     r5,r7
01a1  0054                                 mov     r4,r5

Dave


Fri Aug 11, 2017 10:17 am

Joined: Tue Feb 10, 2015 7:07 am
Posts: 52
And the next one.

12. Multiply by 2 not correct
Code:
int bug12() {
   int a = 123;
   int b = a * 2;
   return b;
}

Code:
_bug12:
               push    r13,r14
               push    r12,r14
               mov     r12,r14
               dec     r14,2
               mov     r5,r0,123
               sto     r5,r12,-1
               ld      r5,r12,-1
               add     r5,r0          # should be add r5, r5
               sto     r5,r12,-2
               ld      r5,r12,-2
               mov     r1,r5
               mov     r14,r12
               pop     r12,r14
               pop     r13,r14
               mov     r15,r13


Fri Aug 11, 2017 10:42 am

Joined: Tue Feb 10, 2015 7:07 am
Posts: 52
And after manually patching the generated code for (11) and (12) we get:
Code:
*go 100
@ABCD
1234
5678
3.141592653589793238462643383279502884197169399375105820974944592307816406286208998628034825342117067
*

:D :D :D
Dave


Fri Aug 11, 2017 10:55 am

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1796
Hurrah - a milestone! Congrats to everyone involved!


Fri Aug 11, 2017 10:57 am

Joined: Tue Feb 10, 2015 7:07 am
Posts: 52
Here's 250 digits:
Code:
*go 100
@ABCD
1234
5678
3.1415926535897932384626433832795028841971693993751058209749445923078164062862089986280348 [line breaks]
253421170679821480865132823066470938446095505822317253594081284811174502841027019385211055 [added by mod]
596446229489549303819644288109756659334461284756482337867831652712019091
*


13. Compiler crashes:

I'm hitting a couple of compiler crashes if I change the integer type from "int" to "unsigned int" or "long int":

Using "unsigned int":
Code:
./c64 pi.c
GenerateTempRegPop()/2


Using "long int":
Code:
dmb@quadhog:~/atom/opc/examples/pi-spigot-c$ ./c64 pi.c
Segmentation fault (core dumped)


And the latter in gdb:
Code:
dmb@quadhog:~/atom/opc/examples/pi-spigot-c$ gdb ./c64 core
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./c64...done.
[New LWP 6714]
Core was generated by `./c64 pi.c'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0804d02d in GenerateBinary (node=0x9154e2c, flags=33807, size=1, op=0) at CodeGenerator.cpp:906
906      ap3->isAddress = ap1->isAddress | ap2->isAddress;
(gdb) bt
#0  0x0804d02d in GenerateBinary (node=0x9154e2c, flags=33807, size=1, op=0) at CodeGenerator.cpp:906
#1  0x0805085a in GenerateExpression (node=0x9154e2c, flags=33807, size=1) at CodeGenerator.cpp:2113
#2  0x080561d2 in Statement::Generate (this=0x9154d74) at GenerateStatement.cpp:742
#3  0x08056058 in Statement::GenerateCompound (this=0x9154d14) at GenerateStatement.cpp:677
#4  0x0805612a in Statement::Generate (this=0x9154d14) at GenerateStatement.cpp:713
#5  0x08054b07 in Statement::GenerateFor (this=0x9154c5c) at GenerateStatement.cpp:254
#6  0x08056260 in Statement::Generate (this=0x9154aec) at GenerateStatement.cpp:767
#7  0x08056058 in Statement::GenerateCompound (this=0x914fa94) at GenerateStatement.cpp:677
#8  0x0805612a in Statement::Generate (this=0x914fa94) at GenerateStatement.cpp:713
#9  0x0805dec3 in GenerateFunction (sym=0x809864c <compiler+43308>) at OPC6.cpp:560
#10 0x080709d1 in ParseFunctionBody (sp=0x809864c <compiler+43308>) at ParseFunction.cpp:468
#11 0x080703a7 in ParseFunction (sp=0x809864c <compiler+43308>) at ParseFunction.cpp:336
#12 0x080665fb in Declaration::declare (parent=0x0, table=0x808c000 <gsyms>, al=2, ilc=0, ztype=28) at ParseDeclarations.cpp:1385
#13 0x08066bb6 in GlobalDeclaration::Parse (this=0x9149a44) at ParseDeclarations.cpp:1527
#14 0x08051e43 in Compiler::compile (this=0x808dd20 <compiler>) at Compiler.cpp:105
#15 0x0804a850 in main (argc=1, argv=0xbfdf53d8) at Cmain.cpp:82
(gdb) print ap1
$1 = (AMODE *) 0x915f954
(gdb) print ap2
$2 = (AMODE *) 0x0
(gdb) print ap3
$3 = (AMODE *) 0x915f93c
(gdb)


Dave


Fri Aug 11, 2017 11:37 am

Joined: Tue Dec 11, 2012 8:03 am
Posts: 285
Location: California
Tor wrote:
robfinch wrote:
'C' does not guarantee the order in which the auto-inc and auto-dec operations take place. Different compilers support them differently.
If one does something like p[ndx++] the index may be updated either before or after it's used to index the element depending on the compiler.
This is not correct, as far as I know. N++ and ++N are well defined.

This is something I ran into in the K&R C book, 2nd Edition, a few months ago. P.53-54 say,

    Function calls, nested assignment statements, and increment and decrement operators cause "side effects"—some variable is changed as a by-product of the evaluation of an expression. In any expression involving side effects, there can be subtle dependencies on the order in which variables taking part in the expression are updated. One unhappy situation is typified by the statement
    Code:
         a[i] = i++;

    The question is whether the subscript is the old value of i or the new. Compilers can interpret this in different way, and generate different answers depending on their interpretation. The standard intentionally leaves most such matters unspecified. When side effects (assignment to variables) take place within an expression is left to the discretion of the compiler, since the best order depends strongly on machine architecture.

P.90 says,

    If you examine the expansion of max, you will notice some pitfalls. The expressions are evaluated twice; this is bad if they involve side effects like increment operators or input and output. For instance,
    Code:
         max(i++, j++)     /* wrong */

    will increment the larger value twice.

Reading the book (and part of another on C) in the extent of my expertise on C. You guys know the language much better than I; but I thought I'd bring this in, from the horse's mouth.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources


Fri Aug 11, 2017 5:22 pm WWW

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1796
It's true, there are pitfalls, but I think they are not difficult to avoid. If you were really nervous you could just avoid using increment and decrement in complex expressions.

But I think, other than Rob, we believe the simple case of func(a++) or even func(*a++) are not pitfall territory - they are well-defined.


Fri Aug 11, 2017 5:30 pm

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1796
(I'm not sure if this will help anyone, but there's a thing called sequence points which define when C actually makes changes to values. Or something.)


Fri Aug 11, 2017 6:54 pm

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2153
Location: Canada
Okay, the latest fix for the compiler is available. An address mode spec of am_ind needed to be added to the GenerateIndex() function to fix the array reference problem.
Shifts should be fixed too.

I tried several combinations of unsigned int / int but couldn't get the compiler to crash. Could I please have a sample program ?

The compiler sometimes generates better code if n++ is used rather than n = n + 1.

_________________
Robert Finch http://www.finitron.ca


Fri Aug 11, 2017 9:40 pm WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2153
Location: Canada
Okay, the latest fix for the compiler is available.
Fixes crashes having to do with long variables.

_________________
Robert Finch http://www.finitron.ca


Fri Aug 11, 2017 11:12 pm WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2153
Location: Canada
Another fix for the compiler. It should fix references to globals variables.

The compiler as it is outputs variable declarations as it encounters them in the program. For 'C' variables must be declared before they are referenced in the program. This means the compiler will output space for global variables *before* code is output. This is opposite to what is expected to be loaded. I could possibly have the compiler output a jump instruction first to jump around the global variables. Or it might be easier to output a couple of special labels "__begin_data" "__end_data" so that the variable space can be cut from the source file and pasted after the code. Could this be done by the assembler ?

Code:
#   bss
_pi:
   WORD   0
   WORD   0
   WORD   0
   WORD   0
   WORD   0
   WORD   0
   WORD   0
   WORD   0
   WORD   0
   WORD   0
   WORD   0
   WORD   0
   WORD   0
   WORD   0
   WORD   0
   WORD   0
   WORD   0
   WORD   0
   WORD   0
   WORD   0

#   code
_main:
   # int pi[20];
               push    r13,r14
               push    r12,r14
               mov     r12,r14
   #    int x, i;
               dec     r14,2
   #    x = 10 * pi[i];
               ld      r7,r12,-2
               mov     r6,r7
               ld      r6,r6,_pi
               mov     r7,r0,10
               mov     r1,r6
               mov     r2,r7
               jsr     r13,r0,__mul
               mov     r5,r1
               sto     r5,r12,-1
               mov     r14,r12
               pop     r12,r14
               pop     r13,r14
               mov     r15,r13


#   rodata
#   global   _main
#   global   _pi

_________________
Robert Finch http://www.finitron.ca


Sat Aug 12, 2017 5:07 am WWW
 [ 77 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6  Next

Who is online

Users browsing this forum: CCBot and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  
Powered by phpBB® Forum Software © phpBB Group
Designed by ST Software