Last visit was: Sun Dec 22, 2024 6:24 am
|
It is currently Sun Dec 22, 2024 6:24 am
|
Author |
Message |
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2231 Location: Canada
|
After several hardware fixes and several software work-arounds the demo can at least clear the screen. Indexed addressing using pointers does not seem to compiler properly. It leaves out the scaling of the index register and the index register load. Code like the following does not work. Code: for (n = 0; n < 56 *31; n = n + 1) pScreen[n] = DBGAttr|' '; Changing the code to use pointer incrementation seems to work: Code: pScreen = 0xFFD00010; for (n = 0; n < 56*31; n = n + 1) *pScreen++ = my_rand();
_________________Robert Finch http://www.finitron.ca
|
Sun Sep 19, 2021 4:59 am |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2231 Location: Canada
|
I started working on my own PowerPC compiler. The PowerPC compiler is coming along. It does not yet generate code good enough to execute but it is getting there. The compiler is being modified to output a syntax that vasm and vlink can digest. While writing the compiler the plan is to use the existing assembler and linker from vbcc.
I’ve been playing with the sprite controller lately.
_________________Robert Finch http://www.finitron.ca
|
Fri Sep 24, 2021 4:44 am |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2231 Location: Canada
|
The compiler code output looks almost good enough to try executing. While code generation appears good, there is still an issue with generation of variables in the data, bss and rodata segments. There is also an issue with code generation quitting too soon unless optimization is turned on, which seems a bit strange to me. If optimization is on the entire code is generated, off and it quits partway through the generation. Following is a simple example of code generation. Code: int my_abs(register int a) { if (a < 0) a = -a; return (a); } Code: .text .align 4
.global _my_abs .align 4
#==================================================== # Basic Block 0 #==================================================== _my_abs: cmpwi cr0,r3,0 bge cr0,.C00013 neg r3,r3 .C00013: .C00012: blr .type _my_abs,@function .size _my_abs,$-_my_abs # stacksize=48 .set ___stack_my_abs,48
_________________Robert Finch http://www.finitron.ca
|
Sat Sep 25, 2021 4:53 am |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2231 Location: Canada
|
Latest Fixes: Hardware: the target and source register were swapped around for the SRAWI instruction which led to an infinite loop as the wrong register got updated. For this version of the PowerPC for some instructions the Ra source register and Rt target register fields are swapped. This has led to some confusion in the past. The mask begin and mask end fields were processed in the wrong bit order. PowerPC encodes the most significant bit as bit 0, the least significant as bit 31. This led incorrect masking of shifts. Mask generation was also incorrect for some cases. Shift and rotate operations were using the wrong pipeline ir to determine the shift and mask amounts. Software: Loading and storing the link register needed to be done indirectly through another register as a direct load or store of the link register is not supported on PowerPC. This led to a hang on a return from a non-leaf routine. Additions: pipeline loop mode was added. Up to seven instructions in the pipeline may be circulated without instruction fetches. Modifications: renamed the project rfPower
_________________Robert Finch http://www.finitron.ca
|
Wed Sep 29, 2021 5:13 am |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2231 Location: Canada
|
Issues with the link register bypass network were resolved. The stack is messed up. The return address is stored at $FFFCFF8C but readback of the address is done at $FFFCFF84. I cannot find where the difference of 8 is coming from. Function entry and exit code looks good. This is an issue because it is causing a two-up return to the program exit point. This happens with non-leaf functions. I am getting the compiler to insert a couple of extra NOP instructions around the load and store of the link register. My thinking is that it may be a link register forwarding issue and the extra NOPs should bypass the forwarding.
The compiler was not outputting any storage directive for .bss variables. This led to all the variables being stacked up at address zero causing software issues. It really needs an .lcomm directive but for now it just outputs a .byte which works with a warning message spit out by the linker.
_________________Robert Finch http://www.finitron.ca
|
Thu Sep 30, 2021 4:01 am |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2231 Location: Canada
|
Had lots of fun getting switch cases to work properly. They were outputting multiple copies of code for cases. This led to thousands of lines of code being output for a simple switch. Hardware fixes: the bits in the condition register were reversed in order. Software worked only because decoding was also in the reverse order. The bits have now been set to the order of the PowerPC. Software fixes: the compiler was always outputting an ADDI instruction for small constants when it should have been an ADDI with a negative constant to do a subtract. This led to the sprite demo always moving sprites down towards the right instead of in a variety of directions.
Conditional logic, if statements, does not appear to be working correctly. The sprites bounced back and forth about the same location as direction toggled due to an if always evaluating to true; the branch is always false. Conditional logic for ‘for’ loops seems to work, so it is looking like a compile error. For loops are working but they are comparing a register against an immediate value, if statements are comparing a value loaded from memory against an immediate. There could be an issue with forwarding logic for the value from memory. I have studied both the compiler and Verilog rtl code for potential issues. Nothing obvious appears. So, I put a couple of NOP instructions before the CMPW removing the need for bypassing. If that causes things to work then it is known to be a bypass issue.
_________________Robert Finch http://www.finitron.ca
|
Sun Oct 03, 2021 5:32 am |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2231 Location: Canada
|
Found and fixed a software issue that was causing a two-up return from subroutine. The compiler was outputting the stack unlink code in the wrong order. After the fix subroutine call and return seems to work fine. I am looking for a fast way to emulate the SET instructions using PowerPC code. A simple set equal to zero takes about five instructions. A sequence of instructions like the following is required: Code: cmpwi cr0,r11,0 mfcr r11 srwi r11,r11,29 andi. r11,r11,1 xori r11,r11,1
_________________Robert Finch http://www.finitron.ca
|
Tue Oct 05, 2021 5:20 am |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2231 Location: Canada
|
The core works well enough now to display single characters with the DBGDisplayChar() routine. Displaying strings does not work yet. There is an issue with getting the correct endian for stored data. The assembler outputs data in big-endian. PowerPC is great because the endian is controlled by a bit in the machine status register. The default operation of the core is little endian. When operating in big-endian mode it takes an extra clock cycle on a load to swap the endian around.
_________________Robert Finch http://www.finitron.ca
|
Wed Oct 06, 2021 8:07 am |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2231 Location: Canada
|
Found yet another software bug. When passing arguments to a function, to push the argument the stack pointer was being decrement using a ‘sub r1,r1,4’ instruction. A couple of issues with this. The PowerPC assembler does not automatically assume immediate values. So, it thought it was a register-register subtract. The instruction needed to be ‘addi r1,r1-4’. A quick fix to the compiler was made.
Using arguments passed in registers is a bit overrated. It turns out that if the function with register arguments needs to call another function then the arguments need to be saved on the stack before the call and restored afterwards. In other words, register based arguments end up being stored on the stack anyway, so why not just use stack based arguments? The only case where register arguments are better occurs when calling a leaf routine.
_________________Robert Finch http://www.finitron.ca
|
Thu Oct 07, 2021 4:50 am |
|
|
oldben
Joined: Mon Oct 07, 2019 2:41 am Posts: 702
|
Most leaf functions I can think of are like a block move. 5 registers. 2 pointer registers 1 return register 1 counter register and temp register char * move(d,s,c) char *d,*s;int c; { char *r; r=d; while(c--) *d++=*s++; return r; }
Anything bigger is a real function, that needs a full stack. What I can see more usefull is bound check registers for stacks and memory functions as a software check.
big recusive program(stuff) ... if (stack>= toobig) clean up and exit else big recursive program(more-stuff)
|
Sun Oct 10, 2021 2:06 am |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2231 Location: Canada
|
Kinda stuck on this project at the moment. The core or software is not managing the link register correctly causing an infinite loop due to a return to the wrong point in a program. This has led to a continuous stream of 'A0' characters while trying to dump values to the screen. Things almost but do not quite work fullly yet. I am unable to run tests in SIM as LLVM keeps running out of memory. So I decided to move onto the Thor2021 project for now.
_________________Robert Finch http://www.finitron.ca
|
Tue Oct 12, 2021 7:17 am |
|
|
BigEd
Joined: Wed Jan 09, 2013 6:54 pm Posts: 1808
|
I remember one time at work the commercial synthesis program had some bug where memory usage blew up - we commissioned a very large machine, possibly a 1024G machine, so the program would at least run to completion so we could get support on a test case...
|
Thu Oct 14, 2021 3:53 pm |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2231 Location: Canada
|
Another case where the ‘just get a bigger machine’ paradigm worked. Something I have thought about. 1024GB? That’s 40+ bit addressing.
I am not sure that LLVM is running out of memory. It says it is, but SIM can run a larger more complex project just fine. But it dies on a project ¼ the size. So it smells like something else is wrong. I have a machine with 16G but it does not seem to use more than 8GB for Vivado. I wonder if there is an 8GB limit? I have thought about upgrading the memory, but if there is a limit as to how much memory Vivado will use it may not do any good.
_________________Robert Finch http://www.finitron.ca
|
Fri Oct 15, 2021 5:52 am |
|
|
BigEd
Joined: Wed Jan 09, 2013 6:54 pm Posts: 1808
|
Hmm, yes, 40 bits. This was in the days of x86 and Xeon, dual or quad socket computers, as far as I recall. I think PAE might be the term whereby x86/Pentium got beyond 32 bit addressing? But then AMD64 came along, and this must have been past that point, so perhaps PAE was a distant memory.
Sometimes of course you can just add enormous amounts of swap space and the program can then allocate what it likes - it might not actually use it heavily.
|
Fri Oct 15, 2021 8:00 am |
|
|
oldben
Joined: Mon Oct 07, 2019 2:41 am Posts: 702
|
Video memory could be a hidden factor. 40 bits virtual memory is more memory than all the C-64's ever sold. Ben.
|
Sun Oct 17, 2021 4:33 am |
|
Who is online |
Users browsing this forum: claudebot and 0 guests |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot post attachments in this forum
|
|