View unanswered posts | View active topics It is currently Thu Apr 18, 2024 11:33 pm



Reply to topic  [ 64 posts ]  Go to page 1, 2, 3, 4, 5  Next
 nPower 
Author Message

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Started work on nPower. nPower is going to be a PowerPC compatible core. The plan is to use the LLVM toolchain for software. The initial core will be 32-bits, super-pipelined with two parallel pipelines executing in-order.

Got the basic core coded with a simplified instruction set. When run in synthesis however, synthesis trims out most of the design leaving on 175 LUTs. I haven’t been able to figure out from the synthesis report where most of the trim is occurring, everything it reports are trimmed seems reasonable, so I am picking off warning messages one-by-one.
The following instructions are supported in the initial version. They should be enough to run some simple programs.
Code:
# Currently Supported Instructions

ADD   ADDI  ADDIS   LBZ   LBZU  LBZX  LBZUX
SUBF                LHZ   LHZU  LHZX  LHZUX
CMP   CMPI  EXTB    LWZ   LWZU  LWZX  LWZUX
CMPL  CMPLI EXTH    STB   STBU  STBX  STBUX
AND   ANDI  ANDIS   STH   STHU  STHX  STHUX
OR    ORI   ORIS    STW   STWU  STWX  STWUX
XOR   XORI  XORIS   B     BC    BCCTR BCLR
SLW   NEG
SRW
SRAW  SRAWI

_________________
Robert Finch http://www.finitron.ca


Sat Dec 05, 2020 5:10 am
Profile WWW
User avatar

Joined: Fri Mar 22, 2019 8:03 am
Posts: 328
Location: Girona-Catalonia
Implementing a processor compatible with a well known architecture should certainly help with the toolchain. The LLVM compiler should work great !


Sat Dec 05, 2020 12:39 pm
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Quote:
Implementing a processor compatible with a well known architecture should certainly help with the toolchain. The LLVM compiler should work great !
This is what I hope. Building LLVM will have its own challenges. A trimmed down compiler is needed, so modifications are likely.

Figured out what caused most of the logic to be trimmed. A pipeline valid signal (rval) was not being propagated resulting a zero being assigned to it. With the one pipeline stage flagged as always invalid, the entire pipeline was trimmed out. Anyway, the core size is now coming back as 40,000 LUTs, much more realistic. Many more instructions are now supported. Virtual memory is not going to be supported. The system is envisioned to contain multiple PowerPC cores each with a dedicated main memory.
Code:
 # Currently Supported Instructions

ADD   ADDI  ADDIS   LBZ   LBZU  LBZX  LBZUX   MFCR    MTCRF
SUBF                LHZ   LHZU  LHZX  LHZUX   MFSPR   MTSPR
CMP   CMPI  EXTB    LWZ   LWZU  LWZX  LWZUX   MFLR    MTLR
CMPL  CMPLI EXTH    STB   STBU  STBX  STBUX   MFCTR   MTCTR
AND   ANDI  ANDIS   STH   STHU  STHX  STHUX   MFXER   MTXER
OR    ORI   ORIS    STW   STWU  STWX  STWUX   MCRXR
XOR   XORI  XORIS   B     BC    BCCTR BCLR
ANDC  ORC           CRAND CROR  CRXOR
NAND  NOR           CRNAND CRNOR CREQV
EQV                 CRANDC  CRORC
MULLI MULLW RLWIMI
SLW   NEG   RLWINM
SRW         RLWNM
SRAW  SRAWI

# Features Not Supported
- segment registers
- tlb

<software>The first issue with building LLVM was the Windows SDK version. The build tools for Visual Studio 17 had to be installed as it will not build with Visual Studio 19. Then I scrapped that build and started a build using LLVM 11.0 which works with Visual Studio 19. The build went well until it was about 90% complete at which point the machine ran out of memory and the build had to be aborted. This is on a machine with 16GB ram + 4GB swap file. Closed everything down, rebooted the machine and restarted the build. Then the build hung trying to run configure on things rather than completing the aborted build. So, now clean has been run and the build restarted again.

PowerPC 32-bit code is what (nano) nPower uses. In LLVM there is a project for PowerPC64 bit code. I am wondering if it can build 32-bit binary files. A file is needed that is a raw binary file that can be placed in ROM code. The reset address needs to be specifiable. Digging into the compiler options is another task to be done.
In the meantime, some more work was done on the RTF64 project.

A simple NOP testbench was created and run in simulation. The core could at least fetch and execute the NOP instructions. But something more sophisticated is required now.

_________________
Robert Finch http://www.finitron.ca


Sun Dec 06, 2020 4:17 am
Profile WWW
User avatar

Joined: Fri Mar 22, 2019 8:03 am
Posts: 328
Location: Girona-Catalonia
Hi Rob,

I can't really help with installing LLVM on windows as I did it on a mac computer using Cmake and Xcode (not without a lot of pain). The LLVM community seems to use something called "Ninja", but I became familiar with Xcode long ago and I guess I'm too old to start now learning yet another IDE, despite LLVM only supports Xcode partially. I guess that claimed compatibility with a number of platforms is typical to open source project, and it's generally true, but not without a lot of problems and nuances.

Regarding PowerPC, you can get the supported CPUs by typing the following command in the terminal window:

Code:
clang --target=powerpc -print-supported-cpus


For LLVM V10 I get the following output:

Code:
clang version 10.0.0 (https://github.com/llvm/llvm-project.git 4786e1f8362e35817ac6bbf5359bfe19f7253592)
Target: powerpc
Thread model: posix
InstalledDir: /Users/joan/LLVM-10/llvm-project/build/Debug/bin
Available CPUs for this target:

   440
   450
   601
   602
   603
   603e
   603ev
   604
   604e
   620
   7400
   7450
   750
   970
   a2
   a2q
   e500
   e500mc
   e5500
   g3
   g4
   g4+
   g5
   generic
   ppc
   ppc32
   ppc64
   ppc64le
   pwr3
   pwr4
   pwr5
   pwr5x
   pwr6
   pwr6x
   pwr7
   pwr8
   pwr9

Use -mcpu or -mtune to specify the target's processor.
For example, clang --target=aarch64-unknown-linux-gui -mcpu=cortex-a35


So, it looks to me that there's quite a lot of choice and it should be possible to start with the subset of instructions supported by one of the earlier cpus.

As per my experience, I found easier to use LLVM to output only assembly code, rather than binary or executable files.

The problem with binaries is that you need to know their format and they require a 'linker' pass before they are ready for execution. I do not know your experience with these things, including link file and executable file formats, but mine is null, so that's why I avoided that and chose a slightly different route.

With assembly files, you can create an assembler to your entire convenience, and generate your executable files from that. I found this approach easier and more flexible.

If you want to follow this route, in the case of PowerPC, you can invoke the following commands:

Code:
clang --target=powerpc -mcpu=ppc32 -emit-llvm -S -Os queens.c


The example here is the "queens.c" program that we discussed in the compilers thread.

This command creates a human readable "queens.ll" file with the LLVM intermediate representation, which you don't need to really understand but it's useful to get a taste of the target independent optimisations that are performed. "-Os" is the optimisation specification that balances speed with code size. I found this is ok to avoid very long executables with all the loops unnecessarily unrolled and frankly totally unreadable assemblies.

Then you can use the *.ll file with 'llc' to produce the "*.s" assembly output, like this:

Code:
llc queens.ll


I tried to ran the above commands and got the following PowerPC assembly output from the "queens.c" code:

queens.s
Code:
   .text
   .file   "queens.c"
   .globl   print_board             # -- Begin function print_board
   .p2align   2
   .type   print_board,@function
print_board:                            # @print_board
.Lfunc_begin0:
# %bb.0:                                # %entry
   lis 5, 0
   li 4, 0
   li 6, 8
   li 7, 81
   li 8, 46
   ori 5, 5, 65535
   li 9, 10
   mr 10, 3
.LBB0_1:                                # %for.cond1.preheader
                                        # =>This Loop Header: Depth=1
                                        #     Child Loop BB0_2 Depth 2
   slwi 11, 4, 5
   add 11, 3, 11
   addi 11, 11, -4
   mtctr 6
.LBB0_2:                                # %for.body3
                                        #   Parent Loop BB0_1 Depth=1
                                        # =>  This Inner Loop Header: Depth=2
   lwzu 12, 4(11)
   cmplwi   12, 0
   bc 12, 2, .LBB0_4
# %bb.3:                                # %for.body3
                                        #   in Loop: Header=BB0_2 Depth=2
   ori 12, 7, 0
   b .LBB0_5
.LBB0_4:                                # %for.body3
                                        #   in Loop: Header=BB0_2 Depth=2
   addi 12, 8, 0
.LBB0_5:                                # %for.body3
                                        #   in Loop: Header=BB0_2 Depth=2
   stb 12, 0(5)
   bdnz .LBB0_2
# %bb.6:                                # %for.end
                                        #   in Loop: Header=BB0_1 Depth=1
   addi 4, 4, 1
   cmplwi   4, 8
   addi 10, 10, 32
   stb 9, 0(5)
   bne   0, .LBB0_1
# %bb.7:                                # %for.end7
   li 3, 10
   stb 3, 0(5)
   stb 3, 0(5)
   blr
.Lfunc_end0:
   .size   print_board, .Lfunc_end0-.Lfunc_begin0
                                        # -- End function
   .globl   conflict                # -- Begin function conflict
   .p2align   2
   .type   conflict,@function
conflict:                               # @conflict
.Lfunc_begin1:
# %bb.0:                                # %entry
   cmpwi   4, 1
   blt   0, .LBB1_8
# %bb.1:                                # %for.body.preheader
   subf 7, 4, 5
   add 6, 5, 4
   slwi 5, 5, 2
   addi 8, 3, -32
   slwi 9, 7, 2
   add 5, 8, 5
   add 8, 9, 8
   slwi 9, 6, 2
   add 3, 9, 3
   addi 8, 8, -4
   addi 9, 3, -28
   mtctr 4
.LBB1_2:                                # %for.body
                                        # =>This Inner Loop Header: Depth=1
   lwzu 3, 32(5)
   cmplwi   3, 0
   li 3, 1
   bnelr 0
# %bb.3:                                # %if.end
                                        #   in Loop: Header=BB1_2 Depth=1
   cmpwi   7, 0
   addi 8, 8, 36
   blt   0, .LBB1_5
# %bb.4:                                # %if.then4
                                        #   in Loop: Header=BB1_2 Depth=1
   lwz 4, 0(8)
   cmplwi   4, 0
   bnelr 0
.LBB1_5:                                # %if.end11
                                        #   in Loop: Header=BB1_2 Depth=1
   cmpwi   6, 7
   addi 9, 9, 28
   bgt   0, .LBB1_7
# %bb.6:                                # %if.then14
                                        #   in Loop: Header=BB1_2 Depth=1
   lwz 4, 0(9)
   cmplwi   4, 0
   bnelr 0
.LBB1_7:                                # %for.inc.critedge
                                        #   in Loop: Header=BB1_2 Depth=1
   addi 6, 6, -1
   addi 7, 7, 1
   bdnz .LBB1_2
.LBB1_8:
   li 3, 0
   blr
.Lfunc_end1:
   .size   conflict, .Lfunc_end1-.Lfunc_begin1
                                        # -- End function
   .globl   solve                   # -- Begin function solve
   .p2align   2
   .type   solve,@function
solve:                                  # @solve
.Lfunc_begin2:
# %bb.0:                                # %entry
   mflr 0
   stw 0, 4(1)
   stwu 1, -48(1)
   stw 30, 40(1)                   # 4-byte Folded Spill
   cmplwi   4, 8
   mr 30, 3
   stw 23, 12(1)                   # 4-byte Folded Spill
   stw 24, 16(1)                   # 4-byte Folded Spill
   stw 25, 20(1)                   # 4-byte Folded Spill
   stw 26, 24(1)                   # 4-byte Folded Spill
   stw 27, 28(1)                   # 4-byte Folded Spill
   stw 28, 32(1)                   # 4-byte Folded Spill
   stw 29, 36(1)                   # 4-byte Folded Spill
   bne   0, .LBB2_9
# %bb.1:                                # %for.cond1.preheader.i.preheader
   lis 4, 0
   li 3, 0
   li 5, 8
   li 6, 81
   li 7, 46
   ori 4, 4, 65535
   li 8, 10
   mr 9, 30
.LBB2_2:                                # %for.cond1.preheader.i
                                        # =>This Loop Header: Depth=1
                                        #     Child Loop BB2_3 Depth 2
   slwi 10, 3, 5
   add 10, 30, 10
   addi 10, 10, -4
   mtctr 5
.LBB2_3:                                # %for.body3.i
                                        #   Parent Loop BB2_2 Depth=1
                                        # =>  This Inner Loop Header: Depth=2
   lwzu 11, 4(10)
   cmplwi   11, 0
   bc 12, 2, .LBB2_5
# %bb.4:                                # %for.body3.i
                                        #   in Loop: Header=BB2_3 Depth=2
   ori 11, 6, 0
   b .LBB2_6
.LBB2_5:                                # %for.body3.i
                                        #   in Loop: Header=BB2_3 Depth=2
   addi 11, 7, 0
.LBB2_6:                                # %for.body3.i
                                        #   in Loop: Header=BB2_3 Depth=2
   stb 11, 0(4)
   bdnz .LBB2_3
# %bb.7:                                # %for.end.i
                                        #   in Loop: Header=BB2_2 Depth=1
   addi 3, 3, 1
   cmplwi   3, 8
   addi 9, 9, 32
   stb 8, 0(4)
   bne   0, .LBB2_2
# %bb.8:                                # %print_board.exit
   li 3, 10
   stb 3, 0(4)
   stb 3, 0(4)
   b .LBB2_13
.LBB2_9:                                # %for.cond.preheader
   slwi 26, 4, 5
   add 3, 30, 26
   mr 29, 4
   addi 28, 4, 1
   li 25, 0
   li 24, 1
   addi 23, 3, -4
   li 27, 0
.LBB2_10:                               # %for.body
                                        # =>This Inner Loop Header: Depth=1
   mr 3, 30
   mr 4, 29
   mr 5, 27
   addi 23, 23, 4
   bl conflict
   cmplwi   3, 0
   bne   0, .LBB2_12
# %bb.11:                               # %if.then3
                                        #   in Loop: Header=BB2_10 Depth=1
   mr 3, 30
   mr 4, 28
   stw 24, 0(23)
   bl solve
   stw 25, 0(23)
.LBB2_12:                               # %for.inc
                                        #   in Loop: Header=BB2_10 Depth=1
   addi 27, 27, 1
   cmplwi   27, 8
   addi 26, 26, 4
   bne   0, .LBB2_10
.LBB2_13:                               # %return
   lwz 30, 40(1)                   # 4-byte Folded Reload
   lwz 29, 36(1)                   # 4-byte Folded Reload
   lwz 28, 32(1)                   # 4-byte Folded Reload
   lwz 27, 28(1)                   # 4-byte Folded Reload
   lwz 26, 24(1)                   # 4-byte Folded Reload
   lwz 25, 20(1)                   # 4-byte Folded Reload
   lwz 24, 16(1)                   # 4-byte Folded Reload
   lwz 23, 12(1)                   # 4-byte Folded Reload
   lwz 0, 52(1)
   addi 1, 1, 48
   mtlr 0
   blr
.Lfunc_end2:
   .size   solve, .Lfunc_end2-.Lfunc_begin2
                                        # -- End function
   .globl   main                    # -- Begin function main
   .p2align   2
   .type   main,@function
main:                                   # @main
.Lfunc_begin3:
# %bb.0:                                # %entry
   mflr 0
   stw 0, 4(1)
   stwu 1, -16(1)
   lis 3, board@ha
   stw 30, 8(1)                    # 4-byte Folded Spill
   la 30, board@l(3)
   li 4, 0
   mr 3, 30
   li 5, 256
   bl memset@PLT
   mr 3, 30
   li 4, 0
   bl solve
   li 3, 0
   lwz 30, 8(1)                    # 4-byte Folded Reload
   lwz 0, 20(1)
   addi 1, 1, 16
   mtlr 0
   blr
.Lfunc_end3:
   .size   main, .Lfunc_end3-.Lfunc_begin3
                                        # -- End function
   .type   board,@object           # @board
   .comm   board,256,4
   .ident   "clang version 10.0.0 (https://github.com/llvm/llvm-project.git 4786e1f8362e35817ac6bbf5359bfe19f7253592)"
   .section   ".note.GNU-stack","",@progbits


I don't know much about the PowerPC, but the assembly file looks ok to me. This file of course must be assembled now to produce the final executable.

This procedure worked fine for me, except that I implemented my own "CPU74" backend instead of using an existing one.

Now, if you chose to follow a similar route and create your own assembler, then you don't even need to be byte compatible with the PowerPC, as far as your assembler understands all the supported PowerPC assembly mnemonics and patterns.

I hope this helps
Joan


Sun Dec 06, 2020 11:41 am
Profile

Joined: Wed Nov 20, 2019 12:56 pm
Posts: 92
This is where I sound like a broken record and recommend VBCC again (provided you don't mind the weird license). Vasm, too, if you need an assembler.

It already has a PowerPC backend, it's relatively easy to modify, can be rebuilt in seconds after making a change, and doesn't take the form of a 30-foot-high teetering stack of semi-relevant layers, frameworks and build-tools that may or may not build successfully if your distro is too old or too new. (Wasted a lot of time this weekend trying to get a GCC-based ESP8266 toolchain up and running. I'm slightly annoyed by it - can you tell? :D )


Sun Dec 06, 2020 4:46 pm
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Quote:
Regarding PowerPC, you can get the supported CPUs by typing the following command in the terminal window:

Thanks, I see there seems to be a lot of support for the PowerPC.

The build environment settings in Visual Studio needed to be modified to reduce the amount of parallel build operations that occur. It was setup by default to be maxed out for my workstation, causing it to launch tools (cl, link) until the machine ran out of memory. I’ve had to restart the build several times as it runs into memory issues. Hopefully once the everything is built, building the specific tools wont be so troublesome.
I did manage to get LLVM and CLANG built and compiled a “Hello World1” program for the PowerPC. Then I got stuck on how to convert it to a rom image. It is a toss up as to whether to proceed with LLVM or VBCC.
Quote:
This is where I sound like a broken record and recommend VBCC again (provided you don't mind the weird license). Vasm, too, if you need an assembler.

It already has a PowerPC backend, it's relatively easy to modify, can be rebuilt in seconds after making a change, and doesn't take the form of a 30-foot-high teetering stack of semi-relevant layers, frameworks and build-tools that may or may not build successfully if your distro is too old or too new. (Wasted a lot of time this weekend trying to get a GCC-based ESP8266 toolchain up and running. I'm slightly annoyed by it - can you tell? )
All very good points making vbcc worth looking into. I downloaded and fiddled around with VBCC yesterday until I got it to generate PowerPC code. But the download link for vasm does not work. It times out from my workstation.
I want to modify the compiler to support features of the cc64 language, so VBCC may be easier to modify.

_________________
Robert Finch http://www.finitron.ca


Mon Dec 07, 2020 3:45 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
I had to modify the makefile for VBCC to get it to build using nmake. The command shell outs did not like the forward slashes in file paths so they all had to be converted to back-slashes to work. Also, when executing tools in bin they had to be referenced as .\bin otherwise they were not found. So, there were some minor mods to the makefile.

Still cannot download VASM. I left an email at the site.

_________________
Robert Finch http://www.finitron.ca


Tue Dec 08, 2020 3:55 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Latest Milestone:
A compiled and assembled ROM test image.

<hardware> The reset address was modified to $FFFC0000 from $FFFFFFFC. This allows a smaller rom image to be used.

<software> The instruction bytes output by vasm for a binary file are output in the wrong order. The simplest fix was to provide an option in the processor core to reverse the order of the bytes.

A small utility bin2ver was written to convert binary files to Verilog memory assignment statements.

_________________
Robert Finch http://www.finitron.ca


Thu Dec 10, 2020 4:03 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Just wading through the debug of nPower today. Many, many small fixes. Running in the simulator the first few lines of code execute okay, then there is a forwarding issue of some sort. Mulling over the idea of changing the design to a single pipeline, the results forwarding for the dual pipeline is pretty complex. Complicated by the load/store with update instructions. There are two sets of values to forward, the results bus and the effective address bus, resulting in about 16:1 multiplexors for forwarding.

This is the current ROM test program:
Code:
int main()
{
   int DBGAttr;
   int* pLEDS = 0xFFDC0600;
   int* pScreen = 0xFFD00000;
   
   *pLEDS = 0xAA;
   DBGAttr = 0x4FF0F000;
   pScreen[0] = DBGAttr|'A';
   pScreen[1] = DBGAttr|'A';
   pScreen[2] = DBGAttr|'A';
   pScreen[3] = DBGAttr|'A';
}

Which stops working correctly here:
Code:
   .file   "rom_bios.c"
   .text
   .align   2
   .sdreg   r13
   .align   4
   .global   _main
_main:
   stwu   r1,-32(r1)
   lis   r11,-36
   addi   r11,r11,1536
   stw   r11,12(r1)
   lis   r11,-48
   stw   r11,16(r1)
   li   r11,170
   lwz   r12,12(r1)
->   stw   r11,0(r12)  <- this fails
   lis   r11,20465
   addi   r11,r11,-4096
   stw   r11,8(r1)
   lwz   r11,8(r1)
   ori   r0,r11,65
   lwz   r11,16(r1)
   stw   r0,0(r11)
   lwz   r11,8(r1)
   ori   r10,r11,65
   lwz   r9,16(r1)
   stw   r10,4(r9)
   lwz   r11,8(r1)
   ori   r10,r11,65
   lwz   r9,16(r1)
   stw   r10,8(r9)
   lwz   r11,8(r1)
   ori   r10,r11,65
   lwz   r9,16(r1)
   stw   r10,12(r9)
l1:
   addi   r1,r1,32
   blr
   .type   _main,@function
   .size   _main,$-_main
# stacksize=32
   .set   ___stack_main,32

_________________
Robert Finch http://www.finitron.ca


Fri Dec 11, 2020 5:00 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
I have not been able to resolve the forwarding issue in nPower yet, so it is going to be shelved for a few days. It does not forward an address from a previous instruction properly, causing the address zero to be used instead. One would think knowing roughly what the issue is it should be solvable. But it is complicated, occurring when two instructions are paired and a pipeline bubble needs to be inserted with a delay of the second pipeline. The bubble can be seen to be inserted by noting the advancement of the instruction register to the next stage, however the operands are not advanced properly, they appear to get left behind. It needs a double bubble in this case because it is the result of a load operation that needs to be bypassed. IFAICT it should be working. The logic is in place but I have missed something.

_________________
Robert Finch http://www.finitron.ca


Sat Dec 12, 2020 4:20 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Tentatively solved the forwarding issue, but another issue cropped up. This time in the instruction fetch / icache. It is missing a pair of instructions on a cache load.

_________________
Robert Finch http://www.finitron.ca


Sun Dec 13, 2020 5:48 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Latest Milestone: test program run in simulation.

Latest Fixes:
Much to my disbelief the register fields for logic and shift operations are swapped for Rt and Ra compared to the rest of the instruction set. I have a book that indicates this, but I had thought it was a transcription issue after reading the Open PowerPC docs. In newer versions of the processor the register fields are not swapped around.
After fixing this, the test program runs in simulation !
Now to move on to more complex tests.

_________________
Robert Finch http://www.finitron.ca


Sun Dec 13, 2020 7:43 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Latest Milestone: now running Sieve of Eratosthenes

Many bug fixes later:
It looks like the IPC is turning out to be about 0.1. The best IPC that can be achieved with this design is 0.6666. The core stalls a lot of the time. The stall conditions are maybe too stringent in the interest of simplicity. The conditions are a dozen lines of combo logic.

_________________
Robert Finch http://www.finitron.ca


Mon Dec 14, 2020 4:42 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Latest Mods:
<hardware> Added dual perceptron branch predictors, one for each pipe. With branch predictors and a couple of other minor tweaks, the IPC is now about 0.12, or almost 20% better. Further tweaking of the core has got it to 0.203 IPC for the Sieve program.
Support for even more of the instruction set has been added.

Testing:
A trivial test program was created containing a looping sequence of about 100 add instructions without any loads or stores. The core was able to achieve 1.75 instructions per machine cycle (a machine cycle being three clocks).

Latest Fixes:
<hardware> The value zero was not being returned if register zero was specified for Ra in the ADDI instruction. For loads and stores and ADDI if Ra is specified as zero then the value zero should be used. Otherwise r0 is just another general-purpose register.

_________________
Robert Finch http://www.finitron.ca


Tue Dec 15, 2020 5:16 am
Profile WWW

Joined: Sun Dec 20, 2020 1:54 pm
Posts: 74
robfinch wrote:
Code:
# Currently Supported Instructions

ADD   ADDI  ADDIS   LBZ   LBZU  LBZX  LBZUX
SUBF                LHZ   LHZU  LHZX  LHZUX
CMP   CMPI  EXTB    LWZ   LWZU  LWZX  LWZUX
CMPL  CMPLI EXTH    STB   STBU  STBX  STBUX
AND   ANDI  ANDIS   STH   STHU  STHX  STHUX
OR    ORI   ORIS    STW   STWU  STWX  STWUX
XOR   XORI  XORIS   B     BC    BCCTR BCLR
SLW   NEG
SRW
SRAW  SRAWI



hi,
very nice project! I have being playing with PPC4xx embedded cores for hobby since 2005, and I'd love to replace the physical chip with an FPGA because nowadays these chips are difficult to find, as well as eval-boards in good working conditions.

I just wonder: is your subset the minimal operative set that you have distilled? Or are you going to extend it as soon as your code got a stable milestone?

Thanks for sharing :D


Sun Dec 20, 2020 2:03 pm
Profile
Display posts from previous:  Sort by  
Reply to topic   [ 64 posts ]  Go to page 1, 2, 3, 4, 5  Next

Who is online

Users browsing this forum: trendictionbot and 14 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group
Designed by ST Software