View unanswered posts | View active topics It is currently Fri Apr 19, 2024 1:35 pm



Reply to topic  [ 66 posts ]  Go to page Previous  1, 2, 3, 4, 5
 Started 6809 core 
Author Message

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Added support for 128-bit decimal floating-point. Added a ‘G’ accumulator and associated instructions to support this. The floating-point is not very fast given that it must load the data from memory which is eleven bytes. It probably takes somewhere around 100 clocks just to load the data, then another 30 to 40 clocks to perform the operation.

The 200+ bit wide BCD adder in the multiplier blew the timing budget by quite a bit. Tools reported 68ns required limiting the clock frequency to about 14 MHz. So, some more pipelining is required in the BCD addition. The thing should be running at least 40 MHz. The obvious thing to do is break up the carry chain so less propagation time is required. A set of pipeline registers was added to the output of the adder and a retiming option specified to allow the tools to decide what to do. This increases the clock count required to perform an addition. The multiplier used repeated addition taking an average of five clock cycles per digit processed. 68 digits were being processed meaning a multiply could take about 350 clock cycles on average. Adding a 12-stage pipeline register to improve the frequency makes this time 12x or approximately 4,200 clock cycles. 350 is bad, but 4,200 is probably not good enough for interrupt latency. That means making the multiply or divide operation interruptible. To support interruptible multiply / divide a done flag was added to the ccr. Multiply / divide now just returns to the IFETCH stage without incrementing the PC if not done, otherwise if done the PC is incremented.

Had to modify the BCD adder to include a clocked carry chain. The tools did not pipeline the adder as hoped. I am wondering now if a population count of the number of set carries can be taken and the add limited to that number of clock cycles. In other words if adding 60 digits and there are only three carries, is it safe to perform the add in only four clock cycles? If the only carry is in digit 40, then it should not be necessary for more than 28 clocks to be needed.

_________________
Robert Finch http://www.finitron.ca


Sat Feb 19, 2022 5:15 am
Profile WWW

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1782
Have you looked into carry-save adders or any kind of redundant representation? If you add two nibbles into 5-bit digits you can propagate the carries afterwards, and most of the time they won't cause any ripple effects, so two or three reductions will settle out the results.
https://en.wikipedia.org/wiki/Carry-save_adder
http://www.quadibloc.com/comp/cp0202.htm

(Or add 8 bit inputs into 9 bit temporaries, or 16 bits into 17, depending on your desired clock speed.)


Sat Feb 19, 2022 7:30 am
Profile

Joined: Mon Oct 07, 2019 2:41 am
Posts: 592
Some good stuff on BCD adders here.
http://6502.org/users/dieter/bcd2/bcd2_0.htm
BCD adjust ment can be done using logic gates, so correction can be done
before and after the general purpose adder.
I use the 100181 ALU style algorithim. + 6 to B input for addition compilment B for
subtraction. Carry from adder leave alu output unchanged, no carry -6 from alu digit.
Barrel shifter 8..1 lets me shift bcd digits quickly. (4 bit bcd number or flags)
Dubble dabble and Reverse dubble dabble shift adjustment is also handy.
Ben.


Sat Feb 19, 2022 5:26 pm
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
I believe the last means BigEd mentioned is in use now. Got a much faster add now. The add is performed two digits (eight bits) at a time ignoring carries every eight bits which are captured in a register. Then the carries are added to previous result during the next clock cycle. If there are yet still more carries, then they are added to the result in the next clock cycle. I think there cannot be more than two sets of carries. So, the operation is done in three clocks. It is much faster now, it misses timing for 40MHz by 3ns now instead of 68ns. The difference is an operating frequency of 37MHz instead of 40MHz, so I tried running it at the target of 40MHz and it seems to work.

I have found that it is smaller footprint to just use a look-up table for the BCD correction. If it is coded with ‘>’ it will be larger and probably slower than using a table.

Since divide uses repeated subtraction, the divide can take approximately 68*9*3=1836 cycles max. The average uses about half as many clock cycles. It still is an interruptible operation though. Add / subtract is fast enough to not need to be interruptible. It is about 35 clocks. Other operations NEG, CLR, CMP are much faster requiring only 2 to 4 clocks. I need to work on conversions to, from integers yet.

Validating that the operations are correct is going to be a challenge. I need to find a source of 128-bit decimal floating-point tests to compare against.
I also need to polish up some I/O routines.

_________________
Robert Finch http://www.finitron.ca


Sun Feb 20, 2022 4:30 pm
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Got decimal-float <-> integer conversion routines done. They are part of the FT816Float library.
Not sure that they work 100%. Tried a few simple values integer to decimal and back again, seems to work.

_________________
Robert Finch http://www.finitron.ca


Mon Feb 21, 2022 5:34 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Found and fixed an issue in the conversion routines. The exponent bias was incorrectly set at the value $3FFF used for non-decimal arithmetic. It should have been $17FF. This was found while verifying multiply.

Worked on decimal float arithmetic today. Posted some code for it on opencores.org.

_________________
Robert Finch http://www.finitron.ca


Tue Feb 22, 2022 3:14 am
Profile WWW
Display posts from previous:  Sort by  
Reply to topic   [ 66 posts ]  Go to page Previous  1, 2, 3, 4, 5

Who is online

Users browsing this forum: trendictionbot and 13 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group
Designed by ST Software