Last visit was: Tue Sep 10, 2024 10:44 am
|
It is currently Tue Sep 10, 2024 10:44 am
|
LALU Computer: Lookup Arithmetic Logic Unit
Author |
Message |
Ken KD5ZXG
Joined: Sat Sep 03, 2022 3:04 am Posts: 51
|
Can't fathom how you work some of those instructions without read accessible pointers, maybe the new stacks. Anyways, your A*B table. Does that function return the H or L byte of the answer? And why not offer both? A+A or B+B can also perform left logical shift, so not like you have no table to spare for the other byte...
|
Mon Sep 19, 2022 7:18 pm |
|
|
mmruzek
Joined: Sun Dec 19, 2021 1:36 pm Posts: 79 Location: Michigan USA
|
The ALU function for the Multiply in LALU is absolutely ghastly. The lookup table reports the 8 bit result of a multiplication of A and B so long as it does not overflow. If the operation does overflow it reports back hexadecimal FF. ( 8*8=64 OK, 17*17=FF for overflow). I'm interested in any ideas to improve that function, or ideas to replace the function entirely with something more useful.
Also, here is kind of an odd thing. Since the ALU can generate the various flags as listed, I could program the lookup controller to make any instruction conditional on the flags. For example I have an instruction MAB (Move A to B). The controller could be programmed to make MAB execution conditional on the flags... for example, MAB only If A>B.
Here's another thing: I have space in the controller for 3 more instructions. (Hex D, E and F). I'd be interested to know if anyone has ideas for additional useful instructions.
Michael
|
Mon Sep 19, 2022 9:23 pm |
|
|
Ken KD5ZXG
Joined: Sat Sep 03, 2022 3:04 am Posts: 51
|
What I suggest are fixed point output bytes "HL.XYZ" Each output byte gets a separate lookup table in LALU. Go deep or shallow as you find table space to spare.
Right now we are just talking about multiply. Nothing messy when you simply offer the high byte instead of an error.
MPH(A*B)=H_.___ Multiply MPL(A*B)=_L.___ DVL(A/B)=_L.___ Divide DVX(A/B)=__.X__ DVY(A/B)=__._Y_ DVZ(A/B)=__.__Z MDL(A/B)=_L.___ Modulo or Division's Remainder CSX(AB)=__.X__ Cosine of a 16bit quadrant CSY(AB)=__._Y_ CSZ(AB)=__.__Z QRL(AB)=_L.___ Square root QRX(AB)=__.X__ QRY(AB)=__._Y_ QRZ(AB)=__.__Z
None of these desperately need to throw flags, so the full width of every byte is available. Divide By Zero or COS(0)=1 could flag, but worthwhile or just making things complicated? Rarely used ALU functions may not merit a unique instruction, but instead access as memory.
-edit- What about COS(AB+1)? One extra angle lets us hit 45degrees and all halvings exactly. COS(0)=1 should "off the table" anyway, as a CSL table wastes 64KBytes just to store one bit. Better off with a slightly smaller quadrant, integer divisible to all the most common angles? I had such a number writ down but lost. A gazillion old notes like this to consolidate... Was: 2*2*2*2*2*3*3*5*5*7 = 50,400 (+1 for zero). Or 90 degrees subdivided by 560.
Last edited by Ken KD5ZXG on Wed Sep 21, 2022 7:37 am, edited 14 times in total.
|
Tue Sep 20, 2022 2:02 am |
|
|
oldben
Joined: Mon Oct 07, 2019 2:41 am Posts: 632
|
mmruzek wrote: The ALU function for the Multiply in LALU is absolutely ghastly. Here's another thing: I have space in the controller for 3 more instructions. (Hex D, E and F). I'd be interested to know if anyone has ideas for additional useful instructions.
Michael Decimal add with carry and Decimal subtract with carry D & E They work with the ASCII characters 0..9. You then can have a 4 digit calculator. (F has to halt and catch on fire) Ben.
|
Tue Sep 20, 2022 2:49 am |
|
|
Ken KD5ZXG
Joined: Sat Sep 03, 2022 3:04 am Posts: 51
|
I don't offer BCD arithmetic tables, but three conversion tables under "Single Argument Functions", where all or part of B further specifies a sub-function.
BCD to ABS _L.___ Coded Decimal to Absolute ABS to CDL _L.___ Absolute to Coded Decimal Low (00~99) ABS to CDH H_.___ Absolute to Coded Decimal High (Hundreds)
ABS being my "absolute" or unsigned plain positive binary data type. I also have conversion tables to/from ascii and seven segment... Perhaps relevant to the purpose of a calculator.
As for binary addition: ADD(A+B+0) Addressed when the input flag is false ADD(A+B+1) Addressed when the input flag is true Neither of these give simultaneous output of any flag. Fortunately we don't always care to update flags. Even convenient to avoid mandatory overwrite.
Flag outputs (when we do care) will require a separate lookup. Conversely, a branch suffers no requirement to ever see the main result. FLG(A,B,0) Addressed when the input flag is false FLG(A,B,1) Addressed when the input flag is true
Each bit of flag and it's "flag when flagged" evil twin represent a function. Obviously A+B's carry, A-B and B-A's borrow and zero. One flag pair further break down per the single argument sub-functions. Complicated and messy to describe in detail. Some other time, not tonight... I have an old neglected blog at Hackaday. Don't take my plans there as fully baked.
The organization of MMruzek's LALU reserved no empty space for new functions. To complicate addition, his Carry manifests as LSB. If just one A vs B table could be freed, single argument conversions occupy only 1/256th of that same space. For example: Rotations operating only on A don't need to occupy a full A vs B table. New functions (like BCD conversion) may now squeeze in edgewise, selected by B. Even without new functions, some workaround for BCD Addition w Carry might be microcoded.
Suffice to say I have way more ALU function tables than one should assign opcodes. I don't have any opcodes or instructions yet, only an absurd excess of ALU functions. Unfortunately too many moons thinking only how ALU tables might be optimized. Almost totally neglected how burn the tables or make a workable CPU of the thing. Thinking too hard now about all those neglected aspects.
|
Tue Sep 20, 2022 6:58 am |
|
|
mmruzek
Joined: Sun Dec 19, 2021 1:36 pm Posts: 79 Location: Michigan USA
|
Here is a photo of the stack cards for LALU. There is a Data stack, Keyboard stack and Return stack. The Data stack is used for computations. The Keyboard stack holds the character string sent to the language interpreter. The Return stack is used for holding addresses for return jumps and nesting. Presently the stacks are 256 bytes deep, because an 8 bit counter is used. The instructions sent to the stacks are microcoded, which is why there is a 4-16 decoder on each board. The same microcode also controls the LCD Display card, and the Keyboard I/O card. There is a good book about Stack Computers by Koopman available on the web. Here is a link to it: http://users.ece.cmu.edu/~koopman/stack ... index.htmlI am now in the process of writing an interpretive language for the computer using the assembly language previously described. I am trying to make the language easy to use like Basic, but powerful like Forth. More on that later!
You do not have the required permissions to view the files attached to this post.
|
Sun Oct 09, 2022 1:11 pm |
|
|
mmruzek
Joined: Sun Dec 19, 2021 1:36 pm Posts: 79 Location: Michigan USA
|
Hi, It has been about a month since my last update on the LALU Computer. Previously, I published a listing of the Assembly Language for LALU. Since then I have written a higher level language for the computer called 'LANG'. My goal is to develop the language to the point of doing some performance benchmarking, notably running a program to calculate prime numbers.
LANG is an interpretive computer language with an emphasis on stack manipulation using Reverse Polish Notation math. The parser generally uses the 1st and/or 2nd ASCII character of an instruction to identify the operation to be performed. All numbers and variables are unsigned positive 16 bit integers. This required quite a bit of coding because the LALU computer is native 8 bit data manipulation.
LANG is extremely compact and terse. I've attempted to make the language slightly easier to read than FORTH, but somewhat less intuitive than BASIC. The language specification and an example program are attached. This project is obviously a work in progress, and I value any input or comments. Regards, Michael
You do not have the required permissions to view the files attached to this post.
|
Sat Nov 12, 2022 12:10 pm |
|
|
mmruzek
Joined: Sun Dec 19, 2021 1:36 pm Posts: 79 Location: Michigan USA
|
The LALU computer is now running the LANG language OK. I wrote a small program to identify the prime numbers up to 1000, loosely based on the benchmark given by Tom Fox in the June 1980 issue of Interface Age. I attached a screenshot of the original article. This benchmark is relatively simple, and is often run in the original language of BASIC. My goal was to get a rough idea of where I am in the design of LALU, which is an 8 bit machine running at 1 MHZ.
A program listing for the LANG version of the program is attached. It is to be noted that I am not printing out the array of numbers, as in the original program. (Printing out an array of 1000 numbers on the LCD display takes about 20 seconds.)
Running the program as shown, the time to complete using LANG is 360 seconds. By combining lines of the program I can get this down to 240 seconds. This is not an apples to apples comparison, but I at least find it encouraging.
Another factor is that I am using 6 microsteps per instruction on the ring counter, but my instructions only use 4 of the microsteps, so there is some opportunity to speed things up there as well.
You do not have the required permissions to view the files attached to this post.
|
Sun Dec 04, 2022 3:39 pm |
|
|
BigEd
Joined: Wed Jan 09, 2013 6:54 pm Posts: 1796
|
(Interesting little benchmark - posted about it here.)
|
Mon Dec 05, 2022 5:26 pm |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2157 Location: Canada
|
I got curious as to how well my 40MHz rf68000 would perform on this test. 122 seconds running TinyBasic. I had to modify the program to something equivalent I think because TinyBasic did not like the early exits from the FOR loop. Also, TinyBasic only supports integers so the remainder had to be computed. TinyBasic does not tokenize so it has to parse everything, and it is calling a subroutine. Code: NEW 130 PRINT "Starting." 140 FOR N = 1 TO 1000 150 GOSUB 300 240 NEXT N 250 PRINT "Finished." 260 END 300 FOR K = 2 TO 500 310 M=N/K 312 J=N-M*K 320 IF K=N GOTO 380 330 IF M=0 RETURN 340 IF M=1 GOTO 370 350 IF J>0 GOTO 370 360 IF J=0 RETURN 370 NEXT K 380 PRINT N 390 RETURN
RUN <CON
_________________Robert Finch http://www.finitron.ca
|
Tue Dec 06, 2022 2:22 pm |
|
|
oldben
Joined: Mon Oct 07, 2019 2:41 am Posts: 632
|
Count me out, as I don't have BASIC on my machine.Benchmarks really don't count unless normalized for the same memory speeds in my view. I do have a compiled language, but not C for my machine. I still run at 1980 speeds 1.5 Mhz.
|
Tue Dec 06, 2022 7:23 pm |
|
|
alrj
Joined: Thu Feb 25, 2021 8:27 am Posts: 38 Location: Belgium
|
Ooh, that's funny, I like it !
I've just run it with my port of the Palo-Alto Tiny Basic on BB-88, my 8088 (actually V20) breadboard computer. I had to use the remainder method provided above, but PATB does support early exits from FOR loops, so no need to use a subroutine in my case. The benchmark ran in 630 seconds. That's not too bad! Then, just to get an idea, I booted up in MS-DOS and ran the original benchmark in QBasic: 547 seconds. Oh well...
|
Wed Dec 07, 2022 10:57 am |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2157 Location: Canada
|
Contest!
630 secs for a breadboard computer is pretty decent.
The rf68k is only about as good on the benchmark as the 6MHz HEX29 2900 mainly because of the memory access. Rf68k uses about five or six cycles per memory access, lots of wait states. So, <8 MHz memory. So, it sorta makes sense its about the same time. No cache on this machine. Thankfully it is local RAM being used. Global RAM access is like 20+ cycles.
BTW the LALU 360 secs. For a 1 MHz machine is awesome! I have been following the LALU with interest.
_________________Robert Finch http://www.finitron.ca
|
Wed Dec 07, 2022 1:53 pm |
|
|
mmruzek
Joined: Sun Dec 19, 2021 1:36 pm Posts: 79 Location: Michigan USA
|
Those comparison tests on the machines are interesting, and thanks for the link to your post about the benchmark. What is the clock speed on the 8088 breadboard, and could you post a photo of your setup(s)?
I've trimmed the code in assembly language routines for LANG, and eliminated a time delay loop I had inserted for the LCD display (duh!), so things are really humming along now. I'm kind of at a loss for how to make things faster in software. Right now the largest delay is caused by the Go To statements. Those require loading the Keyboard Stack with the relevant line of text for parsing. My parser is super-simple, it literally uses the first ASCII character of an instruction as a memory pointer to jump to the ROM code location. (Delimiters are blank spaces.) Also, of course a bunch of time is wasted doing 16 bits on an 8 bit machine.
Here are a few photos of the complete LALU computer. I think you might find the keyboard kind of comical. I took a Dell PS/2 keyboard and use it with a translator PIC previously described. What's funny is I used a black sharpie to blank any key that is not translated by the PIC and slapped a "LALU" tag over the Dell.
You do not have the required permissions to view the files attached to this post.
|
Wed Dec 07, 2022 9:43 pm |
|
|
BigEd
Joined: Wed Jan 09, 2013 6:54 pm Posts: 1796
|
Splendid! Thanks for the photos.
|
Wed Dec 07, 2022 9:46 pm |
|
Who is online |
Users browsing this forum: Applebot, CCBot, trendictionbot and 0 guests |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot post attachments in this forum
|
|