Last visit was: Wed Dec 01, 2021 7:04 pm
It is currently Wed Dec 01, 2021 7:04 pm



 [ 9 posts ] 
 Favorite size of byte 

What your favorite byte size ?
5 0%  0%  [ 0 ]
6 0%  0%  [ 0 ]
7 0%  0%  [ 0 ]
8 88%  88%  [ 7 ]
9 0%  0%  [ 0 ]
10 13%  13%  [ 1 ]
11 0%  0%  [ 0 ]
12 0%  0%  [ 0 ]
16 0%  0%  [ 0 ]
Total votes : 8

 Favorite size of byte 
Author Message

Joined: Sat Feb 02, 2013 9:40 am
Posts: 1531
Location: Canada
This poll triggered by Mike's comment here http://anycpu.org/forum/viewtopic.php?f=23&t=300#p1955

8 bit bytes win for me because it's a standard that everybody uses, kind of a lowest common denominator. But 10 bit bytes were a close runner up. Once you add error correction into the system (if one has ever used a flaky FPGA processing core) 10 bits + 5 bits error correction fits nicely into 16 bits (at least better than 8 + 5 bits) 10 bit bytes also have a nice progression of processing sizes: 10,20,40,and 80 bits. It's a good size for FP operations too.
It's not as crazy as it sounds. MS Basic uses 40 bit floating point I think, and the 8087 FP chip used 80 bit floats. I used 40 bit opcodes in the Table888 core. 41 bits is also
used by the Itanium.

12 bit bytes are also incredibly popular in some places; very convenient with 4,4,4 RGB.
Which leads to 24 and 48 bit processing which isn't quite as nice a progression as 8 or 10 bit byte systems.

_________________
Robert Finch http://www.finitron.ca


Tue Aug 09, 2016 5:49 am WWW

Joined: Tue Jan 15, 2013 10:11 am
Posts: 114
Location: Norway/Japan
None of the above.. I was writing a response to Mike's thread about it, but for some reason my browser dropped me out of the message so it got lost. But it boiled down to the fact that 8-bit bytes are needed for only two reasons:
1) Because character strings are composed of 8-bit bytes, and
2) Saving memory by having a byte-addressable architecture and byte operations.

If characters had been 32 bits from the start we would never have had to go through the mess of DOS code pages, or ISO character sets, or the other means of forcing character encodings into a format that doesn't have enough space (limited to 255, or much less in practice, which is ridiculous).

So, 1) could be avoided by going to 32 bits for a character, if that had been done from the start (ignoring memory economy issues), we would have been accustomed to deal with 32 bits for them and seen no need for 8 bits. Mostly. I can handle 8-bit bytes as bits or hex or octal in my head, that won't work with 32 bits.
2) is just economy, and can thus be ignored.

TLDR; my favourite byte size is 32 bits. I used to support 8-bit bytes until recently, but this summer the issue has been churning inside my mind and in the end I concluded that it's only character sets that are holding us back, and that's just an artifact of history. I've worked with 16-bit word-addressable machines, and there were special 8-bit instructions there, for handling chars only. And that was a kludge, and not needed for anything else, it was only needed for string handling. Use UTF-32 and it all goes away.


Tue Aug 09, 2016 8:16 am

Joined: Tue Dec 11, 2012 8:03 am
Posts: 285
Location: California
I don't have any easy answers either, but I can comment on widths from experience. I do want to avoid going back to octal like the front-panel minis of the 1960's used. It has to go in groups of 4 bits, IMO.

My HP-41 calculator's Nut processor has 56-bit registers and a bit-serial (1-bit) data bus. The architecture was apparently optimized for floating-point performance for battery life for the technology of the day. (The initial version of the '41 was introduced in 1979.) In RAM, the 56 bits are arranged in groups of seven 8-bit bytes. Instructions and operands in user-language programs go in bytes. The machine language however goes in 10-bit bytes, in a separate memory space. The 41cx version has a text editor, which is quite slow since the architecture does not lend itself well to performance in heavily text-based operations. There is a Forth module, but a look at its manual tells me the architecture is very poor for that too.

The HP-71 was a successor. Its processor is the Saturn which was also used in some other HP handheld computers and calculators. It has 64-, 20-, and 12-bit registers, and 4-bit data bus. Addresses are 20-bit, ie, 4 nybbles. This is the machine I originally learned Forth on. It was a bit strange, with 20-bit cells, 2-nybble bytes, and nybble-addressable (with a 1MNybble address space having no segments or page or bank boundaries). It's weird, but it works well (including far better than the '41 for handling text), and this machine was way ahead of its time in many ways.

On 6502.org, we've discussed wider, more-capable versions of a 65816. There doesn't seem to be any magic number. 32-bit cells would be nice for Forth, with 32-bit non-multiplexed addresses and data and offset registers. It is totally unrealistic that any single programmer would overflow a program space of over 16MB/MW/etc. (2^24 8-bit or 16-bit or 32-bit) memory locations with any semi-efficient programming language, so the extra is for data, huge tables, arrays, memory-resident files, etc.. What percentage of the operands would need to be more than 23- or 24-bit? I don't know, but I suspect Mike's idea to merge the shorter operand with the instruction in a 32-bit word, with a provision to take a second 32-bit word if the operand is longer, is a good one.

Does anything need to be 8-bit? I doubt it, but I don't have the programming experience to know in advance. On the 6502, I used a 4-bit RTC IC in the late 1980's in a design for a commercial product, the Saronix RTC58321 IIRC. There was no problem interfacing the 4-bit RTC to the 6502's 8-bit bus. You just AND-out the high nybble upon reading. When you store, the high 4 bits don't matter, because nothing is listening to them. One thing I'm not sure how you'd handle though is when you want to test input bytes by using the Carry or oVerflow bits like the 65xx does with the BIT instruction. If you had 32-bit everything except I/O, would you have more bits to see if bit 7, 15, 23, and 31 are set, and be able to branch on those?

I have wondered about some of this. If Mike gets his processor into an FPGA, what things will come to light as we start developing real programs for real hardware? Our topic on 6502.org that developed into the 65Org32 made me want to make it first in a microcontroller, to experiment with programming matters. The performance would be terrible, but performance would not be the point initially. The point would be to see what programming problems might come up.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources


Tue Aug 09, 2016 9:08 am WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 1531
Location: Canada
I generally regard 16 bit+ values as characters and not bytes, so my poll is a little biased in the size of bytes. I wasn't thinking 32 bits would be a byte but I guess it could be.
I use 16 bit characters for text in software as a compromise between storage efficiency and supporting multiple languages. In my toolsets if you define a character (eg string) it's automatically 16 bit. With 16 bit chars, especially when it's automatically generated, I have less use for bytes. But byte quantities still show up in data structures (database records). It's good way to represent some constants (50+% fitting into a byte). For instance suppose you need to store just the day of a month, is it going to be stored as a 32 bit int or an eight bit byte ? There are many cases where options to be selected are eight bits or less.

32 bits for characters seems like overkill, but I suppose if one day data is needed from a galactic database maybe a larger character size would be required. Characters could also be represented directly as bitmaps (eg 16x16). Hmm, maybe I should switch to (32x32) bitmap storage. Less ambiguity between characters.

I am using 5 bit nybbles as a compact way of storing type information in a compiler, because it can be compressed into alphabetic characters which are readable. Three nybbles are fit into a 16 bit character. But it's an unusual application.

_________________
Robert Finch http://www.finitron.ca


Tue Aug 09, 2016 4:03 pm WWW

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1647
I'm interested in computer history, and there have historically been lots of widths, and machines which are word-based and machines which are not. Packing 10 6-bit chars into a 60 bit word did make sense at one point.

But today, we find RAMs supplied as x8 or x16 widths, or x9 inside an FPGA. So for me, x8, x16 or x32 are the most interesting possibilities. I think a word-based machine has a lot of merit and is worth exploring. It's obvious that some masking and shifting can allow access to any kind of packed data when we really need to - you just lose some performance and code density as a tradeoff for data density.


Sun Aug 14, 2016 7:35 am

Joined: Sat Feb 02, 2013 9:40 am
Posts: 1531
Location: Canada
BigEd wrote:
I think a word-based machine has a lot of merit and is worth exploring. It's obvious that some masking and shifting can allow access to any kind of packed data when we really need to - you just lose some performance and code density as a tradeoff for data density.

A problem with word oriented machines is that store cycles must perform read-modify-write operations for anything less than a word in size. It turns stores into loads as well as stores. It's really nice to have store instructions that can handle 8, 16, 32 bit data with a single bus cycle. Maybe an asymmetrical instruction set is due.

_________________
Robert Finch http://www.finitron.ca


Sun Aug 14, 2016 4:17 pm WWW

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1647
I don't think a word-based machine need necessarily support accesses other than aligned word accesses - leave it to software to do the read-modify-write, which we presume to be relatively rare and not performance critical.

There's a whole range of choices available, from wide machines which support a range of narrow and unaligned accesses to machines which support only a single width. Indeed, the 6502 is fairly resolutely(*) a byte-wide machine. It supports BCD for add and subtract, but if you want to pack, unpack or update nibbles then you will need to do the work yourself.

(*) OK, indirection does use pairs of bytes in zero page to act as 16 bit wide addresses.


Sun Aug 14, 2016 4:22 pm

Joined: Tue Dec 11, 2012 8:03 am
Posts: 285
Location: California
robfinch wrote:
A problem with word oriented machines is that store cycles must perform read-modify-write operations for anything less than a word in size. It turns stores into loads as well as stores. It's really nice to have store instructions that can handle 8, 16, 32 bit data with a single bus cycle. Maybe an asymmetrical instruction set is due.

This is something we talked about regarding the 65Org32—that there would not really be anything less than 32 bits. You cannot store just 8 bits, because each memory address holds 32 bits. It's wasteful for straight text, but memory is cheap today, if you really want to avoid the extra R-M-W time. So what about something like storing to an 8-bit port? The high 24 bits simply don't matter. There's no reason to read 32 bits, modify only the low 8 bits, and store it all back, because there's nothing listening to the high 24 bits. Just store the whole thing, without first reading, and the high 24 bits are lost in space which is fine. For reading I/O however, it might sometimes be good to have instructions able to read the low 8 bits and fill the rest in with 0's.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources


Sun Aug 14, 2016 7:20 pm WWW

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1647
Something which is easily missed when discussing what features would be nice to have: every extra feature has a chance of causing some drop in clock speed. There may be a point where it really isn't worth adding a feature: doing the job in software will result in a faster machine overall because the clock will be faster. This is of course the reason behind RISC, but it's much more general than that. And there's another cost: every extra feature is a challenge to verification, or an opportunity to misbehave, depending on how you look at it. And then, at bottom, more features means more work!


Sun Aug 14, 2016 7:31 pm
 [ 9 posts ] 

Who is online

Users browsing this forum: AhrefsBot, CCBot and 0 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  
Powered by phpBB® Forum Software © phpBB Group
Designed by ST Software