View unanswered posts | View active topics It is currently Thu Mar 28, 2024 10:54 am



Reply to topic  [ 9 posts ] 
 Byte Addressable Memory 
Author Message

Joined: Sat Jun 16, 2018 2:51 am
Posts: 50
Hi! I'm looking for some insight into how systems that use an address bus that is larger than 8 bits in size achieve byte addressing.

For example, consider the case of a 32 bit system that implements the x86 architecture (and uses 32 bit internal registers). From what I've gathered by searching the web, even though you specify memory addresses in bytes when using x86 assembly, the system does not fetch single bytes from memory. Instead it fetches 32bit words from memory (four bytes at a time).

Consider the code below. It loops through the byte string "helloWorld" and prints each character in a newline. Register ebx is set to point to the location of the string in memory. Suppose that the string resides at memory location 100 bytes.

Register ecx is loaded with the current character (that ebx is pointing to) so that we can compare it against zero as a way to check when we have reached the end of the string. Specifically, the lower byte of register ecx (named cl) is loaded with the character.

Is the following diagram how the string's bytes are loaded from memory? If so, how does cl (the lowest byte of register ecx) get the appropriate character when the character's index modulo 4 does not equal zero? I.e. how do the ?? below end up with the correct character? Is some byte shifting performed behind the scenes by the system?

Code:
desired    | byte address  |  actual address  |  value loaded  |  value loaded
character  | requested     |  sent to memory: |  into ecx      |  into cl
           |               |  floor( a / 4 )  |  register      |  (lowest byte)
-----------|---------------|------------------|----------------|---------------
h          |     100       |         25       |  (l, l, e, h)  |   h
e          |     101       |         25       |  (l, l, e, h)  |   ??
l          |     102       |         25       |  (l, l, e, h)  |   ??
l          |     103       |         25       |  (l, l, e, h)  |   ??
o          |     104       |         26       |  (r, o, W, o)  |   o
W          |     105       |         26       |  (r, o, W, o)  |   ??
o          |     106       |         26       |  (r, o, W, o)  |   ??
r          |     107       |         26       |  (r, o, W, o)  |   ??
l          |     108       |         27       |  (., 0, d, l)  |   l
d          |     109       |         27       |  (., 0, d, l)  |   ??
0x0        |     110       |         27       |  (., 0, d, l)  |   ??



Full assembly code for reference:
Code:
global _start

SYS_EXIT  equ 60
EXIT_CODE equ 0
SYS_WRITE equ 1
STD_OUT   equ 1

section .data

   newline db 0xA
   msg     db "helloWorld", 0x0

section .text

   ; Based on https://stackoverflow.com/a/7614431
   _start:

      mov ebx, msg            ; pointer to string

   mLoop:

      mov cl, [ ebx ]         ; Read the next byte from memory

      cmp cl, 0               ; Compare the byte to null (the terminator)
      je exit                 ; If the byte is null, jump out of the loop

      ; Output the character
      mov eax, SYS_WRITE
      mov edi, STD_OUT
      mov esi, ebx            ; pointer to character...
      mov edx, 1
      syscall

      ; Ouput new line
      mov eax, SYS_WRITE
      mov edi, STD_OUT
      mov esi, newline
      mov edx, 1
      syscall

      ; Move to the next byte in the string
      add ebx, 1
      jmp mLoop


   exit:

      mov eax, SYS_EXIT
      mov edi, EXIT_CODE
      syscall


*Disclaimer, the code might not work as is. It was originally written for and tested in 64bit. I swapped the registers to 32bit for brevity (example rcx to ecx).


Tue Mar 26, 2019 12:20 am
Profile

Joined: Tue Dec 31, 2013 2:01 am
Posts: 116
Location: Sacramento, CA, United States
Quote:
Hi! I'm looking for some insight into how systems that use an address bus that is larger than 8 bits in size achieve byte addressing.
No, I am relatively certain that the number of bits in the address bus dictate the number of addressable memory cells, not the widths of those cells. The most popular cell width is still 8-bits, at least in the 65xx, 68xx, x86, 80xx, and Z80 crowds ...

Quote:
From what I've gathered by searching the web, even though you specify memory addresses in bytes when using x86 assembly, the system does not fetch single bytes from memory. Instead it fetches 32bit words from memory (four bytes at a time).
I am not an x86 programmer, but that statement does not agree with my understanding of how the x86 works. AIUI, the destination register 'cl' in the mov instruction is only 8-bits wide, so the x86 should effectively only be reading one byte from RAM and depositing it in cl, at least from a programmer's point of view. If you had specified 'cx' or 'ecx' or 'rcx' as a destination register, the x86 would recognize the width of the register and read the appropriate number of bytes from RAM starting at the byte address specified in the effective address. The byte at [ea] would deposit into cl, [ea+1] to ch, [ea+2] and [ea+3] to bits 16:31 of ecx, etc. From a hardware standpoint, there are issues of "aligned addresses" which can allow some versions of x86 to greatly improve performance by copying more than one byte simultaneously, but I'll leave that discussion to those more experienced in the subject.

Mike B.


Last edited by barrym95838 on Tue Mar 26, 2019 6:08 am, edited 1 time in total.



Tue Mar 26, 2019 5:37 am
Profile

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1780
It's a good question quadrant (I'm sure you meant to speak of a wide data bus rather than mention the width of the address bus)

[My answer is coming out a little different from yours Mike, which is not to say I think you're wrong!]

I think I'm right in saying that there are several ways for an architecture to offer byte access. In your example, you are loading a wide register at a byte address which isn't aligned with the size of the register. That's called an unaligned access.
- some architectures will forbid an unaligned access, so you get an exception
- some will barrel-roll the databus: the word containing the byte in question is rotated by 8, 16 or 24 bit shifts: for address 0x0101 'ELLH' or rather 'H', 'L', 'L', 'E' with 'E' in the least significant byte
- some will access the word containing the byte of interest and also the next word, and shift the pair to give you the byte you want and the three bytes after it: for address 0x0101, 'ELLO' or rather O, L, L, E, again with 'E' in the least significant byte

Getting an unaligned word is going to cost two accesses. In a machine with a cache, the penalty might be less if the two words are in the same cache line.

Writes are even worse, because the machine needs to read the affected word, modify it, and then write it back. Unless the memory system provides byte enables for writes, which it might do.

But note: you could have chosen to load only a byte-wide register at an unaligned address, instead of loading a word-wide register. The same analysis applies, but there's no longer a need to read two words.


Tue Mar 26, 2019 6:06 am
Profile

Joined: Fri May 05, 2017 7:39 pm
Posts: 22
You can find a detailed description how the 32bit 68020 will handle this in the User Manual (ch.5.2).


Tue Mar 26, 2019 1:32 pm
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
The processor / I/O may also reflect narrow data multiple times on a wider data bus. Rather than just shifting an eight-bit value, the same eight-bit value will be output on all byte lanes. Which lane gets written is usually controlled by a select signal (sel_o for WISHBONE bus).
Getting or setting I/O values has a similar issue when the bus widths mismatch. Suppose there’s an eight-bit keyboard controller connected to a 64-bit bus. To write to the I/O the cpu reflects the same data byte across all eight-byte lanes when outputting data. Even though it may be trying to write byte #3, byte #0 has the same contents so the keyboard databus needs to be connected to only bit 0 to 7. The keyboard must check the low order address bits or the byte lane select signals to determine which register to write. On readback from the keyboard, it must copy the data bits to all byte lanes of the cpu. Following is code for feeding a 64-bit databus from 8,32, and 64 bit peripherals.
Code:
always @(posedge cpu_clk)
casez({rnd_ack,cs_led,kbd_ack,aud_ack})
4'b1???:   br2_dati <= {2{rnd_dato}};   // 32 bits reflected twice
4'b01??:   br2_dati <= {2{led_dato}};   
4'b001?:   br2_dati <= {8{kbd_dato}};   // 8 bits reflect 8 times
4'b0001:   br2_dati <= aud_cdato;         // 64 bit peripheral
default:   br2_dati <= br2_dati;
endcase

_________________
Robert Finch http://www.finitron.ca


Tue Mar 26, 2019 7:27 pm
Profile WWW

Joined: Sat Jun 16, 2018 2:51 am
Posts: 50
barrym95838 wrote:
No, I am relatively certain that the number of bits in the address bus dictate the number of addressable memory cells, not the widths of those cells. The most popular cell width is still 8-bits, at least in the 65xx, 68xx, x86, 80xx, and Z80 crowds ...
Ohhh. I was under the impression that memory was 32/64bits wide. I didn't realize it was actually 8bit.

@BigEd Thanks for the insight! Do you know which of the three is the most popular approach? It seems forbidding unaligned access (and throwing an error) would be the simplest...

GaBuZoMeu wrote:
You can find a detailed description how the 32bit 68020 will handle this in the User Manual (ch.5.2).
Thank you, will give it a read.


Tue Mar 26, 2019 8:08 pm
Profile

Joined: Sat Jun 16, 2018 2:51 am
Posts: 50
robfinch wrote:
Rather than just shifting an eight-bit value, the same eight-bit value will be output on all byte lanes. Which lane gets written is usually controlled by a select signal (sel_o for WISHBONE bus).
Oh, that's neat!


Tue Mar 26, 2019 8:12 pm
Profile

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1780
I think forbidding unaligned accesses is the kind of thing a fresh-start RISC machine will do. Whereas a machine with layers of legacy, like x86, has to allow them.

Here's a doc on what they are and how they can arise and how to avoid them in C:
https://www.kernel.org/doc/Documentatio ... access.txt

Some nice details in this slide set:
https://www.cl.cam.ac.uk/teaching/2004/ ... ynotes.pdf
(Edit: start at page 65 for unaligned accesses, but for a big lecture course on comparing architectures, read it all!)


Tue Mar 26, 2019 8:53 pm
Profile

Joined: Fri May 05, 2017 7:39 pm
Posts: 22
IIRC the 68000 does not allow odd values for the system stackpointer.


Wed Mar 27, 2019 12:16 am
Profile
Display posts from previous:  Sort by  
Reply to topic   [ 9 posts ] 

Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group
Designed by ST Software