Last visit was: Wed Jan 15, 2025 9:16 am
It is currently Wed Jan 15, 2025 9:16 am



 [ 74 posts ]  Go to page Previous  1, 2, 3, 4, 5  Next
 Sprite / Cursor Controllers 
Author Message

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2232
Location: Canada
This post is a bit off topic, but it’s related to the test system.

Optional mixing of audio channels 0 and 1, or 2 and 3 was added as per Cray Ze’s suggestion. The ‘add’ operator was used then the result divided by two to keep it in range. Also added amplitude and frequency modulation options to audio channels. Channel 0 can modulate channel 1, channel 1 can modulate channel 2, and channel 2 can modulate channel 3.

Added a simple paged memory management unit to the system. The lowest page of memory where the interrupt vectors are, and the upper quarter of the memory space are unmapped. The mmu doesn’t page fault on missing pages and doesn’t support any protection violations that might cause an exception. It does allow mapping of addresses and write-protecting pages. It also supports two page sizes 4kB and 4MB. The mmu uses a fully associative 32 entry cache for translated addresses. Translations take place without incurring additional clock cycles, so the mmu may end up slowing the clock cycle time down. If there is a translation miss it takes about 20 or more clock cycles to walk the page tables and get the translation. Depends on the memory access which takes several clock cycles.

_________________
Robert Finch http://www.finitron.ca


Mon Dec 04, 2017 9:59 pm WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2232
Location: Canada
Added sixteen sprites to the core. The sprites are up to 32 x 32 in size with three colors. They only require four clock cycles to load the shift register. It’s estimated there are enough clock cycles during horizontal blank period (336) to support up to 76 sprites per line. Just how many to support is a question. The cascaded logic in determining the pixel color output is what maybe provides a limit. TI’s chip 99x8 supports eight per line and more than that onscreen at the same time. C64 and Amiga support eight sprites. I’ve tried up to 32 before with another controller. Up to three sprites may be linked together to increase the color selection up to 63 colors. Each linked sprite contributes two bits for selection. The sprites have to be positioned at the same coordinates for the link to work correctly.

Cascade logic (priority encoder) sample:
Code:
always @(posedge clk)
   rgb <=    blank ? 15'h0000 :
            border ? borderColor :
             cursor ? (cursor_color0[15] ? rgb_i[14:0] ^ 15'h7FFF : cursor_color0) :
             cursor_on[0] ? cursorColor[0] :
             cursor_on[1] ? cursorColor[1] :
             cursor_on[2] ? cursorColor[2] :
             cursor_on[3] ? cursorColor[3] :
             cursor_on[4] ? cursorColor[4] :
             cursor_on[5] ? cursorColor[5] :
             cursor_on[6] ? cursorColor[6] :
             cursor_on[7] ? cursorColor[7] :
         rgb_i[14:0];

Calling the mmu initialization routine causes the machine to hang. However, it’s not hanging in the initialization routine, it hangs later after it returns from the routine. This is strange because even if the mmu isn’t turned on, it still hangs. However, removing the call to the initialization routine allows the machine to work. It strange because all initialization does is setup page tables in memory. There isn’t hardware device that being accessed.

_________________
Robert Finch http://www.finitron.ca


Tue Dec 05, 2017 7:21 am WWW

Joined: Fri May 08, 2015 6:22 pm
Posts: 61
Looking at your code, it occurred to me that there is a direct correlation between the input bits and the output bits of your priority encoder.
It should be easy to work out the bits for the cursorColor entry in parallel.

This is not even pseudocode, but more thought dump to explain the pattern I see.

EDIT: Now I think I'm seeing nothing as everything seems to cancel out, I'll leave it here just in case it shows something other than crazy thoughts.

Code:
             cursor_on[0] ? cursorColor[000] :


   if 1 not 0 then bit 0 = 1
             cursor_on[1] ? cursorColor[001] :


   if 2 to 3 not 0 then bit 1 = 1
             cursor_on[2] ? cursorColor[010] :
             cursor_on[3] ? cursorColor[011] :


   if 4 to 7 not 0 then bit 2 = 1
             cursor_on[4] ? cursorColor[100] :
             cursor_on[5] ? cursorColor[101] :
             cursor_on[6] ? cursorColor[110] :
             cursor_on[7] ? cursorColor[111] :


Tue Dec 05, 2017 9:35 am

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2232
Location: Canada
Quote:
Looking at your code, it occurred to me that there is a direct correlation between the input bits and the output bits of your priority encoder.

Oops. I re-wrote the code so it works differently now (there's still a priority encoding though). One thought I had was to split the priority encoding across multiple clock cycles. I added more cursor attributes to the color. The cursor now has flash, flash rate, reverse video, and alpha blending attributes all coming from a color palette. Each cursor gets its own group of colors from the palette, however colors are shared if the cursors are linked. An idea borrowed from the Amiga. I broke the code up some, it was just too much to do in a single line.
Cursor color selection is coded like this:
Code:
wire [22:0] cursorColorOut =
                cursor_color[
                cursor_on[0] ? cursorColorNdx[0] :
                cursor_on[1] ? cursorColorNdx[1] :
                cursor_on[2] ? cursorColorNdx[2] :
                cursor_on[3] ? cursorColorNdx[3] :
                cursor_on[4] ? cursorColorNdx[4] :
                cursor_on[5] ? cursorColorNdx[5] :
                cursor_on[6] ? cursorColorNdx[6] :
                cursor_on[7] ? cursorColorNdx[7] :
                cursor_on[8] ? cursorColorNdx[8] :
                cursor_on[9] ? cursorColorNdx[9] :
                cursor_on[10] ? cursorColorNdx[10] :
                cursor_on[11] ? cursorColorNdx[11] :
                cursor_on[12] ? cursorColorNdx[12] :
                cursor_on[13] ? cursorColorNdx[13] :
                cursor_on[14] ? cursorColorNdx[14] :
                cursorColorNdx[15]];


With all the attributes and a large number of cursor the number of logic levels is bound to be a lot. It may not work.

_________________
Robert Finch http://www.finitron.ca


Tue Dec 05, 2017 4:16 pm WWW

Joined: Fri May 08, 2015 6:22 pm
Posts: 61
If I expand my thought posted above properly and break the group of four into two, and then the groups of two into single checks,
I get something very similar in operation to the 74HC148 8 to 3 Line Priority Decoder. Perhaps not such a crazy thought.

There's a nice 74HC148 schematic on the following page, along with an example of expansion to more inputs.
http://www.learnabout-electronics.org/Digital/dig44.php


Tue Dec 05, 2017 5:10 pm

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2232
Location: Canada
'148 encoder is a good little chip. I used it to encode interrupts for the 68k.

Spent some time studying Amiga playfields. I was wondering about sprite-background collision detection and whether it was worth it to add to the controller. In fact it seems that software collision detection rules, so hardware detection is of limited value. I was also wondering about sprite-background display priorities and how to implement this in the controller. The Amiga can setup two playfields so that some of the display appears in front of sprites.

Rather than using playfields I’m contemplating using the idea of a z-buffer associated with the display in addition to the regular screen buffer. After all each displayed pixel can be only one color and rendering multiple screen buffers to get multiple layers of graphics would take to much memory bandwidth.
The z-buffer would provide display priority information for each pixel of the display. A two bit z-buffer could supply one of four display priorities for each screen pixel. Sprites could also have a z layer associated with them. So a display pixel with a z value of 3 would be behind everything. A sprite with a z value of 3 would appear behind pixels with a z value < 3. A two bit z buffer could supply the same effect as four playfields.

A problem is how to make the z-buffer available to the blitter. One option is to make it part of the display memory by reducing the available color selection. I’m not particularly fond of that idea, as it would make it difficult to use existing graphics, which wouldn’t understand the z-buffer bits.
Well, the z-buffer is implemented as a separate memory not accessible to the blitter at the moment.

_________________
Robert Finch http://www.finitron.ca


Wed Dec 06, 2017 10:54 am WWW

Joined: Fri May 08, 2015 6:22 pm
Posts: 61
A z-buffer makes sense, how are you handling the 2-bit width for storage and blitting? Weird sized blockRAM config, bit packing, something else?


Wed Dec 06, 2017 12:16 pm

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2232
Location: Canada
Quote:
A z-buffer makes sense, how are you handling the 2-bit width for storage and blitting? Weird sized blockRAM config, bit packing, something else?

The block ram is triple ported, two read ports, one write port. At the moment (it may change) on the cpu/blitter side the ram looks like it's 32 bits wide read/write and has 8, 4 bit or 16 two bit values that are packed into the long word. On the display side the ram is 8 bits wide (smallest port that was allowed) but there's a mux on the eight bit output to select either 4 bit z values or two bit z values. Because the z buffer is packed on the blitter side of the ram, the blitter can move z values around only in groups of 4 or 8.
I think the blitter has to be changed to support moving value smaller than 16 bits around. But it would need to be able to operate on 1,2,4, or 16 bit values. Right now the blitter can read single bit values which are promoted to 16 bits black or white color, but it doesn't have a shift register to allow storing a bit value back to memory. This lets it use two value bitmaps for masking but it can't manipulate them.
I've been thinking of having the z-buffer ram a simple 32 bits wide with only two bits implemented. The reason being that some graphics operations use a z-buffer value and it might be a place to store the z-buffer for the graphics operation. Right now there isn't enough ram left in the FPGA to support a 32 bit z buffer.

_________________
Robert Finch http://www.finitron.ca


Wed Dec 06, 2017 7:41 pm WWW

Joined: Fri May 08, 2015 6:22 pm
Posts: 61
I had an idea you were getting close to your blockRAM capacity, do you have any ROM in the FPGA that might be able to be copied out to main RAM at boot time, freeing up some blockRAM in the FPGA for other things. This works on the idea of using preloaded RAM to store ROM data at boot, the ROM is then copied out to system RAM (you could then write protect the range to be more ROM like), after which, the blockRAM can be reassigned to a more useful function, like z-buffer.

Sort of a moot point if you don't have the RAM, but when not using the full 32 bits of z-buffer for a certain mode, the extra bits could store the ID of the last sprite to collide with that particular pixel.
This could be a more useful hardware collision method than is used in many old systems.


Wed Dec 06, 2017 11:20 pm

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2232
Location: Canada
I would be nice to be able to use the DDR ram for display purposes but it would be a bit more complicated to use. It might be fast enough if it were used at 128 bit wide, so multiple pixels could be fetched with a slower clock. But it would require completely redesigning the controller. Using the block ram display access is really simple, it doesn’t need any fifos. Timing of the DDR ram is unpredictable making bus master switching on a cycle by cycle basis not possible.

Something strange is happening now and I’ve yet to figure out the problem. Display output is doubled in size, but every other scanline and every other pixel column is blank. If I didn’t know better I’d say it was a problem with the monitor. I added the ability to set a lower resolution mode so modes now support (200x150, 400x300 and 800x600). I also added the capability to set the playfield bitmap width wider than the screen (viewport). I added lower resolution to be able to test playfield width settings and scrolling. 200x150 mode only take 60k ram, so about 16 screens can fit in the 1M display ram.

z buffer is now available to the blitter. A buffer switch flag was added to enable the same address logic to be used to address the z buffer as the regular buffer. So display address $0 refers to either the normal display buffer or the z buffer depending on the flag setting. Drawing commands know about the z buffer. The z buffer setting can be set by drawing lines or plotting points in addition to the cpu access to the z buffer. It does mean that to draw a line with a specific z buffer setting the line draw has to be done twice. Once to set the pixels and a second time after the flag is set to set the z buffer values.

Shift register capability was added to the blitter to allow processing of data less than 16 bits in size. So in theory the blitter should now be able to handle moving z buffer data around.
Quote:
...do you have any ROM in the FPGA that might be able to be copied...

Yes, there is a 64k bootrom that might possibly be swapped. There is also a smaller scratchpad ram used for bootstrapping stack. The bootrom could be made much smaller if I could get the system to load from an SD card or serial port, I haven't had much luck with either of those.

Quote:
...the extra bits could store the ID of the last sprite to collide with that particular pixel.
This could be a more useful hardware collision method than is used in many old systems.

Wouldn't that require searching memory for collisions ?

_________________
Robert Finch http://www.finitron.ca


Thu Dec 07, 2017 8:20 am WWW

Joined: Fri May 08, 2015 6:22 pm
Posts: 61
robfinch wrote:
I would be nice to be able to use the DDR ram for display purposes but it would be a bit more complicated to use. It might be fast enough if it were used at 128 bit wide, so multiple pixels could be fetched with a slower clock. But it would require completely redesigning the controller. Using the block ram display access is really simple, it doesn’t need any fifos. Timing of the DDR ram is unpredictable making bus master switching on a cycle by cycle basis not possible.

It's a shame they seem to have stopped putting SRAM on development boards, SDRAM/DDR RAM increases design complexity to the next level.
Quote:

Something strange is happening now and I’ve yet to figure out the problem. Display output is doubled in size, but every other scanline and every other pixel column is blank. If I didn’t know better I’d say it was a problem with the monitor. I added the ability to set a lower resolution mode so modes now support (200x150, 400x300 and 800x600). I also added the capability to set the playfield bitmap width wider than the screen (viewport). I added lower resolution to be able to test playfield width settings and scrolling. 200x150 mode only take 60k ram, so about 16 screens can fit in the 1M display ram.

Is it possible that a register delay found it's way into the system on both the horizontal and vertical functions of your new resolution modes?
Quote:

Yes, there is a 64k bootrom that might possibly be swapped. There is also a smaller scratchpad ram used for bootstrapping stack. The bootrom could be made much smaller if I could get the system to load from an SD card or serial port, I haven't had much luck with either of those.

The bootrom could also be made smaller if it's job was solely to initialize the CPU, and copy the REAL 64K bootrom to external RAM from an image preloaded into displayRAM or some other blockRAM area by the design tool.
Quote:

Quote:
...the extra bits could store the ID of the last sprite to collide with that particular pixel.
This could be a more useful hardware collision method than is used in many old systems.

Wouldn't that require searching memory for collisions ?

I don't think so. When drawing a sprite, you still know the horizontal and vertical timer values and therefore, what screen locations are under the sprite.
Writing the sprite number to the underlying unused z-buffer bits during sprite drawing would store the last collided sprite ID associated with that pixel, this could be cleared by the user after reading.


Thu Dec 07, 2017 11:03 am

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2232
Location: Canada
Quote:
It's a shame they seem to have stopped putting SRAM on development boards, S

I guess they figure with newer parts the amount of SRAM in the FPGA itself is enough. It might be less expensive for them to include a larger FPGA rather than an SRAM. SRAM probably isn't a good way to implement the video frame buffer because it's too small. 800x600 x 16bpp takes 1M ram, higher resolutions more.

Quote:
Is it possible that a register delay found it's way into the system on both the horizontal and vertical functions of your new resolution modes?

I've checked and double checked. I can't see how the controller could be creating the display. Even if there was a register delay problem there should still be color on screen, not blank lines.

The sprites aren't appearing anymore. I'm trying to figure out why.

_________________
Robert Finch http://www.finitron.ca


Thu Dec 07, 2017 7:18 pm WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2232
Location: Canada
Figured out the strange problem after determining there was a ram addressing problem. The chip ram was set to 32 bits wide but only 16 bit updates were occurring. I forgot to change the width back to 16 bits. This resulted in every other pixel being written as far as display was concerned.
The ram addressing problem made it apparent that occasionally the incorrect ram address would be updated. So the core has switched back to using a synchronous clock for ram updates, rather than running the bram at 200 MHz. It’s a few cycles slower but it’s more reliable.

The number of sprites the core supports has been increased to 32, in theory the core could support even more sprites. There are “extra” cycles available during the horizontal blanking time. I’m wondering how else they could be put to use. That’s on the read-only side of the ram.

The sprite demo/test almost works correctly. There are a few display glitches that I think are due to the design not meeting timing in some circumstances. I should check the timing reports. The problem is that the sprites shapes are defined to be ‘X’s or ‘O’s . This seems to work for all but two sprites. One of the sprites shows up as kind of a box, the other sprite has a continuously changing shape. The continuously changing shape has to be because the sprite’s image pointer is changing somehow. One or two of the sprites don’t track in the arena properly. They bounce too soon, instead of off the arena walls. This is a problem with sprite coordinates. It’s difficult to see how these problems could be software or hdl related as all the sprites are coded in a uniform manner.

_________________
Robert Finch http://www.finitron.ca


Sat Dec 09, 2017 4:20 am WWW

Joined: Fri May 08, 2015 6:22 pm
Posts: 61
Could the sprite strangeness be related to the length of the priority encoder? I imagine it's fairly long mux chain.


Sat Dec 09, 2017 9:56 am

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2232
Location: Canada
Quote:
Could the sprite strangeness be related to the length of the priority encoder? I imagine it's fairly long mux chain.

Perhaps some of the display problems. The color's consistent in the sprite that changes shape. I think the tools will break up the muxes into a minimal chain. I have it coded as a for loop now (it's still a priority chain) so the number of sprites could vary. I'm going to try reducing the number of sprite now.
Another problem is the screen goes black during the sprite demo. There's no code to do that. But the screen bitmap address register is located right after the sprite control registers for the last sprite. I'm wondering if somehow the bitmap address is changing. -> so a routine that fills all the memory with color bands at startup might help indicate if this is true.

Added clock enable control to the ram to reduce power consumption, previously the ram was enabled all the time for bootstrapping simplicity. On the display side the ram isn’t clocked during the vertical blanking interval. On the update side the ram clock is enabled only if an access is going to take place. If the machines sitting in the idle state no ram clocking takes place.

Added a capabilities register to indicate the number of supported sprites and other things. This should help insulate software from changes to the core.

_________________
Robert Finch http://www.finitron.ca


Sat Dec 09, 2017 1:05 pm WWW
 [ 74 posts ]  Go to page Previous  1, 2, 3, 4, 5  Next

Who is online

Users browsing this forum: claudebot and 0 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  
Powered by phpBB® Forum Software © phpBB Group
Designed by ST Software