View unanswered posts | View active topics It is currently Thu Mar 28, 2024 11:46 pm



Reply to topic  [ 74 posts ]  Go to page Previous  1, 2, 3, 4, 5
 Sprite / Cursor Controllers 
Author Message

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Triangle drawing was fixed by placing limits for lines where the inverse slope was infinite.
A five deep state stack was added so that graphics primitives could “call” hardware routines rather than using a mix of in-lining everything and spaghetti state management.

Some experimentation with anti-aliasing has taken place. The result was interesting. It’s apparent that the color of the pixels needs to be combined with the underlying color in the drawing area. What the Wikipedia article doesn’t mention is that something similar to alpha blending is required to mix the colors. Without blending the colors the perimeter of the triangles appears black. Reading the underlying color for the two pixels being set takes about six clock cycles. The total being about eight clocks for every pixel plotted. This is 4x slower than Bresenham’s line draw. Fortunately the anti-aliasing only needs to be applied around the perimeter of the shape.

The anti-aliasing at the moment actually de-proves the appearance. The problem is that the anti-aliasing lines don’t match up exactly with the lines drawn to fill the triangle. This results in more jaggies which are noticeable.

_________________
Robert Finch http://www.finitron.ca


Mon Dec 18, 2017 6:06 am
Profile WWW

Joined: Fri May 08, 2015 6:22 pm
Posts: 61
robfinch wrote:
Triangle drawing was fixed by placing limits for lines where the inverse slope was infinite.
A five deep state stack was added so that graphics primitives could “call” hardware routines rather than using a mix of in-lining everything and spaghetti state management.

That sounds interestingly expandable.

Quote:
Some experimentation with anti-aliasing has taken place. The result was interesting. It’s apparent that the color of the pixels needs to be combined with the underlying color in the drawing area. What the Wikipedia article doesn’t mention is that something similar to alpha blending is required to mix the colors. Without blending the colors the perimeter of the triangles appears black. Reading the underlying color for the two pixels being set takes about six clock cycles. The total being about eight clocks for every pixel plotted. This is 4x slower than Bresenham’s line draw. Fortunately the anti-aliasing only needs to be applied around the perimeter of the shape.

I hadn't thought about the underlying colours, most examples cheat by using a solid black or white background.
I don't think being 4x slower than Bresenham’s is a real problem. If the alternate was to use software anti-aliasing, it would be far slower.

Quote:
The anti-aliasing at the moment actually de-proves the appearance. The problem is that the anti-aliasing lines don’t match up exactly with the lines drawn to fill the triangle. This results in more jaggies which are noticeable.

A different output for each algorithm, ouch.
It might be possible to use a Wu derived algorithm that works in a similar fassion to the Bresenham based filled triangle algorithm.


Mon Dec 18, 2017 7:27 am
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Whenever I encounter an intractable problem I try and work on something else for a while. So now it's flood filling.

Added flood filling to the core, it uses a recursive flood fill for simplicity. The flood fill stack is 4096 entries, so if there are more than 4094 pixels in a row to be filled without returning the stack will overflow and the operation will be aborted. It could happen since the bitmap may be up to 65536x65536. Practically it shouldn’t happen because there is only 1MB of graphics memory. The target bitmap is <= 800x600.

As a test the entire screen is to be flood filled. But when run, only two pixels in the middle of the screen are set. This has me baffled. I wasn’t expecting it to fail in that manner. I figured it would scan at least to the end of the scan-line. The only thing I can think of right now is that it’s reading the wrong pixel from the screen and it thinks it’s finished because it’s reading a pixel that’s set already.
I put an additional delay in from the time the pixel coordinate is calculated to the generation of memory address and that helped a bit. Now there are about sixteen scanlines of fill happening instead of just two pixels. This should not have made a difference but it did, as there were no timing errors reported.

Another problem with the system is that the cursor loses it’s shape. I observed that cursoring around on the screen changes the cursor shape. It looks like the cursor image address register is updated whenever the cursors vertical position changes. There’s got to be a bad logic bit somewhere. I tried to fix this with software by re-writing the cursor image address register after a change in position but it didn’t work.
Anyways I’ve re-written the flood fill because it just might be failing at a 4096 stack overflow due to a programming problem. The system is building at this very moment.

_________________
Robert Finch http://www.finitron.ca


Tue Dec 19, 2017 7:34 am
Profile WWW

Joined: Fri May 08, 2015 6:22 pm
Posts: 61
robfinch wrote:
Whenever I encounter an intractable problem I try and work on something else for a while. So now it's flood filling.

Sounds like a good move. Often a solution to one problem can suddenly pop into mind while working on a totally different problem.
Quote:
Added flood filling to the core, it uses a recursive flood fill for simplicity. The flood fill stack is 4096 entries, so if there are more than 4094 pixels in a row to be filled without returning the stack will overflow and the operation will be aborted. It could happen since the bitmap may be up to 65536x65536. Practically it shouldn’t happen because there is only 1MB of graphics memory. The target bitmap is <= 800x600.

As a test the entire screen is to be flood filled. But when run, only two pixels in the middle of the screen are set. This has me baffled. I wasn’t expecting it to fail in that manner. I figured it would scan at least to the end of the scan-line. The only thing I can think of right now is that it’s reading the wrong pixel from the screen and it thinks it’s finished because it’s reading a pixel that’s set already.
I put an additional delay in from the time the pixel coordinate is calculated to the generation of memory address and that helped a bit. Now there are about sixteen scanlines of fill happening instead of just two pixels. This should not have made a difference but it did, as there were no timing errors reported.

Might be time for another video debug hack.
If you disable writes and run the fill in a loop, you'll be able to flag the result of your 'read test for set pixel' function and feed it directly into the video out section of the design to see what's going on.
If you also set a flag based on then the 'disabled' writes would have been occurring, use a different colour and see if it lines up with the detected edge pixels.

Quote:
Another problem with the system is that the cursor loses it’s shape. I observed that cursoring around on the screen changes the cursor shape. It looks like the cursor image address register is updated whenever the cursors vertical position changes. There’s got to be a bad logic bit somewhere. I tried to fix this with software by re-writing the cursor image address register after a change in position but it didn’t work.
Anyways I’ve re-written the flood fill because it just might be failing at a 4096 stack overflow due to a programming problem. The system is building at this very moment.

A fix might be as simple as only updating the cursor position during the vertical blanking interval.


Tue Dec 19, 2017 9:41 am
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
The last couple of days have been spent working on yet another audio/video controller core (AVIC128). This time using ddr ram for the frame and audio buffers. According to my calculations the interface to the ram should be just fast enough to support a 800x600x16bpp display along with sprites and audio. The controller uses a 128 bit wide bus to interface to the memory. Much of the code for AVIC128 has just been copied from the previous core. The biggest difference being in the ram bus interfacing.
It was desired to use the block ram to support a multi-core operation rather than as a video frame buffer for the system.
Sprite have changed slightly because 128 bit are read at once. So that bits aren't wasted sprites are now 16 color, and 32 pixels wide rather than 4 color. They can only be linked to one other sprite rather than doubly linked.

The main processor for the system has been switched to FT64 from TG68. A sequential version of the FT64 processing core was written. FT64 makes use of a 128 bit wide data bus. FT64 is a superscalar processor but it's too big to fit in the FPGA, hence a non-superscalar sequential version that executes the same instruction set was born. The FT64 instruction set may be "good enough" for me to use as a base for future projects. I've been bouncing around with the instruction sets learning all the in's and out's.

_________________
Robert Finch http://www.finitron.ca


Fri Dec 22, 2017 6:58 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
I'm kinda stuck in the water on this project right now as synthesis quits with 'Abnormal Program Termination' and I haven't been able to figure out yet what synthesis doesn't like. I was trying to add blitter logic when this error struck. I have posted a dump file with message on Xilinx forum.

After some experimentation it looks like 800x600 mode may not support more than three sprites. I switched the system to 400x300 mode, and that seems to allow 17 sprites. Accessing memory isn't nearly as fast as block ram access. This may put limits on what can be done. The command queue seems not to work.

Numerous software issues with the FT64 assembler were ironed out. Apparently I only half-ported it from another project so a number of opcodes were wrong.

_________________
Robert Finch http://www.finitron.ca


Sat Dec 30, 2017 1:08 am
Profile WWW

Joined: Fri May 08, 2015 6:22 pm
Posts: 61
Just in case you haven't found this.

AR# 55854
Vivado - What can I do to resolve a Vivado crash, exception, or abnormal program termination?
https://www.xilinx.com/support/answers/55854.html


Modern RAM strikes again :(
Does the dev board have enough IO to externally connect some fast SRAM?


Sat Dec 30, 2017 3:25 am
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Got past the abnormal program termination by modifying enough code that synthesis worked on it differently. Also many warnings spit out by synthesis were fixed.

Quote:
Does the dev board have enough IO to externally connect some fast SRAM?

Unfortunately the dev board doesn't have any sram. There are some I/O's (34 differential pairs) on an FMC connector but I think maybe not enough unless they could be used as single ended I/Os. It would need quite a bit of sram for a decent video frame buffer anyways. 1366x768x3 bytes per pixel) * 2 pages = 6MB. 8MB or more would probably be better considering that sprite and texture data would also be stored there. It would also probably need to be 32 or more bits wide.

Using ddr ram for the display is probably the way to go for high capacity memory. But it's too bad the interfacing isn't simpler / lower latency.

Text blitting is just about working. Some random characters were displayed onscreen, except they came out flipped horizontally.

_________________
Robert Finch http://www.finitron.ca


Sat Dec 30, 2017 2:52 pm
Profile WWW

Joined: Fri May 08, 2015 6:22 pm
Posts: 61
robfinch wrote:
Got past the abnormal program termination by modifying enough code that synthesis worked on it differently. Also many warnings spit out by synthesis were fixed.

Quote:
Does the dev board have enough IO to externally connect some fast SRAM?

Unfortunately the dev board doesn't have any sram. There are some I/O's (34 differential pairs) on an FMC connector but I think maybe not enough unless they could be used as single ended I/Os. It would need quite a bit of sram for a decent video frame buffer anyways. 1366x768x3 bytes per pixel) * 2 pages = 6MB. 8MB or more would probably be better considering that sprite and texture data would also be stored there. It would also probably need to be 32 or more bits wide.

Using ddr ram for the display is probably the way to go for high capacity memory. But it's too bad the interfacing isn't simpler / lower latency.

While each line of a differential pair will (should) be length matched, some dev-boards don't care so much about keeping the length of the different pairs matched to each other, this could cause issues in the case of high speed parallel bus as used to connect RAM. FMC connector, hmmm, not the friendliest thing to interface to for a hobby project, and a bit pricey.

MiSTer uses standard SDRAM on an addon board to avoid latency problems with the DDR3 on the Terasic DE10-nano.
https://github.com/MiSTer-devel/Main_MiSTer/wiki

I spotted a couple GS8662Q36E-250I chips (72Mb SigmaQuad-II Burst of 2 SRAM) on a piece of scrap board I picked up on eBay, I love the spec, but they're a bit on the expensive side and only exist as BGA.
They would make very nice display RAM though :)
http://pdf1.alldatasheet.com/datasheet- ... asheet.pdf


Quote:
Text blitting is just about working. Some random characters were displayed onscreen, except they came out flipped horizontally.

The NOT in this snippet from my VHDL code is responsible for un-flipping X.
Code:
         IF (charData(to_integer(NOT pixelh(2 DOWNTO 0))) xor (Cursor AND CursorValid)) = '1' THEN
            Pixel_Colour <= CharCol;
         END IF;


Sat Dec 30, 2017 8:14 pm
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Worked on the sprite controller version from about 3 years ago. There are two basic approaches I have used to implement sprite controllers. One is to use block rams as a cache for image data, the screen is fed from block rams which are loaded via DMA. The second approach used is beam racing. Loading the sprite data a scanline before it is needed, and having only a scan-line buffer rather than a complete image cache.
The first version from three years ago, uses block rams to hold the sprite image data instead of DMA’ing it directly when needed. Using block rams gives sprites more memory bandwidth to support higher color resolutions. Up to 32 block rams may be accessed in parallel for display data which can be 32-bits per pixel color. That is 128 bytes per pixel location of bandwidth. Getting all the data for pixels during a scan time would not be possible for a beam racing approach.
Added the capability to automatically perform DMA during vertical sync. Otherwise sprite image data load is triggered programmatically. Removed the programmatic access to the sprite image cache to simplify the interface to the cache. Sprite data must be setup in main memory then loaded with DMA to the image cache. The image data is manipulated entirely in main memory now.

_________________
Robert Finch http://www.finitron.ca


Thu Sep 23, 2021 7:41 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Added animation capability to the sprite controller. By cycling through frames in the image buffer simple animations can be done.

Red box testing of the sprite controller showed that pixels were displaying in the wrong order. Pixels were displaying: 3,2,1,0 and should have been 0,1,2,3. The order was swapped across 64 bits which is the amount of data read by the sprite controller in one cycle. So, some of the address lines to the BRAM were inverted to correct the display order. This could also have been corrected by modifying the order data is written out by the sprite editor, but that would make it impossible to edit using other tools.

Made a small sprite controller test bench. It accesses the sprite controller causing sprites to move around randomly on screen. The test bench shows that the register interface to the sprite controller is working.

Red box image:
Attachment:
File comment: red box screen
redbox.png
redbox.png [ 7.3 MiB | Viewed 2366 times ]

_________________
Robert Finch http://www.finitron.ca


Sun Oct 16, 2022 4:02 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Got the sprite image data loading from the DRAM. The result is the sprite looks like a random grid of colored dots.

I can see a burst of 64 128-bit packets of data coming from the DRAM using the logic analyzer. Each packet is transferred at 100 MHz. That is 1600 MB/s, not bad.

Trying to get the frame buffer working now. The display is rolling lines ATM. The multi-port memory controller, mpmc10, is a new version so the bugs need to be worked out. Display of sprites is rock-solid, so I am thinking there are issues with the frame buffer component.

The animations of butterflies and squares filled with random dots reminds me of the fish in the doctor’s office.

_________________
Robert Finch http://www.finitron.ca


Mon Oct 17, 2022 7:20 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Got the classic display of fish swimming to demonstrate the operation of a sprite controller. Tested the output display plane of the sprite controller and text controller, they seem to work. I have been editing fish sprites so they can work with the sprite controller. The sprite controller was directly connected to block ROMs containing the images to display. That works fine. Next I setup a simple state machine module to load the DRAM with sprite images from ROM. And I setup a second state machine to program the sprite controller’s registers to display the sprite images from DRAM. It seems to be working, proving that the multi-port memory controller #10 is working for writes and streamed access. Next will be to get the controller working for cached accesses.

Always learning something new. Found a new way to do round-robin arbitration on stackoverflow using a subtractor carry-chain rather than left/right rotates of data. It should make the arbitrator smaller and faster.

Sprite display is rock-solid when image data is coming from block ROMS. However, there are glitches when the image data is coming from DRAM. I am pretty sure writes to the DRAM through the memory controller are working. The sprite image data is setup only once at initialization and the sprites sometimes display properly without glitches. The sprite image caches are not large enough to hold all the data for all the different sprites and animation. Therefore, they must be being successfully reloaded from the DRAM. This means the auto-reload feature of the sprite controller is working.

_________________
Robert Finch http://www.finitron.ca


Fri Oct 21, 2022 4:26 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
There are what I assume are timing issues in the sprite controller due to the animation logic. These are apparent when a large sprite is displayed, for some reason smaller sprites were not affected. The display of the sprite was starting at the wrong location in the image buffer, resulting in it being shifted downwards on-screen. The code with timing issues turns out to be the frame multiplier a line of code like: sprAddress = sprImageOffset + sprCurFrame * sprFrameSize
Even though the current frame was zero the multiply was returning a non-zero value (it was always off by a fixed amount, but only for large sprites).

The multiply has now been replaced with an accumulating add operation. This should improve the timing. It works great now.

_________________
Robert Finch http://www.finitron.ca


Sat Oct 22, 2022 2:48 am
Profile WWW
Display posts from previous:  Sort by  
Reply to topic   [ 74 posts ]  Go to page Previous  1, 2, 3, 4, 5

Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group
Designed by ST Software