View unanswered posts | View active topics It is currently Thu Nov 14, 2019 3:09 am



Reply to topic  [ 483 posts ]  Go to page Previous  1 ... 23, 24, 25, 26, 27, 28, 29 ... 33  Next
 Thor Core / FT64 
Author Message

Joined: Sat Feb 02, 2013 9:40 am
Posts: 936
Location: Canada
I continue to work on both v8 and v7 versions of the core. Switched the DPO branch addressing in v7 back to regular displacement addressing. The DPO addressing wasn’t a bad idea, but it’s also not that good of an idea either. The DPO addressing made it impossible to use compressed branches. Switching back to regular displacements allowed compressed branches, which affected about 5% in compression. Segmentation was also removed from the v7 core, although I keep a copy locally of the segmented core. Paging can adequately provide a virtual address space instead of using segmentation. Privileged access to pages is via keys. The keys are only 10-bit as that’s all the block-ram storage space I was willing to allocate to each page. I’ve seen it recommended that keys be at least 18-bit. There is a keys CSR register that contains up to six ten-bit keys for the app. Every memory access compares all six of the keys in parallel held by the app to the memory page’s key looking for a match. Having the proper key to a page is what allows shared memory. The inverted page table for the system is stored entirely in block-ram memory to allow the fastest processing.
At the moment I’m trying to get a block of pre-initialized data for the standard C library routines copied to an area of ram reserved for the purpose. Many C standard library routines failed to work because duh, the initialization data for the routines wasn’t being setup properly. But the routine hangs at a store word operation.

_________________
Robert Finch http://www.finitron.ca


Thu Jan 10, 2019 7:20 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 936
Location: Canada
Received a book on garbage collection a couple of days ago and I’m just reading through it. There’s more to it than I would’ve thought and I’m getting the picture that one solution doesn’t work for everything. So, if GC is included as part of the OS with some hardware support it’ll mean picking and choosing possibly multiple algorithms.

The audio controller in the FT64v7 SoC was generating ack pulses on all writes, not just for itself. This didn’t matter most of the time except for when it did matter.

The IsMem() decoder was missing a default statement to set the output to false, causing some instructions to be interpreted as memory instructions. This caused the core to hang as per the previous post. Another hang fixed.

_________________
Robert Finch http://www.finitron.ca


Fri Jan 18, 2019 6:39 am
Profile WWW

Joined: Tue Dec 11, 2012 8:03 am
Posts: 271
Location: California
robfinch wrote:
Received a book on garbage collection a couple of days ago and I’m just reading through it. There’s more to it than I would’ve thought and I’m getting the picture that one solution doesn’t work for everything. So, if GC is included as part of the OS with some hardware support it’ll mean picking and choosing possibly multiple algorithms.

I'll be anxious to hear more—if I can even understand it.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources


Fri Jan 18, 2019 8:07 pm
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 936
Location: Canada
Interrupts were going to be used to trigger and end a garbage collection cycles, with garbage collection taking place in an incremental fashion. But now I see that it maybe can’t be easily done that way. An issue is that some memory operations need to be done atomically with respect to the garbage collector. Having an interrupt occur in the middle of what should be an atomic update would not be good. For instance, the collector scans the stack and a stack frame could be in the process of being setup when the interrupt occurs. This could lead to false pointer detections.
The book suggests using a poll operation to determine when to perform GC while code is running. Some sort of instruction that monitors the GC cycle signal line and jumps to the GC routine if it’s active would be useful. Polling instructions would need to be inserted into code in quite a few places leading to code bloat. But that is part of the cost of a GC system. Alternately GC interrupts could be disabled and enabled around the section of code needing atomic access. It’s more code either way.
In several places the book mentions the usefulness of having more processor registers available to support GC operations. With three or four more registers statically assigned to support garbage collection it makes me wonder if it would be better to go with a 64-register design rather than 32. There’s a couple of other places where a couple of statically assigned registers would be useful too (exception processing, task management). A 32-register design is already pretty crowded for registers before considering GC register uses. About a 40 register design would be good.

Current core ft64v7 is busted back to the clear-screen level.

_________________
Robert Finch http://www.finitron.ca


Thu Jan 24, 2019 1:54 am
Profile WWW

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1299
robfinch wrote:
Received a book on garbage collection a couple of days ago and I’m just reading through it.


If you find the book good enough to recommend, please do let us know which book it is... or indeed if you find it not good, that might be useful to know too!

I think GC is a bit like cache - if you could predict the future, you'd be perfect, but the best you can do is approximate what the likely future needs are, preferably without too much complexity.


Sun Jan 27, 2019 4:52 pm
Profile

Joined: Wed Jan 02, 2019 4:10 pm
Posts: 14
My opinion on the presently ubiquitous forms of GC is unprintable. The Go-lang team at Google claim to have made considerable progress in eliminating big latency stalls, but I get increasingly suspicious about stuff I can't understand well enough to implement myself, and Go-lang's GC has gone multiple levels beyond that.

There's a very fast and simple approach to GC that works for most allocations: reference counting. Every object maintains a count of pointer references to itself, and every manipulation of pointers includes code to maintain those counters as necessary. When the counter falls to zero, the object is deallocated (and recursively checked for inner pointer decrements). You can do various clever things to reduce the amount of counter manipulation you need to do in practice, and defer the point at which you need to handle the recursive deallocations, but you can implement those incrementally for the most part.

The main pitfall is that circular data structures (which, annoyingly, includes doubly-linked lists) can prevent the counters from reaching zero even if the structure as a whole is orphaned, at which point most people promptly give up and implement a mark-and-sweep instead. What *I* suggest is to use reference counting to sharply reduce the work a mark-and-sweep GC needs to do. You can also treat a complete data structure with internal references as a single object, and have those internal references just be indexes rather than full-blown pointers. This requires that your language is capable of recognising complex data structures as objects in their own right, which is a deeper level of introspection than C or C++ manage.


Sat Feb 09, 2019 8:19 pm
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 936
Location: Canada
I’m about ¾ the way through reading the book, I’ve been putting off commenting until I finish the book. I’m looking at this from the perspective of what would be good hardware to have present in a cpu to support garbage collection. It’s looking to me like a stock machine would do an adequate job.
There’s a good section in the book on reference counting and it’s pros and cons.
Quote:
I get increasingly suspicious about stuff I can't understand well enough to implement myself,
I know the feeling. At some point one just has to jump in and assume that the code that’s been developed elsewhere works.

_________________
Robert Finch http://www.finitron.ca


Sun Feb 10, 2019 11:27 pm
Profile WWW

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1299
robfinch wrote:
I’m about ¾ the way through reading the book, I’ve been putting off commenting until I finish the book.

Sounds like a good plan!


Mon Feb 11, 2019 6:34 am
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 936
Location: Canada
I’ve been snowed in for a couple of days, so I managed to finish reading the book: “The Garbage Collection Handbook – The Art of Automatic Memory Management” by Jones, Hosking, and Moss.
The book covers about 50 years of garbage collection techniques and has tons of references for further reading. It outlines in pseudo-code with lots of diagrams various algorithms and deals with concepts like atomic memory updates and thread safety. It makes one aware of what goes on behind the scenes in managed environments. It’s an excellent book for any programmer to have on the bookshelf.
A garbage collection safe-point instruction (GCS) would be a handy modification to the instruction set. The safe-point instruction indicates when it is okay to run the garbage collector. It reminds me of co-operative multi-tasking. Also handy to have in hardware might be pointers to read and write barriers with their return addresses (takes four registers).

_________________
Robert Finch http://www.finitron.ca


Fri Feb 15, 2019 4:19 am
Profile WWW

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1299
Thanks - sounds great. I see it's a bit pricey, as academic books sometimes are. The author page is here:
https://www.cs.kent.ac.uk/people/staff/rej/gc.html


Fri Feb 15, 2019 8:18 am
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 936
Location: Canada
The GCS instruction is evolving into a more generic thread-switch point (TSP) instruction. The TSP instruction is treated as a NOP unless there is a thread-switch pending in which case it acts like a trap. This is similar in concept to a trap-on-overflow. There needs to be a signal in the processor that can be raised to indicate a GC cycle or a thread switch. There also probably needs to be a vector register for the TSP instruction so that going through the normal break vector isn’t required. Pondering the value of using a single instruction to detect both GC cycles and thread-switches. Testing for both at the same time would save code bloat. But I’m wondering if there may be cases where the GC cycle should be detected, and a thread-switch ignored. Maybe options to the instruction. One would likely want to place the GC detect and thread-switch detect instructions at the same point in code.

_________________
Robert Finch http://www.finitron.ca


Sat Feb 16, 2019 4:48 am
Profile WWW

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1299
Is it a sort of yield instruction, which indicates quiescence? In which case it might well be seen to serve one of several purposes.


Sat Feb 16, 2019 9:46 am
Profile

Joined: Fri May 05, 2017 7:39 pm
Posts: 19
Thanks Rob for the brief recommendation,
and thanks BigEd for the link.


Sat Feb 16, 2019 5:18 pm
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 936
Location: Canada
Yes, it is a sort of yield instruction which is maybe a better moniker. That helped me recall the WFI (wait for interrupt) instruction.

I was going to write a long post containing my thoughts but I eventually realized all that’s needed is to use the wait-for-interrupt instruction (WFI). Raising the interrupt could be done by a store to the interrupt controller to trigger an interrupt. It’s easy enough to define a macro to hide the stores to the interrupt controller so “mRaise #GC” would cause a GC interrupt to occur. The one gotcha is that the interrupt would need to be processed only by a WFI instruction and not immediately taken. My thought on this is to simply run the system with interrupts disabled, so that they are only detected by the WFI instruction. Otherwise some mechanism would be needed to tell the difference between interrupts that need immediate processing and deferred interrupts.
Given a modern processor which can execute hundreds of instructions per memory cycle, polling for interrupts every few instructions isn’t really a lot of overhead.

_________________
Robert Finch http://www.finitron.ca


Sun Feb 17, 2019 2:58 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 936
Location: Canada
Realized that wait for interrupt (WFI) isn’t quite the right instruction because it’s desired for the program to continue if no interrupt is present rather than wait. So, it’s really a poll for interrupt (PFI) instruction that’s desired.
Just wondering how to implement the PFI instruction in hardware. Poll for interrupt is similar to wait for interrupt (WFI) except that it doesn’t sit and wait for an interrupt. PFI takes the interrupt if it’s available otherwise it acts like a NOP instruction. The tricky part is recognizing the interrupt state when the instruction is executing. Because there are several stages to the processor before the execute stage, to implement PFI the interrupt status must be recorded at time of fetch and then propagated forwards to the execute stage. It looks to me like an extra interrupt flag bit has to be associated with each instruction. Also factoring into the hardware is regular interrupt processing. It’s likely desirable for the normal interrupt processing to occur if interrupts are enabled. Meaning the interrupt enable flag factors into things somehow.
I’m still fiddling with the interrupt hardware.

_________________
Robert Finch http://www.finitron.ca


Tue Feb 26, 2019 4:37 am
Profile WWW
Display posts from previous:  Sort by  
Reply to topic   [ 483 posts ]  Go to page Previous  1 ... 23, 24, 25, 26, 27, 28, 29 ... 33  Next

Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group
Designed by ST Software