AnyCPU
http://anycpu.org/forum/

Thor Core / FT64
http://anycpu.org/forum/viewtopic.php?f=23&t=251
Page 26 of 33

Author:  robfinch [ Thu Jan 10, 2019 7:20 am ]
Post subject:  Re: Thor Core / FT64

I continue to work on both v8 and v7 versions of the core. Switched the DPO branch addressing in v7 back to regular displacement addressing. The DPO addressing wasn’t a bad idea, but it’s also not that good of an idea either. The DPO addressing made it impossible to use compressed branches. Switching back to regular displacements allowed compressed branches, which affected about 5% in compression. Segmentation was also removed from the v7 core, although I keep a copy locally of the segmented core. Paging can adequately provide a virtual address space instead of using segmentation. Privileged access to pages is via keys. The keys are only 10-bit as that’s all the block-ram storage space I was willing to allocate to each page. I’ve seen it recommended that keys be at least 18-bit. There is a keys CSR register that contains up to six ten-bit keys for the app. Every memory access compares all six of the keys in parallel held by the app to the memory page’s key looking for a match. Having the proper key to a page is what allows shared memory. The inverted page table for the system is stored entirely in block-ram memory to allow the fastest processing.
At the moment I’m trying to get a block of pre-initialized data for the standard C library routines copied to an area of ram reserved for the purpose. Many C standard library routines failed to work because duh, the initialization data for the routines wasn’t being setup properly. But the routine hangs at a store word operation.

Author:  robfinch [ Fri Jan 18, 2019 6:39 am ]
Post subject:  Re: Thor Core / FT64

Received a book on garbage collection a couple of days ago and I’m just reading through it. There’s more to it than I would’ve thought and I’m getting the picture that one solution doesn’t work for everything. So, if GC is included as part of the OS with some hardware support it’ll mean picking and choosing possibly multiple algorithms.

The audio controller in the FT64v7 SoC was generating ack pulses on all writes, not just for itself. This didn’t matter most of the time except for when it did matter.

The IsMem() decoder was missing a default statement to set the output to false, causing some instructions to be interpreted as memory instructions. This caused the core to hang as per the previous post. Another hang fixed.

Author:  Garth [ Fri Jan 18, 2019 8:07 pm ]
Post subject:  Re: Thor Core / FT64

robfinch wrote:
Received a book on garbage collection a couple of days ago and I’m just reading through it. There’s more to it than I would’ve thought and I’m getting the picture that one solution doesn’t work for everything. So, if GC is included as part of the OS with some hardware support it’ll mean picking and choosing possibly multiple algorithms.

I'll be anxious to hear more—if I can even understand it.

Author:  robfinch [ Thu Jan 24, 2019 1:54 am ]
Post subject:  Re: Thor Core / FT64

Interrupts were going to be used to trigger and end a garbage collection cycles, with garbage collection taking place in an incremental fashion. But now I see that it maybe can’t be easily done that way. An issue is that some memory operations need to be done atomically with respect to the garbage collector. Having an interrupt occur in the middle of what should be an atomic update would not be good. For instance, the collector scans the stack and a stack frame could be in the process of being setup when the interrupt occurs. This could lead to false pointer detections.
The book suggests using a poll operation to determine when to perform GC while code is running. Some sort of instruction that monitors the GC cycle signal line and jumps to the GC routine if it’s active would be useful. Polling instructions would need to be inserted into code in quite a few places leading to code bloat. But that is part of the cost of a GC system. Alternately GC interrupts could be disabled and enabled around the section of code needing atomic access. It’s more code either way.
In several places the book mentions the usefulness of having more processor registers available to support GC operations. With three or four more registers statically assigned to support garbage collection it makes me wonder if it would be better to go with a 64-register design rather than 32. There’s a couple of other places where a couple of statically assigned registers would be useful too (exception processing, task management). A 32-register design is already pretty crowded for registers before considering GC register uses. About a 40 register design would be good.

Current core ft64v7 is busted back to the clear-screen level.

Author:  BigEd [ Sun Jan 27, 2019 4:52 pm ]
Post subject:  Re: Thor Core / FT64

robfinch wrote:
Received a book on garbage collection a couple of days ago and I’m just reading through it.


If you find the book good enough to recommend, please do let us know which book it is... or indeed if you find it not good, that might be useful to know too!

I think GC is a bit like cache - if you could predict the future, you'd be perfect, but the best you can do is approximate what the likely future needs are, preferably without too much complexity.

Author:  Chromatix [ Sat Feb 09, 2019 8:19 pm ]
Post subject:  Re: Thor Core / FT64

My opinion on the presently ubiquitous forms of GC is unprintable. The Go-lang team at Google claim to have made considerable progress in eliminating big latency stalls, but I get increasingly suspicious about stuff I can't understand well enough to implement myself, and Go-lang's GC has gone multiple levels beyond that.

There's a very fast and simple approach to GC that works for most allocations: reference counting. Every object maintains a count of pointer references to itself, and every manipulation of pointers includes code to maintain those counters as necessary. When the counter falls to zero, the object is deallocated (and recursively checked for inner pointer decrements). You can do various clever things to reduce the amount of counter manipulation you need to do in practice, and defer the point at which you need to handle the recursive deallocations, but you can implement those incrementally for the most part.

The main pitfall is that circular data structures (which, annoyingly, includes doubly-linked lists) can prevent the counters from reaching zero even if the structure as a whole is orphaned, at which point most people promptly give up and implement a mark-and-sweep instead. What *I* suggest is to use reference counting to sharply reduce the work a mark-and-sweep GC needs to do. You can also treat a complete data structure with internal references as a single object, and have those internal references just be indexes rather than full-blown pointers. This requires that your language is capable of recognising complex data structures as objects in their own right, which is a deeper level of introspection than C or C++ manage.

Author:  robfinch [ Sun Feb 10, 2019 11:27 pm ]
Post subject:  Re: Thor Core / FT64

I’m about ¾ the way through reading the book, I’ve been putting off commenting until I finish the book. I’m looking at this from the perspective of what would be good hardware to have present in a cpu to support garbage collection. It’s looking to me like a stock machine would do an adequate job.
There’s a good section in the book on reference counting and it’s pros and cons.
Quote:
I get increasingly suspicious about stuff I can't understand well enough to implement myself,
I know the feeling. At some point one just has to jump in and assume that the code that’s been developed elsewhere works.

Author:  BigEd [ Mon Feb 11, 2019 6:34 am ]
Post subject:  Re: Thor Core / FT64

robfinch wrote:
I’m about ¾ the way through reading the book, I’ve been putting off commenting until I finish the book.

Sounds like a good plan!

Author:  robfinch [ Fri Feb 15, 2019 4:19 am ]
Post subject:  Re: Thor Core / FT64

I’ve been snowed in for a couple of days, so I managed to finish reading the book: “The Garbage Collection Handbook – The Art of Automatic Memory Management” by Jones, Hosking, and Moss.
The book covers about 50 years of garbage collection techniques and has tons of references for further reading. It outlines in pseudo-code with lots of diagrams various algorithms and deals with concepts like atomic memory updates and thread safety. It makes one aware of what goes on behind the scenes in managed environments. It’s an excellent book for any programmer to have on the bookshelf.
A garbage collection safe-point instruction (GCS) would be a handy modification to the instruction set. The safe-point instruction indicates when it is okay to run the garbage collector. It reminds me of co-operative multi-tasking. Also handy to have in hardware might be pointers to read and write barriers with their return addresses (takes four registers).

Author:  BigEd [ Fri Feb 15, 2019 8:18 am ]
Post subject:  Re: Thor Core / FT64

Thanks - sounds great. I see it's a bit pricey, as academic books sometimes are. The author page is here:
https://www.cs.kent.ac.uk/people/staff/rej/gc.html

Author:  robfinch [ Sat Feb 16, 2019 4:48 am ]
Post subject:  Re: Thor Core / FT64

The GCS instruction is evolving into a more generic thread-switch point (TSP) instruction. The TSP instruction is treated as a NOP unless there is a thread-switch pending in which case it acts like a trap. This is similar in concept to a trap-on-overflow. There needs to be a signal in the processor that can be raised to indicate a GC cycle or a thread switch. There also probably needs to be a vector register for the TSP instruction so that going through the normal break vector isn’t required. Pondering the value of using a single instruction to detect both GC cycles and thread-switches. Testing for both at the same time would save code bloat. But I’m wondering if there may be cases where the GC cycle should be detected, and a thread-switch ignored. Maybe options to the instruction. One would likely want to place the GC detect and thread-switch detect instructions at the same point in code.

Author:  BigEd [ Sat Feb 16, 2019 9:46 am ]
Post subject:  Re: Thor Core / FT64

Is it a sort of yield instruction, which indicates quiescence? In which case it might well be seen to serve one of several purposes.

Author:  GaBuZoMeu [ Sat Feb 16, 2019 5:18 pm ]
Post subject:  Re: Thor Core / FT64

Thanks Rob for the brief recommendation,
and thanks BigEd for the link.

Author:  robfinch [ Sun Feb 17, 2019 2:58 am ]
Post subject:  Re: Thor Core / FT64

Yes, it is a sort of yield instruction which is maybe a better moniker. That helped me recall the WFI (wait for interrupt) instruction.

I was going to write a long post containing my thoughts but I eventually realized all that’s needed is to use the wait-for-interrupt instruction (WFI). Raising the interrupt could be done by a store to the interrupt controller to trigger an interrupt. It’s easy enough to define a macro to hide the stores to the interrupt controller so “mRaise #GC” would cause a GC interrupt to occur. The one gotcha is that the interrupt would need to be processed only by a WFI instruction and not immediately taken. My thought on this is to simply run the system with interrupts disabled, so that they are only detected by the WFI instruction. Otherwise some mechanism would be needed to tell the difference between interrupts that need immediate processing and deferred interrupts.
Given a modern processor which can execute hundreds of instructions per memory cycle, polling for interrupts every few instructions isn’t really a lot of overhead.

Author:  robfinch [ Tue Feb 26, 2019 4:37 am ]
Post subject:  Re: Thor Core / FT64

Realized that wait for interrupt (WFI) isn’t quite the right instruction because it’s desired for the program to continue if no interrupt is present rather than wait. So, it’s really a poll for interrupt (PFI) instruction that’s desired.
Just wondering how to implement the PFI instruction in hardware. Poll for interrupt is similar to wait for interrupt (WFI) except that it doesn’t sit and wait for an interrupt. PFI takes the interrupt if it’s available otherwise it acts like a NOP instruction. The tricky part is recognizing the interrupt state when the instruction is executing. Because there are several stages to the processor before the execute stage, to implement PFI the interrupt status must be recorded at time of fetch and then propagated forwards to the execute stage. It looks to me like an extra interrupt flag bit has to be associated with each instruction. Also factoring into the hardware is regular interrupt processing. It’s likely desirable for the normal interrupt processing to occur if interrupts are enabled. Meaning the interrupt enable flag factors into things somehow.
I’m still fiddling with the interrupt hardware.

Page 26 of 33 All times are UTC
Powered by phpBB® Forum Software © phpBB Group
http://www.phpbb.com/