View unanswered posts | View active topics It is currently Fri Apr 19, 2024 5:19 pm



Reply to topic  [ 108 posts ]  Go to page Previous  1 ... 4, 5, 6, 7, 8  Next
 CS01 
Author Message

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Improved the operation of critical regions in the OS. Critical regions are now protected with a semaphore in the memory system. They are also processed with interrupts turned off.
The assembler was busted for callee save registers over $s9. $s10 and $s11 were not encoding correctly. The OS uses $s11 to maintain the interrupt status.

_________________
Robert Finch http://www.finitron.ca


Sat Aug 21, 2021 3:33 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Something is amiss as serial output works for only one character now. After executing the SerialPutChar() routine, on the function return the stack is messed up causing a return to an invalid address and a halt. It is not as straight-forward as one might think to diagnose. The operating system I/O function (FMTK_IO()) is called using an ‘ecall’ instruction. This causes the register set, including the stack pointer, to be changed to one for the ecall function. When the corresponding ‘eret’ instruction is executed the register set is switched back to the original one from before the ecall. There are two dispatchers in the process of getting to an I/O routine. The general OS dispatcher which selects the FMTK_IO() and the FMTK_IO() dispatcher which chooses the I/O function. Both of these are table driven.
Calling SerialPutChar() directly and bypassing all the processing of the ‘ecall’ and I/O dispatch works great. It loses all the benefits however of going through the OS. One feature the OS supports is locked access during critical regions. Unless running in the OS these functions cause an illegal instruction trap. Another important feature provided by the OS is other tasks are not blocked by an IO block. These features cannot be done from user mode code.
For serial output if the transmitter fifo is full the output routine retries up to ten times to place a character in the fifo. If a character still cannot be placed the operating system Sleep() function is called to put the task to sleep for a tick. Once awoken the output routine tries up to ten times again to place a character in the fifo. After a total of 100 tries, 10 tries and 10 sleeps(), the function is aborted and an error code of E_NoDev (no device) recorded. In theory it is supposed to work, but in practice it does not quite work yet. When put to sleep the task gets lost by the OS which starts looping around to no end.

It may seem like I am switching projects, but it is really all one mega-project. Getting the Femtiki operating system working can be applied to other sub-projects. Much of the guts of the OS can be ported to other processors even though written in assembler.

_________________
Robert Finch http://www.finitron.ca


Sun Aug 22, 2021 4:04 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
What’s this? The instruction trace queue dump displayed only the first 1024 instructions encountered. I had expected the fifo to rollover on overflow and end up recording the last 1024 instructions in a continuous fashion. So, hardware needed to be added to pop fifo entries once the fifo was nearly full, so that new entries could be added. Dumping the trace fifo is a great debugging aid. It can identify exactly which instruction the processor crapped out on.
It turns out the apps were all configured to use the same register set, register set #4. I ma not sure why this was hard-coded that way except maybe it was for debugging. This naturally caused issues on a task switch which relies on different apps using different register sets. There is a field in the application start record (ASR) that allows the register set to use to be specified. It should be different for each app.

Debugging is a bit further now. the entire PAM can be dumped, using the OS based Putch() routine. Putting the task to sleep for one tick seems to work.
Tasks are switched several times, then the system crashes.

_________________
Robert Finch http://www.finitron.ca


Mon Aug 23, 2021 3:22 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Created the ViaTime() function which returns the number of milliseconds since reset. It should be accurate to the millisecond. It uses the milliseconds time-slice variable which is updated every 30ms, then it takes into consideration the current down count of the time-slice counter. The time-slice counter is decrementing at a 40MHz rate. So, a proportion of the current down count is added to the milliseconds variable to get a time accurate to 1ms. The ViaTime() function is used for time accounting in the OS which previously just used the tick count which was only accurate to 30ms. An issue fixed up was the fact that a task may run for only part of a time-slice if it calls the Sleep() function. It is tempting to use an even smaller increment for time accounting, say 100us. At 100us there are 4,000 clock cycles between each 100us. Computing the time would take substantially less than that. A 10us resolution would be a lot more questionable.

The stack pointer was screwed up after an ecall. I dumped the sp value to a memory location just before the ecall, then again just after the ecall. And the two values were different. Since the ecall is not supposed to affect any of the registers except the return value registers $a0, $a1 this is an issue. Thinking that perhaps the register set was not being swapped correctly, I decided to dump the register set selection before the ecall and again afterwards. Unfortunately for debugging, this fixed the stack pointer issue and I do not have a clue as to why it would. The dump indicates the same register set before and after the ecall. The ecall itself uses a different register set and its own stack. I decided to put some extra FF registers in the register read address path of the core. This was to align the register set number with the register number. It turns out the register set number was available a cycle earlier than the register number. This would cause a dud read from the wrong place in the register file, but it should be corrected in the next cycle. Aligning the two seems to make things work, although it should not make a difference.

The SLT instruction with an immediate constant larger than 12 bit was not being encoded correctly by the assembler. This caused a loop to fail.

_________________
Robert Finch http://www.finitron.ca


Tue Aug 24, 2021 3:49 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Removed an excess block RAM which was being used to implement a register file read port. When the core was coded the register set portion was coded as if it were in an overlapped pipeline. But the current core is not an overlapped pipeline, so the register read and write port can be shared because they are accessed in different states.

Found a bug in the via6522 component. The interrupt flag was being set if counts were zero. The flag should be set only on the transition to zero, not the value zero. Accessing the timer registers would clear the flag, however the flag would be set again in the next cycle if the count was still zero. This caused an almost continuous interrupt situation.

_________________
Robert Finch http://www.finitron.ca


Wed Aug 25, 2021 7:23 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Made part of the RAM TMR (triple mode redundant). It looked like bit errors were occurring occasionally. This was wrecking havoc on the operation of the OS. When a bad app id or bad task id is present it can cause memory access issues. Causing the OS to halt with a fault.

Also wrote a simple memory test routine. The memory test does not find any errors.

The system ram is now accessible at two address ranges, one in the mapped memory space ($000xxxxx) and one in the unmapped memory space ($FF0xxxxx). The system has only 512kB ram so this did not cause too many issues.

_________________
Robert Finch http://www.finitron.ca


Thu Aug 26, 2021 5:03 am
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Having trouble with the external RAM. Somewhere along the path of a write cycle it is failing. So, I have now turned write cycles into write-readback cycles. The RAM is readback during the latter half of a write cycle and value compared to what was written. If they differ then the write cycle is redone up to 10 times, after which a badram signal is asserted. The PAMFindRun() function insists on returning to address $55555555. This happens to be the value written to RAM during the ramtest routine. Other than the function prolog where registers are saved, the PAMFindRun() function does not do any stores to memory. So, I have no idea yet why there is a $55555555 loaded as the return address. I tried dumping the return address on entry to the function and it looks correct.

Ram bits appear to be drifting high. For example: $FFFC0CFC is what is written $FFFC0DFC is read back.

_________________
Robert Finch http://www.finitron.ca


Fri Aug 27, 2021 4:31 am
Profile WWW

Joined: Wed Nov 20, 2019 12:56 pm
Posts: 92
What kind of RAM are you using?


Fri Aug 27, 2021 7:50 am
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
It is a 3.3V 8ns CMOS static RAM. Does not require refresh or clock signal. It is the sram on a CmodA7-15T board.
https://www.issi.com/WW/pdf/61-64WV5128Axx-Bxx.pdf
The bits appear to be drifting high.
Dumping the TCB pointers gives:
$5000 for the first dump (a valid pointer)
$5001 the second time the pointer is dumped (an invalid pointer)
$5003 the third time the pointer is dumped
and
$500F the fourth time the pointer is dumped.
I have seen other bits that have gone high.

The vendor supplied RAM test passes.
And my simple checkerboard RAM test also passes.
But when it comes to trying to run the OS issues abound.

I do not know whether or not to get another board and see if the same thing happens.

_________________
Robert Finch http://www.finitron.ca


Sun Aug 29, 2021 4:53 am
Profile WWW

Joined: Wed Nov 20, 2019 12:56 pm
Posts: 92
robfinch wrote:
It is a 3.3V 8ns CMOS static RAM. Does not require refresh or clock signal. It is the sram on a CmodA7-15T board.
https://www.issi.com/WW/pdf/61-64WV5128Axx-Bxx.pdf
The bits appear to be drifting high.
Dumping the TCB pointers gives:
$5000 for the first dump (a valid pointer)
$5001 the second time the pointer is dumped (an invalid pointer)
$5003 the third time the pointer is dumped
and
$500F the fourth time the pointer is dumped.
I have seen other bits that have gone high.


Had those memory locations been re-written in between reads?

What are you using as a memory controller? When writing, are you definitely allowing sufficient time for the chip to go High Z before putting your own data on the bus, keeping it valid long enough after de-asserting WE/CS, and then setting it high-Z in plenty of time before the chip starts driving it again?


Sun Aug 29, 2021 9:21 am
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
Quote:
Had those memory locations been re-written in between reads?
No.
Quote:
What are you using as a memory controller?
A home grown memory controller is in use.
I believe the RAM specs are being met. A 120MHz base timing clock is being used. And the read/write pulse width is two clock cycles or 16.67ns. The RAM is thus being used at less than ½ its potential rate.
The controller is here:
[url]Cores/cs01memInterfaceTMR.sv at master · robfinch/Cores · GitHub[/url]
The basic process is:
Write cycles:
Drive CE low, drive WE low, wait two clocks 16.67ns, drive WE high to latch the data. Wait 1 clock drive OE low, wait a clock and compare data in to data out. If they are not the same, start the write cycle over again. Drive CE high.
Read cycles:
Drive CE low, drive OE low. wait two clocks (16.67ns) latch the read data, Drive OE high.

It is a little more complex than described as byte, wyde and tetra access is supported, so the there is an address increment taking place and loops to handle more than byte wide data.

_________________
Robert Finch http://www.finitron.ca


Sun Aug 29, 2021 2:52 pm
Profile WWW

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1782
Feels like a timing issue to me - missing hold time constraints perhaps?


Sun Aug 29, 2021 3:40 pm
Profile

Joined: Wed Nov 20, 2019 12:56 pm
Posts: 92
robfinch wrote:
The controller is here:
[url]Cores/cs01memInterfaceTMR.sv at master · robfinch/Cores · GitHub[/url]


The URL got a bit screwed up there, but I think this is the relevant file?
https://github.com/robfinch/Cores/blob/master/CS01/rtl/cs01memInterfaceTMR.sv

If I'm reading it correctly, MemT is an output enable, used to tristate the bus, but I'm not sure which state corresponds to high-z and which corresponds to driving. I think MemT low = driving, but I'm not sure.

RamWEn is asserted at state WR0, and released at WR1. MemT is driven low at the start of a write cycle - i.e. before RamWEn - but rises at WR1, so the same moment as releasing RamWEn.

You shouldn't be driving the bus until tHZWE after WE is asserted, so if MemT low = low-Z then I think you're driving to soon and risking bus contention? Also it rising at the same time as WE gives you zero hold time - which according to the datasheet is OK, but if there's any skew between the various RAM signals, you could end up with the output-enable leading WE, which could give you negative hold. Are all the RAM-related signals in IOBs?

If MemT high = driving, then it needs to rise before WE, not at the same time. (I think MemT high = high-z, though?)

(SRAM on Xilinx is the exact opposite of what I'm used to, which is SDRAM on Altera/Intel chips - but I hope these thoughts are some help, nonetheless.)


Sun Aug 29, 2021 9:39 pm
Profile

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
On a roll today. Got character input from the serial port to work, using the OS I/O routines. Receiving a character is interrupt driven.
The monitor program works well enough to dump memory and a few other things with a few hiccups.

Quote:
Feels like a timing issue to me - missing hold time constraints perhaps?

I do not think it is a timing issue as there is a whole clock cycle dedicated to setup/hold. I have specified an output delay from the clock of a max of 2ns. I looked at a couple of other projects using the sram and they did not specify anything other than the pin numbers.

Quote:
If I'm reading it correctly, MemT is an output enable,
memT is the output enable, LOW = driving to the sram. I thought there was an extra clock cycle between memT tri-stating and OE going low. As I understand it, a critical thing to get right with asynch ram is the rising edge of the write signal and the data setup time. There may be some contention with driving memT at the same time, but there is a full two clocks to stabilize and meet the setup time. I guess the core should really be test benched.

Shamefully, I have not yet created a test bench specifically for the memory controller. I just wrote it and plugged it in, and it seemed to work. Maybe its time to create a real test bench with a traffic generator.
One issue is that it seems to completely miss write cycles occasionally. I think this may be due to the fact the controller is operating at 3x the processor clock rate. If the write signal from the processor is skewed at all from the chip select / strobe then maybe it does a read cycle instead. I put a fix in to account for possible skew. There is a check for the write signal during a read operation.

_________________
Robert Finch http://www.finitron.ca


Sun Aug 29, 2021 10:04 pm
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
I dumped the RAM control signals in logic analyzer and made one minor fix. The RamOEn and MemT signals were active all the time because they were not qualified with the circuit select (cs). This should not have affected anything as the RamCEn signal acts as a qualifier.
Read Cycle:
The read cycle is for a word read. The changing data is due to the address changing. It looks like maybe another cycle could be added as the data is latched at the same time the address is updated. Was relying on the data hold time to be able to latch correct data.
Attachment:
File comment: CS01 word read cycle
WordReadCycle.png
WordReadCycle.png [ 18.12 KiB | Viewed 1104 times ]


Write Cycle:
The write cycle is for a word write.
Attachment:
File comment: CS01 word write cycle
WordWriteCycle.png
WordWriteCycle.png [ 19.24 KiB | Viewed 1104 times ]

_________________
Robert Finch http://www.finitron.ca


Sun Aug 29, 2021 10:59 pm
Profile WWW
Display posts from previous:  Sort by  
Reply to topic   [ 108 posts ]  Go to page Previous  1 ... 4, 5, 6, 7, 8  Next

Who is online

Users browsing this forum: No registered users and 7 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group
Designed by ST Software