Last visit was: Fri Sep 17, 2021 10:53 pm
It is currently Fri Sep 17, 2021 10:53 pm

 [ 5 posts ] 
 8086 hardware virtualization with NMI on I/O access 
Author Message

Joined: Sun Jan 10, 2021 9:12 pm
Posts: 2
For the past three years I've been involved in hacking the 8088 card for the Commodore CBM-II computer:

Basically, it's an add-on card for a 6502-based computer which adds a second CPU (8088) allowing it to run a specially compiled version of MS-DOS 1.25. It was an attempt by Commodore to join the "MS-DOS compatible" club, and as all the other attempts, it failed in the market because it was not IBM PC compatible and no PC applications could run on it. In fact, this card is about as incompatible as it gets: the 8088 CPU has no access whatsoever to any I/O or video, instead passing all requests via an IPC mechanism to the 6502 CPU which handles everything and passes the results back [it's actually a really interesting configuration, but it's a story for another time].

Anyway, my goal was is to make this hardware setup able to run PC applications by making it as PC compatible as possible using software emulation. Interestingly, it is possible to achieve a very high degree of compatibility with a software-only solution, by (1) reimplementing a set of PC BIOS interrupts as wrappers for appropriate IPC calls to the 6502 and (2) writing a routine that periodically copies screen data from the B000 segment to the 6502 video memory. I am able to run FreeDOS, Turbo Pascal, Norton Commander etc. with this setup.

But there are a lot of PC applications that access the PC hardware directly, so emulating the PC BIOS is not enough to make them work. So I wanted to introduce a higher degree of compatibility, by emulating PC hardware chips... in software too of course :)

To that end, I added a little circuitry to the board, which generates a NMI signal whenever an I/O device is accessed. There are only two instructions that can be used to access I/O: IN and OUT, and they always operate on the AL register. So the theory is: when one of these instructions is used, NMI is generated and the interrupt handler routine performs appropriate emulation tasks using the value in AL (in case of OUT) or stuffing AL with appropriate value on return (in case of IN).

Well, in theory there is no difference between theory and practice. In practice, however, it turned out that the Intel 8086 datasheet is misleading. It says that when a rising edge is detected on the NMI line, the processor will finish the current instruction and handle the interrupt. This is not true - after a rising edge on the NMI line, the processor finishes the current instruction, executes the next one and only then handles the interrupt (I have tested this on both 8088 and V20).

Of course this poses serious problems with emulation, because if the interrupt occurs after the second instruction, this instruction may very well change the value in the AL register. This is not such a big deal with OUT, as a simple latch can hold the original value that was written to the I/O space. Not so easy with IN, though; a code such as that (which is very common) will not work:

IN AL, someport
JZ somewhere

The interrupt is handled only after the TEST instruction is executed, and it is too late - the (bogus) value in AL has already been checked.

The solution I came up with is a kludge, but it works in most reasonable scenarios: the interrupt routine gets the return address from the stack, checks if two bytes earlier there is an instruction that operates on AL (currently I am checking for TEST, OR, AND, CMP and MOV) and if so, modifies the return address on the stack so that the interrupt returns to this instruction, which is executed again, this time with a correct value in AL.

There is, of course, a hundred ways it can go wrong, but generally it works for realistic code scenarios. This way I was able to make Turbo Pascal sound() function work, by intercepting accesses to the I/O ports 42h and 61h, and using the values written there to drive the SID chip on the 6502 side. Still, it remains what it is: a kludge, which may or may not work based on many factors.

So, I guess what I really am asking about, is: is this NMI behavior normal on the 8088? And is there anything that could be done to prevent or circumvent it?

Mon Jan 11, 2021 9:21 am

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1632
Welcome, Michau, and a really nice project with an ingenious ingredient. But I can't answer the crucial question, it being well outside my territory...

Mon Jan 11, 2021 9:51 am

Joined: Sat Feb 02, 2013 9:40 am
Posts: 1482
Location: Canada
I suspect it may be normal behaviour for x86.

8088 has a prefetch queue of four bytes (six bytes for an x86) IIRC. Could it be that it is emptying out the queue before executing the NMI? Or rather queuing the NMI in order with other instructions in the queue. If that is the case then there maybe more than one additional instruction executed before the NMI is processed. I think you are using a very clever trick.

Robert Finch

Mon Jan 11, 2021 2:15 pm WWW

Joined: Sun Jan 10, 2021 9:12 pm
Posts: 2
Well yes, there is a prefetch queue. But it is largely independent from the execution unit, and simply fetches bytes from memory whenever there is nothing else to do on the bus. The execution unit can use these prefetched bytes, but it will also happily discard them, for example when branching. So I would rather presume that the same thing happens when an interrupt occurs - the prefetch queue should be cleared, otherwise the interrupt latency would become horrible. Anyway, if the prefetch queue was to blame, the delay in NMI handling would be variable based on the state of the queue. But is is always 1 instruction, regardless of the code being executed (which would influence the queue), and also of the processor type (8088 vs V20 - on the 8088 the queue should fill faster as the execution unit is slower).

I have found another nugget in the Intel datasheet, though. It says that the value on the NMI line should be active for at least two clock cycles. Not only this does not make any sense for an edge triggered interrupt, but it's also completely false - I checked that NMI pulses of one clock cycle, or even half clock cycles, work perfectly. But would it perhaps mean that after a NMI occurs, it takes the CPU two cycles to prepare for its execution, and by that time, the IN / OUT instruction is already finished and another one starts?

Mon Jan 11, 2021 5:52 pm

Joined: Sat Feb 02, 2013 9:40 am
Posts: 1482
Location: Canada
When I implement edge detection for NMI it just checks for a low to high difference on the NMI line in consecutive clock cycles which means NMI should be active for more than one clock to guarantee detection. I suspect the 80x86 designers did the same thing. The edge itself does not actually trigger any action. It is not really edge triggered action, but edge *detected* triggered action.

Edge triggered signals still need to be synchronized to the processor clock. If as you say signals ½ clock period or less are being recognized, the edge triggering likely sets an SR latch which is recognized at the next clock edge. It is probably spec’d to be two cycles to account for the cases where the NMI edge does not quite meet the clock setup time, which could give the appearance of it being delayed a cycle. So engineers can estimate the timing they just say ‘minimum of two clocks’.

There is maybe a one or two stage (clock) synchronizer to the clock for metastability too.

Robert Finch

Mon Jan 11, 2021 7:04 pm WWW
 [ 5 posts ] 

Who is online

Users browsing this forum: CCBot and 0 guests

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  
Powered by phpBB® Forum Software © phpBB Group
Designed by ST Software