In conversation with Jeff, I realised that some of the techniques used in Acorn's BBC Micro could be used in any system which can do byte-wide I/O - which is pretty much any system. The same might well be true of techniques used in other machines - please contribute descriptions, pointers and ideas!
I wrote a concise summary of the BBC Micro's capabilities here:
Acorn's BBC micro - some resourcesThe two ideas which sprung to mind are:
-
The Tube, a general coprocessor interface (pdf) with 4 byte-wide FIFOs in each direction, used to implement Acorn's OS (read, write, signals, general settings, complex calls, block transfers) such that the machine on the far side needs only a small boot ROM and a small I/O window in addition to a fast CPU and lots of RAM. It's an 8-address device which implements the full OS. The same idea has also been used with other communication channels: two back-to-back parallel ports, or
a serial connection.
- The
1MHz bus (pdf), which is a 256-byte I/O space meant to be shared between devices, themselves controlled through specific allocated addresses in an adjacent 256-byte space. Using just 512 bytes of I/O space, one can drive large RAM disks, IDE disks, SCSI storage, solid state storage, a
wavetable synthesiser (or two), teletext receiver, IEEE 488 Interface, and so on. As the name implies, this bus runs at 1MHz even though the host machine runs faster, which makes life easier for the peripherals. (Devices which don't need block transfers only need to occupy addresses in the second space - see
here for a map of usage seen in the wild.)
Two other ideas which came up relatively recently on the 6502 forum, which are more generally applicable:
-
memory mapped FPU-
memory mapped Life accelerator- also mentioned in passing, a CRC accelerator
- not specifically mentioned, a 4x4 matrix ops accelerator for 3D purposes