Low-latency extremely accurate input timing is by far the hardest thing to handle in an emulator.
Generally the emulator will complete emulation for an entire video frame in a couple of ms, which implies it idles doing nothing for 10+ms. All the events that arrive in that time then get delivered 'simultaneously' to the emulation (which can delay them the appropriate amount of time that it would take to receive them over e.g. a serial port, but they still arrive in these big chunky clumps).
Trying to solve this problem is very hard. You could try to do a fine-granularity wait (e.g. on every video line) but this is then subject to timing issues when you don't get your timeslice (and you won't get your timeslice moderately frequently on any non-realtime OS). You can buffer all input a frame ahead, including timing information, and replay it accurately. But this pumps latency into the system. And all solutions require some work to be going on at a very fine granularity which is a right pain on power consumption, free cycles for the rest of the system and the like; even modern operating systems are only just starting to think about even 1ms timeslices, let alone 64us or similar.
It's generally not a large problem for one-way communication, but bidirectional communication runs hard into this. E.g. using a network to emulate serial; two emulators connected via the network will really struggle to handle any handshaking protocol at decent speed because if it sends then listens it's often likely to see 10ms delays in the response, and 19200 baud should be able to see a response in under 1ms always. I ran into this trying to get serial-via-network working for Populous, Stunt Car Racer and the like.