(Continuation of this post)
After about an hour of sending thousands of web requests, bingo, the mystery character appeared, and the logic analyser breakpoint I set had triggered. I was pleased to see a clean and clear write of 0xFF to 0x20481 – the exact value to the location I suspected was being trashed.
After scrolling through reams of bus transactions and correlating it to a dissassembly, which I used Watcom’s ‘wdis’ tool to create, the problem was becoming apparent.
Although there was only one problem that caused the original fault, I spotted another while I was at it:
1 – I’d forgotten to push the AX register to the stack in the ISR. A bit of a bummer for any interrupted code that was using it.
2 – The second problem had me feeling like I’d failed computer science 101. I’m using the W5100’s interrupt so the application doesn’t have to waste time polling the SPI for stuff to do. Generally speaking the interrupt handler adhered to good practise – Doing no real work, just flagging stuff for the main routine to do.
The oops? I was still reading from the SPI in the ISR, and the main routine code was too, so one thread of w1500_read() would be interrupted by another copy of w5100_read(), stuffing the whole thing up, because the SPI controller and the W5100 are a shared resource, and I wasn’t disabling interrupts for the execution of the non interrupt context version.
In the end I moved the W5100 flags ‘read’ out of the ISR, and into the main loop, instead just flagging the interrupt pin in the ISR, which is safe.
Did I really need a logic analyser for this? Not really, but heck, once I had a capture, it helped me find both problems in minutes.