Apollo 3 STIMER Silicon Bug

My second test was designed to prove that the timer double counts. It drives the XTAL clock to a CLKOUT pin and at the same time uses a tight software loop to manually count those transitions on CLKOUT. It compares the manual count to the count in the timer. Since the timer and CLKOUT pin are driven by the same internal XTAL clock source, the counts should never, ever diverge regardless of the method used to count the XTAL clocks. But the counts from the two methods do diverge. I proved to myself that the XTAL oscillator is OK and the data on the CLKOUT pin is correct. I connected both a scope and frequency counter to the CLKOUT pin and set them to trigger on out-of-spec CLKOUT timing. Both pieces of equipment swear that the CLKOUT timing is perfect when the timer counter register skips a count. The only possibility is that the timer count is wrong: it has to be the timer double-incrementing. As mentioned earlier, Ambiq admits that their timers double increment, but they never said what triggered this issue, or how often it might happen. All that they said is that it is “rare”. Maybe Ambiq and I just disagree on what “rare” means.

I have shown that all you need to do to trigger the bug is to read the current timer value for any of the timer count registers, STIMER or CTIMER0 through CTIMER7. Each time you read any one of those counters, there is a chance that the STIMER will silently double-increment. This is especially annoying because there is some other bug in the chip whose official workaround in the Ambiq HAL is to read the timer count value three times to determine that actual timer count. That workaround makes it three times more likely that your STIMER count will get corrupted each time you read any timer counter using the HAL!

As for how this affects people, well, that’s an open question. At a minimum, it is certainly something to be aware of. Imagine if the Apollo3 had a bug where when you added two numbers together, there was a chance that the result might be reported as 1 larger than it should be, without any mechanism to detect the corruption. You could still write a large class of programs that would function just fine under those circumstances. Likewise, a timer that skips a value once in a while means that from time to time, 1/32768 of a second will appear to pass when it shouldn’t have. A large class of programs will never notice. However, some will! I found the bug exactly because I needed to use the timer to make precision measurements of the passage of time, and I noticed that the timer counting time was moving faster than the real time. That really tossed a wrench into my plans, so the bug is not without its potentially severe side effects for some applications.