Difference between pulling RESET and power cycling

I have turned a Blueboard B/O board (LPC1768) into an autopilot. It is talking over 2 serial ports and is using 4 I2C devices. Everything was going pretty good until recently. Sometimes it will just stop running. No more talking, no more flashing LED, and I can observe the current to go down by a factor of ~2. A couple of things:

  1. I added a lot more logic, which could be leading to some kind of memory leak or other software badness.

  2. I was dong some flight testing where the power set up had changed. The input voltage was a little higher than the recommended input, but no where near the spec for the on board reg. Maybe I damaged the regulator? It does get kind of warm.

Here is the clue that I’m hoping will point somebody in the know the right way!

When it has this failure, I have to cycle power in order to get it to work as normal. If I use the reset button, then it seems to get stuck somewhere in the first couple of seconds of the code before it gets into the while(1) loop. It prints a message to UART, flashes the LED a couple of time on an interrupt, and then stops. It may be hanging on one of the I2C devices either during initialization or the first poll in the while loop.

OK, just typing this helped me think out loud. The failure is random, but pretty reliable. I think I will just start taking I2C devices out of the software 1 by 1. It will take some work to unsolder them from the bus, but I guess I should do that also.

Arghhh but I was running these sensors and polling for raw data for hours on end before! Actually, I’m going back to USB power right now - not sure that I had any faults in that configuration.

I seem to have sourced it down to the Kalman filter function. I’m running an extended test now to confirm.

What’s news to me is that even after a reset, there seems to still be some values stored in memory. I guess reading into what a reset is, then the code starts back at the beginning, and pins return to their defaults. It should then run back through all of the mallocs and static defines. Hmmm, garbage in, garbage out. Now I just need to find my garbage.

Please disregard. It seems to boil down to the poor treatment of a singular matrix. It is a numerical methods / Kalman filtering problem, not a micro problem.

Use the watchdog timer if you want a true software reset.