Talking about watchdog we usually believe that it's more then enough to reset the timer every call of OSSched().
My experience is that in some cases this is far away from the true - the kernel could work and the OSSched could be called frequently, but in the same time the tasks could be locked and not working!
I had a case, that is connected somehow with the last problem I revealed in SALVO - bank problem (currently solved). I was very surprised to see that the system stopped responding at all, but the kernel was still rounding, the watchdog was ON, but because the reset watchdog is called from OSSched, no reset occured.
My investigations brought me to the source of the problem - the interrupt, that is responsible for calling OSTimer() was disabled somehow.
In short - if you use time related features of SALVO (connected with OSTimer() and OS_Delay, timeouts), reseting the watchdog only in the kernel is not enough to guarantee, that the system is not locked or hanged. To guarantee that the system is really OK we need some other, more complicated approach - we should care is the OSTimer() called frequently or not (in cases when OSTImer() is used).
I found out this potential problem right now and I didn't have time to get solution. I always believed that reseting the watchdog in kernel is enough... wrong. Just try to disable interrupt that calls OSStimer and you'll see hanging the system. This could be done with one simple instruction like
code:
T0IE=0; // and the system hangs...
......
or GIE = 0; // this could happen often - the user often disables and reenables GIE
Look to the problem from different angle - one wrong instruction, one unconcious reseting a single bit .. and BANG! I mean - here is one of the weak places of SALVO .... be aware!
Well, I know that everybody tries to avoid this, but I was witness how one unpredictable change of this bit makes the system to hang, despite that the kernel is rounding.
Any suggestions how this could be avoid are welcomed.
Best regards
Luben