Pumpkin, Inc.

Pumpkin User Forums

Watchdog Interrupt - Modifying Task Status and Skips

If you can't make Salvo do what you want it to do, post it here.

Watchdog Interrupt - Modifying Task Status and Skips

Postby jc_hsfl » Mon Jun 23, 2014 8:35 pm

I'm looking at coding a 'watchdog' style interrupt which will periodically check for tasks which are 'stuck'. I'd like to be able to selectively decide what to do with a particular task if it is stuck. e.g. at worst case, perform a total reset. If the task was unimportant, I'd like the opportunity to kill it and clean up its resources.

So far I'm able to identify which task is running from the interrupt. Is it possible to make it safe to use OSDestroyTask (or similar) and force a context switch from the interrupt? This assumes the task is stuck where it cannot perform a context switch itself.
jc_hsfl
 
Posts: 16
Joined: Fri May 23, 2014 2:25 am

Re: Watchdog Interrupt - Modifying Task Status and Skips

Postby aek » Wed Jun 25, 2014 1:20 am

Salvo does not context-switch at the interrupt level (though it may appear to do that) ... hence
Is it possible to make it safe to use OSDestroyTask (or similar) and force a context switch from the interrupt?
is not possible.

1. A WDT needs to function whether or not a task is "stuck" ... so the mechanism you use for WDT resets needs to be independent of stuck tasks.

2. Non-OS_ API calls can usually be called from anywhere. However, several have built-in "checks" that prevent their use on the current task (due to restrictions on what Salvo can and cannot do). Therefore, architecturally, the best approach is to perhaps have a dedicated task to manage discrete task stops and restarts. However, that won't work if you're stuck in a single task ... which leads to:

The best practices that I am aware of for WDT can be traced to the use of external WDTs, which force a complete (power-on-like) reset in the event of a watchdog timeout. The key here is that external WDTs can _only_ forcibly reset the entire MCU (assumign they are connected to the MCU's -RESET pin). So while what you are trying to do -- with fine-grained control of stuck tasks -- might be desirable in some circumstances, bottom line is that a WDT reset event is a serious issue, and it's advised to fix your code so that it won't happen and if there is a problem, reset the whole thing (and log why it happened), as opposed to fine-grained control of stuck tasks.
-------
aek
aek
 
Posts: 1888
Joined: Sat Aug 26, 2000 11:00 pm

Re: Watchdog Interrupt - Modifying Task Status and Skips

Postby jc_hsfl » Wed Jul 02, 2014 1:21 am

Thanks for the thorough response on that. I took your advice and started using the watchdog interrupt to start logging what tasks and PC things are getting stuck at. Seemed to work wonders when I didn't know which task was stuck.

For others who might want to figure out which task you're getting stuck in, here's what I did:

1. Create a task which would resets a 'last watchdog task activity' counter every second

2. Create an interrupt triggered by any timer at around a 1s interval. The ISR increments and monitors the 'last watchdog task activity' counter every second
- When the interrupt detects that the 'last task activity' has been stuck for more than x number of seconds, it prints a debug message via UART, and saves a couple key variables to 'noload' attribute variables which are preserved across software reset and PIC24 WDT Timeout resets. One of the more difficult variables to get to was the program counter. The method I used was having an assembly routine load a C variable with the program counter.

3. Create an assembly routine to 'attempt to' grab the program counter (PC) of running task, and put it into a XC16 'noload' variable that will be preserved across software/WDT triggered resets

Code: Select all
/* Example File Name: "PIC24_debug_int_get_pc.h" */
/* C header for assembly code to grab PC of current task from an interrupt */

void PIC24_debug_int_get_pc(uint32_t *);

Code: Select all
/* Example File Name: "PIC24_debug_int_get_pc.s" */
; Assembly Code to grab PC of current task from an interrupt (assumes non-nested interrupts)
; Save as a .s file, and add it to your MPLAB X project, along with the above code in a C header

   .global   _PIC24_debug_int_get_pc
_PIC24_debug_int_get_pc:
    ; Because this is run from an interrupt, preserve used (W0-W5) registers
    push.d W0  ; Note that register W0 contains the first formal parameter (uint32_t *)
    push.d W2
    push.d W4

    ; Grab 'Stuck' Task PC
    mov     W14, W2     ; Copy current frame pointer (W14) to register W2
    sub     #0x1A, W2   ; Subtract 0x1A from copied frame pointer to get location of returning PC word
    mov.d   [W2], W4    ; Double-Word (32-bit) copy from address in W2 (hopefully task PC) into register W4
    mov.d   W4, [W0]    ; Double-Word (32-bit) copy data within W4 to return variable location (address in W0)

    ; Restore W registers
    pop.d W4
    pop.d W2
    pop.d W0

    ; done
    return

Code: Select all
/* Example File Name: "noload-vars.h" */
/* global declaration of no-load variables */

extern volatile uint32_t last_os_pc __attribute__((noload));

Code: Select all
/* Example File Name: "noload-vars.c" */
/* .c definition of global no-load variable on the XC16 compiler */
/* Note: last_os_pc will be preserved across SW/WDT resets, and uninitialized on power-up/programming */

#include <stdint.h>

volatile uint32_t last_os_pc __attribute__((noload));


Code: Select all
/* Example File Name: "timer_interrupt.c" */
/* Assuming this is run in an interrupt, the following grabs current task PC position and puts it into last_os_pc variable */

#include "PIC24_debug_int_get_pc.h"
#include "noload-vars.h"

/* Place the following line within preferred ISR with preferred conditions to grab PC of current task, from within an ISR*/

PIC24_debug_int_get_pc((uint32_t *)&last_os_pc);

/* To log which task is stuck, use the following line to grab the task number.  I use a uint16_t type to store the task ID number, and this can be done from the same interrupt: */

<your_noload_variable> = OStID(OScTcbP, OSTASKS);

   


To break out of the stuck task, I've been using the PIC24F WDT set to about 17 seconds. When the PIC24 restarts, I have the boot-up sequence display the program counter via UART (and also make it available via command). This allows you to pinpoint the exact instruction number(s) that a task is stuck at (especially if it's in a simple 'forever loop'), and you can cross reference it with the MPLAB X 'program' instruction dump, or narrow down to a function with a .map file associated with your latest hex file (for MPLAB X, located by default at: <project folder>/dist/default/production/).
jc_hsfl
 
Posts: 16
Joined: Fri May 23, 2014 2:25 am

Re: Watchdog Interrupt - Modifying Task Status and Skips

Postby aek » Wed Jul 02, 2014 2:24 am

Cool!

What does each task have to do to be compatible with this scheme?
-------
aek
aek
 
Posts: 1888
Joined: Sat Aug 26, 2000 11:00 pm

Re: Watchdog Interrupt - Modifying Task Status and Skips

Postby jc_hsfl » Wed Jul 02, 2014 11:11 am

So far, I've found that no changes are necessary to the tasks.

To expand on the general concept of the assembly routine:

Assuming you have no nested interrupts enabled, the ISR for this should always be returning to main line code or a task (with a RETFIE instruction). And when the interrupt is triggered for this ISR, the running task's program counter is automatically pushed onto the stack and the frame pointer is advanced to the top of the stack. Because of this, the ISR can see the mainline/tasks's program counter at a predictable offset (-0x1A) below the ISR's new stack frame pointer.
jc_hsfl
 
Posts: 16
Joined: Fri May 23, 2014 2:25 am


Return to Coding

Who is online

Users browsing this forum: No registered users and 1 guest

cron