Pumpkin, Inc.

Pumpkin User Forums

OS_WaitXYZ() behaviour

If you can't make Salvo do what you want it to do, post it here.

OS_WaitXYZ() behaviour

Postby luben » Mon Jul 09, 2001 8:39 am

Hello,

I have a question about the OS_WaitXYZ().

Let's say it's OS_WaitBinSem(BINSEM1,...) and waits for binary semaphore BINSEM1.

If before running the OS_WaitXYZ() the semaphore is already set, then what will occure:
1. contex switch and then analize the event
2. analyze the event and if set - don't contex switch - just return the control.

In my opinion it should be the first case, because the second one could produce many troubles. From other side the second case could speed up the program, so there is some reason to chose it too.

Luben

luben
 
Posts: 324
Joined: Sun Nov 19, 2000 12:00 am
Location: Sofia, Bulgaria

Re: OS_WaitXYZ() behaviour

Postby Salvo Tech Support » Tue Jul 10, 2001 8:13 am

Hi Luben.

Case #2 is the correct one for the behavior of a semaphore, and is how Salvo works. Also, in case #2, the binsem is returned to zero.

Remember, there's no need to context switch if the binsem is available.

------------------
-----------------------
Salvo Technical Support
Please request all tech support through the Forums.

--------
Salvo Technical Support
Please request all tech support through the Forums.
Salvo Tech Support
 
Posts: 173
Joined: Sun Nov 19, 2000 12:00 am

Re: OS_WaitXYZ() behaviour

Postby luben » Tue Jul 10, 2001 9:40 am

Hello,

I think that this has some potential problems. Because if I make a loop and I use the OS_WaitXYZ() to contex switch it could never happen, I mean this task will not return the control to the scheduler if the event occures too frequently.

At least this should be note in the manual - that if event already exeists - there is no contex switch.

Luben

luben
 
Posts: 324
Joined: Sun Nov 19, 2000 12:00 am
Location: Sofia, Bulgaria

Re: OS_WaitXYZ() behaviour

Postby luben » Fri Jul 13, 2001 7:54 am

Hello,

You wrote that OS_WaitXYZ() don't have exception and its behaviour is described into the manual. It's half of the true, I mean it's formal true. In fact you try to bring into the manuel the law of exceptions. Well, it something that can help users to avoid the hidden problems of exceptions. Because events are comming asynchronously the user can't use the information that if event exists - will not contex switch and visa versa. The user DON'T KNOW when exactly one event can come, or it's possible in limited number of cases. That's why I say that OS_WaitXYZ() has exception - not because you didn't described the behaviour fo services, but because of the nature of the events - it makes the bahaviour unpredictable.... throwing dies.

You see that there are cases (let's exclude the case of wrong behaviour of OS_WaitXYZ() with OStimedOut()) when it's possible to expect locking of system, due to that small exception - sometimes contex switch, sometimes not (in fact you tried to describe in manual when exactly it happens and when not).

It's really taugh situation. If you first contex switch and then you analyze event - then everything is OK, except the speed. But the speed is something very important, so it's better not even to touch something. And it's very hard to explain in manual all possible "buggs" that user can expect if he's not very careful with this exceptions of contex switching (I'll bring with the time more and more). I'm sure that exists more cases, when user can lock the system. As I brought you example with task with simple contex switch, that is bombed with events from ISR - the task will never contex switch, until "attack" goes away. Immagine that this happens for 20-30ms - enough time to make watch dog to reset the processor. Scary, huh? And note that this is almost unpredictable and unfortunately could happen quite often, it's not sofisticated play of my mind.

I have some ideas that came into my head and can save the OS_WaitXYZ() for such problems. Imagine you have for all tasks one flag, let's say contex_keeper. After every call of OSSched() you clear it (the flag is just one bit, doesn't matter how many tasks you have).
Every calling of OS_WaitXYZ() will do the folowing. If event already exists - first check the flag and if = 0, return to task without contex switching, but set flag = 1. If flag == 1 - then first contex switch. If event don't exists - don't touch the flag. This simple, easy to implement algorythm will keep you from "bombing with event" dead lock. It consumes only 1 additional bit, maybe up to 5 machine cycles that will increase the event only if the "dangerous" case exists. From other side the "dangerous" case is made just to speed up the process, so 5 cycles will not change anything.

I'm sure that you'll find even better ideas to prevent such "bombing" dead locks.

About writing of OS_WaitXYZ(). For sure, to be more exectly you shoul point the attention of the users that sometimes it can not contex switch. Because all other services that have OS_ have "hard" contex switch, OS_WaitXYZ() should have different way of writing. It's a small detail, but very meaningfull. Dont' forget that if you add into SALVO the flag that prevents "bombing" dead lock, the explanation of behaviour of OS_WaitXYZ() will become very foggy and complicated.

Do you know, it's so called "dilemma" whatever you do - it's not good. But for sure is good to prevent "bombing" dead lock.


Regards
Luben

luben
 
Posts: 324
Joined: Sun Nov 19, 2000 12:00 am
Location: Sofia, Bulgaria

Re: OS_WaitXYZ() behaviour

Postby aek » Fri Jul 13, 2001 10:53 am

This is a very interesting point, and requires further study. If an event is continuously "bombarded" by calls to OSSignalXyz(), and a relatively low-priority task is running in a tight loop, waiting on that event, then it will never return to the scheduler and the system will lock up, stuck in that task forever, and other, higher-priority tasks will never run because the low-priority task never yields to the scheduler.

The obvious fix is quite easy (see salvo.h), but we need to analyze the system-wide impact of such a change before we implement it.

-------
aek
aek
 
Posts: 1888
Joined: Sat Aug 26, 2000 11:00 pm

Re: OS_WaitXYZ() behaviour

Postby aek » Sat Jul 14, 2001 7:15 am

Setting aside the isssue of nomenclature (i.e. "OS_" suggests context switch) for a moment, there are two issues at hand.

In a preemptive RTOS, if an event is not available, then the task is made to wait (i.e. it changes state). If it is available, then execution continues.

In a cooperative RTOS, if an event is not available, then the task is made to wait (i.e. it changes state). If it is available, then execution continues.

The same, right? Except that there's also the issue of the context switch. In the preemptive RTOS, it's "hidden" / implicit. In the cooperative RTOS, it's not hidden / explicit. That's why Salvo's OS_WaitXyz() are written as they are -- for efficiency, we explicitly "throw in the context switch for free" when the event is not available. And if you don't want the context switch, then you can use OS_TestXyz().

So what you consider "unpredictable" behavior is what we consider to be correct behavior -- namely, that if an event is not available, then the task is made to wait AND it yields to the context switcher, so that other tasks may run. When it is available, then it simply continues. Since Salvo is event-based, one should expect interaction with the scheduler when waiting on events if the event is not available.

But as you have pointed out, this "conditional context switch" within OS_WaitXyz() has a detrimental effect on system responsiveness, i.e. if an event is bombarded then the task will run as long as the event is non-zero, which could easily cause the WDT reset. So this is where we're focussing our efforts on for future releases and upgrades -- we'll probably have to trade some performance for overall system ruggedness and responsiveness.

quote:
The user DON'T KNOW when exactly one event can come ...

This is the nature of event-based and interrupt-driven coding -- the user must code for the fact that there is no indication when an event can happen. Salvo supports this very well.

[This message has been edited by aek (edited July 14, 2001).]

-------
aek
aek
 
Posts: 1888
Joined: Sat Aug 26, 2000 11:00 pm

Re: OS_WaitXYZ() behaviour

Postby luben » Sat Jul 14, 2001 7:50 am

Hello,

Think about using of some flag, that is cleared after every contex switch and is set after every case when OS_waitXYZ() doesn't contex switch (event already existst). If the program comes to OS_WaitXYZ() with "event already exists" with set flag - then it first contex switch and then analyze the event. This prevents the occuring of two sequential cases when OS_WaitXYZ() don't contex switch. The result will be - if the first time OS_WaitXYZ() don't switch, the second time (event already appeared) will go to OSSched().
As I told you this will consume only one bit information for all event services and 5-8 machine cycles.... acceptable, huh?
I think that even the place where should be fixed the problem of "wrong behaviour of OSTimedOut()" and "problem with bombarded task" is the same. Because they have similar nature of the reason - "OS_WaitXYZ() don't contex switch hard, but depend of the events and their appearance in the time". You have to agree that this small unpredictable behaviour of the OS_WaitXYZ() (unpredictable, because of random nature of time of coming the events) cased up to now 2 mistakes in SALVO. I'm sure that for you it's "a piece of cake" to fix them. And by the way these are not real mistakes in SALVO. I'm sure that SALVO now makes everything you wanted, whatever you wished you implemented into SALVO. But in the moment of creating of SALVO you didn't take in mind that sometimes the random appearing of events can produce wrong behaviour of SALVO.

And yes, cooperative multitasking can cause many problems if the user don't care for the whole structure of the program. And by the way the problem with "bombarded tasks" could be solved with one single OS_Yield() before the OS_WaitXYZ() (I can't say the same for the wrong behaviour of the OSTimedOut()). So, looking formal there is no problem in SALVO - it could be user job to care for such situiations and to solve them in different manners. And if he has the sources, he can even write new functions, fix any possible problems, etc.
From other side SALVO can solve such problems on higher level, more elegant.

Regards
Luben


luben
 
Posts: 324
Joined: Sun Nov 19, 2000 12:00 am
Location: Sofia, Bulgaria

Re: OS_WaitXYZ() behaviour

Postby luben » Sat Jul 14, 2001 7:57 am

Hello,

About "unpredictable behaviour" of OS_WaitXYZ(). I should explain more - maybe to be more exactly - in fact the behaviour of OS_WaitXYZ() is very good determinated and described. if something is unpredictable - it's the time of coming the events.
So the user can't calculate exactly when he will contex switch , just because he doesn't know when event will comes. And because OS_WaitXYZ() and events are one whole, the behaviour of the whole is unpredictable.

Luben

luben
 
Posts: 324
Joined: Sun Nov 19, 2000 12:00 am
Location: Sofia, Bulgaria

Re: OS_WaitXYZ() behaviour

Postby aek » Sun Jul 15, 2001 10:33 am

quote:
So the user can't calculate exactly when he will contex switch ...

Keep in mind that even if the user had an exact cycle count of the time between context switches in a task, this would be of little or no use, because interrupts would wreak havoc on the timing. So in an overall sense, if the user is concerned with the system responsiveness (i.e. the worst-case time for a task to be "put off" before it can run), then he/she must take into account not only the time between context switches from one task to another, but all of the interrupt- and other-related delays. The same is of course true with preemptive systems. By fixing the issue you brought to our attention below, Salvo's responsiveness will be improved. The challenge is for us to implement this change in an efficient manner.

I have found that many users worry about these issues because they are still thinking in terms of "superloop" or "interrupt-driven" software architectures. The priority-based, event-driven architecture that Salvo (and other RTOSes) employ is altogether different, and requires just some worst-case figures for context-switching and interrupt latency times, plus the knowledge that tasks are priority-based.

Practically speaking, most soft real-time systems do not need to worry about these timing issues. That's why we recommend other texts (e.g. uC/OS-II's textbook) for a discussion of timing issues. Salvo's latencies, etc. can be obtained fairly easily through the test programs.

I'm going to go ahead and close this thread, as we've covered the important issues. We hope to have efficiently resolved the "bombarding event" case in a future release.

-------
aek
aek
 
Posts: 1888
Joined: Sat Aug 26, 2000 11:00 pm


Return to Coding

Who is online

Users browsing this forum: No registered users and 1 guest

cron