A preemptive scheduler introduces too much overhead
A general assumption concerning preemptive schedulers is that they introduce too much overhead, both in processing time an memory overhead. Is this really so? In order to say something about it, a comparison needs to be made with the alternative, a cooperative scheduler. Concerning usage of RAM, these two types of scheduler do differ. A preemptive scheduler consumes more RAM because every thread has got its own stack while a cooperative scheduler has got a single stack. These stacks are generally considered to be overhead and therefore a waste of RAM. Whether this is entirely true is discussed in “All these stacks consume too much memory”. Lets have a look at processing overhead. A preemptive scheduler is said to introduce processing overhead because it has to switch threads. When switching from one active thread to another, the state of one thread is saved while the state of the other is restored. It is true this consumes CPU cycles. AVIX for example takes ~90 cycles from the point the scheduler is started to the point where the new thread starts running. For the entire switch, one must add to this figure the cycles spent in code determining that a new thread has to be activated. Taking a mutex as an example, AVIX consumes ~200 cycles from the point a mutex is freed up to the point where the thread locking the mutex is restarted. On a 40 MIPS controller this consumes ~5µs. The question now is whether with a cooperative scheduler you get this type of functionality for free. A cooperative scheduler repeatedly calls all threads, where the threads themselves determine whether or not they have to do something. It shall be obvious often this is not the case. The thread is however called just to find nothing has to happen which is a waste of CPU cycles. It shall be obvious this quite easily adds up far beyond the 200 cycles mentioned with the preemptive scheduler. Of course for a fair comparison the frequency these cycles are ‘wasted’ with have to be taken into account but since this very much depends on the dynamics of the system no general statement can be made here. In general it is justified to state the overhead of a cooperative scheduler encompasses that of a preemptive scheduler.Top of page...
All these stacks consume too much memory
True, a preemptive scheduler allocates a stack for every thread whilst a cooperative scheduler uses one stack. But is this all overhead? A stack is used for three purposes, first it is used for return addresses when calling functions. Second it is used to save the state of a thread when this is preempted and third it is used to store local variables. Return addresses are also stored on the stack when using a cooperative scheduler. A cooperative scheduler uses a pure function call based model where the nesting of function calls is much deeper than when using a preemptive scheduler. For this reason it is correct to assume the stack space used with both models is comparable. Next the stack space used for local variables: This is no waste of memory. Variables exist since your program depends on them and they are part of the programs functionality. They exist, regardless the type of scheduler used. Leaves the stack space needed to save the state of a thread: True, this can be considered overhead but as shall be clear the overhead is much less than what it might seem at first sight. Top of page...
Operations on basic variables are implicit thread safe
In virtually every design threads communicate and often this communication is done through variables having basic types like int or char. In order to update the content of a variable three steps are required:
First the content of the memory location where the variable resides has to be read to some processor internal register.
Second, the value is changed, actually meaning the copy in the processor internal register is changed
Third, the updated value is written back to the memory location it came from.
Only after all three steps have completed, the variable is updated. This sequence is widely known as the read-modify-write sequence and looking at code generated by a compiler, often this is the way the code is generated. When programming something like i = i + 1; the corresponding machine code consists of three instructions. Using constructs like this in a multithreaded environment is the cause of many problems. When multiple threads execute the same increment statement, the value of the variable will often not be equal to the number of times the statement is executed. This has to do with the fact that a thread updating the variable is preempted the moment the content of the variable is read. When read the variable itself still has the ‘old’ value. Now when another thread goes through an update cycle, when the preempted thread continues it updates the ‘old’ value. The result is that although two thread executed an update, effectively only one update occurred. This problem is assumed well known. Many processors, amongst which are the PIC24 and dsPIC controllers, offer atomic increment and decrement instructions. Although in essence they also use a read-modify-write cycle, the controller takes care this sequence can not be interrupted. Many people now think that when using a language like ‘C”, constructs like i++; or i--; translate to these atomic inc or dec instructions and therefore are implicitly thread safe. Fact is they are not. There is no guarantee to which assembler instructions the high level language constructs will be translated. This is entirely compiler dependent. The only way to be sure atomic inc or dec instructions are used is when programming in the controllers assembly language. Under all circumstances the programmer has to take care of making the update atomic by enclosing it in a critical section using mutexes. Top of page...