|
|
 |
|
 |
|
|
|
|
|
|
|
AVIX, like a number of competing products, offers a system stack for use by Interrupt Service Routines. This mechanism is meant to preserve RAM. The advantage of this mechanism with most RTOSes however is small. Even though a system stack is offered, still Interrupt Service Routines place a high load on the stacks of the individual threads. As a result of the unique system stack mechanism offered by AVIX, AVIX places a substantially lower load on the controllers RAM. The graphs below illustrate this. In these graphs you see the amount of RAM used by the RTOS as a function of the number of threads. These graphs are based on:
- The controller offering 32KRAM
- Each thread consuming 300 bytes for context saving, bookkeeping and a basic stack for local variables and function calling
- An interrupt consuming a minimum of 104 bytes, this figure is explained below
The left graph shows the amount of RAM left to the application when using six unique interrupt priority levels, the right graph in case only three unique interrupt priority levels are used. As you can see with AVIX-32, a substantial higher amount of RAM remains for use by your application. In the case of 16 threads and six interrupt priority levels, with AVIX-32 ~27KB of RAM is available for use by the application while some competing products only leave ~18KB of RAM for use by the application.
|
|
|
|
|
|
|
|
AVIX offers a unique system stack mechanism lowering the RAM usage by Interrupt Service Routines on thread stacks to zero for its 32 bit version and almost zero for its 16 bit version. Below you can read how this is implemented and how it preserves many KBytes of RAM compared with competing products.
|
|
|
|
|
|
An RTOS allows for multiple loosely coupled programs to run concurrently. Each of these programs (threads) has its own stack. This allows for local variables to be used in the thread’s code without worrying about access to those variables by other threads. When the RTOS preempts a thread and activates another, the controllers stack pointer is set to refer the stack of the newly activated thread. Furthermore the thread’s stack is used for return addresses when the thread makes function calls. It all comes down to providing the thread a local context making it a real independant piece of software.
|
|
|
|
|
|
|
|
A hardware interrupt is dealt with at the level of the controller’s hardware. When an interrupt occurs, the context of the running software is saved on the stack and the interrupt handler is activated. This mechanism is built into the hardware of the controller. Problem is the controller has no knowledge about an RTOS being used. From the controller’s point of view, an RTOS is just software. The fact that this software is implemented such that the value of the stackpointer is changed in case of a thread context switch is not known to the controller and when an interrupt occurs, the current value of the controllers stackpointer is used to save the context of the interrupted software. Since interrupts happen to occur at unpredictable moments, each thread stack has to reserve room to save this context. This means when using, lets say, ten threads the amount of RAM for interrupt context saving also has to be present ten times compared to the case no RTOS is used. This consumes precious RAM and the more threads there are, the more RAM is consumed. From an application point of view, this RAM is just wasted. It is not available to the application program.
To make things worse micro-controllers offer nested interrupts. An interrupt has a priority where a higher priority interrupt handler can be activated while a lower priority interrupt handler is running. This higher priority interrupt handler will again save the context on top of the earlier saved context of the lower priority interrupt handler. Interrupts can be nested seven levels deep meaning that in a worst case situation seven context’s are saved on the stack and again, each threads stack has to have enough room to deal with this.
|
|
|
|
|
|
|
|
In this section, an explanation is given how a system stack works and how the AVIX implementation differs from that of competing products.
|
|
First an explanation of the system stack mechanism offered by competing products:
The figure to the left shows three threads (the blue lines in the lower left corner) At moment 1, an interrupt occurs. The horizontal red line depicts the first part of the Interrupt Service Routine, which is code generated by the compiler. The RTOS is not yet in control and thus this part saves the controllers context on the stack of the thread that happens to be interrupted. This is depicted by the red box in the Thread Stacks. At moment 2, the RTOS takes control and switches to the system stack. Only from this moment on, the remaining stack requirement of the Interrupt Service Routine is placed on the system stack. Since the interrupt can occur at a random moment, each of the three threads in this figure have to preserve stack space for it. So the total amount of RAM that is potentially needed is shown in the bar to the right.
|
|
The AVIX implementation is different, as is explained here:
The figure to the right shows how AVIX uses a system stack. The AVIX mechanism does not use compiler generated Interrupt Service Routine code. For this reason, when at moment 1 the interrupt occurs, AVIX immediately takes control and switches to the system stack. Therefore the entire Interrupt Service Routine stack requirement is placed on the system stack and the thread stacks do not need to preserve room for use by the Interrupt Service Routine. As a result the total RAM requirement is much lower, as is illustrated with the bar to the right of the figure.
|
|
To illustrate the severeness of the problem, the situation with nested interrupts as implemented by competing products is illustrated:
The figure to the left illustrates what happens in case of nested interrupts. Here again at moment 1, the interrupt occurs, leading to the described usage of the stack of the interrupted thread. If now, before the switch to the system stack is made (interrupts happen to occur at unpredictable moments), another interrupt occurs (2), even a larger amount of thread stack is used. Only again at moment 3 the switch to the system stack is made, again leading to the remaining stack usage to be placed on the system stack. At moment 4, the second interrupt returns, and the first interrupt picks up where it was interrupted itself. Now at moment 5, the first interrupt switches to the system stack again. As you can see in the bar to the right of the figure, the RAM usage rises linear with the interrupt nesting level and of course with the number of threads.
|
|
And again the AVIX implementation:
For AVIX the sitatuation remains the same as in the non-nested situation. No use is made of compiler generated interrupt code. When at moment 1 the first interrupt occurs, AVIX again immediatelly takes control and switches to the system stack. Again no use is made of the stack of the interrupted thread. When at moment 2 the first interrupt itself is interrupted, AVIX takes control, sees that the system stack is already active and the stack requirement of the second interrupt is placed on top of that of the first interrupt. As show in the bar to the right of the figure, again resulting in a substantial lower RAM usage than in the case presented before.
|
|
|
|
|
|
|
|
As illustrated in the previous section, although competing products do offer a system stack mechanism, still a high load is placed on the stack of the individual threads. How big this load is is illustrated in this section. Do realize that the penalty presented is not present when using AVIX-32 and substantial smaller when using AVIX-16.
First for the 32-bit controllers; below, part of the assembly code is shown for an interrupt handler written in C:
|
168: void __ISR(_TIMER_3_VECTOR, ipl6) T3ISR(void) 169: { 9D0003B4 415DE800 rdpgpr sp,sp 9D0003B8 401A6800 mfc0 k0,Cause 9D0003BC 401B7000 mfc0 k1,EPC 9D0003C0 001AD282 srl k0,k0,0xa 9D0003C4 27BDFF98 addiu sp,sp,-104 9D0003C8 AFBB0064 sw k1,100(sp) 9D0003CC 401B6000 mfc0 k1,Status 9D0003D0 AFBB0060 sw k1,96(sp) ...
|
|
The line in blue shows the amount of bytes reserved on the stack to save the context of the interrupted program, being 104 bytes. These 104 bytes must be present on the stack of every thread in the application. When using nested interrupts (and since this is one of the controllers most powerful features, it is likely they will be used, is it not now then in a future upgrade of your software), in a worst case situation the load on the stack to save the context for six levels of nested interrupts is 6 times 104 totaling 624 bytes!! And since multiple threads are used, the stack of each of these threads has to reserve this space. In an average system it is not unlikely to have 16 threads, meaning 16 times 624 or ~10KB has to be reserved for this. This implies on a 32-bit controller with 32KB of RAM ~30% of this RAM is wasted!
For the 16-bit controllers the numbers are different, again below part of the assembly code for an interrupt handler written in C is shown:
|
558: void __attribute__((__interrupt__, auto_psv)) _T3Interrupt(void) 559: { 03536 F80036 push.w 0x0036 03538 BE9F80 mov.d 0x0000,[0x001e++] 0353A BE9F82 mov.d 0x0004,[0x001e++] 0353C BE9F84 mov.d 0x0008,[0x001e++] 0353E BE9F86 mov.d 0x000c,[0x001e++] 03540 F80034 push.w 0x0034 ...
|
|
In case of an interrupt, first four bytes are pushed on the stack. In the interrupt handler a total of 10 registers are saved on the stack. Next a function call is made pushing four more bytes on the stack before the switch to the system stack is made. This adds up to 28 bytes per interrupt and in the case of worst case interrupt nesting means a total of 6 times 28 is 168 bytes. Again, when having a system with 16 threads a total amount of 16 times 168 is ~2.7KB has to be reserved for this. This implies on a 16-bit controller with 16KB or RAM, ~16% of this RAM is wasted!
Although these figures are already substantial, in reality one can expect them to be even worse in case the interrupt handler contains local variables or nested function calls are made.
|
|
|
|
|
|
|
|
For the 32 bit version, the story is easy. When using the AVIX supplied interrupt definition macros, for every interrupt handler, interrupts place no load on the stack of the individual threads. So for the average application presented in the previous section, instead of wasting ~10KB for interrupt stacks, using AVIX 0KB is wasted.
For the 16 bit version, it is impossible to entirely solve the problem. The hardware of the controller uses four bytes of stack when an interrupt occurs. On top of this AVIX needs two bytes. So every interrupt consumes six bytes, no less no more. With a worst case interrupt nesting situation, this places 42 bytes on the stack of each thread leading to 672 bytes for the whole system consisting of 16 threads. This comes in place of the ~2.7KB presented in the previous section leading to a gain over 2KB compared with the system stack mechanisms as offered by competing products.
AVIX offers a system stack implementation providing the benefits one might expect from such a mechanism. The saving on RAM usage is substantial and an important factor to consider when selecting an RTOS. Furthermore, the AVIX mechanism leads to deterministic stack usage. When changing interrupt handlers leading to a higher stack usage, not all threads have to be revised but only the system wide system stack needs to become a little larger. This also helps in preventing errors.
|
|
|
|
|
|
|
|
Besides the large decrease in RAM usage offered by the AVIX system stack mechanism, there are even more advantages.
First, usage is a simple as declaring an ISR using an AVIX supplied macro. For a 16-bit controller, instead of writing:
|
void __attribute__((__interrupt__, auto_psv)) _T3Interrupt(void)
|
|
You write:
|
avixDeclareISR(_T3Interrupt)
|
|
Second, there are no limitations in what can be done or should be done in an interrupt handler. An interrupt handler looks like any regular C function. Use can be made of local variables and functions can be called without any restriction. Of course it still is advisable to keep an interrupt handler as small as possible to prevent other interrupts from the same or lower priority from being blocked too long.
The AVIX supplied mechanism does take a few (~5) instructions to switch to the system stack and thereby has a very small effect on interrupt latency. Might this be a problem, the standard mechanism to define interrupt handlers can be used for those interrupts where latency is of the highest importance. Interrupt handlers based on the AVIX supplied mechanism and interrupt handlers based on the standard definition mechanisms can be mixed and changed from one mechanism to the other whenever needed. By no means it is required to base all interrupt handlers on the AVIX mechanism and one can make a choice whenever needed.
|
|
|
|
|
|
|