It’s basically a rite of passage through which all firmware developers go through, soon or later: you turn on optimizations, and then your device doesn’t work anymore. You may try to debug it, but that may not be easy, because optimizations may change the flow of the program, making it more difficult.
Often, the symptom is a condition that seems permanently stuck at its initial value, as if the variable doesn’t exist.
See the following code as an example, where the LED doesn’t blink when you turn on the optimizations.
int msElapsed = 0;
void sysTimerIrq(void){ //called from SysTick_Hander, the ISR of the SysTick timer
msElapsed++;
}
void appMainProcess(void){
static int msCounter = 0;
if(msElapsed){
msElapsed--;
msCounter++;
if(msCounter >= 1000){
msCounter = 0;
}
if(msCounter == 0){
BSP_LED_On(LED_RED);
}else if(msCounter == 499){
BSP_LED_Off(LED_RED);
}
}
}And the variable disappearing is exactly what happened! Without having to look at the assembly code, even just looking at the .map file can shed some light: the condition variable disappeared!
Why is that?
Well, that’s the compiler doing its job at optimizing. The problem arises when a variable is written in one context (like an ISR) but read in another (like your main loop). I use the term context here, because it can be another thread, an ISR or even the register of a peripheral, changed by a hardware event. One of the optimizations performed by the compiler, involves checking when a variable is read or written, and optimize around that. Problem is, in our case the sysTimerIrq() is called from within the ISR of a timer, and the compiler think that is never called (the IRQs literally interrupts the flow of the program, basically by replacing the content of the Program Counter with the address of the ISR).
Because the compiler can’t see any calls to sysTimerIrq(), it makes a logical assumption: msElapsed is only ever set to 0 and never changes. This, in turn, means that the condition if(msElapsed) will always be false. As a result, the optimizer considers the entire if block to be unreachable and removes it entirely. In this simple example, the result is that the LED is not turned on and off.
Enter the volatile keyword.
The volatile keyword in C is used as a qualifier for a variable. The book “The C programming language” uses only a few lines in the whole book to describe it, because its meaning is very simple: “This variable can change in unexpected ways, so don’t optimize it”.
By just declaring msElapsed as volatile, we tell the compiler to basically read that variable every time, without making any assumption on its content. If we do that, we can see that now the program works as expected and the variable is still present in the .map file.
But beware! The volatile keyword doesn’t make the variable safe to use in a multi threaded environment. It doesn’t make it atomic. This is because a read-modify-write operation (like msElapsed++) can still be interrupted between the read and the write, leading to a race condition, but that’s a story for another time.
It really just boils down to “do not optimize, it may change unexpectedly”.
In fact, the most common place where you can find it is in the structs that define the peripherals of a microcontroller. The various peripherals have their registers mapped to a certain address of the memory. Some of these registers are used to control the peripheral, and some are used to get some information from it, but in any case this means that their value depends on a circuit, ans thus may change without intervention from the software.
For example, on the STM32L552, the low power timer struct is defined in this way:
typedef struct
{
__IO uint32_t ISR; /*!< LPTIM Interrupt and Status register */
__IO uint32_t ICR; /*!< LPTIM Interrupt Clear register */
__IO uint32_t IER; /*!< LPTIM Interrupt Enable register */
__IO uint32_t CFGR; /*!< LPTIM Configuration register */
__IO uint32_t CR; /*!< LPTIM Control register */
__IO uint32_t CMP; /*!< LPTIM Compare register */
__IO uint32_t ARR; /*!< LPTIM Autoreload register */
__IO uint32_t CNT; /*!< LPTIM Counter register */
__IO uint32_t OR; /*!< LPTIM Option register */
__IO uint32_t RESERVED; /*!< Reserved */
__IO uint32_t RCR; /*!< LPTIM Repetition counter register */
} LPTIM_TypeDef;and, in the core_cm33.h, we have __IO defined as
#define __IO volatile /*!< Defines 'read / write' permissions */Ultimately, the rules are simple. If a variable can change in ways the compiler can’t predict, because it’s modified by hardware (a peripheral register) or by a separate context (an ISR or another thread), you must declare it volatile. This tells the optimizer to back off and ensures your code works as you expect.
Just remember: volatile only solves the optimization problem, ensuring atomicity is a challenge for another day.
