I’m using the ldrex and strex instructions to implement mutexes in a RTOS running on a Cortex-M3 processor (specifically a Luminary LM3S1968). Very occasionally (perhaps once in several million tries) the code that sets a mutex to zero fails.
When this happens, ther processor is running with BASEPRI set to 0x40, which is a higher priority than any source of interrupts on this system, therefore an interrupt handler is not preempting the code that’s trying to clear the mutex.
The following Thumb-2 assembly code is called by the mutex code when it tries to clear a mutex. The memory location containing the mutex value to clear is passed in r0, and the result is returned in r0. A return value of zero indicates the code successfully set the mutex to 0 and a return value of 1 indicates it failed.
.thumb_func
clrmutex:
ldrex r1,[r0] // initiate exclusive access
cbz r1,1f // branch if mutex is already 0
mov r2,#0
strex r1,r2,[r0] // try setting location to 0
1:
mov r0,r1 // set return value = 0 if clr succeeded, 1 if clr failed
bx lr // return
On very rare occasions, this code will return 1, which indicates that the strex instruction failed to clear the memory location associated with the mutex.
What can cause this to happen? Since when this code runs interrupt priority is raised high enough that no interrupt-triggered code (device interrupts or the Systick timer) can preempt it, nothing should prevent it from clearing the mutex.
Has anyone seen this issue? Is there perhaps a timing condition whereby the strex fails on extremely rare occasions?
why not
x = mutex binary
disable interrupts
y = ++mutex binary
enable interrupts
if x == y you got it
or some such. Maybe you want to make the comparison with interrupts disabled.
stevech:
why not
x = mutex binary
disable interrupts
y = ++mutex binary
enable interrupts
if x == y you got it
or some such. Maybe you want to make the comparison with interrupts disabled.
Interrupts are effectively disabled when this code runs since the processor is running at BASEPRI=0x40, and all interrupt sources run at BASEPRI=0x80.
My code is perhaps overkill, since the exclusive access instructions are really only needed on a multiprocessor system, but I’d still like to understand why the code sometimes fails to clear the mutex even though the code isn’t preemptable.
Anyway the cortex M3 can buffer writes to memory in a sort of cache, called the write buffer. So if you just clear the interrupts and write to the mutex variable there is no guarantee that the write to the mutex variable becomes visible to other threads before they try to acquire the lock themselves. Which sort of means that the mutex becomes useless.
I think ldrex and strex do the right thing (they do not buffer writes), while I don’t know why they seem to not work sometimes.
shows what I don’t know about the Cortex.
Is there a cache-bank-flush instruction?
Yes, fence instruction are available 8) there are even three: Data memory barrier, Data synchronization barrier and Instruction synchronization barrier.
The first one avoid write reordering, the second locks the CPU until the write buffer is flushed, while the third resets also instruction buffers (used for self modifying code).
Anyway, I’m still a beginner with this architecture, I didn’t write any code yet, but got interested since it seems the successor of the ARM7, which is what I use now.