Presently, RIOT just emits a warning when a stack overflow is
encountered but still resumes execution. In my view, execution should be
aborted as the detection of a stack overflows via the heuristic provided
by the scheduler is an unrecoverable error.
I ran into this while performing automated tests of a RIOT application
where a stack overflow occurred but I only noticed this after inspecting
the application output more closely.
Similar to SSP failures, I added crash_code for stack overflows.
An `1 << x` with `x >= 15` is undefined behavior on 8-bit / 16-bit
machines (which typically have `sizeof(int) == 2`).
Using `1UL << x` is safe for `x <= 31`, which is large enough to make
use of the full 32 bits in `runqueue_bitcache`.
In addition, a `static_assert()` is added to enforce that
`SCHED_PRIO_LEVELS` is never set to anything larger than 32.
- activate THREAD_CREATE_STACKTEST also if test_utils_print_stack_usage
is used
- make thread_measure_stack_free() available unconditionally
- if DEVELHELP is active, call test_utils_print_stack_usage() on any
thread exit
- if DEVELHELP is active, call test_utils_print_stack_usage() after main
for the idle thread, if that is used
Allocate and initialize a thread-local block for each thread at the
top of the stack.
Set the tls base when switching to a new thread.
Add tdata/tbss linker instructions to cortex_m and risc-v scripts.
Signed-off-by: Keith Packard <keithp@keithp.com>
---
v2:
Squash fixes
v3:
Replace tabs with spaces
v4:
Add tbss to fe310 linker script
This commit reverses the runqueue_cache bit order when the architecture
has a CLZ (count leading zeros) instruction. When the architecture
supports CLZ, it is faster to determine the most significant set bit of
a word than to determine the least significant bit set. Unfortunately
when the instruction is not available, it is more efficient to determine
the least significant bit set.
Reversing the bit order shaves off another 4 cycles on the same54-xpro.
From 147 to 143 ticks when testing with tests/bench_sched_nop.
Architectures where no CLZ instruction is available are not affected.
Replace accesses to `sched_active_thread`, `sched_active_pid`, and
`sched_threads` with `thread_get_active()`, `thread_get_active_pid()`, and
`thread_get_unchecked()` where sensible.
An interrupt serviced during the idle sleep can re-request a context
switch while the scheduler is already going to switch contexts after the
idle sleep. Thi sched_context_switch_request should thus be cleared
after the idle sleep and not before where it could be modified during
the idle sleep and get out of sync.
In the case that the no_thread_idle feature is active, the
runqueue_bitcache is checked twice in the case no thread is available to
schedule. This changes the inner while loop to a do-while loop to save
one check from the initial loop iteration, saving a cycle or so in the
idle case.
From the ARMv7-M ARM section B3.5.3:
Where there is an overlap between two regions, the register with
the highest region number takes priority.
We want to make sure the mpu_noexec_ram region has the lowest
priority to allow the mpu_stack_guard region to overwrite the first N
bytes of it.
This change fixes using mpu_noexec_ram and mpu_stack_guard together.
- add init_schedstatistics function to be called after
auto_init, that way xtimer_is init is called before
the first call to xtimer_now
- register schedstatics code as a callback to be executed
on each sched_run()
- Introduced enum type `thread_state_t` to replace preprocessor macros
- Moved thread states to `sched.h` for two reasons:
a) Because of the interdependencies of `sched.h` and `thread.h` keeping it in
`thread.h` would result in ugly code.
b) Theses thread states are defined from the schedulers point of view, so it
actually makes senses to have it defined there