Allow using `event_loop_multi()` to handle event queues of multiple priorities
in an single thread. In the extreme case, all three event queues are handled
by a single thread (thus saving two stacks). This comes for the price of
increased worst case latency, as already running event handlers will no longer
be preempted by higher priority events.
With this, all three event queue priorities are always provided. Using modules,
the old behavior of one thread per event queue can be restored for better worst
case latency at the expense of additional thread size.
When the expected ROM overhead of a function is bigger than the actual function,
it is better to provide the function as static inline function in the header.
The value of `queue->waiter` at the time the event was queued (with IRQs
disabled) was backed up to the stack-variable `waiter`. Thus, the test later on
for `waiter` checks if the queue was already claimed at the time the event
was queued. Therefore, there is no race.
Added `event_wait_multi()` that takes an array of event queues rather than
a single event queues. The queue with the lowest index will have the highest
priority.
With #10970 only existing *.c files will be added to SRC when using
the SUBMODULES mechanism, so SUBMODULES_NOFORCE (used to filter out
non existing source files) is now redundant so remove the usage.