Replace use of C11 atomics with atomic utils. This fixes
> error: address argument to atomic operation must be a pointer to a
> trivially-copyable type ('_Atomic(int) *' invalid)
error when compiling on AVR with LLVM.
This adds a simple macro to check (at C level) whether a given
expression is proven to be compile time constant and suitable for
constant folding. This allows writing code like this:
```C
int gpio_read(gpio_t pin) {
if (IS_CT_CONSTANT(pin)) {
/* this implementation should even be able to use the port and
* pin number as immediate in inline assembly */
}
else {
/* less efficient implementation that cannot use port and pin
* number as immediate in inline assembly */
}
}
```
This provides the same functionality as `static_assert()` provided by
C11 and has no advantages compared to it. Hence, encourage users to use
standard C functionality instead.
For the caller there should be no difference if there is no message
in the queue and if there can't be a message in the queue.
The current API works as one would expect if there is a message queue,
but once called from a thread that does not have a message queue
configured, code that does
while (msg_avail())
will end up in an infinite loop.
Remove this foot-gun from the API by making the return value of
msg_avail() independend of the availability of a message queue.
`WITHOUT_PEDANTIC(expr)` disables `-Wpedantic` for `expr`, but switches
back to the previous diagnostic settings afterwards. This helps defining
macros that are not strictly ISO compliant without having to drop the
`-Wpedantic` flag entirely.
`DECLARE_CONSTANT(identifier, const_expr)` declares an anonymous `enum`
constant named `identifier` and assigns it the value `const_expr`. Here,
`const_expr` has to be a compile time constant, but is not needed to be
an integer constant expression. It basically is a tool to magically
convert a non-integer constant expression into a integer constant
expression.
Calculate the size of the element based on the array given, not based
on the element pointer.
The element might as well be given as a `void *` via a callback.
In that case, if the user forgets to cast the `void *` to the array
element type, the calculation returns false values.
Disarm this foot gun by basing the element size off the given array.
This prevents gcc from figuring out that an XFA that has been
initialized in the same file is technically empty when the compilation
unit is seen by itself. This happened with gcc 10.1.0 on msp430-elf.
Due to limited compatibility with C, we cannot use the inline mutex_trylock
implementation for C++. Instead, we provide a mutex_trylock_ffi() intended for
foreign function interfaces. This should also benefit rust users.
Add a version of `mutex_lock()` that can be canceled with the obvious name
`mutex_lock_cancelable()`. This function returns `0` on success, and
`-ECANCELED` when the calling thread was unblocked via a call to
`mutex_cancel()` (and hence without obtaining the mutex).
This is intended to simplify the implementation of `xtimer_mutex_lock_timeout()`
and to implement `ztimer_mutex_lock_timeout()`.
- Split out handling of the blocking code path of mutex_lock() into a static
`_block()` function. This improves readability a bit and will ease review of
a follow up PR.
- Return `void` instead of `int`.
- Use static inline function for `mutex_try_lock()`
- The implementation is trivial enough with the inline-able IRQ API to just
always be inline-ed
- Rename `_mutex_lock()` to `mutex_lock()` and drop the blocking parameter
- This was possible to the stand-alone `mutex_try_lock()` implementation
- This yields a measurable performance bump
Currently it is not possible to check if a message was sent over a bus
or if it was send the usual way using `msg_send()`.
This adds a flag to the `sender_pid` if the message was sent over a bus.
`MAXTHREADS` is currently set to 32, so there is still plenty of room in
the PID space. (`kernel_pid_t` is `int16_t`)
The message type for bus message type is already accessed through a getter
function, so it's just consistent to do the same for sender_pid.
Verified that each warning generated by -Wcast-align is indeed a false positive
and used an (intermediate) cast to `uintptr_t` to silence the warnings.
container_of() is safe to use in regard to alignment requirements, when used
correctly. Using `uintptr_t` instead of `char *` for applying the offset results
in -Wcast-align not complaining.
Separate thread names from DEVELHELP so thread names can be
enabled in non-development/debug builds when required/desired.
THREAD_NAMES will be enabled by default then DEVELHELP is set to 1.
Allocate and initialize a thread-local block for each thread at the
top of the stack.
Set the tls base when switching to a new thread.
Add tdata/tbss linker instructions to cortex_m and risc-v scripts.
Signed-off-by: Keith Packard <keithp@keithp.com>
---
v2:
Squash fixes
v3:
Replace tabs with spaces
v4:
Add tbss to fe310 linker script
- Add `byteorder_bebuftohll()` to read an 64 bit value from a big endian buffer
- Add `byteorder_htobebufll()` to write an 64 bit value into a big endian buffer
This commit reverses the runqueue_cache bit order when the architecture
has a CLZ (count leading zeros) instruction. When the architecture
supports CLZ, it is faster to determine the most significant set bit of
a word than to determine the least significant bit set. Unfortunately
when the instruction is not available, it is more efficient to determine
the least significant bit set.
Reversing the bit order shaves off another 4 cycles on the same54-xpro.
From 147 to 143 ticks when testing with tests/bench_sched_nop.
Architectures where no CLZ instruction is available are not affected.
Apparently clang doesn't like static variables / functions being accessed called
from inline function (-Wstatic-in-inline). This commit results in the same
binary being generated while making clang happy.
Replace accesses to `sched_active_thread`, `sched_active_pid`, and
`sched_threads` with `thread_get_active()`, `thread_get_active_pid()`, and
`thread_get_unchecked()` where sensible.
- Add `thread_get_active()` to access the TCB
- Add `thread_get_unchecked()` as fast alternative to `thread_get()`
- Drop `volatile` qualifier in `thread_get()`
- Right now every caller of this function does this. It is better to
contain this undefined behavior to at least one place in code
The `clz` instruction pretty much implements getting the most significant bit
in hardware, so use it instead of the software implementation.
This reults in both a reduction in code size as in a speedup:
master:
text data bss dec hex filename
14816 136 2424 17376 43e0 tests/bitarithm_timings/bin/same54-xpro/tests_bitarithm_timings.elf
+ bitarithm_msb: 3529411 iterations per second
this patch:
text data bss dec hex filename
14768 136 2424 17328 43b0 tests/bitarithm_timings/bin/same54-xpro/tests_bitarithm_timings.elf
+ bitarithm_msb: 9230761 iterations per second