It is often desiderable to sync on multiple threads, e.g. there can be a controller
thread that waits for `n` worker threads to finish their job.
An inverse semaphore provides an easy primitive to implement this pattern.
After being initialized with a value `n` (in counter mode), a call to `sema_inv_wait()`
will block until each of the `n` threads has called `sema_inv_post()` exactly once.
There are situations where workers might post an event more than once
(unless additional state is introduced).
For this case, the alternative mask mode is provided.
Here the inverse semaphore is initialized with a bit mask, each worker can clear one
or multiple bits with `sema_inv_post_mask()`. A worker can clear it's bit multiple times.