zephyr/kernel
Andy Ross b173e4353f kernel/queue: Fix spurious NULL exit condition when using timeouts
The queue loop when CONFIG_POLL is in used has an inherent race
between the return of k_poll() and the inspection of the list where no
lock can be held.  Other contending readers of the same queue can
sneak in and steal the item out of the list before the current thread
gets to the sys_sflist_get() call, and the current loop will (if it
has a timeout) spuriously return NULL before the timeout expires.

It's not even a hard race to exercise.  Consider three threads at
different priorities: High (which can be an ISR too), Mid, and Low:

1. Mid and Low both enter k_queue_get() and sleep inside k_poll() on
   an empty queue.

2. High comes along and calls k_queue_insert().  The queue code then
   wakes up Mid, and reschedules, but because High is still running Mid
   doesn't get to run yet.

3. High inserts a SECOND item.  The queue then unpends the next thread
   in the list (Low), and readies it to run.  But as before, it won't
   be scheduled yet.

4. Now High sleeps (or if it's an interrupt, exits), and Mid gets to
   run.  It dequeues and returns the item it was delivered normally.

5. But Mid is still running!  So it re-enters the loop it's sitting in
   and calls k_queue_get() again, which sees and returns the second
   item in the queue synchronously.  Then it calls it a third time and
   goes to sleep because the queue is empty.

6. Finally, Low wakes up to find an empty queue, and returns NULL
   despite the fact that the timeout hadn't expired.

The fix is simple enough: check the timeout expiration inside the loop
so we don't return early.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-06-11 17:11:51 -04:00
..
include kernel: move thread monitor init to common code 2018-06-06 14:26:45 -04:00
alert.c
atomic_c.c
CMakeLists.txt
compiler_stack_protect.c
device.c
errno.c
idle.c
init.c kernel: Use IS-specific entropy function when available 2018-05-24 15:13:13 -07:00
int_latency_bench.c
Kconfig kernel: work_q: Document implications of default sys work_q priority 2018-06-11 14:40:07 -04:00
Kconfig.event_logger
Kconfig.power_mgmt
mailbox.c
mem_domain.c
mem_slab.c
mempool.c kernel/mempool: Handle transient failure condition 2018-05-27 09:55:04 -04:00
msg_q.c kernel: Wait queues aren't dlists anymore 2018-05-19 07:00:55 +03:00
mutex.c
pipes.c kernel: Wait queues aren't dlists anymore 2018-05-19 07:00:55 +03:00
poll.c
queue.c kernel/queue: Fix spurious NULL exit condition when using timeouts 2018-06-11 17:11:51 -04:00
sched.c kernel: sched: use _is_thread_ready() in should_preempt() 2018-06-04 08:21:47 -04:00
sem.c
smp.c
stack.c kernel: Wait queues aren't dlists anymore 2018-05-19 07:00:55 +03:00
sys_clock.c
system_work_q.c
thread_abort.c
thread.c kernel: move thread monitor init to common code 2018-06-06 14:26:45 -04:00
timer.c
userspace_handler.c
userspace.c
version.c
work_q.c