Skip to content

Commit dff6b71

Browse files
Andy Rossandrewboie
authored andcommitted
kernel/sched: More nonatomic swap fixes
Nonatomic swap strikes again. These issues are all longstanding, but were unmasked by the dlist work in commit d40b8ce ("sys: dlist: Add sys_dnode_is_linked") where list node pointers become nulls on removal. The previous fix was for a specific case where a timeslicing interrupt would try to slice out the "wrong" current thread because the thread has "just" pended itself. That was incomplete, because the parallel code in k_sleep() didn't flag itself the same way. And beyond that, it turns out to be basically impossible (now that I'm thinking about it correctly) to prevent interrupt code from calling into the scheduler to suspend a "just pended but not quite" current and/or preempt away to another thread. In any of these cases, the scheduler modifications to the state bits remain correct but the queue nodes may be corrupt because the thread was already removed from the ready queue. So we have to test and correct this at the lowest level, where a thread is being removed from a priq: check that it's (1) the ready queue and not a waitq, (2) the current thread, and (3) already marked suspended and thus not in the queue. There are lots of existing issues filed in the last few months all pointing to odd instability on ARM platforms. I'm reasonably certain this is the root cause for most or all of them. Signed-off-by: Andy Ross <[email protected]>
1 parent 775993a commit dff6b71

File tree

1 file changed

+22
-0
lines changed

1 file changed

+22
-0
lines changed

kernel/sched.c

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -619,6 +619,13 @@ ALWAYS_INLINE void _priq_dumb_add(sys_dlist_t *pq, struct k_thread *thread)
619619

620620
void _priq_dumb_remove(sys_dlist_t *pq, struct k_thread *thread)
621621
{
622+
#if defined(CONFIG_SWAP_NONATOMIC) && defined(CONFIG_SCHED_DUMB)
623+
if (pq == &_kernel.ready_q.runq && thread == _current &&
624+
_is_thread_prevented_from_running(thread)) {
625+
return;
626+
}
627+
#endif
628+
622629
__ASSERT_NO_MSG(!_is_idle(thread));
623630

624631
sys_dlist_remove(&thread->base.qnode_dlist);
@@ -676,6 +683,12 @@ void _priq_rb_add(struct _priq_rb *pq, struct k_thread *thread)
676683

677684
void _priq_rb_remove(struct _priq_rb *pq, struct k_thread *thread)
678685
{
686+
#if defined(CONFIG_SWAP_NONATOMIC) && defined(CONFIG_SCHED_SCALABLE)
687+
if (pq == &_kernel.ready_q.runq && thread == _current &&
688+
_is_thread_prevented_from_running(thread)) {
689+
return;
690+
}
691+
#endif
679692
__ASSERT_NO_MSG(!_is_idle(thread));
680693

681694
rb_remove(&pq->tree, &thread->base.qnode_rb);
@@ -712,6 +725,12 @@ ALWAYS_INLINE void _priq_mq_add(struct _priq_mq *pq, struct k_thread *thread)
712725

713726
ALWAYS_INLINE void _priq_mq_remove(struct _priq_mq *pq, struct k_thread *thread)
714727
{
728+
#if defined(CONFIG_SWAP_NONATOMIC) && defined(CONFIG_SCHED_MULTIQ)
729+
if (pq == &_kernel.ready_q.runq && thread == _current &&
730+
_is_thread_prevented_from_running(thread)) {
731+
return;
732+
}
733+
#endif
715734
int priority_bit = thread->base.prio - K_HIGHEST_THREAD_PRIO;
716735

717736
sys_dlist_remove(&thread->base.qnode_dlist);
@@ -893,6 +912,9 @@ s32_t _impl_k_sleep(s32_t duration)
893912
struct k_spinlock local_lock = {};
894913
k_spinlock_key_t key = k_spin_lock(&local_lock);
895914

915+
#if defined(CONFIG_TIMESLICING) && defined(CONFIG_SWAP_NONATOMIC)
916+
pending_current = _current;
917+
#endif
896918
_remove_thread_from_ready_q(_current);
897919
_add_thread_timeout(_current, ticks);
898920

0 commit comments

Comments
 (0)