Skip to content

Commit 03bd4e1

Browse files
Wanpeng LiIngo Molnar
authored andcommitted
sched: Fix unreleased llc_shared_mask bit during CPU hotplug
The following bug can be triggered by hot adding and removing a large number of xen domain0's vcpus repeatedly: BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 IP: [..] find_busiest_group PGD 5a9d5067 PUD 13067 PMD 0 Oops: 0000 [#3] SMP [...] Call Trace: load_balance ? _raw_spin_unlock_irqrestore idle_balance __schedule schedule schedule_timeout ? lock_timer_base schedule_timeout_uninterruptible msleep lock_device_hotplug_sysfs online_store dev_attr_store sysfs_write_file vfs_write SyS_write system_call_fastpath Last level cache shared mask is built during CPU up and the build_sched_domain() routine takes advantage of it to setup the sched domain CPU topology. However, llc_shared_mask is not released during CPU disable, which leads to an invalid sched domainCPU topology. This patch fix it by releasing the llc_shared_mask correctly during CPU disable. Yasuaki also reported that this can happen on real hardware: https://lkml.org/lkml/2014/7/22/1018 His case is here: == Here is an example on my system. My system has 4 sockets and each socket has 15 cores and HT is enabled. In this case, each core of sockes is numbered as follows: | CPU# Socket#0 | 0-14 , 60-74 Socket#1 | 15-29, 75-89 Socket#2 | 30-44, 90-104 Socket#3 | 45-59, 105-119 Then llc_shared_mask of CPU#30 has 0x3fff80000001fffc0000000. It means that last level cache of Socket#2 is shared with CPU#30-44 and 90-104. When hot-removing socket#2 and #3, each core of sockets is numbered as follows: | CPU# Socket#0 | 0-14 , 60-74 Socket#1 | 15-29, 75-89 But llc_shared_mask is not cleared. So llc_shared_mask of CPU#30 remains having 0x3fff80000001fffc0000000. After that, when hot-adding socket#2 and #3, each core of sockets is numbered as follows: | CPU# Socket#0 | 0-14 , 60-74 Socket#1 | 15-29, 75-89 Socket#2 | 30-59 Socket#3 | 90-119 Then llc_shared_mask of CPU#30 becomes 0x3fff8000fffffffc0000000. It means that last level cache of Socket#2 is shared with CPU#30-59 and 90-104. So the mask has the wrong value. Signed-off-by: Wanpeng Li <[email protected]> Tested-by: Linn Crosetto <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Reviewed-by: Toshi Kani <[email protected]> Reviewed-by: Yasuaki Ishimatsu <[email protected]> Cc: <[email protected]> Cc: David Rientjes <[email protected]> Cc: Prarit Bhargava <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
1 parent 6a40281 commit 03bd4e1

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

arch/x86/kernel/smpboot.c

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1284,6 +1284,9 @@ static void remove_siblinginfo(int cpu)
12841284

12851285
for_each_cpu(sibling, cpu_sibling_mask(cpu))
12861286
cpumask_clear_cpu(cpu, cpu_sibling_mask(sibling));
1287+
for_each_cpu(sibling, cpu_llc_shared_mask(cpu))
1288+
cpumask_clear_cpu(cpu, cpu_llc_shared_mask(sibling));
1289+
cpumask_clear(cpu_llc_shared_mask(cpu));
12871290
cpumask_clear(cpu_sibling_mask(cpu));
12881291
cpumask_clear(cpu_core_mask(cpu));
12891292
c->phys_proc_id = 0;

0 commit comments

Comments
 (0)