|
1 | 1 | ---
|
2 | 2 | title: xc_domain_node_setaffinity()
|
3 |
| -description: Set a Xen domain's NUMA node affinity |
| 3 | +description: Set a Xen domain's NUMA node affinity for memory allocations |
| 4 | +mermaid: |
| 5 | + force: true |
4 | 6 | ---
|
5 | 7 |
|
6 |
| -`xc_domain_node_setaffinity()` controls the NUMA node affinity of a domain. |
| 8 | +`xc_domain_node_setaffinity()` controls the NUMA node affinity of a domain, |
| 9 | +but it only updates the Xen hypervisor domain's `d->node_affinity` mask. |
| 10 | +This mask is read by the Xen memory allocator as the 2nd preference for the |
| 11 | +NUMA node to allocate memory from for this domain. |
7 | 12 |
|
8 |
| -By default, Xen enables the `auto_node_affinity` feature flag, |
9 |
| -where setting the vCPU affinity also sets the NUMA node affinity for |
10 |
| -memory allocations to be aligned with the vCPU affinity of the domain. |
| 13 | +> [!info] Preferences of the Xen memory allocator: |
| 14 | +> 1. A NUMA node passed to the allocator directly takes precedence, if present. |
| 15 | +> 2. Then, if the allocation is for a domain, it's `node_affinity` mask is tried. |
| 16 | +> 3. Finally, it falls back to spread the pages over all remaining NUMA nodes. |
| 17 | +
|
| 18 | +As this call has no practical effect on the Xen scheduler, vCPU affinities |
| 19 | +need to be set separately anyways. |
| 20 | + |
| 21 | +The domain's `auto_node_affinity` flag is enabled by default by Xen. This means |
| 22 | +that when setting vCPU affinities, Xen updates the `d->node_affinity` mask |
| 23 | +to consist of the NUMA nodes to which its vCPUs have affinity to. |
| 24 | + |
| 25 | +See [xc_vcpu_setaffinity()](xc_vcpu_setaffinity) for more information |
| 26 | +on how `d->auto_node_affinity` is used to set the NUMA node affinity. |
| 27 | + |
| 28 | +Thus, so far, there is no obvious need to call `xc_domain_node_setaffinity()` |
| 29 | +when building a domain. |
11 | 30 |
|
12 | 31 | Setting the NUMA node affinity using this call can be used,
|
13 | 32 | for example, when there might not be enough memory on the
|
@@ -63,18 +82,57 @@ https://github.com/xen-project/xen/blob/master/xen/common/domain.c#L943-L970"
|
63 | 82 | This function implements the functionality of `xc_domain_node_setaffinity`
|
64 | 83 | to set the NUMA affinity of a domain as described above.
|
65 | 84 | If the new_affinity does not intersect the `node_online_map`,
|
66 |
| -it returns `-EINVAL`, otherwise on success `0`. |
| 85 | +it returns `-EINVAL`. Otherwise, the result is a success, and it returns `0`. |
67 | 86 |
|
68 | 87 | When the `new_affinity` is a specific set of NUMA nodes, it updates the NUMA
|
69 |
| -`node_affinity` of the domain to these nodes and disables `auto_node_affinity` |
70 |
| -for this domain. It also notifies the Xen scheduler of the change. |
| 88 | +`node_affinity` of the domain to these nodes and disables `d->auto_node_affinity` |
| 89 | +for this domain. With `d->auto_node_affinity` disabled, |
| 90 | +[xc_vcpu_setaffinity()](xc_vcpu_setaffinity) no longer updates the NUMA affinity |
| 91 | +of this domain. |
| 92 | + |
| 93 | +If `new_affinity` has all bits set, it re-enables the `d->auto_node_affinity` |
| 94 | +for this domain and calls |
| 95 | +[domain_update_node_aff()](https://github.com/xen-project/xen/blob/e16acd80/xen/common/sched/core.c#L1809-L1876) |
| 96 | +to re-set the domain's `node_affinity` mask to the NUMA nodes of the current |
| 97 | +the hard and soft affinity of the domain's online vCPUs. |
| 98 | + |
| 99 | +### Flowchart in relation to xc_set_vcpu_affinity() |
| 100 | + |
| 101 | +The effect of `domain_set_node_affinity()` can be seen more clearly on this |
| 102 | +flowchart which shows how `xc_set_vcpu_affinity()` is currently used to set |
| 103 | +the NUMA affinity of a new domain, but also shows how `domain_set_node_affinity()` |
| 104 | +relates to it: |
71 | 105 |
|
72 |
| -This sets the preference the memory allocator to the new NUMA nodes, |
73 |
| -and in theory, it could also alter the behaviour of the scheduler. |
74 |
| -This of course depends on the scheduler and its configuration. |
| 106 | +{{% include "xc_vcpu_setaffinity-xenopsd-notes.md" %}} |
| 107 | +{{% include "xc_vcpu_setaffinity-xenopsd.md" %}} |
| 108 | + |
| 109 | +`xc_domain_node_setaffinity` can be used to set the domain's `node_affinity` |
| 110 | +(which is normally set by `xc_set_vcpu_affinity`) to different NUMA nodes. |
| 111 | + |
| 112 | +#### No effect on the Xen scheduler |
| 113 | + |
| 114 | +Currently, the node affinity does not affect the Xen scheudler: |
| 115 | +In case `d->node_affinity` would be set before vCPU creation, the initial pCPU |
| 116 | +of the new vCPU is the first pCPU of the first NUMA node in the domain's |
| 117 | +`node_affinity`. This is further changed when one of more `cpupools` are set up. |
| 118 | +As this is only the initial pCPU of the vCPU, this alone does not change the |
| 119 | +scheduling of Xen Credit scheduler as it reschedules the vCPUs to other pCPUs. |
75 | 120 |
|
76 | 121 | ## Notes on future design improvements
|
77 | 122 |
|
| 123 | +### It may be possible to call it before vCPUs are created |
| 124 | + |
| 125 | +When done early, before vCPU creation, some domain-related data structures |
| 126 | +could be allocated using the domain's `d->node_affinity` NUMA node mask. |
| 127 | + |
| 128 | +With further changes in Xen and `xenopsd`, Xen could allocate the vCPU structs |
| 129 | +on the affine NUMA nodes of the domain. |
| 130 | + |
| 131 | +For this, would be that `xenopsd` would have to call `xc_domain_node_setaffinity()` |
| 132 | +before vCPU creation, after having decided the domain's NUMA placement, |
| 133 | +preferably including claiming the required memory for the domain to ensure |
| 134 | +that the domain will be populated from the same NUMA node(s). |
| 135 | + |
78 | 136 | This call cannot influence the past: The `xenopsd`
|
79 | 137 | [VM_create](../../xenopsd/walkthroughs/VM.start.md#2-create-a-xen-domain)
|
80 | 138 | micro-ops calls `Xenctrl.domain_create`. It currently creates
|
|
0 commit comments