Skip to content

Commit dafcaab

Browse files
authored
(doc) Describe how xc_domain_claim_pages() is used to claim pages (#6343)
(doc) Describe how xc_domain_claim_pages() is used to claim pages
2 parents b9c8154 + e900040 commit dafcaab

File tree

1 file changed

+157
-0
lines changed

1 file changed

+157
-0
lines changed
Lines changed: 157 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,157 @@
1+
---
2+
title: xc_domain_claim_pages()
3+
description: Stake a claim for further memory for a domain, and release it too.
4+
---
5+
6+
## Purpose
7+
8+
The purpose of `xc_domain_claim_pages()` is to attempt to
9+
stake a claim on an amount of memory for a given domain which guarantees that
10+
memory allocations for the claimed amount will be successful.
11+
12+
The domain can still attempt to allocate beyond the claim, but those are not
13+
guaranteed to be successful and will fail if the domain's memory reaches it's
14+
`max_mem` value.
15+
16+
Each domain can only have one claim, and the domid is the key of the claim.
17+
By killing the domain, the claim is also released.
18+
19+
Depending on the given size argument, the remaining stack of the domain
20+
can be set initially, updated to the given amount, or reset to no claim (0).
21+
22+
## Management of claims
23+
24+
- The stake is centrally managed by the Xen hypervisor using a
25+
[Hypercall](https://wiki.xenproject.org/wiki/Hypercall).
26+
- Claims are not reflected in the amount of free memory reported by Xen.
27+
28+
## Reporting of claims
29+
30+
- `xl claims` reports the outstanding claims of the domains:
31+
> [!info] Sample output of `xl claims`:
32+
> ```js
33+
> Name ID Mem VCPUs State Time(s) Claimed
34+
> Domain-0 0 2656 8 r----- 957418.2 0
35+
> ```
36+
- `xl info` reports the host-wide outstanding claims:
37+
> [!info] Sample output from `xl info | grep outstanding`:
38+
> ```js
39+
> outstanding_claims : 0
40+
> ```
41+
42+
## Tracking of claims
43+
44+
Xen only tracks:
45+
- the outstanding claims of each domain and
46+
- the outstanding host-wide claims.
47+
48+
Claiming zero pages effectively cancels the domain's outstanding claim
49+
and is always successful.
50+
51+
> [!info]
52+
> - Allocations for outstanding claims are expected to always be successful.
53+
> - But this reduces the amount of outstanding claims if the domain.
54+
> - Freeing memory of the domain increases the domain's claim again:
55+
> - But, when a domain consumes its claim, it is reset.
56+
> - When the claim is reset, freed memory is longer moved to the outstanding claims!
57+
> - It would have to get a new claim on memory to have spare memory again.
58+
59+
> [!warning] The domain's `max_mem` value is used to deny memory allocation
60+
> If an allocation would cause the domain to exceed it's `max_mem`
61+
> value, it will always fail.
62+
63+
64+
## Implementation
65+
66+
Function signature of the libXenCtrl function to call the Xen hypercall:
67+
68+
```c
69+
long xc_memory_op(libxc_handle, XENMEM_claim_pages, struct xen_memory_reservation *)
70+
```
71+
72+
`struct xen_memory_reservation` is defined as :
73+
74+
```c
75+
struct xen_memory_reservation {
76+
.nr_extents = nr_pages, /* number of pages to claim */
77+
.extent_order = 0, /* an order 0 means: 4k pages, only 0 is allowed */
78+
.mem_flags = 0, /* no flags, only 0 is allowed (at the moment) */
79+
.domid = domid /* numerical domain ID of the domain */
80+
};
81+
```
82+
83+
### Concurrency
84+
85+
Xen protects the consistency of the stake of the domain
86+
using the domain's `page_alloc_lock` and the global `heap_lock` of Xen.
87+
Thse spin-locks prevent any "time-of-check-time-of-use" races.
88+
As the hypercall needs to take those spin-locks, it cannot be preempted.
89+
90+
### Return value
91+
92+
The call returns 0 if the hypercall successfully claimed the requested amount
93+
of memory, else it returns non-zero.
94+
95+
## Current users
96+
97+
### <tt>libxl</tt> and the <tt>xl</tt> CLI
98+
99+
If the `struct xc_dom_image` passed by `libxl` to the
100+
[libxenguest](https://github.com/xen-project/xen/tree/master/tools/libs/guest)
101+
functions
102+
[meminit_hvm()](https://github.com/xen-project/xen/blob/de0254b9/tools/libs/guest/xg_dom_x86.c#L1348-L1649)
103+
and
104+
[meminit_pv()](https://github.com/xen-project/xen/blob/de0254b9/tools/libs/guest/xg_dom_x86.c#L1183-L1333)
105+
has it's `claim_enabled` field set, they,
106+
before allocating the domain's system memory using the allocation function
107+
[xc_populate_physmap()](https://github.com/xen-project/xen/blob/de0254b9/xen/common/memory.c#L159-L314) which calls the hypercall to allocate and populate
108+
the domain's main system memory, will attempt to claim the to-be allocated
109+
memory using a call to `xc_domain_claim_pages()`.
110+
In case this fails, they do not attempt to continue and return the error code
111+
of `xc_domain_claim_pages()`.
112+
113+
Both functions also (unconditionally) reset the claim upon return.
114+
115+
But, the `xl` CLI uses this functionality (unless disabled in `xl.conf`)
116+
to make building the domains fail to prevent running out of memory inside
117+
the `meminit_hvm` and `meminit_pv` calls.
118+
Instead, they immediately return an error.
119+
120+
This means that in case the claim fails, `xl` avoids:
121+
- The effort of allocating the memory, thereby not blocking it for other domains.
122+
- The effort of potentially needing to scrub the memory after the build failure.
123+
124+
### xenguest
125+
126+
While [xenguest](../../../xenopsd/walkthroughs/VM.build/xenguest) calls the
127+
[libxenguest](https://github.com/xen-project/xen/tree/master/tools/libs/guest)
128+
functions
129+
[meminit_hvm()](https://github.com/xen-project/xen/blob/de0254b9/tools/libs/guest/xg_dom_x86.c#L1348-L1649)
130+
and
131+
[meminit_pv()](https://github.com/xen-project/xen/blob/de0254b9/tools/libs/guest/xg_dom_x86.c#L1183-L1333)
132+
like `libxl` does, it does not set
133+
[struct xc_dom_image.claim_enabled](https://github.com/xen-project/xen/blob/de0254b9/tools/include/xenguest.h#L186),
134+
so it does not enable the first call to `xc_domain_claim_pages()`
135+
which would claim the amount of memory that these functions will
136+
attempt to allocate and populate for the domain.
137+
138+
#### Future design ideas for improved NUMA support
139+
140+
For improved support for [NUMA](../../../toolstack/features/NUMA/), `xenopsd`
141+
may want to call an updated version of this function for the domain, so it has
142+
a stake on the NUMA node's memory before `xenguest` will allocate for the domain
143+
before assigning an NUMA node to a new domain.
144+
145+
Further, as PV drivers `unmap` and `free` memory for grant tables to Xen and
146+
then re-allocate memory for those grant tables, `xenopsd` may want to try to
147+
stake a very small claim for the domain on the NUMA node of the domain so that
148+
Xen can increase this claim when the PV drivers `free` this memory and re-use
149+
the resulting claimed amount for allocating the grant tables. This would ensure
150+
that the grant tables are then allocated on the local NUMA node of the domain,
151+
avoiding remote memory accesses when accessing the grant tables from inside
152+
the domain.
153+
154+
Note: In case the corresponding backend process in Dom0 is running on another
155+
NUMA node, it would access the domain's grant tables from a remote NUMA node,
156+
but in this would enable a future improvement for Dom0, where it could prefer to
157+
run the corresponding backend process on the same or a neighbouring NUMA node.

0 commit comments

Comments
 (0)