Skip to content

Commit 8355506

Browse files
authored
Hugo docs: Add dedicated walk-throughs for VM.build and xenguest (#6296)
Add walk-throughs for VM_build (split into 3 pages for improved focus): The existing walk-though for VM_build as been uses as the basis and extended: The parts which are currently in [Chapter 3 of the VM.start workflow](https://xapi-project.github.io/new-docs/xenopsd/walkthroughs/VM.start/index.html#3-build-the-domain) have been extracted, moved to new files and extended. A special improvement is that the deep nesting of lists is split into separate chapters and flattened, which fixes the deep nesting that I initially used. The new extracted and improved walk-though contains flowcharts that show the execution flow and a call graph to visualise the content of the walk-through. This PR has 3 commits, the 2nd commit is the big commit. The 1st and 3rd commit are very small and only minor improvements: 1. [doc/hugo.toml: Use the theme font for mermaid diagrams too](66e950) This is only a cosmetic change to align the font of the Mermaid diagrams with the font of the Hugo Relearn theme. It makes the look of the Mermaid diagrams more consistent with the more modern theme of the site. It is only a very minimal step and very small. 2. [docs: Add dedicated walk-throughs for VM.build and xenguest](f58c32) Add walk-throughs for VM_build (split into 3 pages for improved focus). 3. [xenopsd docs: Add Walk-through descriptions, show them on the index page](27de477) Show a short on-line summary of the workflows on the index walk-throug index page
2 parents 07a4ae4 + 27de477 commit 8355506

File tree

9 files changed

+445
-108
lines changed

9 files changed

+445
-108
lines changed
Lines changed: 137 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,137 @@
1+
---
2+
title: Domain.build
3+
description:
4+
"Prepare the build of a VM: Wait for scrubbing, do NUMA placement, run xenguest."
5+
---
6+
7+
## Overview
8+
9+
```mermaid
10+
flowchart LR
11+
subgraph xenopsd VM_build[
12+
xenopsd thread pool with two VM_build micro#8209;ops:
13+
During parallel VM_start, Many threads run this in parallel!
14+
]
15+
direction LR
16+
build_domain_exn[
17+
VM.build_domain_exn
18+
from thread pool Thread #1
19+
] --> Domain.build
20+
Domain.build --> build_pre
21+
build_pre --> wait_xen_free_mem
22+
build_pre -->|if NUMA/Best_effort| numa_placement
23+
Domain.build --> xenguest[Invoke xenguest]
24+
click Domain.build "https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/domain.ml#L1111-L1210" _blank
25+
click build_domain_exn "https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/xenops_server_xen.ml#L2222-L2225" _blank
26+
click wait_xen_free_mem "https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/domain.ml#L236-L272" _blank
27+
click numa_placement "https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/domain.ml#L862-L897" _blank
28+
click build_pre "https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/domain.ml#L899-L964" _blank
29+
click xenguest "https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/domain.ml#L1139-L1146" _blank
30+
31+
build_domain_exn2[
32+
VM.build_domain_exn
33+
from thread pool Thread #2] --> Domain.build2[Domain.build]
34+
Domain.build2 --> build_pre2[build_pre]
35+
build_pre2 --> wait_xen_free_mem2[wait_xen_free_mem]
36+
build_pre2 -->|if NUMA/Best_effort| numa_placement2[numa_placement]
37+
Domain.build2 --> xenguest2[Invoke xenguest]
38+
click Domain.build2 "https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/domain.ml#L1111-L1210" _blank
39+
click build_domain_exn2 "https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/xenops_server_xen.ml#L2222-L2225" _blank
40+
click wait_xen_free_mem2 "https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/domain.ml#L236-L272" _blank
41+
click numa_placement2 "https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/domain.ml#L862-L897" _blank
42+
click build_pre2 "https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/domain.ml#L899-L964" _blank
43+
click xenguest2 "https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/domain.ml#L1139-L1146" _blank
44+
end
45+
```
46+
47+
[`VM.build_domain_exn`](https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/xenops_server_xen.ml#L2024-L2248)
48+
[calls](https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/xenops_server_xen.ml#L2222-L2225)
49+
[`Domain.build`](https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/domain.ml#L1111-L1210)
50+
to call:
51+
- `build_pre` to prepare the build of a VM:
52+
- If the `xe` config `numa_placement` is set to `Best_effort`, invoke the NUMA placement algorithm.
53+
- Run `xenguest`
54+
- `xenguest` to invoke the [xenguest](xenguest) program to setup the domain's system memory.
55+
56+
## Domain Build Preparation using build_pre
57+
58+
[`Domain.build`](https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/domain.ml#L1111-L1210)
59+
[calls](https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/domain.ml#L1137)
60+
the [function `build_pre`](https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/domain.ml#L899-L964)
61+
(which is also used for VM restore). It must:
62+
63+
1. [Call](https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/domain.ml#L902-L911)
64+
[wait_xen_free_mem](https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/domain.ml#L236-L272)
65+
to wait, if necessary, for the Xen memory scrubber to catch up reclaiming memory (CA-39743)
66+
2. Call the hypercall to set the timer mode
67+
3. Call the hypercall to set the number of vCPUs
68+
4. As described in the [NUMA feature description](../../toolstack/features/NUMA),
69+
when the `xe` configuration option `numa_placement` is set to `Best_effort`,
70+
except when the VM has a hard affinity set, invoke the `numa_placement` function:
71+
72+
```ml
73+
match !Xenops_server.numa_placement with
74+
| Any ->
75+
()
76+
| Best_effort ->
77+
log_reraise (Printf.sprintf "NUMA placement") (fun () ->
78+
if has_hard_affinity then
79+
D.debug "VM has hard affinity set, skipping NUMA optimization"
80+
else
81+
numa_placement domid ~vcpus
82+
~memory:(Int64.mul memory.xen_max_mib 1048576L)
83+
)
84+
```
85+
86+
## NUMA placement
87+
88+
`build_pre` passes the `domid`, the number of `vCPUs` and `xen_max_mib` to the
89+
[numa_placement](https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/domain.ml#L862-L897)
90+
function to run the algorithm to find the best NUMA placement.
91+
92+
When it returns a NUMA node to use, it calls the Xen hypercalls
93+
to set the vCPU affinity to this NUMA node:
94+
95+
```ml
96+
let vm = NUMARequest.make ~memory ~vcpus in
97+
let nodea =
98+
match !numa_resources with
99+
| None ->
100+
Array.of_list nodes
101+
| Some a ->
102+
Array.map2 NUMAResource.min_memory (Array.of_list nodes) a
103+
in
104+
numa_resources := Some nodea ;
105+
Softaffinity.plan ~vm host nodea
106+
```
107+
108+
By using the default `auto_node_affinity` feature of Xen,
109+
setting the vCPU affinity causes the Xen hypervisor to activate
110+
NUMA node affinity for memory allocations to be aligned with
111+
the vCPU affinity of the domain.
112+
113+
Note: See the Xen domain's
114+
[auto_node_affinity](https://wiki.xenproject.org/wiki/NUMA_node_affinity_in_the_Xen_hypervisor)
115+
feature flag, which controls this, which can be overridden in the
116+
Xen hypervisor if needed for specific VMs.
117+
118+
This can be used, for example, when there might not be enough memory
119+
on the preferred NUMA node, but there are other NUMA nodes that have
120+
enough free memory among with the memory allocations shall be done.
121+
122+
In terms of future NUMA design, it might be even more favourable to
123+
have a strategy in `xenguest` where in such cases, the superpages
124+
of the preferred node are used first and a fallback to neighbouring
125+
NUMA nodes only happens to the extent necessary.
126+
127+
Likely, the future allocation strategy should be passed to `xenguest`
128+
using Xenstore like the other platform parameters for the VM.
129+
130+
Summary: This passes the information to the hypervisor that memory
131+
allocation for this domain should preferably be done from this NUMA node.
132+
133+
## Invoke the xenguest program
134+
135+
With the preparation in `build_pre` completed, `Domain.build`
136+
[calls](https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/domain.ml#L1127-L1155)
137+
the `xenguest` function to invoke the [xenguest](xenguest) program to build the domain.
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
---
2+
title: VM_build micro-op
3+
linkTitle: VM_build μ-op
4+
description: Overview of the VM_build μ-op (runs after the VM_create μ-op created the domain).
5+
weight: 10
6+
---
7+
8+
## Overview
9+
10+
On Xen, `Xenctrl.domain_create` creates an empty domain and
11+
returns the domain ID (`domid`) of the new domain to `xenopsd`.
12+
13+
In the `build` phase, the `xenguest` program is called to create
14+
the system memory layout of the domain, set vCPU affinity and a
15+
lot more.
16+
17+
The [VM_build](https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/lib/xenops_server.ml#L2255-L2271)
18+
micro-op collects the VM build parameters and calls
19+
[VM.build](https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/xenops_server_xen.ml#L2290-L2291),
20+
which calls
21+
[VM.build_domain](https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/xenops_server_xen.ml#L2250-L2288),
22+
which calls
23+
[VM.build_domain_exn](https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/xenops_server_xen.ml#L2024-L2248)
24+
which calls [Domain.build](Domain.build):
25+
26+
```mermaid
27+
flowchart
28+
subgraph xenopsd VM_build[xenopsd VM_build micro#8209;op]
29+
direction LR
30+
VM_build --> VM.build
31+
VM.build --> VM.build_domain
32+
VM.build_domain --> VM.build_domain_exn
33+
VM.build_domain_exn --> Domain.build
34+
click VM_build "https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/lib/xenops_server.ml#L2255-L2271" _blank
35+
click VM.build "https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/xenops_server_xen.ml#L2290-L2291" _blank
36+
click VM.build_domain "https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/xenops_server_xen.ml#L2250-L2288" _blank
37+
click VM.build_domain_exn "https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/xenops_server_xen.ml#L2024-L2248" _blank
38+
click Domain.build "../Domain.build/index.html"
39+
end
40+
```
41+
42+
The function
43+
[VM.build_domain_exn](https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/xenops_server_xen.ml#L2024)
44+
must:
45+
46+
1. Run pygrub (or eliloader) to extract the kernel and initrd, if necessary
47+
2. [Call](https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/xenops_server_xen.ml#L2222-L2225)
48+
[Domain.build](Domain.build)
49+
to:
50+
- optionally run NUMA placement and
51+
- invoke [xenguest](VM.build/xenguest) to set up the domain memory.
52+
53+
See the walk-though on [VM.build](VM.build) for more details on this phase.
54+
3. Apply the `cpuid` configuration
55+
4. Store the current domain configuration on disk -- it's important to know
56+
the difference between the configuration you started with and the configuration
57+
you would use after a reboot because some properties (such as maximum memory
58+
and vCPUs) as fixed on create.
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
---
2+
title: Building a VM
3+
description: After VM_create, VM_build builds the core of the domain (vCPUs, memory)
4+
weight: 20
5+
---
6+
7+
Walk-through documents for the `VM_build` phase:
8+
9+
```mermaid
10+
flowchart
11+
subgraph xenopsd VM_build[xenopsd VM_build micro#8209;op]
12+
direction LR
13+
VM_build --> VM.build
14+
VM.build --> VM.build_domain
15+
VM.build_domain --> VM.build_domain_exn
16+
VM.build_domain_exn --> Domain.build
17+
click VM_build "https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/lib/xenops_server.ml#L2255-L2271" _blank
18+
click VM.build "https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/xenops_server_xen.ml#L2290-L2291" _blank
19+
click VM.build_domain "https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/xenops_server_xen.ml#L2250-L2288" _blank
20+
click VM.build_domain_exn "https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/xenops_server_xen.ml#L2024-L2248" _blank
21+
end
22+
```
23+
24+
{{% children description=true %}}

0 commit comments

Comments
 (0)