Skip to content

Commit e8a34e9

Browse files
authored
(docs) VM.migrate.md: Rephrase and simplify, improve readability (#6307)
(docs) Update the walk-trough `VM.migrate`: - Include the Live migration flowchart for reference and as overview. - Fix the chapter structure and chapter headings: Improved table of contents (1st button in the top bar) - Add chapter links to the mentioned atomics operations - Covert a long sentence describing a list of parameters into a simple list of parameters - Clarify ambiguities, e.g. "if we are already at the right place" -> "if the command is already at the destination host" - Removed the use of the word "will" when things are already happening or are done just at that point. - Removed sentences that just filled space by just mentioning what the next chapter headline says. - Improved the chapter on the final step with more references and a link to a helpful explanation. - Updated the links for a better understanding to the currently used URLs.
2 parents ec3b62e + 6d0fef3 commit e8a34e9

File tree

5 files changed

+115
-99
lines changed

5 files changed

+115
-99
lines changed

doc/content/squeezed/architecture/index.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,9 @@
11
+++
2-
title = "Architecture"
2+
title = "Squeezed Architecture"
3+
linkTitle = "Architecture"
34
+++
45

5-
Squeezed is responsible for managing the memory on a single host. Squeezed
6+
Squeezed is the XAPI Toolstack’s host memory ballooning daemon. It
67
"balances" memory between VMs according to a policy written to Xenstore.
78

89
The following diagram shows the internals of Squeezed:

doc/content/xenopsd/architecture/_index.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
+++
2-
title = "Architecture"
2+
title = "Xenopsd Architecture"
3+
linkTitle = "Architecture"
34
+++
45

56
Xenopsd instances run on a host and manage VMs on behalf of clients. This

doc/content/xenopsd/walkthroughs/VM.migrate.md

Lines changed: 96 additions & 82 deletions
Original file line numberDiff line numberDiff line change
@@ -3,38 +3,42 @@ title: 'Walkthrough: Migrating a VM'
33
linktitle: 'Migrating a VM'
44
description: Walkthrough of migrating a VM from one host to another.
55
weight: 50
6+
mermaid:
7+
force: true
68
---
9+
At the end of this walkthrough, a sequence diagram of the overall process is included.
710

8-
A XenAPI client wishes to migrate a VM from one host to another within
9-
the same pool.
11+
## Invocation
1012

11-
The client will issue a command to migrate the VM and it will be dispatched
13+
The command to migrate the VM is dispatched
1214
by the autogenerated `dispatch_call` function from **xapi/server.ml**. For
1315
more information about the generated functions you can have a look to
1416
[XAPI IDL model](https://github.com/xapi-project/xen-api/tree/master/ocaml/idl/ocaml_backend).
1517

16-
The command will trigger the operation
18+
The command triggers the operation
1719
[VM_migrate](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/lib/xenops_server.ml#L2572)
18-
that has low level operations performed by the backend. These atomics operations
19-
that we will describe in the documentation are:
20-
21-
- VM.restore
22-
- VM.rename
23-
- VBD.set_active
24-
- VBD.plug
25-
- VIF.set_active
26-
- VGPU.set_active
27-
- VM.create_device_model
28-
- PCI.plug
29-
- VM.set_domain_action_request
30-
31-
The command has several parameters such as: Should it be started asynchronously,
32-
should it be forwarded to another host, how arguments should be marshalled and
33-
so on. A new thread is created by [xapi/server_helpers.ml](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xapi/server_helpers.ml#L55)
34-
to handle the command asynchronously. At this point the helper also check if
20+
that uses many low level atomics operations. These are:
21+
22+
- [VM.restore](#VM-restore)
23+
- [VM.rename](#VM-rename)
24+
- [VBD.set_active](#restoring-devices)
25+
- [VBD.plug](#restoring-devices)
26+
- [VIF.set_active](#restoring-devices)
27+
- [VGPU.set_active](#restoring-devices)
28+
- [VM.create_device_model](#creating-the-device-model)
29+
- [PCI.plug](#pci-plug)
30+
31+
The migrate command has several parameters such as:
32+
33+
- Should it be started asynchronously,
34+
- Should it be forwarded to another host,
35+
- How arguments should be marshalled, and so on.
36+
37+
A new thread is created by [xapi/server_helpers.ml](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xapi/server_helpers.ml#L55)
38+
to handle the command asynchronously. The helper thread checks if
3539
the command should be passed to the [message forwarding](https://github.com/xapi-project/xen-api/blob/master/ocaml/xapi/message_forwarding.ml)
36-
layer in order to be executed on another host (the destination) or locally if
37-
we are already at the right place.
40+
layer in order to be executed on another host (the destination) or locally (if
41+
it is already at the destination host).
3842

3943
It will finally reach [xapi/api_server.ml](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xapi/api_server.ml#L242) that will take the action
4044
of posted a command to the message broker [message switch](https://github.com/xapi-project/xen-api/tree/master/ocaml/message-switch).
@@ -43,34 +47,38 @@ XAPI daemons. In the case of the migration this message sends by **XAPI** will b
4347
consumed by the [xenopsd](https://github.com/xapi-project/xen-api/tree/master/ocaml/xenopsd)
4448
daemon that will do the job of migrating the VM.
4549

46-
# The migration of the VM
50+
## Overview
4751

4852
The migration is an asynchronous task and a thread is created to handle this task.
49-
The tasks's reference is returned to the client, which can then check
53+
The task reference is returned to the client, which can then check
5054
its status until completion.
5155

52-
As we see in the introduction the [xenopsd](https://github.com/xapi-project/xen-api/tree/master/ocaml/xenopsd)
53-
daemon will pop the operation
56+
As shown in the introduction, [xenopsd](https://github.com/xapi-project/xen-api/tree/master/ocaml/xenopsd)
57+
fetches the
5458
[VM_migrate](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/lib/xenops_server.ml#L2572)
55-
from the message broker.
59+
operation from the message broker.
5660

57-
Only one backend is know available that interacts with libxc, libxenguest
58-
and xenstore. It is the [xc backend](https://github.com/xapi-project/xen-api/tree/master/ocaml/xenopsd/xc).
61+
All tasks specific to [libxenctrl](../../lib/xenctrl),
62+
[xenguest](VM.build/xenguest) and [Xenstore](https://wiki.xenproject.org/wiki/XenStore)
63+
are handled by the xenopsd
64+
[xc backend](https://github.com/xapi-project/xen-api/tree/master/ocaml/xenopsd/xc).
5965

6066
The entities that need to be migrated are: *VDI*, *VIF*, *VGPU* and *PCI* components.
6167

62-
During the migration process the destination domain will be built with the same
63-
uuid than the original VM but the last part of the UUID will be
68+
During the migration process, the destination domain will be built with the same
69+
UUID as the original VM, except that the last part of the UUID will be
6470
`XXXXXXXX-XXXX-XXXX-XXXX-000000000001`. The original domain will be removed using
6571
`XXXXXXXX-XXXX-XXXX-XXXX-000000000000`.
6672

67-
There are some points called *hooks* at which `xenopsd` can execute some script.
68-
Before starting a migration a command is send to the original domain to execute
69-
a pre migrate script if it exists.
73+
## Preparing VM migration
7074

71-
Before starting the migration a command is sent to Qemu using the Qemu Machine Protocol (QMP)
75+
At specific places, `xenopsd` can execute *hooks* to run scripts.
76+
In case a pre-migrate script is in place, a command to run this script
77+
is sent to the original domain.
78+
79+
Likewise, a command is sent to Qemu using the Qemu Machine Protocol (QMP)
7280
to check that the domain can be suspended (see [xenopsd/xc/device_common.ml](https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/device_common.ml)).
73-
After checking with Qemu that the VM is suspendable we can start the migration.
81+
After checking with Qemu that the VM is can be suspended, the migration can begin.
7482

7583
## Importing metadata
7684

@@ -82,38 +90,34 @@ Once imported, it will give us a reference id and will allow building the new do
8290
on the destination using the temporary VM uuid `XXXXXXXX-XXXX-XXXX-XXXX-000000000001`
8391
where `XXX...` is the reference id of the original VM.
8492

85-
## Setting memory
93+
## Memory setup
8694

87-
One of the first thing to do is to set up the memory. The backend will check that there
88-
is no ballooning operation in progress. At this point the migration can fail if a
89-
ballooning operation is in progress and takes too much time.
95+
One of the first steps the setup of the VM's memory: The backend checks that there
96+
is no ballooning operation in progress. If so, the migration could fail.
9097

9198
Once memory has been checked, the daemon will get the state of the VM (running, halted, ...) and
92-
information about the VM is retrieved by the backend like the maximum memory the domain
93-
can consume but also information about quotas for example.
94-
The backend retrieves this information from the Xenstore.
99+
The backend retrieves the domain's platform data (memory, vCPUs setc) from the Xenstore.
95100

96101
Once this is complete, we can restore VIF and create the domain.
97102

98103
The synchronisation of the memory is the first point of synchronisation and everything
99104
is ready for VM migration.
100105

101-
## VM Migration
106+
## Destination VM setup
102107

103108
After receiving memory we can set up the destination domain. If we have a vGPU we need to kick
104-
off its migration process. We will need to wait the acknowledge that indicates that the entry
105-
for the GPU has been well initialized. before starting the main VM migration.
109+
off its migration process. We will need to wait for the acknowledgement that the
110+
GPU entry has been successfully initialized before starting the main VM migration.
106111

107-
Their is a mechanism of handshake for synchronizing between the source and the
108-
destination. Using the handshake protocol the receiver inform the sender of the
109-
request that everything has been setup and ready to save/restore.
112+
The receiver informs the sender using a handshake protocol
113+
that everything is set up and ready for save/restore.
110114

111-
### VM restore
115+
## Destination VM restore
112116

113117
VM restore is a low level atomic operation [VM.restore](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L2684).
114118
This operation is represented by a function call to [backend](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/domain.ml#L1540).
115119
It uses **Xenguest**, a low-level utility from XAPI toolstack, to interact with the Xen hypervisor
116-
and libxc for sending a request of migration to the **emu-manager**.
120+
and `libxc` for sending a migration request to the **emu-manager**.
117121

118122
After sending the request results coming from **emu-manager** are collected
119123
by the main thread. It blocks until results are received.
@@ -123,16 +127,14 @@ transitions for the devices and handling the message passing for the VM as
123127
it's moved between hosts. This includes making sure that the state of the
124128
VM's virtual devices, like disks or network interfaces, is correctly moved over.
125129

126-
### VM renaming
130+
## Destination VM rename
127131

128-
Once all operations are done we can rename the VM on the target from its temporary
129-
name to its real UUID. This operation is another low level atomic one
132+
Once all operations are done, `xenopsd` renames the target VM from its temporary
133+
name to its real UUID. This operation is a low-level atomic
130134
[VM.rename](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L1667)
131-
that will take care of updating the xenstore on the destination.
132-
133-
The next step is the restauration of devices and unpause the domain.
135+
which takes care of updating the Xenstore on the destination host.
134136

135-
### Restoring remaining devices
137+
## Restoring devices
136138

137139
Restoring devices starts by activating VBD using the low level atomic operation
138140
[VBD.set_active](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L3674). It is an update of Xenstore. VBDs that are read-write must
@@ -143,39 +145,51 @@ is called. VDI are attached and activate.
143145
Next devices are VIFs that are set as active [VIF.set_active](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L4296) and plug [VIF.plug](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L4394).
144146
If there are VGPUs we will set them as active now using the atomic [VGPU.set_active](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L3490).
145147

146-
We are almost done. The next step is to create the device model
147-
148-
#### create device model
148+
### Creating the device model
149149

150-
Create device model is done by using the atomic operation [VM.create_device_model](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L2375). This
151-
will configure **qemu-dm** and started. This allows to manage PCI devices.
150+
[create_device_model](https://github.com/xapi-project/xen-api/blob/ec3b62ee/ocaml/xenopsd/xc/xenops_server_xen.ml#L2293-L2349)
151+
configures **qemu-dm** and starts it. This allows to manage PCI devices.
152152

153-
#### PCI plug
153+
### PCI plug
154154

155155
[PCI.plug](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L3399)
156156
is executed by the backend. It plugs a PCI device and advertises it to QEMU if this option is set. It is
157-
the case for NVIDIA SR-IOV vGPUS.
157+
the case for NVIDIA SR-IOV vGPUs.
158158

159-
At this point devices have been restored. The new domain is considered survivable. We can
160-
unpause the domain and performs last actions
159+
## Unpause
161160

162-
### Unpause and done
161+
The libxenctrl call
162+
[xc_domain_unpause()](https://github.com/xen-project/xen/blob/414dde3/tools/libs/ctrl/xc_domain.c#L76)
163+
unpauses the domain, and it starts running.
163164

164-
Unpause is done by managing the state of the domain using bindings to [xenctrl](https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=tools/libs/ctrl/xc_domain.c;h=f2d9d14b4d9f24553fa766c5dcb289f88d684bb0;hb=HEAD#l76).
165-
Once hypervisor has unpaused the domain some actions can be requested using [VM.set_domain_action_request](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L3172).
166-
It is a path in xenstore. By default no action is done but a reboot can be for example
167-
initiated.
165+
## Cleanup
168166

169-
Previously we spoke about some points called *hooks* at which `xenopsd` can execute some script. There
170-
is also a hook to run a post migrate script. After the execution of the script if there is one
171-
the migration is almost done. The last step is a handshake to seal the success of the migration
167+
1. [VM_set_domain_action_request](https://github.com/xapi-project/xen-api/blob/ec3b62ee/ocaml/xenopsd/lib/xenops_server.ml#L3004)
168+
marks the domain as alive: In case `xenopsd` restarts, it no longer reboots the VM.
169+
See the chapter on [marking domains as alive](VM.start#11-mark-the-domain-as-alive)
170+
for more information.
171+
172+
2. If a post-migrate script is in place, it is executed by the
173+
[Xenops_hooks.VM_post_migrate](https://github.com/xapi-project/xen-api/blob/ec3b62ee/ocaml/xenopsd/lib/xenops_server.ml#L3005-L3009)
174+
hook.
175+
176+
3. The final step is a handshake to seal the success of the migration
172177
and the old VM can now be cleaned up.
173178

174-
# Links
179+
[Syncronisation point 4](https://github.com/xapi-project/xen-api/blob/ec3b62ee/ocaml/xenopsd/lib/xenops_server.ml#L3014)
180+
has been reached, the migration is complete.
181+
182+
## Live migration flowchart
183+
184+
This flowchart gives a visual representation of the VM migration workflow:
185+
186+
{{% include live-migration %}}
187+
188+
## References
175189

176-
Some links are old but even if many changes occurred, they are relevant for a global understanding
177-
of the XAPI toolstack.
190+
These pages might help for a better understanding of the XAPI toolstack:
178191

179-
- [XAPI architecture](https://xapi-project.github.io/xapi/architecture.html)
180-
- [XAPI dispatcher](https://wiki.xenproject.org/wiki/XAPI_Dispatch)
181-
- [Xenopsd architecture](https://xapi-project.github.io/xenopsd/architecture.html)
192+
- See the [XAPI architecture](../../xapi/_index) for the overall architecture of Xapi
193+
- See the [XAPI dispatcher](https://wiki.xenproject.org/wiki/XAPI_Dispatch) for service dispatch and message forwarding
194+
- See the [Xenopsd architecture](../architecture/_index) for the overall architecture of Xenopsd
195+
- See the [How Xen suspend and resume works](https://mirage.io/docs/xen-suspend) for very similar operations in more detail.

doc/content/xenopsd/walkthroughs/VM.start.md

Lines changed: 9 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -135,17 +135,15 @@ When the Task has completed successfully, then calls to *.stat will show:
135135
- a valid start time
136136
- valid "targets" for memory and vCPU
137137

138-
Note: before a Task completes, calls to *.stat will show partial updates e.g.
139-
the power state may be Paused but none of the disks may have become plugged.
138+
Note: before a Task completes, calls to *.stat will show partial updates. E.g.
139+
the power state may be paused, but no disk may have been plugged.
140140
UI clients must choose whether they are happy displaying this in-between state
141141
or whether they wish to hide it and pretend the whole operation has happened
142-
transactionally. If a particular client wishes to perform side-effects in
143-
response to Xenopsd state changes -- for example to clean up an external resource
144-
when a VIF becomes unplugged -- then it must be very careful to avoid responding
145-
to these in-between states. Generally it is safest to passively report these
146-
values without driving things directly from them. Think of them as status lights
147-
on the front panel of a PC: fine to look at but it's not a good idea to wire
148-
them up to actuators which actually do things.
142+
transactionally. If a particular, when a client wishes to perform side-effects in
143+
response to `xenopsd` state changes (for example, to clean up an external resource
144+
when a VIF becomes unplugged), it must be very careful to avoid responding
145+
to these in-between states. Generally, it is safest to passively report these
146+
values without driving things directly from them.
149147

150148
Note: the Xenopsd implementation guarantees that, if it is restarted at any point
151149
during the start operation, on restart the VM state shall be "fixed" by either
@@ -304,7 +302,7 @@ calls bracket plug/unplug. If the "active" flag was set before the unplug
304302
attempt then as soon as the frontend/backend connection is removed clients
305303
would see the VBD as completely dissociated from the VM -- this would be misleading
306304
because Xenopsd will not have had time to use the storage API to release locks
307-
on the disks. By doing all the cleanup before setting "active" to false, clients
305+
on the disks. By cleaning up before setting "active" to false, clients
308306
can be assured that the disks are now free to be reassigned.
309307

310308
## 5. handle non-persistent disks
@@ -370,7 +368,7 @@ to be the order the nodes were created so this means that (i) xenstored must
370368
continue to store directories as ordered lists rather than maps (which would
371369
be more efficient); and (ii) Xenopsd must make sure to plug the vifs in
372370
the same order. Note that relying on ethX device numbering has always been a
373-
bad idea but is still common. I bet if you change this lots of tests will
371+
bad idea but is still common. I bet if you change this, many tests will
374372
suddenly start to fail!
375373

376374
The function

doc/content/xenopsd/walkthroughs/live-migration.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,12 @@
22
title = "Live Migration Sequence Diagram"
33
linkTitle = "Live Migration"
44
description = "Sequence diagram of the process of Live Migration."
5+
# Note: This page is included by VM.migrate.md to provide a complete overview
6+
# of the most important parts of live migration. Do not add text as that would
7+
# break the mermaid diagram inclusion.
58
+++
69

7-
{{<mermaid align="left">}}
10+
```mermaid
811
sequenceDiagram
912
autonumber
1013
participant tx as sender
@@ -44,5 +47,4 @@ deactivate rx1
4447
4548
tx->>tx: VM_shutdown<br/>VM_remove
4649
deactivate tx
47-
48-
{{< /mermaid >}}
50+
```

0 commit comments

Comments
 (0)