|
| 1 | +# Backing Guest Memory by Huge Pages |
| 2 | + |
| 3 | +> \[!WARNING\] |
| 4 | +> |
| 5 | +> Support is currently in **developer preview**. See |
| 6 | +> [this section](RELEASE_POLICY.md#developer-preview-features) for more info. |
| 7 | +
|
| 8 | +Firecracker supports backing the guest memory of a VM by 2MB hugetlbfs pages. |
| 9 | +This can be enabled by setting the `huge_pages` field of `PUT` or `PATCH` |
| 10 | +requests to the `/machine-config` endpoint to `2M`. |
| 11 | + |
| 12 | +Backing guest memory by huge pages can bring performance improvements for |
| 13 | +specific workloads, due to less TLB contention and less overhead during |
| 14 | +virtual->physical address resolution. It can also help reduce the number of |
| 15 | +KVM_EXITS required to rebuild extended page tables post snapshot restore, as |
| 16 | +well as improve boot times (by up to 50% as measured by Firecracker's |
| 17 | +[boot time performance tests](../tests/integration_tests/performance/test_boottime.py)) |
| 18 | + |
| 19 | +Using hugetlbfs requires the host running Firecracker to have a pre-allocated |
| 20 | +pool of 2M pages. Should this pool be too small, Firecracker may behave |
| 21 | +erratically or receive the `SIGBUS` signal. This is because Firecracker uses the |
| 22 | +`MAP_NORESERVE` flag when mapping guest memory. This flag means the kernel will |
| 23 | +not try to reserve sufficient hugetlbfs pages at the time of the `mmap` call, |
| 24 | +trying to claim them from the pool on-demand. For details on how to manage this |
| 25 | +pool, please refer to the [Linux Documentation][hugetlbfs_docs]. |
| 26 | + |
| 27 | +## Huge Pages and Snapshotting |
| 28 | + |
| 29 | +Restoring a Firecracker snapshot of a microVM backed by huge pages will also use |
| 30 | +huge pages to back the restored guest. There is no option to flip between |
| 31 | +regular, 4K, pages and huge pages at restore time. Furthermore, snapshots of |
| 32 | +microVMs backed with huge pages can only be restored via UFFD. Lastly, note that |
| 33 | +even for guests backed by huge pages, differential snapshots will always track |
| 34 | +write accesses to guest memory at 4K granularity. |
| 35 | + |
| 36 | +## Known Limitations |
| 37 | + |
| 38 | +Currently, hugetlbfs support is mutually exclusive with the following |
| 39 | +Firecracker features: |
| 40 | + |
| 41 | +- Memory Ballooning via the [Balloon Device](./ballooning.md) |
| 42 | +- Initrd |
| 43 | + |
| 44 | +## FAQ |
| 45 | + |
| 46 | +### Why does Firecracker not offer a transparent huge pages (THP) setting? |
| 47 | + |
| 48 | +Firecracker's guest memory is memfd based. Linux (as of 6.1) does not offer a |
| 49 | +way to dynamically enable THP for such memory regions. Additionally, UFFD does |
| 50 | +not integrate with THP (no transparent huge pages will be allocated during |
| 51 | +userfaulting). Please refer to the [Linux Documentation][thp_docs] for more |
| 52 | +information. |
| 53 | + |
| 54 | +[hugetlbfs_docs]: https://docs.kernel.org/admin-guide/mm/hugetlbpage.html |
| 55 | +[thp_docs]: https://www.kernel.org/doc/html/next/admin-guide/mm/transhuge.html#hugepages-in-tmpfs-shmem |
0 commit comments