You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It would be great if we could get support for BTRFS_IOC_DEFRAG_RANGE in python-btrfs. It would allow developers easier access to defragmenting files in their programs or scripts.
By targeting defragmentation to specific portions of files where there are lots of small extents, instead of defragmenting the entire file, we need less I/O for a large gain (similar to btrfs-balance-least-used), and we avoid breaking reflinks for portions of files that weren't touched.
Here is a 32GiB VM image that I tested this approach on. The file's length is mapped over the x-axis in buckets/bars, with roughly 550MiB per bucket. The amplitude represents the count of extents that begin within the bucket's byte range. It is clear from the histogram that a majority of the extents are located to a relatively small part of the file.
Histogram: bytes 0..32GiB in steps of 546MiB
Extents: 102,044
Peak bar represents: 33,572 extents
▁▁▁▁▁▁▁▃▁▁█▃▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
With DEFRAG_RANGE we could target the bucket range that has the 33,000 extents, resulting in a substantial reduction with only a portion of the file's data rewritten. (Note that the amplitude scale isn't equal between the two histograms.)
Histogram: bytes 0...32GiB in steps of 546MiB
Extents: 69,639
Peak bar represents: 10,083 extents.
▂▃▃▁▁▁▃▇▂▃▁█▇▄▂▁▃▂▁▁▃▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
Here is a 2.1GiB archive that was written out in one go, so extents are much larger and evenly spread out.
Histogram: bytes 0...2.1GiB in steps of 35.7 MiB
Extents: 20
Peak bar represents: 3 extents
▃▁▁▃▁▁▁▃▁▁▃▃▁▁▃▁▁▁▃▁▁▁▃▁▁▃▁▁▁▃▁▁▃▁▁▁▃▁▁▃▁▁▁▃▁▁▁▃▁█▁▁▁▃▁▁▁▃▁▁
Expanding on this idea, it would be possible to do a reflink-aware defragmentarion where we check extents for their shared status, and only issue defrag on ranges containing unshared extents.
The text was updated successfully, but these errors were encountered:
How does range defrag differ from making a (physical, non-reflink) copy of the range and then deduping the fragmented file against the new file? There's no guarantee that the new extents will be written in fewer fragments, but defrag doesn't offer any guarantees either. There's an extra copy from the kernel to userspace and back, but that should be cheap compared to getting the data from a pathologically fragmented file.
Range defrag is subject to some internal heuristic checks which prevent the defrag. The kernel call is intended to implement btrfs fi defrag and has some anti-features for other use cases.
Range defrag creates the copy and replaces the original extent refs in a single atomic operation. Copy + dedupe can be broken into 3 steps: copy, check layout, and if the layout is OK, dedupe, but if the copy is too fragmented, delete the copy and keep the better original extent. (If that happens often, the filesystem may be too full to avoid fragmentation)
With copy+dedupe, plus some digging into the extent tree with LOGICAL_INO, the copy can be reused to replace all of the references to the original data. Range defrag doesn't provide that capability in any form, which is why range defrag unshares everything it deigns to copy.
It would be great if we could get support for
BTRFS_IOC_DEFRAG_RANGE
in python-btrfs. It would allow developers easier access to defragmenting files in their programs or scripts.By targeting defragmentation to specific portions of files where there are lots of small extents, instead of defragmenting the entire file, we need less I/O for a large gain (similar to btrfs-balance-least-used), and we avoid breaking reflinks for portions of files that weren't touched.
Here is a 32GiB VM image that I tested this approach on. The file's length is mapped over the x-axis in buckets/bars, with roughly 550MiB per bucket. The amplitude represents the count of extents that begin within the bucket's byte range. It is clear from the histogram that a majority of the extents are located to a relatively small part of the file.
With
DEFRAG_RANGE
we could target the bucket range that has the 33,000 extents, resulting in a substantial reduction with only a portion of the file's data rewritten. (Note that the amplitude scale isn't equal between the two histograms.)Here is a 2.1GiB archive that was written out in one go, so extents are much larger and evenly spread out.
Expanding on this idea, it would be possible to do a reflink-aware defragmentarion where we check extents for their shared status, and only issue defrag on ranges containing unshared extents.
The text was updated successfully, but these errors were encountered: