Skip to content

Commit f2ab5b8

Browse files
amotinixhamza
authored andcommitted
Fix metaslab group fragmentation math (openzfs#17037)
Since we are calculating a free space fragmentation, we should weight metaslabs by the amount of their free space, not a full size. Fragmentation of full metaslabs may not matter in presence empty ones. The old algorithm did not differentiate metaslabs having only one free 4KB block from metaslabs having 50% of space free in 4KB blocks, reporting higher fragmentation. While there, move metaslab_group_alloc_update() call after setting mg_fragmentation, otherwise the effect may be delayed by one TXG. Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Reviewed-by: Paul Dagnelie <[email protected]> Reviewed-by: Tony Nguyen <[email protected]> Reviewed-by: Tony Hutter <[email protected]>
1 parent 1bdce04 commit f2ab5b8

File tree

1 file changed

+14
-10
lines changed

1 file changed

+14
-10
lines changed

module/zfs/metaslab.c

Lines changed: 14 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1176,9 +1176,8 @@ metaslab_group_sort(metaslab_group_t *mg, metaslab_t *msp, uint64_t weight)
11761176
}
11771177

11781178
/*
1179-
* Calculate the fragmentation for a given metaslab group. We can use
1180-
* a simple average here since all metaslabs within the group must have
1181-
* the same size. The return value will be a value between 0 and 100
1179+
* Calculate the fragmentation for a given metaslab group. Weight metaslabs
1180+
* on the amount of free space. The return value will be between 0 and 100
11821181
* (inclusive), or ZFS_FRAG_INVALID if less than half of the metaslab in this
11831182
* group have a fragmentation metric.
11841183
*/
@@ -1187,24 +1186,29 @@ metaslab_group_fragmentation(metaslab_group_t *mg)
11871186
{
11881187
vdev_t *vd = mg->mg_vd;
11891188
uint64_t fragmentation = 0;
1190-
uint64_t valid_ms = 0;
1189+
uint64_t valid_ms = 0, total_ms = 0;
1190+
uint64_t free, total_free = 0;
11911191

11921192
for (int m = 0; m < vd->vdev_ms_count; m++) {
11931193
metaslab_t *msp = vd->vdev_ms[m];
11941194

1195-
if (msp->ms_fragmentation == ZFS_FRAG_INVALID)
1196-
continue;
11971195
if (msp->ms_group != mg)
11981196
continue;
1197+
total_ms++;
1198+
if (msp->ms_fragmentation == ZFS_FRAG_INVALID)
1199+
continue;
11991200

12001201
valid_ms++;
1201-
fragmentation += msp->ms_fragmentation;
1202+
free = (msp->ms_size - metaslab_allocated_space(msp)) /
1203+
SPA_MINBLOCKSIZE; /* To prevent overflows. */
1204+
total_free += free;
1205+
fragmentation += msp->ms_fragmentation * free;
12021206
}
12031207

1204-
if (valid_ms <= mg->mg_vd->vdev_ms_count / 2)
1208+
if (valid_ms < (total_ms + 1) / 2 || total_free == 0)
12051209
return (ZFS_FRAG_INVALID);
12061210

1207-
fragmentation /= valid_ms;
1211+
fragmentation /= total_free;
12081212
ASSERT3U(fragmentation, <=, 100);
12091213
return (fragmentation);
12101214
}
@@ -4469,8 +4473,8 @@ metaslab_sync_reassess(metaslab_group_t *mg)
44694473
spa_t *spa = mg->mg_class->mc_spa;
44704474

44714475
spa_config_enter(spa, SCL_ALLOC, FTAG, RW_READER);
4472-
metaslab_group_alloc_update(mg);
44734476
mg->mg_fragmentation = metaslab_group_fragmentation(mg);
4477+
metaslab_group_alloc_update(mg);
44744478

44754479
/*
44764480
* Preload the next potential metaslabs but only on active

0 commit comments

Comments
 (0)