Description
One of my machines (FC40) recently received a kernel update
- from kernel-6.8.11-300.fc40.x86_64 to kernel-6.10.12-200.fc40.x86_64, and
- from ZFS+DKMS+Dracut
master
lineage, from commit 02c5aa9 to ca0141f
simultaneously. This took place earlier today. The pool was healthy, in use, and recently scrubbed multiple times. No error anywhere, in the kernel log, or in the journal.
Mere minutes after I rebooted to the new kernel and ZFS, my Prometheus setup alerted me to 30 checksum errors, several write errors, and 4 data errors. Upon inspection:
[root@penny ~]# zpool status -v
pool: chest
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
...abbreviated...
NAME STATE READ WRITE CKSUM
...abbreviated...
errors: Permanent errors have been detected in the following files:
<metadata>:<0x16>
<metadata>:<0x3c>
<metadata>:<0x44>
<metadata>:<0x594>
The kernel also didn't show any hardware errors in the kernel ring buffer.
I rebooted to the older kernel and ZFS module, and started a scrub. It's still ongoing, but it has not found any problem nor produced any WRITE or CKSUM errors.
Interestingly, neither the cache nor the ZIL devices had that error. The boot drive also seemed unaffected.
This, to me, indicates a software issue, probably related to the LUKS write path (we've had these before) or mirroring (only the pool with the mirrored drives were hit — the single boot drive was not hit by the problem despite being the same otherwise).
The affected LUKS2 devices are all whole-disk formatted with 4K sector size, and my pool is ashift 12. (The unaffected root pool, cache device and ZIL device for the affected pool are not formatted with LUKS 4K sector size).
In the interest of seeing if it makes a difference, the affected LUKS devices are tuned with the following persistent flags:
Flags: allow-discards same-cpu-crypt submit-from-crypt-cpus no-read-workqueue no-write-workqueue
The pool has the following features:
NAME PROPERTY VALUE SOURCE
chest size 10.9T -
chest capacity 67% -
chest altroot - default
chest health ONLINE -
chest guid 2537396116593781450 -
chest version - default
chest bootfs - default
chest delegation on default
chest autoreplace off default
chest cachefile - default
chest failmode wait default
chest listsnapshots off default
chest autoexpand off default
chest dedupratio 1.00x -
chest free 3.55T -
chest allocated 7.34T -
chest readonly off -
chest ashift 12 local
chest comment - default
chest expandsize - -
chest freeing 0 -
chest fragmentation 9% -
chest leaked 0 -
chest multihost off default
chest checkpoint - -
chest load_guid 16604087848420727134 -
chest autotrim off default
chest compatibility off default
chest bcloneused 0 -
chest bclonesaved 0 -
chest bcloneratio 1.00x -
chest feature@async_destroy enabled local
chest feature@empty_bpobj active local
chest feature@lz4_compress active local
chest feature@multi_vdev_crash_dump enabled local
chest feature@spacemap_histogram active local
chest feature@enabled_txg active local
chest feature@hole_birth active local
chest feature@extensible_dataset active local
chest feature@embedded_data active local
chest feature@bookmarks enabled local
chest feature@filesystem_limits enabled local
chest feature@large_blocks enabled local
chest feature@large_dnode enabled local
chest feature@sha512 enabled local
chest feature@skein enabled local
chest feature@edonr enabled local
chest feature@userobj_accounting active local
chest feature@encryption enabled local
chest feature@project_quota active local
chest feature@device_removal enabled local
chest feature@obsolete_counts enabled local
chest feature@zpool_checkpoint enabled local
chest feature@spacemap_v2 active local
chest feature@allocation_classes enabled local
chest feature@resilver_defer enabled local
chest feature@bookmark_v2 enabled local
chest feature@redaction_bookmarks enabled local
chest feature@redacted_datasets enabled local
chest feature@bookmark_written enabled local
chest feature@log_spacemap active local
chest feature@livelist enabled local
chest feature@device_rebuild enabled local
chest feature@zstd_compress enabled local
chest feature@draid enabled local
chest feature@zilsaxattr disabled local
chest feature@head_errlog disabled local
chest feature@blake3 disabled local
chest feature@block_cloning disabled local
chest feature@vdev_zaps_v2 disabled local
chest feature@redaction_list_spill disabled local
chest feature@raidz_expansion disabled local
This is where my pool currently sits:
[root@penny ~]# zpool status
pool: chest
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
scan: scrub in progress since Wed Oct 9 22:37:54 2024
900G / 7.35T scanned at 950M/s, 136G / 7.35T issued at 143M/s
0B repaired, 1.80% done, 14:39:54 to go
config:
NAME STATE READ WRITE CKSUM
chest ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
dm-uuid-CRYPT-LUKS2-sda ONLINE 0 0 0
dm-uuid-CRYPT-LUKS2-sdb ONLINE 0 0 0
mirror-3 ONLINE 0 0 0
dm-uuid-CRYPT-LUKS2-sdc ONLINE 0 0 0
dm-uuid-CRYPT-LUKS2-sdd ONLINE 0 0 0
logs
dm-uuid-CRYPT-LUKS2-sde ONLINE 0 0 0
cache
dm-uuid-CRYPT-LUKS2-sdf ONLINE 0 0 0
errors: 4 data errors, use '-v' for a list
Update: got good news! After reverting to the prior kernel + commit mentioned above, I am very happy to report that the scrub found no errors, and the data errors listed previously simply disappeared. So not a single bit of data loss!
The less good news: this does indicate, strongly, that under these use cases, there is a software defect in OpenZFS.