Skip to content

Linux: Fix zfs_prune panics v2 #17121

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 25, 2025
Merged

Conversation

snajpa
Copy link
Contributor

@snajpa snajpa commented Mar 6, 2025

Motivation and Context

The approach I've taken in #16770 was insufficient per @youzhongyang report here

Description

It turns out that approach taken in the original version of the patch was wrong. So now, we're taking approach in-line with how kernel actually does it - when sb is being torn down, access to it is serialized via sb->s_umount rwsem, only when that lock is taken is it okay to work with s_flags - and the other mistake I was doing was trying to make SB_ACTIVE work, but apparently the kernel checks the negative variant - not SB_DYING and not SB_BORN.

This way it finally seems to work without issues.

How Has This Been Tested?

Confirmed by @youzhongyang

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@github-actions github-actions bot added the Status: Work in Progress Not yet ready for general review label Mar 6, 2025
It turns out that approach taken in the original version of the patch
was wrong. So now, we're taking approach in-line with how kernel
actually does it - when sb is being torn down, access to it
is serialized via sb->s_umount rwsem, only when that lock is taken
is it okay to work with s_flags - and the other mistake I was doing
was trying to make SB_ACTIVE work, but apparently the kernel checks
the negative variant - not SB_DYING and not SB_BORN.

Kernels pre-6.6 don't have SB_DYING, but check if sb is hashed
instead.

Signed-off-by: Pavel Snajdr <[email protected]>
@snajpa
Copy link
Contributor Author

snajpa commented Mar 9, 2025

that Alma thing doesn't seem to be my fault and I haven't touched any common code with FreeBSD this time, so... I think it's ready for review

@snajpa snajpa marked this pull request as ready for review March 9, 2025 15:17
@github-actions github-actions bot added Status: Code Review Needed Ready for review and testing and removed Status: Work in Progress Not yet ready for general review labels Mar 9, 2025
@amotin amotin requested a review from behlendorf March 10, 2025 13:41
Copy link
Contributor

@tonyhutter tonyhutter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've never worked with zpl_super.c, but I don't see any surface level issues.

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, thanks for making another pass at this.

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Mar 25, 2025
@tonyhutter tonyhutter merged commit a0e6271 into openzfs:master Mar 25, 2025
37 of 45 checks passed
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Apr 3, 2025
It turns out that approach taken in the original version of the patch
was wrong. So now, we're taking approach in-line with how kernel
actually does it - when sb is being torn down, access to it
is serialized via sb->s_umount rwsem, only when that lock is taken
is it okay to work with s_flags - and the other mistake I was doing
was trying to make SB_ACTIVE work, but apparently the kernel checks
the negative variant - not SB_DYING and not SB_BORN.

Kernels pre-6.6 don't have SB_DYING, but check if sb is hashed
instead.

Signed-off-by: Pavel Snajdr <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
fuporovvStack pushed a commit to fuporovvStack/zfs that referenced this pull request Apr 11, 2025
It turns out that approach taken in the original version of the patch
was wrong. So now, we're taking approach in-line with how kernel
actually does it - when sb is being torn down, access to it
is serialized via sb->s_umount rwsem, only when that lock is taken
is it okay to work with s_flags - and the other mistake I was doing
was trying to make SB_ACTIVE work, but apparently the kernel checks
the negative variant - not SB_DYING and not SB_BORN.

Kernels pre-6.6 don't have SB_DYING, but check if sb is hashed
instead.

Signed-off-by: Pavel Snajdr <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
fuporovvStack pushed a commit to fuporovvStack/zfs that referenced this pull request Apr 11, 2025
It turns out that approach taken in the original version of the patch
was wrong. So now, we're taking approach in-line with how kernel
actually does it - when sb is being torn down, access to it
is serialized via sb->s_umount rwsem, only when that lock is taken
is it okay to work with s_flags - and the other mistake I was doing
was trying to make SB_ACTIVE work, but apparently the kernel checks
the negative variant - not SB_DYING and not SB_BORN.

Kernels pre-6.6 don't have SB_DYING, but check if sb is hashed
instead.

Signed-off-by: Pavel Snajdr <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Apr 16, 2025
It turns out that approach taken in the original version of the patch
was wrong. So now, we're taking approach in-line with how kernel
actually does it - when sb is being torn down, access to it
is serialized via sb->s_umount rwsem, only when that lock is taken
is it okay to work with s_flags - and the other mistake I was doing
was trying to make SB_ACTIVE work, but apparently the kernel checks
the negative variant - not SB_DYING and not SB_BORN.

Kernels pre-6.6 don't have SB_DYING, but check if sb is hashed
instead.

Signed-off-by: Pavel Snajdr <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
robn pushed a commit to robn/zfs that referenced this pull request May 12, 2025
It turns out that approach taken in the original version of the patch
was wrong. So now, we're taking approach in-line with how kernel
actually does it - when sb is being torn down, access to it
is serialized via sb->s_umount rwsem, only when that lock is taken
is it okay to work with s_flags - and the other mistake I was doing
was trying to make SB_ACTIVE work, but apparently the kernel checks
the negative variant - not SB_DYING and not SB_BORN.

Kernels pre-6.6 don't have SB_DYING, but check if sb is hashed
instead.

Signed-off-by: Pavel Snajdr <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
(cherry picked from commit a0e6271)
robn pushed a commit to robn/zfs that referenced this pull request May 12, 2025
It turns out that approach taken in the original version of the patch
was wrong. So now, we're taking approach in-line with how kernel
actually does it - when sb is being torn down, access to it
is serialized via sb->s_umount rwsem, only when that lock is taken
is it okay to work with s_flags - and the other mistake I was doing
was trying to make SB_ACTIVE work, but apparently the kernel checks
the negative variant - not SB_DYING and not SB_BORN.

Kernels pre-6.6 don't have SB_DYING, but check if sb is hashed
instead.

Signed-off-by: Pavel Snajdr <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Closes: openzfs#17121
(cherry picked from commit a0e6271)
@robn robn mentioned this pull request May 12, 2025
13 tasks
tonyhutter pushed a commit that referenced this pull request May 27, 2025
It turns out that approach taken in the original version of the patch
was wrong. So now, we're taking approach in-line with how kernel
actually does it - when sb is being torn down, access to it
is serialized via sb->s_umount rwsem, only when that lock is taken
is it okay to work with s_flags - and the other mistake I was doing
was trying to make SB_ACTIVE work, but apparently the kernel checks
the negative variant - not SB_DYING and not SB_BORN.

Kernels pre-6.6 don't have SB_DYING, but check if sb is hashed
instead.

Signed-off-by: Pavel Snajdr <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Closes: #17121
(cherry picked from commit a0e6271)
snajpa added a commit to vpsfreecz/zfs that referenced this pull request May 30, 2025
It turns out that approach taken in the original version of the patch
was wrong. So now, we're taking approach in-line with how kernel
actually does it - when sb is being torn down, access to it
is serialized via sb->s_umount rwsem, only when that lock is taken
is it okay to work with s_flags - and the other mistake I was doing
was trying to make SB_ACTIVE work, but apparently the kernel checks
the negative variant - not SB_DYING and not SB_BORN.

Kernels pre-6.6 don't have SB_DYING, but check if sb is hashed
instead.

Signed-off-by: Pavel Snajdr <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Pavel Snajdr <[email protected]>
aither64 added a commit to vpsfreecz/zfs that referenced this pull request Jun 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants