Skip to content

Redfs-6.8 with dlm_lock opcode for fuse #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: redfs-6.8
Choose a base branch
from

Conversation

hbirth
Copy link
Collaborator

@hbirth hbirth commented Jun 14, 2025

Added DLM_LOCK opcode to fuse for signaling to fuse server that the kernel has mapped a page of a file in cache (only active when writeback cache is enabled)

@hbirth hbirth requested a review from bsbernd June 19, 2025 09:53
area->offset = MIN(current_area_start_offset, lock_start_offset);
area->size = MAX(current_area_end_offset, lock_end_offset) - area->offset;
area->offset = min(current_area_start_offset, lock_start_offset);
area->size = max(current_area_end_offset, lock_end_offset) - area->offset;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why backport? Linux master has min/max?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I.e. I think this can just go as it is into the previous patch and can also be used for the upstream patch

@@ -8,6 +8,8 @@

#include "fuse_i.h"

#include "linux/list.h"
#include "linux/spinlock.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file.c already uses lists and uses spin-lock - uneeded includes.

@@ -632,6 +632,7 @@ enum fuse_opcode {
FUSE_SYNCFS = 50,
FUSE_TMPFILE = 51,
FUSE_STATX = 52,
FUSE_DLM_LOCK = 53,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now this is only for write-back cache, right? FUSE_DLM_WB_LOCK?

@@ -182,6 +199,7 @@ static void fuse_evict_inode(struct inode *inode)
if (S_ISREG(inode->i_mode) && !fuse_is_bad(inode)) {
WARN_ON(!list_empty(&fi->write_files));
WARN_ON(!list_empty(&fi->queued_writes));
fuse_forget_all_dlm_lock_areas(fi);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition on S_ISREG is giving me a bit headache. At least if there is nothing about write-back. Issue is that dir-inodes can and probably will have a page cache for readdir.

struct dlm_locked_area *area, *tmp;
list_for_each_entry_safe(area, tmp, &fi->dlm_locked_areas, list) {
list_del(&area->list);
printk("free area %lld size: %ld\n", area->offset, area->size);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug printk should be removed. Btw, printk is outdated, recent way is pr_info. For debugging may change to pr_debug for now, which be enabled at run time.

@@ -58,6 +58,7 @@
EM( FUSE_SYNCFS, "FUSE_SYNCFS") \
EM( FUSE_TMPFILE, "FUSE_TMPFILE") \
EM( FUSE_STATX, "FUSE_STATX") \
EM( FUSE_DLM_LOCK, "FUSE_DLM_LOCK") \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FUSE_DLM_LOCK alignment doesn't look right. Spaces instead of tabs?

@@ -142,6 +153,9 @@ struct fuse_inode {

/* List of writepage requestst (pending or sent) */
struct rb_root writepages;

/** List of dlm locked areas we have sent lock requests for */
struct list_head dlm_locked_areas;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, dir-inodes also have a cache


spin_lock(&fi->lock);
/* iterate through the areas */
list_for_each_entry(area, &fi->dlm_locked_areas, list) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess list will be too slow - this probably needs to be an xarray, rbtree or maple tree

/* check if the locked areas are completely distinct, then we should continue */
if (current_area_end_offset < lock_start_offset
|| current_area_start_offset > lock_end_offset)
continue;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with lists is that that random writes to large files will be really slow, I think

}

/* create a new locked area */
area = kmalloc(sizeof(struct dlm_locked_area), GFP_KERNEL);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kmalloc needs error checking

return;

/* add the lock to the list of locked areas */
if (outarg.locksize) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check missing if the server locked at least the requested size

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants