Skip to content

Commit c697f28

Browse files
committed
fs/ufs: change default locking protocol
The fs/ufs component by default disabled all file locking before read/write operations (except for NFS file systems). This was based on the assumption, that the file system itself performs the required locking operation and hence we don't have to add to it. This assumption is incorrect when using data sieving. In data sieving, the code 'ignore' small gaps when we write to a file, and perform instead a read-modify-write sequence ourselves for performance reasons. The problem is however that even within a collective operation not all aggregators might want to use data sieving. Hence, enabling locking just for the data-sieving routines is insufficient, all processes have to perform the locking. Therefore, our two options are: a) either disable write data-sieving by default, or b) enable range-locking by default. After some testing, I think enabling range-locking be default is the safer and better approach. It doesn't seem to show any significant performance impact on my test systems. Note, that on Lustre file systems, we can keep the default to no-locking as far as I can see, since the collective algorithm used by Lustre is unlikely to produce this pattern. I did add in however an mca parameter that allows us to control the locking algorithm used by the Lustre component as well, in case we need to change that for a particular use-case or platform. Fixes Issue #12718 Signed-off-by: Edgar Gabriel <[email protected]>
1 parent fefff31 commit c697f28

File tree

4 files changed

+35
-7
lines changed

4 files changed

+35
-7
lines changed

ompi/mca/fs/lustre/fs_lustre.h

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
* Copyright (c) 2015-2018 Research Organization for Information Science
1414
* and Technology (RIST). All rights reserved.
1515
* Copyright (c) 2016-2017 IBM Corporation. All rights reserved.
16+
* Copyright (c) 2024 Advanced Micro Devices, Inc. All rights reserverd.
1617
* $COPYRIGHT$
1718
*
1819
* Additional copyrights may follow
@@ -31,6 +32,12 @@
3132
extern int mca_fs_lustre_priority;
3233
extern int mca_fs_lustre_stripe_size;
3334
extern int mca_fs_lustre_stripe_width;
35+
extern int mca_fs_lustre_lock_algorithm;
36+
37+
#define FS_LUSTRE_LOCK_AUTO 0
38+
#define FS_LUSTRE_LOCK_NEVER 1
39+
#define FS_LUSTRE_LOCK_ENTIRE_FILE 2
40+
#define FS_LUSTRE_LOCK_RANGES 3
3441

3542
BEGIN_C_DECLS
3643

ompi/mca/fs/lustre/fs_lustre_component.c

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
* Copyright (c) 2008-2011 University of Houston. All rights reserved.
1414
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
1515
* reserved.
16+
* Copyright (c) 2024 Advanced Micro Devices, Inc. All rights reserverd.
1617
* $COPYRIGHT$
1718
*
1819
* Additional copyrights may follow
@@ -45,6 +46,7 @@ int mca_fs_lustre_priority = 20;
4546
runtime also*/
4647
int mca_fs_lustre_stripe_size = 0;
4748
int mca_fs_lustre_stripe_width = 0;
49+
int mca_fs_lustre_lock_algorithm = 0; /* auto */
4850
/*
4951
* Instantiate the public struct with all of our public information
5052
* and pointers to our public functions in it
@@ -93,6 +95,15 @@ lustre_register(void)
9395
MCA_BASE_VAR_TYPE_INT, NULL, 0, 0,
9496
OPAL_INFO_LVL_9,
9597
MCA_BASE_VAR_SCOPE_READONLY, &mca_fs_lustre_stripe_width);
98+
mca_fs_lustre_lock_algorithm = 0;
99+
(void) mca_base_component_var_register(&mca_fs_lustre_component.fsm_version,
100+
"lock_algorithm", "Locking algorithm used by the fs ufs component. "
101+
" 0: auto (default), 1: skip locking, 2: always lock entire file, "
102+
"3: lock only specific ranges",
103+
MCA_BASE_VAR_TYPE_INT, NULL, 0, 0,
104+
OPAL_INFO_LVL_9,
105+
MCA_BASE_VAR_SCOPE_READONLY,
106+
&mca_fs_lustre_lock_algorithm );
96107

97108
return OMPI_SUCCESS;
98109
}

ompi/mca/fs/lustre/fs_lustre_file_open.c

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
* Copyright (c) 2015-2020 Research Organization for Information Science
1414
* and Technology (RIST). All rights reserved.
1515
* Copyright (c) 2016-2017 IBM Corporation. All rights reserved.
16+
* Copyright (c) 2024 Advanced Micro Devices, Inc. All rights reserverd.
1617
* $COPYRIGHT$
1718
*
1819
* Additional copyrights may follow
@@ -175,8 +176,22 @@ mca_fs_lustre_file_open (struct ompi_communicator_t *comm,
175176
fh->f_stripe_size = lump->lmm_stripe_size;
176177
fh->f_stripe_count = lump->lmm_stripe_count;
177178
fh->f_fs_block_size = lump->lmm_stripe_size;
178-
fh->f_flags |= OMPIO_LOCK_NEVER;
179179
free(lump);
180180

181+
if (FS_LUSTRE_LOCK_AUTO == mca_fs_lustre_lock_algorithm ||
182+
FS_LUSTRE_LOCK_NEVER == mca_fs_lustre_lock_algorithm ) {
183+
fh->f_flags |= OMPIO_LOCK_NEVER;
184+
}
185+
else if (FS_LUSTRE_LOCK_ENTIRE_FILE == mca_fs_lustre_lock_algorithm) {
186+
fh->f_flags |= OMPIO_LOCK_ENTIRE_FILE;
187+
}
188+
else if (FS_LUSTRE_LOCK_RANGES == mca_fs_lustre_lock_algorithm) {
189+
/* Nothing to be done. This is what the posix fbtl component would do
190+
anyway without additional information . */
191+
}
192+
else {
193+
opal_output ( 1, "Invalid value for mca_fs_lustre_lock_algorithm %d", mca_fs_lustre_lock_algorithm );
194+
}
195+
181196
return OMPI_SUCCESS;
182197
}

ompi/mca/fs/ufs/fs_ufs_file_open.c

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
* Copyright (c) 2015-2018 Research Organization for Information Science
1414
* and Technology (RIST). All rights reserved.
1515
* Copyright (c) 2016-2017 IBM Corporation. All rights reserved.
16+
* Copyright (c) 2024 Advanced Micro Devices, Inc. All rights reserverd.
1617
* $COPYRIGHT$
1718
*
1819
* Additional copyrights may follow
@@ -105,12 +106,6 @@ mca_fs_ufs_file_open (struct ompi_communicator_t *comm,
105106
component. */
106107
fh->f_flags |= OMPIO_LOCK_ENTIRE_FILE;
107108
}
108-
else {
109-
fh->f_flags |= OMPIO_LOCK_NEVER;
110-
}
111-
}
112-
else {
113-
fh->f_flags |= OMPIO_LOCK_NEVER;
114109
}
115110
free (fstype);
116111
}

0 commit comments

Comments
 (0)