-
Notifications
You must be signed in to change notification settings - Fork 5k
Increase buffer size for readir_r and add support for enumerating filenames > 255 bytes #116619
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR increases the internal buffer size used by readdir_r to support filenames longer than 255 bytes and introduces NTFS-specific tests for file enumeration on Linux. It also adjusts the file system decoding buffers and removes an unused helper to streamline filename processing.
- Adjust readdir_r buffer size in System.Native to avoid truncation errors
- Add tests and setup files for NTFS on Linux to cover long filenames
- Update file system entry decoding buffer and remove an obsolete helper function
Reviewed Changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
src/native/libs/System.Native/pal_io.c | Updates buffer size calculation for readdir_r based on d_name offset and adds a 1024-byte extension on non-SunOS platforms |
src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/ManualTests/System.IO.FileSystem.Manual.Tests.csproj | Adds new NTFS test files |
src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/ManualTests/NtfsOnLinuxTests.cs | Introduces tests covering file enumeration for filenames longer than 255 bytes |
src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/ManualTests/NtfsOnLinuxSetup.cs | Provides NTFS loopback device setup and teardown for tests |
src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/Directory/EnumerableTests.cs | Adds a parallel thread safety test for file enumeration |
src/libraries/System.Private.CoreLib/src/System/TimeZoneInfo.Unix.NonAndroid.cs | Removes an unused directory entry helper method |
src/libraries/System.Private.CoreLib/src/System/IO/Enumeration/FileSystemEntry.Unix.cs | Updates the fixed buffer size for file name decoding using a new constant |
src/libraries/Common/src/Interop/Unix/System.Native/Interop.ReadDir.cs | Improves UTF8 decoding logic in the directory entry retrieval |
process.Start(); | ||
process.BeginOutputReadLine(); | ||
process.BeginErrorReadLine(); | ||
process.WaitForExit(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider checking the process exit code immediately after WaitForExit() to verify that the shell command executed successfully.
process.WaitForExit(); | |
process.WaitForExit(); | |
if (process.ExitCode != 0) | |
{ | |
throw new InvalidOperationException($"Shell command failed with exit code {process.ExitCode}. Command: {command}"); | |
} |
Copilot uses AI. Check for mistakes.
Tagging subscribers to this area: @dotnet/area-system-io |
Nit: I am not sure what's misleading about this. POSIX defines
This was a concern for a low-risk servicing change done on a short notice. I do not think it is concern for main. We should switch to using readdir exclusively in main unless somebody points to a real problem with doing that.
You can ask folks who maintain these platforms here. cc @am11 Are you aware of any issues with switching all platforms to readdir? |
Locks are not a problem. We do have our own lock around the calls to readdir/readdir_r. The question is whether readdir uses process global static variable for storing state between invocations. |
readdir_r is the re-entrant version of readdir; it works by calling readdir with a lock. On success it copies the next entry into a user buffer with memcpy(). It is prone to truncation errors if the following is true:
dirent*
) is allocated with ad_name
field too small to hold the filename. POSIX is misleading as it suggests thatd_name
"shall contain a filename of at most {NAME_MAX} + 1," hence why some libraries, including dotnet, assumed filenames do not exceed 256 bytes.I noticed trucation doesn't occur on macOS because they use a 1024 size for d_name, I'm borrowing their approach here.
Alternative
An alternative approach is to get rid of readdir_r entirely, but I was hesitant to implement it for the following reasons:
I will still list the reasons I believe support it:
cc @AustinWise @NattyNarwhal