-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix readFile API doing blocking I/O with a non-blocking handle #6368
Conversation
When a block size is passed to the readFile function or a file has no size, the read is forced to be blocking, even if the handle is opened as non-blocking. The opposite can happen too, a blocking handle is opened but since a block size of 0 is passed, and the file size is not 0, the file is read with non-blocking I/O. This change bases the decision of doing blocking or non-blocking I/O mainly on the "blocking" parameter of the readFile function and the file being a special file or not. If a handle is opened in non-blocking mode but the file is special, the handle is reopened as blocking. Also give a different name to the overload that provides a way to do a read file check via readFile.
I’d like to test this on files in proc before merging. |
I believe this situation is kind of tested already with the That been said, do you want me to put a specific test for this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some feedback for review. Acceptable to merge without a requirement to change due to feedback.
// mode | ||
mode &= ~PF_NONBLOCK; | ||
blocking_io = true; | ||
fd = std::make_unique<PlatformFile>(path, mode); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there some reason we open the file twice, vs performing fd = std::make_unique<PlatformFile>(path, mode);
once after this if-block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem here is that isSpecialFile, which is used in the if, is a function of PlatformFile, which needs an instance of it to work (especially on Windows where there's actually a handle to open), so we need to open it in a mode and then change it later if needed.
} | ||
} | ||
|
||
PlatformTime times; | ||
handle.fd->getFileTimes(times); | ||
|
||
off_t total_bytes = 0; | ||
if (file_size == 0 || block_size > 0) { | ||
if (handle.blocking_io) { | ||
// Reset block size to a sane minimum. | ||
block_size = (block_size < 4096) ? 4096 : block_size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect any block size less than 1MB will be inefficient from an IO perspective for reading files. This is from experiments I ran back in 2006. This link has some data that may explain when 4KB block sizes are preferred over 1MB block sizes and cases where 1MB is preferred: https://medium.com/@duhroach/the-impact-of-blocksize-on-persistent-disk-performance-7e50a85b2647
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it highly depends on the files, the disk, the filesystem etc.
This is just a sane minimum, one is free to use a higher value when using the readFile API with files that are large enough where this makes sense.
Although I could get behind maybe changing it here:
osquery/osquery/filesystem/filesystem.cpp
Lines 187 to 189 in a977045
return readFile(path, | |
size, | |
4096, |
That been said, I feel like this is a different matter that needs its own issue and PR :P.
… to master * commit '8c13dd6bd206f2909a4baea5bcfbc91d5e3f502b': (159 commits) release: updating changelog for 4.3.0 release (osquery#6387) Build hvci_status table with CMake (osquery#6378) Change calls to debug log to verbose (osquery#6369) iokit: Fix race when accessing port_ (osquery#6380) Check extensions are registered with osquery core (osquery#6374) First steps to remove the Buck build system (osquery#6361) Return error detaching table, only use primary database (osquery#6373) Copy the parent environment when launching worker Change process table log errors to info and fix typo (osquery#6370) Ensure the extension uuid is never 0 (osquery#6377) Remove errors when converting empty numeric rows (osquery#6371) Do not force a specific path to install osquery on Windows (osquery#6379) Fix readFile API doing blocking I/O with a non-blocking handle (osquery#6368) magic: Check return from magic_file (osquery#6363) macos: Use -1 for missing ppid in process_events (osquery#6339) Update OpenSSL to version 1.1.1f and fix build (osquery#6359) Simplify how third party libraries formula work (osquery#6303) Add socket_events table for socket auditing in MacOS (osquery#6028) Extend the fields of curl_certificate table (osquery#6176) add status column to deb_packages table (osquery#6341) ...
When a block size is passed to the readFile function
or a file has no size, the read is forced to be blocking,
even if the handle is opened as non-blocking.
The opposite can happen too, a blocking handle is opened
but since a block size of 0 is passed, and the file size is not 0,
the file is read with non-blocking I/O.
This change bases the decision of doing blocking
or non-blocking I/O mainly on the "blocking" parameter
of the readFile function and the file being a special file or not.
If a handle is opened in non-blocking mode but the file is special,
the handle is reopened as blocking.
Also give a different name to the overload that provides
a way to do a read file check via readFile.
This should address the most concerning issues found in #5661