Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: on Windows use UTF-8 strings instead of system default locale strings #6190

Merged
merged 18 commits into from
Feb 28, 2020
Merged

fix: on Windows use UTF-8 strings instead of system default locale strings #6190

merged 18 commits into from
Feb 28, 2020

Conversation

farfella
Copy link
Contributor

The *A version of Windows API functions (e.g., CreateFileA, CreateEventA, etc.) operate on system locale strings and not UTF-8 strings. Furthermore, projects generated with cmake use multi-byte character set. So, CreateFile is defined as CreateFileA vs CreateFileW. This causes weird display and input issues observable by the end-user and reported as bugs.

In this PR, all these cases in the code (only osquery code and not externs) now call the *W versions of the Windows API functions, e.g., CreateFileW, CreateEventW, respectively, and then narrow them to UTF-8 using osquery's wstringToString.

In some cases, we can safely call the *A functions, so these were kept as is (that is, osquery is choosing the input and this input is correct with the *A functions).

Verified all test-cases passed locally.

@farfella
Copy link
Contributor Author

Thanks @Smjert for your help with the Chrome extensions test! :)

@directionless directionless added this to the 4.2.0 milestone Jan 21, 2020
theopolis
theopolis previously approved these changes Jan 25, 2020
@Smjert Smjert added the do not merge Do not merge PR as it's pending on some discussion or external factor. Reviewer should have context. label Jan 25, 2020
Copy link
Member

@Smjert Smjert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for this effort!

osquery/events/windows/ntfs_event_publisher.cpp Outdated Show resolved Hide resolved
osquery/filesystem/windows/fileops.cpp Outdated Show resolved Hide resolved
osquery/filesystem/windows/fileops.cpp Outdated Show resolved Hide resolved
osquery/tables/system/windows/registry.cpp Outdated Show resolved Hide resolved
osquery/tables/system/windows/registry.cpp Outdated Show resolved Hide resolved
osquery/tables/system/windows/registry.cpp Outdated Show resolved Hide resolved
osquery/utils/system/windows/env.cpp Outdated Show resolved Hide resolved
osquery/utils/system/windows/env.cpp Outdated Show resolved Hide resolved
osquery/utils/system/windows/env.cpp Show resolved Hide resolved
@Smjert Smjert removed the do not merge Do not merge PR as it's pending on some discussion or external factor. Reviewer should have context. label Jan 25, 2020
Copy link
Member

@Smjert Smjert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last review round then I think it's good to go!
Please also remember to format the code :).

osquery/utils/system/windows/env.cpp Outdated Show resolved Hide resolved
osquery/utils/system/windows/env.cpp Outdated Show resolved Hide resolved
osquery/utils/system/windows/env.cpp Outdated Show resolved Hide resolved
@farfella
Copy link
Contributor Author

Last review round then I think it's good to go!
Please also remember to format the code :).

Haha! Thanks. Yeah... I just need to set up style-check validator locally. Relying on IDE works 95% of the time.

@Smjert
Copy link
Member

Smjert commented Jan 27, 2020

Last review round then I think it's good to go!
Please also remember to format the code :).

Haha! Thanks. Yeah... I just need to set up style-check validator locally. Relying on IDE works 95% of the time.

We also have two CMake targets for this: format_check, which checks the code after everything has been committed and format, to actually format the code, but it has to be either in the stage area or unstaged.

Smjert
Smjert previously approved these changes Jan 27, 2020
@directionless directionless modified the milestones: 4.2.0, 4.2.1 Jan 31, 2020

return wstringToString(buf.data());

} // namespace osquery
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick, this comment looks like the result of a stray copy-paste

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Updated.

@sroache
Copy link

sroache commented Feb 21, 2020

I was testing with this branch to see if it would fix some unicode issues encountered and it fixed most of them!
However when I queried the file table for a path that contained Chinese characters I did not get the expected path returned
SELECT * FROM fiile WHERE path LIKE "C:\%\cmd.exe
Where I have copied cmd.exe into C:\Test and C:\新建文件夹 only returned the copy in C:\Test.
(Where "新建文件夹" was the "new folder" text after creating a folder on a Chinese machine)

My current guess is that it has something to do with converting from unicode to UTF-8 before checking if the path exists in WindowsFindFile (fileops.cpp). I think we would also probably need to keep the path as a unicode string when checking the constraint/regex.

@farfella
Copy link
Contributor Author

Thanks for validating @sroache! I will take a look. It should call FindFirstFileW if it isn’t since FindFirstFileA is locale specific— I will check your case.

@farfella
Copy link
Contributor Author

I confirm what @sroache is seeing: I'd prefer to fix this issue in another PR after this is merged in vs adding more changes to this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants