-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
System locale #5081
System locale #5081
Conversation
Cargo.toml
Outdated
@@ -66,6 +66,7 @@ members = [ | |||
"utils/litemap", | |||
"utils/pattern", | |||
"utils/resb", | |||
"utils/system_locale", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This name suggests that the crate only retrieves "system" "locale". My understanding is that your crate aims to retrieve environment information from the runtime. I'd use a term like "environment", "preferences" etc.
My concern about the "system" is that it indicates that what you get is "OS Locale" - but you may not. Increasingly OSes sandbox applications, so OS may be in English, but the application may only see that the language of its own environment is French.
My concern about "locale" is that it indicates that all you get is a Unicode Locale - while what you actually get are things beyond that. For example, short date pattern is not part of a Locale but should be retrievable by this crate.
Maybe "utils/env_preferences"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are still a lot of unaddressed comments from my first review
List of changes made:
|
utils/env_preferences/src/apple.rs
Outdated
// that the `lang_ptr` is not NULL thus making it safe to call | ||
let length = unsafe { CFStringGetLength(ptr) as usize }; | ||
|
||
let mut c_str_buf = vec![0; length * 4]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: why is this length * 4
? I think this should be length + 1
: https://developer.apple.com/documentation/corefoundation/1542721-cfstringgetcstring#parameters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*4 is an overkill 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's underkill and maybe UB if the string is empty.
// Skipping "C" and those ending with "UTF-8", as they cannot be converted | ||
// into the locale |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't they be converted into a Locale
? Can they be converted into something else? Can we check that the strings are in a form that we expect, even if not a BCP-47 Locale?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried, doing the following:
let loc_str = "C";
let locale: Locale = loc.parse.unwrap();
It fails
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Converting to locale is a whole other can of worms, there will be strings like C
, de_CH.UTF-8
, etc. Let's not scope-creep this PR and stick with strings for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I've been clear about the line I draw given that this is unsafe code:
- All code paths must be covered
- All bytes resulting from calls to unsafe functions must be read
- The shape of the output should be asserted
I am relying on @robertbastian and @zbraniecki to verify condition 1 (coverage).
I guess condition 2 is covered in std::ffi::CStr::to_str
.
For condition 3, if the strings are not valid BCP-47, then let's assert what shape they are. The bare minimum would be an assertion that they are ASCII, for example. But condition 3 wouldn't rise to the level of blocking the PR from landing in experimental.
utils/env_preferences/tests/test.rs
Outdated
for calendar in calendar_res { | ||
assert!(!calendar.0.is_empty(), "Couldn't retreive calendar locale"); | ||
assert!(!calendar.1.is_empty(), "Couldn't retreive calendar"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Testing nonempty is not sufficient. Please actually read the strings. Assert something about their shape. Maybe have a list of valid values? At the very least check that they are ASCII-alphabetic, for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue with a list is, we are not sure about the people who are contributing to the icu4x, they could have some different locale and it would display failing test.
Check for ascii can be done
Testing is better; still room for improvement
I removed my blocking review; I will let @robertbastian or @zbraniecki land the PR. |
Mentors: @robertbastian @zbraniecki
Total tasks:
String
intoicu4x
componentsOptional: