-
Notifications
You must be signed in to change notification settings - Fork 757
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sclang: encode all filesystem paths as UTF-8 internally #2861
sclang: encode all filesystem paths as UTF-8 internally #2861
Conversation
So how should we go about with this? I had a look, built it on Windows and Mac. It looks very promising (the speed is amazing ;) ), but it's obviously not yet in a "testable" state. So should we just be patient, or can we do something to help? |
@bagong Thanks for looking into it!
|
I am sorry I haven't been able to work on it this week, by the way. Development requires me to boot Windows from a hard drive and it takes about an hour or two of use for VS to speed up to a good clip. I hope to set aside a few hours for it tomorrow! |
Getting closer... fixed almost all the issues with globbing, and wrote a header for codepage conversions, I think that removed a couple TODOs. So I think most of the bigger architectural issues are solved. summary TODO:
|
Unfortunately a build error on Windows:
With more detail:
|
Oops! I know what I did. Funny that VS 2013's static analysis (or compiler) didn't catch that. |
Yea, I'm building with VS2015 and hope to switch to 2017 when Qt 5.9 is out, that will allow to use vcpkg, and has inbuilt cmake support. |
@bagong It should build now, but I haven't confirmed on Windows. Also removed all the old standalone code, that felt good :) |
Builds and works, and Help-generation is back again. This is starting to feel exciting! I don't have time to inspect further during the next hours, hopefully tonight!! But wait: the help system doesn't work yet, I was mislead because the "front-page" now comes up. But the links (except to undocumented classes) are still all broken... Will try to give more exact observations tonight. |
I haven't gone down that route yet, but I was vaguely aware that would need fixing too. Thanks for the heads up :) |
Thanks to TravisCI, I've been fixing a few build errors. The issue @bagong mentioned above with the help system will take some analysis. From what I've seen, the help browser isn't thinking about opening files; it looks to me like the datapath abstracts everything as URIs, which is a problem because I don't want to accidentally convert all URIs to a different encoding; http:// URLs can remain unchanged. So I think at some point along the datapath I'll have to insert a check to convert URIs starting with |
Hm, one of the problems is that QWebView functions are being called almost directly. I think the simplest solution may be to add some SuperCollider primitives to convert strings between UTF-8 and UTF-16, and call only when |
OK. I thought about it a bit and I can see two ways of doing this:
I'll give 2) a try and see how it feels. If I run into complications or it starts getting messy, I'll try the other approach. |
Actually, this is (fingers crossed) turning out to be easier than I expected, and I can do things even closer to the filesystem call than I thought I could. The last commit (6fa61fb) shows where I think those conversions should go. |
Ok, now globbing and file opening are fixed, and it seems like links are working in the help browser again. This is almost ready for testing @bagong, I just need to add some conversions for sound file primitives and those two other spots I noted above. |
Oh, beautiful! I actually didn't need to fix those spots in the IDE code after all, they work already. @bagong (and @llloret ?) we can start testing now! As things settle I'll remove the extra debugging code and also tackle the sound-file primitives, but the main issues are now out of the way AFAICT! |
A full list of primitives that need to be tested: File.delete
File.mtime
File.exists
File.realpath
File.mkdir
File.copy
File.type
File.fileSize
File.getcwd
File.open
Pipe.open
SoundFile.openRead
SoundFile.openWrite
// the remainder of File/Pipe/SoundFile primitives operate on existing *FILE objects and so don't need to be tested
Platform.userHomeDir
Platform.systemAppSupportDir
Platform.userAppSupportDir
Platform.systemExtensionDir
Platform.userExtensionDir
Platform.userConfigDir
Platform.resourceDir
Platform.myDocumentsDir
Platform.ideName
LanguageConfig.getIncludePaths
LanguageConfig.getExcludePaths
LanguageConfig.currentPath
LanguageConfig.store
LanguageConfig.postInlineWarnings (setter and getter)
~string.pathMatch
~string.parseYAMLFile
~string.standardizePath Sorry, I know that's a lot, but each should really only take one or two tests with a string like "我.txt". |
wow, libsndfile has a Windows open method, great: http://www.mega-nerd.com/libsndfile/api.html#open |
Great @brianlheim , I hope I can start tomorrow afternoon! |
@bagong Awesome! I just finished cleaning up a bit and adding encoding conversions to the SoundFile primitives. I'll try to put together a comprehensive script we can use to test this; I've already made a start on it for my own purposes. The majority of it can be done without creating a test user account, but some of it will require that as well. For some functions like pathMatch and standardizePath, we should compare behavior between old and new as those were completely rewritten. I'll note that as well in the script. @snappizz asked what to test on Linux vs macOS. I think the only thing necessary to test is the default directories given by On macOS specifically, we will need to make sure that behavior has been preserved re: aliases, specifically with class library compilation. |
Almost complete test script: https://gist.github.com/brianlheim/7735a484e54b59dce5dd524651586d2a |
I've run my test script on Windows and there appear to be no problems. All file primitives work as expected. I'm going to create a non-ASCII user next and test with that; hopefully I can at least get it off the ground. |
Initial feedback, hopefully more tomorrow. Looks good! Help etc, are back. So I thought I'd give it a challenge and created an account for my friend ابو عزيز. Results are mixed. On first start help is generated but not displayed. On second startup it displays, and the links work. Yey! So writing and reading rendered helpfiles from userAppSupportDir seems to work. Help-sources residing in the extensions folder are a different story, don't work yet (undocumented class is displayed rather than the respective help-file). Maybe a few screen shots explain better: Plugins in Extensions load fine. Not sure if they did before. PS: unrelated but worth a mention: Quarks don't work in this account (I included that other PR in the build): |
Change to a single if statement, make comment into a doxygen comment
For a reference to cerr
Remove todos based on obsolete standalone logic
Previous bad commit made files appear to be directories when they weren't.
Don't print from within SC_Filesystem::resolveIfAlias Fix printing in pyrlexer to use pre-resolution path
Only print an error message. This allows the help system to run even if there's a bad symlink
Added explanatory comment
just rebased on master. ( @patrickdupuis ) |
When through the tests quickly. Everything seemed fine except for the symlink part. I'll run them again when I am a bit more rested and able to concentrate. |
@patrickdupuis thanks—i have updated the test script with a note about the known change in the behavior of compilation with symlinks. |
Thanks for the note addition. That error message was the only issue I noticed the first time. Everything looks fine on a second go around 👍 |
@patrickdupuis Great! Thanks for looking at it again. So, does anyone oppose merging this now? I would love to get started on a 3.9 release. I think any deeper/subtler bugs that I've introduced will be more efficiently found in beta testing or everyday use, and I can promise "long-term support" for the changes here. The caveat is that the only Windows toolchain with guaranteed support after this will be VS 2013 / msvc 12. I know that is a bit of a let-down, but unfortunately nobody else is developing using those toolchains, so we have no dev power going to maintenance there. And of course, that locks out supernova and readline support as well for a Windows package. (If necessary, we can break this topic out into a separate issue. This PR already has way too comments even though they're all on-topic.) I've been very happy developing this on Windows in VS, and I would love to tackle some of these other support issues after this. It seems like they are pretty manageable once we have moved past Qt 5.5 and can bring in VS 2017. Once this merges, I intend to put time into @scztt's webengine branch and get us on Qt 5.9 ASAP. I tried it out last weekend and got it building in VS 2017 with one minor fix. I got a slow start on Windows dev because I felt obliged to learn 3 toolchains and an OS at the same time; the documentation is quite messy and confusing, somewhat necessarily, because no toolchain has been deemed "official" and each has significant versioning and capability issues. Windows is not a special case; we could try to add support for the Intel and IBM compilers or gcc on macOS, but Xcode/clang is what we choose to promote because it works and it's generally more comfortable (in my limited experience) to develop with that toolchain on macOS. If we could commit to one toolchain and advertise that, I think that would help with future contributions, and it would make me much more confident that I'm spending my time fruitfully. |
this sounds very reasonable. Thanks a lot! |
i will be looking over this today. |
ping @snappizz |
🎉 thanks again to everyone for their help with this |
thanks!! |
Fixes #2441 and fixes #2444.
Issue
The Windows NT filesystem encodes paths using UTF-16. POSIX systems use UTF-8. SCIDE/SCDoc encodes all strings as UTF-8 internally. This causes a problem where scide miscodes strings passed by sclang and fails to find files when a path contains a codepoint on which these encodings differ (specifically, any character that is not in the 7-bit ASCII set).
Proposed fix
Use the UTF-8 encoding for paths whenever they are passed to sclang, and use the native encoding whenever passing to the OS filesystem. This eliminates the encoding-agnostic behavior of sclang and also eliminates the encoding barrier between scide and sclang.
This fix also performs a couple other functions:
boost::filesystem
namespace functions where possible. Usingrecursive_directory_iterator
has led to an observed speedup in class library compilation of about 1.5x-2x.SC_DirUtils
translation unit with a new one,SC_Filesystem
, that performs similar functions usingboost::filesystem::path
instead ofchar*
. I changed the name because it's now a singleton class with state rather than a collection of utilities.SC_LanguageConfig
to useboost::filesystem::path
as its path representation type (instead ofstd::string
).Further reading:
http://utf8everywhere.org/
Testing tools:
Full test script to exercise all changed routines and components:
https://gist.github.com/brianlheim/ddfbf90c2bd8e178677d93f680c0f062