Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix volume letter parsing in Windows Paths that contain forward slashes. #885

Closed
wants to merge 1 commit into from

Conversation

ajalt
Copy link
Contributor

@ajalt ajalt commented Jan 12, 2021

#882 introduced a regression in Windows Path parsing. Windows allows forward slashes in path names, replacing them with backslashes as part of normalization (Here it's mentioned in the Win32 docs for the lpFileName parameter, and here it's mentioned for .NET). C:/foo and C:/foo\bar are both valid Windows paths.

This PR removes the requirement for a backslash to be present for volume letters to be parsed.

Note that this doesn't fix the case of UNC paths that use forward slashes. A test like the following still fails:

@Test  fun windowsUncAbsolutePathForwardSlashes() {
  val path = "//server/project/notes.txt".toPath()
  assertEquals("//server/project/notes.txt", path.toString())
  assertEquals("//server/project".toPath(), path.parent)
  assertNull(path.volumeLetter)
  assertEquals("notes.txt", path.name)
  assertTrue(path.isAbsolute)
  assertFalse(path.isRoot)
}

Since //foo is valid as both a UNC path and a POSIX path, I don't believe paths like that can be parsed unambiguously without the notion of a system file separator or similar.

This PR removes the requirement for a backslash to be present for volume letters to be parsed.

Note that this doesn't fix the case of UNC paths that use forward slashes. A test like the following still fails:

```kotlin
@test  fun windowsUncAbsolutePathForwardSlashes() {
  val path = "//server/project/notes.txt".toPath()
  assertEquals("//server/project/notes.txt", path.toString())
  assertEquals("//server/project".toPath(), path.parent)
  assertNull(path.volumeLetter)
  assertEquals("notes.txt", path.name)
  assertTrue(path.isAbsolute)
  assertFalse(path.isRoot)
}
```

Since `//foo` is valid as both a UNC path and a POSIX path, I don't believe paths like that can be parsed unambiguously without the notion of a system file separator or similar.

[1]: https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-createfilea#parameters
[2]: https://docs.microsoft.com/en-us/dotnet/standard/io/file-path-formats#canonicalize-separators
@Test
fun windowsAbsolutePathWithVolumeLetterMixedSlashesSlashFirst() {
val path = "C:/Windows\\notepad.exe".toPath()
assertEquals("C:/Windows/notepad.exe", path.toString())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Paths separators are currently normalized based on the first slash in the path. This isn't necessarily wrong, but it is different from how JVM and most other systems behave.

@swankjesse
Copy link
Collaborator

It was my (misguided?) intention to use the path character to imply which rules the caller wanted to follow.

Consider:

cd /tmp
mkdir -p C:/Windows
cd C:/Windows

On a UNIX system, C: is not special, and so /tmp/C:/Windows is a valid path.

I was thinking we’d look at the slash character to determine whether C: is special.

@ajalt
Copy link
Contributor Author

ajalt commented Jan 12, 2021

If the library uses the presence of a forward slash to dictate parsing behavior, I think that would make some common use cases more difficult. For example, if I wanted to take a path from user input and call volumeLetter on it, I would have to manually check the current operating system and replace forward slashes with backslashes on windows before calling toPath. If someone doesn't know about that behavior and doesn't pre-normalize the path strings, it would be a source of bugs.

@swankjesse
Copy link
Collaborator

Good point!

I guess it comes to this decision:

  • should Path behave the same regardless of the host OS, or
  • should Path behave the same as the host OS

@ajalt
Copy link
Contributor Author

ajalt commented Jan 12, 2021

Option 2 (behavior based on the current OS) is the choice taken by all the APIs I can think of off the top of my head. I expect most developers would be surprised by option 1. Option 1 would also make it difficult to write e.g. a cross-platform CLI app for the reasons I mentioned above.

My personal vote would be to base the behavior on the current OS, and possibly allow forcing a certain OS style when configuring the FileSystem.

@swankjesse
Copy link
Collaborator

We discussed offline! Thank you!

Decisions:

  • path resolution rules to follow the host OS, not the slash
  • make FakeFileSystem follow whatever OS its emulating (“emulateWindows()”)
  • add an API to support this: "C:".toPath().resolve("\\tmp", windows=true). FakeFileSystem will use this for the above. Anybody else could use if they’re trying to be portable and don’t care what the host OS is.

@ajalt ajalt closed this Jan 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants