module lang::paths::Windows
Defines the syntax of filesystem and network drive paths on DOS and Windows Systems.
Usage
import lang::paths::Windows;
Dependencies
import IO;
import util::SystemAPI;
import ParseTree;
Description
This syntax definition of file paths and file names in Windows formalizes open-source implementations manually written in Java, C++ and C# code. These are parsers for Windows syntax of file and directory names, as well as shares on local networks (UNC notation). It also derives from openly available documentation sources on Windows and the .NET platform for confirmation and test examples.
The main function of this module, Parse Windows Path:
- faithfully maps any syntactically correctly Windows paths to syntactically correct
loc
values. - throws a ParseError if the path does not comply. Typically file names ending in spaces do not comply.
- ensures that if the file exists on system A, then the
loc
representation resolves to the same file on system A via any IO function. - and nothing more. No normalization, no interpretatioon of
.
and..
, no changing of cases. This is left to downstream processors ofloc
values, if necessary. The current transformation is purely syntactical, and tries to preserve the semantics of the path as much as possible.
Pitfalls
- Length limitations are not implemnted by this parser. This means that overly long names will lead to IO exceptions when they are finally used.
- The names of drives, files and devices are mapped as-is, without normalization. This means that
the resulting
loc
value may not be a canonical representation of the identified resource. Normalization ofloc
values is for a different function TBD.
syntax WindowsPath
lexical WindowsPath
= unc : Slash Slash Slashes? PathChar* \ "." Slashes PathChar* Slashes WindowsFilePath
| uncDOSDrive : Slash Slash Slashes? DOSDevice Slashes Drive ":" OptionalWindowsFilePath
| uncDOSPath : Slash Slash Slashes? DOSDevice Slashes PathChar* Slashes WindowsFilePath
| absolute : Drive ":" Slashes WindowsFilePath
| driveRelative : Drive ":" WindowsFilePath
| directoryRelative: Slash WindowsFilePath
| relative : WindowsFilePath
;
syntax OptionalWindowsFilePath
lexical OptionalWindowsFilePath
= ()
| Slashes WindowsFilePath
;
syntax DOSDevice
lexical DOSDevice = [.?];
syntax PathChar
lexical PathChar = !([\a00-\a20\< \> : \" | ? * \\ /] - [\ ]);
syntax PathSegment
lexical PathSegment
= current: "."
| parent : ".."
| name : PathChar+ \ ".." \ "."
;
syntax Drive
lexical Drive = [A-Za-z];
syntax Slashes
lexical Slashes = Slash+ !>> [\\/];
syntax Slash
lexical Slash = [\\/];
syntax WindowsFilePath
lexical WindowsFilePath = {PathSegment Slashes}* segments Slashes? [\ .] !<< ();
function parseWindowsPath
Convert a windows path literal to a source location URI.
loc parseWindowsPath(str input, loc src=|unknown:///|)
- parses the path using the grammar for Windows Path
- takes the literal name components using string interpolation
"<segment>"
. This means no decoding/encoding happens at all while extracting hostname, share name and path segment names. Also all superfluous path separators are skipped. - uses
loc + str
path concatenation with its builtin character encoding to construct the URI. Also the right path separators are introduced.
function mapPathToLoc
UNC.
loc mapPathToLoc((WindowsPath) `<Slash _><Slash _><Slashes? _><PathChar* hostName><Slashes _><PathChar* shareName><Slashes _><WindowsFilePath path>`)
function mapPathToLoc
DOS UNC Device Drive.
loc mapPathToLoc((WindowsPath) `<Slash _><Slash _><Slashes? _><DOSDevice dq><Slashes _><Drive drive>:<OptionalWindowsFilePath path>`)
function mapPathToLoc
DOS UNC Device Path.
loc mapPathToLoc((WindowsPath) `<Slash _><Slash _><Slashes? _><DOSDevice dq><Slashes _><PathChar* deviceName><Slashes _><WindowsFilePath path>`)
function deviceIndicator
str deviceIndicator((DOSDevice) `?`)
str deviceIndicator((DOSDevice) `.`)
function mapPathToLoc
DOS UNCPath.
loc mapPathToLoc((WindowsPath) `<Slash _><Slash _><Slashes? _>?<Slashes _><PathChar* shareName><Slashes _><WindowsFilePath path>`)
function mapPathToLoc
Absolute: given the drive and relative to its root.
loc mapPathToLoc((WindowsPath) `<Drive drive>:<Slashes _><WindowsFilePath path>`)
function mapPathToLoc
Drive relative: relative to the current working directory on the given drive.
loc mapPathToLoc((WindowsPath) `<Drive drive>:<WindowsFilePath path>`)
function mapPathToLoc
Directory relative: relative to the root of the current drive.
loc mapPathToLoc((WindowsPath) `<Slash _><WindowsFilePath path>`)
function mapPathToLoc
Relative to the current working directory on the current drive.
loc mapPathToLoc((WindowsPath) `<WindowsFilePath path>`)
function appendPath
loc appendPath(loc root, WindowsFilePath path)
loc appendPath(loc root, (OptionalWindowsFilePath) ``)
loc appendPath(loc root, (OptionalWindowsFilePath) `<Slashes _><WindowsFilePath path>`)
Tests
test uncSharePath
test bool uncSharePath()
= parseWindowsPath("\\\\Server2\\Share\\Test\\Foo.txt")
== |unc://Server2/Share/Test/Foo.txt|;
test uncDrivePath
test bool uncDrivePath()
= parseWindowsPath("\\\\system07\\C$\\")
== |unc://system07/C$|;
test uncDOSDevicePathLocalFileQuestion
test bool uncDOSDevicePathLocalFileQuestion() {
loc l = parseWindowsPath("\\\\?\\c:\\windows\\system32\\cmd.exe");
if (IS_WINDOWS) {
assert exists(l);
}
return l == |unc://%3F/c:/windows/system32/cmd.exe|;
}
test uncDOSDevicePathLocalFileDot
test bool uncDOSDevicePathLocalFileDot() {
loc l = parseWindowsPath("\\\\.\\C:\\Test\\Foo.txt");
return l == |unc://./C:/Test/Foo.txt|;
}
test uncDOSDeviceUNCSharePath
test bool uncDOSDeviceUNCSharePath() {
// the entire UNC namespace is looped back into the DOS Device UNC encoding via
// the reserved name "UNC":
loc m1 = parseWindowsPath("\\\\?\\UNC\\Server\\Share\\Test\\Foo.txt");
loc m2 = parseWindowsPath("\\\\.\\UNC\\Server\\Share\\Test\\Foo.txt");
return m1 == |unc://%3F/UNC/Server/Share/Test/Foo.txt|
&& m2 == |unc://./UNC/Server/Share/Test/Foo.txt|;
}
test uncDOSDeviceVolumeGUIDReference
test bool uncDOSDeviceVolumeGUIDReference() {
loc l = parseWindowsPath("\\\\.\\Volume{b75e2c83-0000-0000-0000-602f00000000}\\Test\\Foo.txt");
return l == |unc://./Volume%7Bb75e2c83-0000-0000-0000-602f00000000%7D/Test/Foo.txt|;
}
test uncDOSDeviceBootPartition
test bool uncDOSDeviceBootPartition() {
loc l = parseWindowsPath("\\\\.\\BootPartition\\");
return l == |unc://./BootPartition|;
}
test simpleDrivePathC
test bool simpleDrivePathC()
= parseWindowsPath("C:\\Program Files\\Rascal")
== |file:///C:/Program%20Files/Rascal|;
test mixedSlashesDrivePathC
test bool mixedSlashesDrivePathC()
= parseWindowsPath("C:\\Program Files/Rascal")
== |file:///C:/Program%20Files/Rascal|;
test trailingSlashesDrivePathC
test bool trailingSlashesDrivePathC()
= parseWindowsPath("C:\\Program Files\\Rascal\\\\")
== |file:///C:/Program%20Files/Rascal|;
test simpleDrivePathD
test bool simpleDrivePathD()
= parseWindowsPath("D:\\Program Files\\Rascal")
== |file:///D:/Program%20Files/Rascal|;
test uncNetworkShareOk
test bool uncNetworkShareOk() {
loc l = parseWindowsPath("\\\\localhost\\ADMIN$\\System32\\cmd.exe");
if (IS_WINDOWS) {
return exists(l);
}
else {
return |unc://localhost/ADMIN$/System32/cmd.exe| == l;
}
}