Skip to content

Add os::split_paths #14544

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Add os::split_paths #14544

wants to merge 1 commit into from

Conversation

aturon
Copy link
Member

@aturon aturon commented May 30, 2014

Adds a platform-specific function, split_paths to the os module. This
function can be used to parse PATH-like environment variables according to
local platform conventions.

Closes #14352.

#[cfg(unix)]
/// Parse the `path` environment variable according to the platform's conventions.
pub fn path_env() -> Vec<Path> {
let paths = getenv("PATH").unwrap_or("".to_owned());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_PATH_DEFPATH in <paths.h> defines the default path to use when $PATH is unset to be /usr/bin:bin. Perhaps this should follow suite?

If you don't want to provide a default then I think Option<Vec<Path>> should be considered.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea; I'll change the default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Option may be the right approach here because what's the default to use for Windows? I have no idea.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to that, it changes based on the Windows version. Vista has the path in that answer, one comment confirms that the PowerShell path was added in Windows 7, and it's plausible that Windows 8 added something else.

But that's actually beside the point. That's the path that %PATH% defaults to, which isn't what we want here. My /usr/bin:bin suggestion was based on the PATH that execvp() uses if $PATH is unset. That's not actually the path that's used by default on a fresh system (that usually includes at least /usr/local/bin).

@lilyball
Copy link
Contributor

It might be better to rework this to have a function that parses a PATH-formatted input, and a wrapper function that calls it with getenv("PATH"). Besides providing PATH-parsing functionality for anything that needs to use it on values other than $PATH, it would also allow you to test it without changing the environment variables of the test runner.

@aturon
Copy link
Member Author

aturon commented May 31, 2014

@kballard Thanks for the feedback on this. I've switched to use an Option type, as you suggested, and changed the tests to reset the PATH variable when complete.

To address your other concerns:

  • I agree that it's troubling to add parsing for Windows paths without a clear spec, but I tried fairly hard to track one down and failed, so I think it's worth taking a best-effort approach here. See windows: PATH=";.;" causes uv_spawn to infinitely loop joyent/libuv#909 for details on what libuv does, if you're curious.
  • I had considered breaking out the parsing as a separate API, but it doesn't seem worth adding to the surface area of OS for parsing $PATH-formatted data that's not actually coming from $PATH, which I suspect would be a relatively rare use-case.

@lilyball
Copy link
Contributor

changed the tests to reset the PATH variable when complete

That's only going to work if the tests succeed. If a test fails, it won't reset the PATH.

I think you should break it out regardless. You could leave it as a private function for now, which the tests could use, and then we could consider making it public if someone has a compelling use-case.


for s in oldpath.iter() {
setenv("PATH", s.as_slice())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a good style for dealing with Option. It makes oldpath look like it's a sequence. It also won't un-set PATH if it wasn't set in the first place.

match oldpath {
    Some(p) => setenv("PATH", p.as_slice())
    None => unsetenv("PATH")
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I was trying to follow the style of the existing tests, e.g. homedir...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can consider those tests to be written in a bad style, then. They should be updated (though that's out of scope of this PR).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'll clean those tests up in a separate PR once we settle on the right approach to dealing with environment variables in these. (I do agree regarding the style, and especially about missing the unset case.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kballard, think you're a bit forceful saying that this isn't good style for Option. I have seen no consensus of this, I know some people don't like it, but others do. (In any case, I agree that this specific use of Option.iter is poor.)

@aturon
Copy link
Member Author

aturon commented May 31, 2014

I can break out the parse function privately, as you suggest, but presumably we'll want some test to ensure that the public API is actually reading from the $PATH variable. As you point out, this will fail to reset $PATH on failure, but this is already the case with other tests in the module, like homedir.

@@ -1754,5 +1817,78 @@ mod tests {
fs::unlink(&path).unwrap();
}

#[test]
#[cfg(windows)]
fn path_env_windows() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to put these tests in a run-pass test instead of in the standard library. At least on windows the value of PATH affects how dynamic libraries are opened up, so this test executing at the same time as those tests may cause badness to arise.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do that; what do you think of @kballard's suggestion to instead break out the parsing into a private function that does not actually read $PATH, and test that instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(My only worry is: this fails to test that the public API successfully reads $PATH.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either way is fine by me

@lilyball
Copy link
Contributor

but presumably we'll want some test to ensure that the public API is actually reading from the $PATH variable

You can write such a test and have it reset PATH correctly even on failure:

#[test]
fn test_read_from_path() {
    let oldpath = getenv("PATH");
    setenv("PATH", "foo:bar:baz");
    let result = path_env();
    match oldpath {
        Some(p) => setenv("PATH", p.as_slice()),
        None => unsetenv("PATH")
    }
    assert_eq!(result, vec!["foo".to_string(), "bar".to_string(), "baz".to_string()]);
}

If you use a PATH value that you already tested the parsing function with then you should be reasonably certain that you won't get task failure either during the parsing.

@aturon
Copy link
Member Author

aturon commented May 31, 2014

@kballard @alexcrichton thanks both for the feedback, will adjust, repush, and ping.

@lilyball
Copy link
Contributor

After considering @alexcrichton's point about PATH affecting dylib tests, I think the right course is to split into two functions, test the inner one (that does the parsing) with #[test], and test the outer one (that reads PATH and calls the inner one) in a run-pass. This approach should probably be used for any function that interprets environment variables in some fashion, so that way the env vars are never mutated by the tests. A single run-pass test can use to validate that all of them read the environment variable.

@aturon
Copy link
Member Author

aturon commented May 31, 2014

@kballard Agreed, that's what I'm doing. Once we're happy with this patch, I can clean up the other tests in this module to follow suit.

/// conventions, dropping empty paths.
/// Returns `None` if `PATH` is not set.
pub fn path_env() -> Option<Vec<Path>> {
getenv("PATH").map(|unparsed| {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should probably use getenv_as_bytes, since it's possible to have non-UTF8 paths.

(Also, btw, using just .split(':') will be faster for strings.)

@liigo
Copy link
Contributor

liigo commented May 31, 2014

I would like path_env() to handle "CLASS_PATH" and many more similar
path-related envs, instead of limited "PATH" only. Please design a more
user-friendly API. Thank you.

@aturon
Copy link
Member Author

aturon commented May 31, 2014

@liigo Something like this was also suggested by @kballard. My worry is that the parsing rules for PATH on Windows may not apply to other environment variables, but I suppose there's no harm as long as we carefully warn about this.

@aturon
Copy link
Member Author

aturon commented May 31, 2014

I've pushed a new version which exposes the API as split_paths, a generic parsing function that can take any BytesContainer data, to be non-utf8 aware and workable with arbitrary PATH-like data.

@aturon aturon changed the title Add os::path_env Add os::split_paths May 31, 2014
@lilyball
Copy link
Contributor

LGTM, but the test functions need to be renamed.

@liigo
Copy link
Contributor

liigo commented May 31, 2014

Good work, I like the new split_paths().

@aturon
Copy link
Member Author

aturon commented May 31, 2014

Pushed new version with renamed test functions, and more precise doc comments not using "$PATH".

["c:\\", "c:\\Program Files\\"]));
assert!(check_parse("c:\\;c:\\\"foo\"\\", ["c:\\", "c:\\foo\\"]));
assert!(check_parse("c:\\;c:\\\"foo;bar\"\\;c:\\baz",
["c:\\", "c:\\foo;bar\\", "c:\\baz"]));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these paths would be easier to read using raw strings.

assert!(check_parse(r#"c:\;c:\"foo;bar"\;c:\baz"#, [r"c:\", r"c:\foo;bar\", r"c:\baz"]));

@lilyball
Copy link
Contributor

r=me with the one comment about the windows test paths.

Adds a platform-specific function, `split_paths` to the `os` module. This
function can be used to parse PATH-like environment variables according to
local platform conventions.

Closes rust-lang#14352.
@aturon
Copy link
Member Author

aturon commented May 31, 2014

Changed Windows test paths to raw string literals.

bors added a commit that referenced this pull request May 31, 2014
Adds a platform-specific function, `split_paths` to the `os` module. This
function can be used to parse PATH-like environment variables according to
local platform conventions.

Closes #14352.
@bors bors closed this May 31, 2014
bors added a commit to rust-lang-ci/rust that referenced this pull request Jun 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

os: add get_path_env function
6 participants