Path Exploration

A short stroll along filesystem::path.

The “experimental” Filesystem TS has been with us for a few years living in the std::experiment namespace. With C++17 it will finally be merged into std.

This short post is not intended to be a full introduction to filesystem nor does it attempt to be exhaustive. It only examines some important methods of std::filesystem::path.
I write this mostly as a future quick reference for myself, with the hope that it may be helpful to others as well.

Caveats
I only tested this on Visual Studio “15”. Hence the results may be Windows/MSVC oriented. The versions of gcc and clang that I tried did not include the <experimental/filesystem> implementation headers.

Path Pebbles

std::filesystem::path provides several methods for decomposing a path into tokens.

Objects of type path represent paths on a filesystem. Only syntactic aspects of paths are handled: the pathname may represent a non-existing path or even one that is not allowed to exist on the current file system or OS.

Given a path created from the string C:\folder\number\1\abc.123.txt, it can be decomposed as follows:
A few things to note:

  • ext() returns the last token following the last .;
  • ext() includes the last .;
  • stem() returns everything from the last folder separator (e.g. \) up to the last .
  • stem() does not include either the \ nor the .
  • parent_path() does not include the last folder separator.

std::filesystem::path also provides several methods for extracting root-info of absolute paths:

Interestingly, if our initial path is set to C:\folder\number\1\abc.123.txt\ (with a trailing \) the path is assumed to be a folder name and the results are different: In this case:

  • filename() is set to ., the special “file” name designating the folder itself;
  • ext() is empty;
  • stem() is set to .

This might seem inconsistent since suddenly stem() includes the . and ext() doesn’t, but in this case the dot is the special folder file-name, not a character within the file-name designating the beginning of the file extension.
This approach also (trivially) maintains the invariant that filename() == stem() + ext() (this is actually pseudo-code since there is no path::operator+() - see below).

Building Paths

We can build up paths with folder separators using the (IMO cleverly named) / operator:

namespace fs = std::experimental::filesystem;

fs::path p("folder");
cout << p / "foo.txt" << '\n';
p /= "bar.txt";
cout << p << '\n';             

Which gives:

folder\foo.txt
folder\bar.txt	

We can also concatenate paths without the folder separator using the += operator:

fs::path p("folder");
p += "_foo";
cout << p << '\n';             
p += "/bar.txt"; // string includes a /
cout << p << '\n';   

Which gives:

folder_foo
folder_foo/bar.txt

If the concatenated string contains folder separators they will be treated properly:

cout << p               << '\n';         
cout << p.filename()    << '\n';         
cout << p.parent_path() << '\n';      

Produces:

folder_foo/bar.txt
bar.txt
folder_foo

as expected.

Strangely, unlike the operator/, there is no non-mutating operator+.
This might be due to the fact that paths are convertible to strings which do support such an operator.

More Quirks

Single char literals

At least on the implementation I tested, all the path API calls that accept only other path arguments work with strings, but do not work with single character literals:

fs::path p;

// p /= 'x';  // ERROR
// p /  'x';  // ERROR
// p != '.';  // ERROR
p += 'x';     // OK! operator+=() accepts string types too

p /= "x";
p /  "x";
p != ".";
p += "x";

remove_filename()

As seen in the figure above, although filename() does not include the leading \, remove_filename() does remove it:

fs::path path("folder/foo.bar");  
cout << path << '\n';            // folder\foo.bar
path.remove_filename();
cout << path << '\n';            // folder

replace_extension()

We had seen that ext() includes a leading .. Consistently, when using replace_extension() make sure to include the . in the new string when you intend to keep the postfix as an extension.

fs::path p = "foo/bar.text";
cout << p << '\n';            // foo\bar.text
p.replace_extension(".txt");
cout << p << '\n';            // foo\bar.txt  

Iteration

When iterating over the elements of a path:

for (auto e : path)
    cout << '[' << e << ']';
cout << '\n';

Remember the following sequence (quoting):

  1. root-name (if any)
  2. root-directory (if any)
  3. sequence of file-names, omitting any directory separators
  4. If there is a directory separator after the last file-name in the path, the last element before the end iterator is a fictitious dot file name.

So, C:\folder\number\1\abc.123.txt gives: [C:][\][folder][number][1][abc.123.txt].

C:\folder\number\1\abc.123.txt\ gives: [C:][\][folder][number][1][abc.123.txt][.].
Note the last “fictitious” dot that does not appear in the original string.

If you found any errors or misconceptions here, do let me know in the comments, Twitter or Reddit.


comments powered by Disqus