Marcos Castilho

Ruby File IO Primer - Part 3 - The Standard Library

June 25, 2011

Following our series on Ruby File IO, we will now delve on the two Standard Library components that complements the core ruby IO API, the FileUtils module and the Pathname class.

The FileUtils Module

The FileUtils module brings an interesting approach to file manipulation, by emulating a ton of file related Unix commands, and its most commonly used flags. The effects of commands like rm -rf and ln -s are done by calling the FileUtils.rb_rf and FileUtils.ln_s methods.

Since they follow a syntax known by most ruby developers, the FileUtils methods are very straightforward to understand, making a list of FileUtils method calls really feels like a bash session. The following sample illustrates that similarity.

  require "fileutils"

  FileUtils.touch(["some_file.rb", "another_file.rb"])
  FileUtils.mkdir("code")
  FileUtils.mv(["another_file.rb", "../other_file.rb"], "code")

  Dir["code/*"]
  >> ["code/some_file.rb", "code/another_file.rb"]

  FileUtils.cp_r("code", "bkp")
  FileUtils.rm_r("code") 

  Dir["code/*"]
  >> []
  Dir["bkp/*"]
  >> ["bkp/some_file.rb", "bkp/another_file.rb"]

Like Unix commands, most FileUtils methods knows how to deal with multiple files by receiving arrays as parameters,like the FileUtils.cp method. They also accepts flags to change its behavior.

  require "fileutils"

  FileUtils.rm("a_file.rb") # removes this file
  FileUtils.rm(Dir["bkp_*"]) # remove all files that start with bkp
  FileUtils.rm(Dir["bkp_*"], :verbose => true) #print the equivalent stmt and remove the bkp files 

The Pathname Class

The Pathname class represents a pathname, the location of a file on the underlying filesystem, and provides facilities for querying and manipulating filepath data.

Although not universally useful as the FileUtils module, the Pathname class can bring in a lot of niceties when you need to do heavy filesystem transversing, and it is used on gems like Sprockets and Carrierwave.

You create a Pathname object by passing a string with the filesystem path to the class constructor. You can also get Pathnames by joining two paths with Pathname.join or Pathname.+. Also, most Pathname methods return Pathname objects.

  require "pathname"

  path = Pathname.new("/home/marcos/projects")
  >> Pathname:"/home/marcos/projects"

  other = path.join("bragi")
  >> Pathname:"/home/marcos/projects/bragi"

A Pathname object can provide a lot of information about its underlying filepath, like its parent path, dirname, whether it is a absolute or relative path, whether it represents a file or not, and so on.

  require "pathname"

  path = Pathname.new("/home/marcos/projects")
  path.dirname   # Pathname: "/home/marcos/"
  path.basename  # Pathname: "/home/marcos/projects"
  path.parent    # Pathname: "/home/marcos/"
  path.file?     # false
  path.absolute? # true
  path.relative? # false

Pathname.ascend iterates and yields a Pathname object for each element in a given path on an ascending order. Pathname.children returns an array with all the siblings of a given path, and Pathname.each_child iterates over then.

  require "pathname"

  path = Pathname.new("/home/marcos/projects")
  path.ascend { |x| puts x }
  >> Pathname: "/home/marcos/projects"
  >> Pathname: "/home/marcos/"
  >> Pathname: "/home/"
  >> Pathname: "/"

  path.each_child { |x| puts x }
  >> Pathname: "/home/marcos/projects/bragi"
  >> Pathname: "/home/marcos/projects/mimir"
  >> Pathname: "/home/marcos/projects/guard-clone"

Pathname also offers facades for many methods found on the File and Dir classes, allowing you to write cleaner code in some cases.

One gotcha is that most Pathname methods, including its constructor, are just wrappers around string manipulations, and will happily accept any string you throw at it, but some methods, like Pathname.children and Pathname.realpath access the filesystem and will raise errors if the Pathname object does not represent an actual filesystem path.