The elements of Rust error handling

Rust error handling using algebraic types, i.e. Result is a win for the language, especially when compared to clumsier mechanisms like exceptions (Python, C++, Javascript) or simpler returned types (C, Go). The one thing that sold me on Rust initially was the fact that my first non-trivial program ran correctly immediately after compilation, even though it contained three nested state machines and I had been going crazy trying to make it robust in Python before.

There is a lot of material on basic error handling but it rarely goes further where things may be a little less obvious. I have picked up a few bits of wisdom here and there, which I think are worth repeating:

Use `thiserror`

There was a time that featured a competition between different error handling crates, but I am of the firm opinion that that contest has concluded: thiserror is the most common denominator these days¹, for good reason. The crate offers the most bang for your dependency buck and does its one job very well, namely obviating the need to write repetitive boilerplate code for error types. #[derive(Error]) is all the magic an error type usually needs.

Avoid `From`

One thing I am not a fan of due to overuse is the From trait, it tends to obscure things and often sports a terrible trade-off between convenience and maintainability. Unfortunately thiserror makes it easy to go overboard:

#[derive(Debug, Error)]
enum SaveSettingsError {
    #[error("could not serialize settings")]
    SerializationFailure(#[from] serde_json::Error),
    // Not great: Genric `Io` error variant.
    #[error("could not write settings")]
    Io(#[from] io::Error),
}

While this code looks innocuous, looking at the following example function shows the problem:

fn save_settings(settings: &Settings) -> Result<(), SaveSettingsError> {
    let serialized = serde_json::to_string_pretty(settings)?;
    let cfg_dir = PathBuf::from("myapp/settings");
    if !cfg_dir.exists() {
        // This will use `From<io::Error>`.
        fs::create_dir_all(&cfg_dir)?;
    }

    // So will this.
    fs::write(cfg_dir.join("settings.json"), serialized)?;

    Ok(())
}

If there is an issue creating the directory because the parent directory is not writable, the error returned will just be labelled Io:

Io(Os { code: 13, kind: PermissionDenied, message: "Permission denied" })

However, if instead the file itself is marked read only, the exact same error will be returned. This sort of issue tends to get worse as functions get more high level and contain multiple std::io calls that all get collapsed into a single error variant.

The answer to this problem is to not use From<io::Error>, but instead use roughly one error variant per call, and replace #[from] with #[source]²:

#[derive(Debug, Error)]
enum SaveSettingsError {
    #[error("could not serialize settings")]
    SerializationFailure(#[from] serde_json::Error),
    #[error("could not create configuration directory")]
    FailedToCreateConfigDir(#[source] io::Error),
    #[error("could not write settings file")]
    FailedToWriteSettingsFile(#[source] io::Error),
}

Now our errors will contain useful information about where the problem occurred:

FailedToWriteSettingsFile(Os { code: 13, kind: PermissionDenied, message: "Permission denied" })

The code will be a bit more involved, but since every variant of an enum is a constructor for that type we can rewrite the code using map_err:

fs::write(cfg_dir.join("settings.json"), serialized)
    .map_err(SaveSettingsError::FailedToWriteSettingsFile)?;

This is not a hard and fast rule, for example, I am tempted to leave in the SerializationFailure using a #[from] since there is only a single place where it can ever occur. But it is usually better to err on the side of less From<_> implementations, not more.

While std::io::Error is a great example because it is so ubiquitous, other potentially repeated error types benefit from breaking this pattern as well. In other words, do not tie the site where an error occurred to the error type without need.

Do not print `source` inside `Display` impls

Every now and then I run across code like this:

#[derive(Debug, Error)]
enum SaveSettingsError {
    // Not great: Printing the source in the `Display` impl.
    #[error("could not serialize settings: {0}")]
    SerializationFailure(#[from] serde_json::Error),
    // ..
}

This essentially puts the “error message” (the std::fmt::Display implementation) from the inner error into the one from outer error³. This is unnecessary in any place where errors are printed using Debug, as it typically includes all nested errors by default, thus all expect, unwrap, panic! or fn main() -> Result<_, _> call sites are already detailed.

The may seem like an issue when an error is printed using display, as

if let Err(err) = save_settings(&settings) {
    println!("{}", err);
}

prints just

could not write settings file

which is not as helpful as it could be. The solution is to print using a function or wrapper that takes an errors source into account. Here’s an adhoc example:

struct WithSource<E>(E);

impl<E: error::Error> Display for WithSource<E> {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        let mut cur: &dyn error::Error = &self.0;

        loop {
            Display::fmt(cur, f)?;
            if let Some(source) = cur.source() {
                f.write_str(": ")?;
                cur = source;
            } else {
                return Ok(());
            }
        }
    }
}

With this in place, println!("{}", WithSource(err)) produces the desired output. The formatting is left to the formatter - others could colorize input, use multiple lines or render in a different context like webpage using full-blown HTML.

On the other hand, if the recursion on source() is already baked into the Display implementation, a proper printer will give very long and potentially confusing output. A good rule to follow is to not print fields of errors in the Display impl if said field is marked as a #[source].

Stringification as a last resort

It may be undesirable or extremely inconvenient to expose the inner error type in a higher level error enum, which sometimes leads to constructs like this:

#[derive(Debug, Error)]
enum SaveSettingsError {
    // Not great: A stringified error.
    #[error("could not serialize settings: {0}")]
    SerializationFailure(String),
    // ...
}

This is problematic as it puts the source, which is now not even marked as a source⁴, into the Display implementation, making it inaccessible using the Error::source method. If any of the inner errors is built without the source-inside-Display anti-pattern mentioned above, its details are now lost as well. Heap memory usage is also higher, as a full error message must be stored for every error message.

This can be solved in a better way since Rust has essentially a “catch all” error type through dynamic dispatch: Box<dyn std::error::Error> (add + Send if you need to send it across threads):

#[error("could not serialize settings")]
SerializationFailure(#[source] Box<dyn std::error::Error + Send>),

let serialized = serde_json::to_string_pretty(settings)
    .map_err(|err| SaveSettingsError::SerializationFailure(Box::new(err)))?;

This alternative behaves just like a regular error, has the same stack size as the string version and a potentially much smaller heap allocation. It is also a good candidate for an #[error(transparent)] annotation if it is on an “other” error variant where no additional context is required.

Another option that sometimes is available is anyhow:

`anyhow` for apps, sparingly in libraries

The anyhow crate can be seen as somewhat of a companion to thiserror or as a Box<dyn Error> on steroids. It allows using a single catch-all error type, namely anyhow::Error that can be constructed using anything that implements std::error::Error.

The drawback is that it introduces a dependency on anyhow, which in many cases is undesirable. Typically one wants to avoid using anyhow in library code, although it can be handy during prototyping by deferring proper design of error types until later:

enum SaveSettingsError {
    // Do not use this in library code.
    #[error("serialization failed")]
    SerializationFailure(#[source] anyhow::Error),
    // ...
}

let serialized = serde_json::to_string_pretty(settings)
      .map_err(anyhow::Error::from)
      .map_err(SaveSettingsError::SerializationFailure)?;

It really shines in functions returning anyhow::Result though, since anyhow::Error implements : std::error::Error>, it is essentially a catch all dynamic error handler that can be annotated using Context::context:

fn save_settings(settings: &Settings) -> anyhow::Result<()> {
    let serialized = serde_json::to_string_pretty(settings)?;
    let cfg_dir = PathBuf::from("myapp/settings");
    if !cfg_dir.exists() {
        fs::create_dir_all(&cfg_dir).context("create config dir")?;
    }

    fs::write(cfg_dir.join("settings.json"), &serialized).context("write config file")?;

    Ok(())
}

Thus it is best used in high-level application functions:

`expect` at the top level

Unless you are developing under extreme memory constraints, unwrap can (and should) always be replaced by a more helpful expect. Usually it is preferrable to just pass Results upwards though, especially in (but not limited to) library code.

Essentially this means to avoid error handling through panic. A trick to avoid this and not getting slowed down initially is to use anyhow::Result liberally. fn main() can return a Result, thus writing a main function returning an anyhow::Result allows liberal use of the ?-operator:

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // ...
    save_settings(&settings)?;

    Ok(())
}

Of course, Box<dyn Error> can be substituted with only a marginal amount of additional boilerplate.

Panics are okay, sometimes and if documented

While panics (panic!, unwrap, expect) should be avoided in production code, they do have a place even in library functions, typically when invariants that are not expressible through the type system are violated by the caller, as in this example⁵:

/// Subtracts two durations.
///
/// # Panics
///
/// Will panic if `end` is small, i.e. before, `start`.
fn time_diff(start: Instant, end: Instant) -> Duration {
    if end < start {
        panic!("end must not be before start");
    }

    end - start
}

They should always be accompanied by a # Panics section in the documentation, as they are part of the functions interface. Reasons to use panics are inability to change the return type to include an error, not wanting to pollute the interface for a rare edge case that the caller should take care of or inability to express a constraint using the type system.

These sort of checks often even better expressed as an assertion

assert!(end >= start, "start must be equal or later than start");

or an expect

Duration::try_from_secs_f64.expect("function must be called with an in-range f64 value")

So when are panics not okay? When they are undocumented and/or used to handle an actual error that could be propagated upwards instead.

Let the caller add context

Rust errors are usually minimal in that they contain only the information from inside a failing function, not information passed in from the outside:

let filename = path::PathBuf::from("/my-filename");
fs::write(&filename, b"").expect("failed to write to file");

The error message does not include the file name:

failed to write to file: Os { code: 13, kind: PermissionDenied, message: "Permission denied" }

This has some merit, as it avoids clobbering the error message with the same file error multiple times, should nested calls occur. It also allows the caller to decide whether or not they want that kind of context attached.

In a pinch, anyhow::Context can fill in:

fs::write(&filename, b"")
    .context(filename.display().to_string())
    .expect("failed to write to file");

Usually only the top level error should carry this information though, not a function that performs operations on a single passed in filename. If this becomes an issue, it is typically a sympton of overloading a single Io error variant (see “Avoid From” above).

A good rule is to treat attaching context like handling or expecting an error, the higher the level at which it is done, the better. Naturally there are some exceptions to this, but in many cases a bespoke context to attach is a feasible option. For example:

use std::{error::Error, fmt, path, fs};

// We are not deriving `thiserror::Error` since we want to print both
// `file_path` and `error` in `Display::fmt`, but otherwise use
// `FileError::error` transparently.
#[derive(Debug)]
struct FileError<E: Error> {
    file_path: path::PathBuf,
    error: E,
}

impl<E: Error> fmt::Display for FileError<E> {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        write!(f, "{}: ", self.file_path.display())?;
        fmt::Display::fmt(&self.error, f)
    }
}

impl<E: Error + 'static> Error for FileError<E> {
    fn source(&self) -> Option<&(dyn Error + 'static)> {
        self.error.source()
    }
}

// An "Ext"-style trait makes using the context much more convenient:
trait FileErrorExt<T, E>  where E: Error {
    fn with_context<P: Into<path::PathBuf>>(self, p: P) -> Result<T, FileError<E>>;
}

impl<T, E: Error> FileErrorExt<T, E> for Result<T, E>   {
    fn with_context<P: Into<path::PathBuf>>(self, p: P) -> Result<T, FileError<E>> {
        self.map_err(|error| { FileError { file_path: p.into(), error }})
    }
}

used in

let filename = path::PathBuf::from("/my-filename");
    
if let Err(err) = fs::write(&filename, b"").with_context(filename) {
    // Note: Properly descending `source()` is not implemented here.
    println!("{}", err);
}

will print

/my-filename: Permission denied (os error 13)

While there are crates out there that magically attach filenames to errors, they tend to make the attached error invisible, thus hiding it from those handling errors above, and attach it to the innermost error. This burdens callers doing things like collecting multiple results in parallel, a case where annotation with a filename may potentially be worthwhile, and forces them to carry now redundant information around anyway.

`io::ErrorKind::Other` is okay, sometimes

Sometimes, when writing code that is heavy on IO handling, it can be useful to just bring out the io::Error::other("some error description"). I’ll leave it a that :).

Conclusion

While Rust has many best practices around error handling, there is still a lot of design space that is up to the individual developer. Good, maintaineable programs need both in their error handling: Discipline and application of creative license where needed.

There was a time when there were other contenders, e.g. error-chain, but they have falled far behind thiserror in terms of popularity and activity. [failure], for example, has officially been deprecated in thiserror’s favor. ↩︎
Of course it would work even without #[source], but the benefits of properly passing information about the error chain will hopefully become apparant in the remainder of this post. ↩︎
If you try hard enough you can make this much worse by putting in {0:?} and mixing Debug and Display implementations! ↩︎
std::error::Error is even explicity not implemented for &str! ↩︎
This example is particularly contrived, since end - start will already panic ↩︎

The elements of Rust error handling

Use thiserror

Avoid From

Do not print source inside Display impls