This writeup is part of the prologue to a series of articles that talk about the “CockroachDB errors library”, which is really a general-purpose, open source replacement for Go’s standard errors package.

So, what are we talking about here?

The basic Go error API: errors are values

The Go ecosystem has some extremely popular, but also extremely basic writeups on the topic:

What can we learn in these articles?

  • Go provides a pre-defined interface type called error, defined as follows:

    // an "error" is an object with an `Error()` method
    // which describes the situation that occurred.
    type error interface {
         Error() string
    }
    
  • The idiomatic way to write Go functions/methods is to have them return an error payload alongside their regular return value, and test that upon every call point:

    func div(x, y int) (int, error) {
        if y == 0 {
           return 0, fmt.Errorf("boo")
        }
        return x / y, nil
    }
    
    func main() {
        r, err := div(3, 2)
        if err != nil {
           fmt.Printf("woops: %v", err)
           return
        }
        fmt.Println("result:", r)
    }
    
  • As demonstrated in the example above, fmt.Printf automatically knows how to call the Error() method to display the text of the error. It also does this if the error is printed via %s, %q, %x / %X.

Errors are also linked lists

If you do not yet know who Dave Cheney is, now is the time to get acquainted with this extremely prolific Go programmer.

In 2015, Dave created the pkg/errors package (source, docs), then subsequently presented it at the GoCon spring conference in Tokyo, in 2016. Here’s the article that explains the story in prose:

Dave Cheney: Don’t just check errors, handle them gracefully.

Here are the main innovations that Dave brought to the table:

  • Go error objects are constructed as linked lists, preferably immutable.
  • The err error reference points at all times to the head of the list.
  • At the very point an error occurs for the first time, an atomic or “leaf” error object is constructed, which will remain the tail of the list.
  • As errors get returned and communicated through a call stack and software components, it is augmented by adding more “layers” to it, by pushing more list elements, or “wrappers” at the head of the existing error list.

What does this buy us in practice? The main use was to add message prefixes to error objects, to give more context about “where an error has been”. For example:

import (
   "fmt"
   "github.com/pkg/errors"
)

func foo() error {
     return fmt.Errorf("boo")
}

func bar() error {
     return errors.Wrap(foo(), "bar")
}

func baz() error {
     return errors.Wrap(foo(), "baz")
}

func main() {
     r := rollDice()
     var err error
     if (r < 4) {
        err = bar()
     } else {
        err = baz()
     }
     fmt.Println(err)
}

Thanks to errors.Wrap(), which adds a prefix to the message, the main function can report bar: boo or baz: boo and the (human) reader of the error message can known after the fact which function was called. Without errors.Wrap(), which call path led to the error would be undiscoverable.

How this works, in practice, looks a little bit like this:

// errorString represents a leaf error. This
// is what gets constructed by e.g. fmt.Errorf().
type errorString struct {
     msg string
}

// Error implements the error interface.
func (e *errorString) Error() string { return e.msg }

// msgWrap represents a wrapper which adds a prefix
// to an error. This is what gets constructed
// by e.g. pkg/errors.Wrap().
type msgWrap struct {
     cause error
     msg string
}

// instances of msgWrap are also instances of the error
// interface, by implementing the Error() method.
func (e *msgWrap) Error() string {
     return fmt.Sprintf("%s: %v", e.msg, e.cause)
}

Error message, wrapper annotations and cause discovery

The foundational wisdom from Dave Cheney is this:

“The Error method on the error interface exists for humans, not code.” [*]

In other words, program code should never inspect or compare the result of the Error() method.

From there, Dave goes on to denounce two patterns of Go programming which he found dangerous / distasteful back then, and which are still frown upon today:

  • the notion of “sentinel errors”, which are reference error instances that the code can compare against. For example if err == ErrNotExists.

    The main problem with the idea of instance is that if there’s a linked list, maybe the sentinel is found at the tail of the list while there is something else at the head (for example, a message prefix).

    Another more practical problem with sentinels is that to be able to perform the comparison, the package where the comparison occurs must import the package where the sentinel is defined. This creates a dependency. This type of hard dependency make software composition more difficult.

  • the notion of reference “error types” (or error wrapper types), which code can check with a type case, for example if e, ok := err.(SomeType); ok.

    The problems here are the same as above: it doesn’t work (well) if there is a linked list, and also forces package dependencies.

Instead, Dave recommends two things:

  • define interfaces for the properties of error objects that are interesting to callers. For example, whether an error is recoverable could be defined by the presence of an IsRecoverable() method.

    It is then possible to assert the implementation of this interface from any package, without a dependency: in Go, interface assertions are based on structural equality, not named equality.

  • be mindful of the linked list structure of errors, and properly iterate over the chain of layers when inspecting an error object.

To enable this last point, Dave Cheney introduced the causer interface in pkg/errors, which enabled the following reusable code pattern:

// NB: causer is not exported by pkg/errors; instead
// any package can re-defined it as needed
type causer interface { Cause() error }

...
if err != nil {
   for {
       if _, ok := err.(SomeInterfaceWithProperty); ok {
          // ... do something ...
       }

       // Peel one layer, if wrapped.
       if c, ok = c.(causer); ok {
          err = c.Cause()
          continue
       }
       break
   }
}

This pattern was even captured in the function (not method) errors.Cause(), which does the above unwrapping until there’s no cause left, to always access the “leaf” or linked list tail of the error object.

Embedded stack traces in errors

An underrated feature of pkg/errors is that it automatically preserves a copy of the stack trace every time an error leaf or wrapper is constructed.

This is important because it makes it possible to analyze “where an error has been” while troubleshooting problems: oftentimes, the error is only visible to the developer or only becomes problematic a long while after it has been instantiated, somewhere in the callers. This difficulty is compounded by various Go concurrency patterns where error objects are transported “sideways” from one goroutine to the next via a channel. It is thus not sufficient to just look “one line up” in the source code to find where an error comes from.

To achieve this, pkg/errors uses an extremely lightweight and rather clever mechanism to preserve a copy of the call stack upon every error construction.

This stack trace does not appear in the result of the Error() method; instead, it appears when the error object is printed via the %+v verb in Printf (this is the most common case, e.g. during debugging), or by checking the existence of a StackTrace() method on some of the layers of the error linked list (e.g. to integrate with Sentry.io).

What is particularly clever about this mechanism is that all the details of the stack trace, including function/package names, are not stored in the error object directly; instead they are retrieved only when the stack trace is printed. This saves time and memory in the common case, where the errors occur but may be innocuous.

Promotion in Go 1.13 and API schism

It is hard to convey how immensely important and foundational the pkg/errors package was. Today, it is a direct dependency to more than 50.000 Go public projects worldwide, and of countless more private Go repositories.

The designers of the Go language recognized this and in 2019 its semantics were integrated in the Go standard library, starting in Go 1.13, albeit with small variations:

  • Go 1.13 errors are linked lists too.

  • Go 1.13 does not provide errors.Wrap(), however fmt.Errorf was augmented to achieve the same: if the formatting verb %w is present, it will construct a wrapped error, and keep the original error object as linked list tail available for inspection:

    • in pkg/errors: errors.Wrapf(err, "hello %s", "world")
    • in Go 1.13: fmt.Errorf("hello %s: %w", "world", err)
  • Go 1.13 simplifies the task of testing a property on every intermediate level of an error linked list, with the following “fresh” APIs:

    • errors.Is(err1, err2) checks whether any layer in err1 is equal to err2 (test for sentinels, recursively).

      This can be used to recognize many of the standard library’s sentinels, for example errors.Is(err, os.ErrNotExist) to check whether an error was caused by a file/directory not being found.

    • errors.As(err1, <type>) checks whether any layer in err1 can be casted to <type> (either interface or concrete type), and returns the result of the cast.

      This can be used to assert error properties, in the way that Dave Cheney was recommending back in 2015.

It is not all rosy though, as Go 1.13 also caused an API schism in the community:

  • The unwrapping method on error objects is called Unwrap(), not Cause().

    I personally resent the Go team for choosing a separate method name, which directly breaks compatibility with all the packages built upon pkg/errors, for no good discernable reason.

  • Go 1.13 does not provide an “unwrap everything” function like errors.Cause() in pkg/errors.

    Also, sadly, since Go 1.13 does not define a Cause() method, it is not possible to use errors.Cause() from pkg/errors to unwrap mixed error objects from a Go 1.13 project and a project designed for the pkg/errors API.

  • Extremely sadly, Go 1.13 does not provide a facility to capture stack traces, like pkg/errors can. And because of the aforementioned API incompatibility, it is not possible to mix-and-match pkg/errors with Go 1.13-specific code to obtain this behavior back.

In summary:

Feature Go’s <1.13 errors github.com/pkg/errors Go 1.13 errors
leaf error constructors (New, Errorf etc)
abstraction: errors are linked lists  
error causes via Cause()    
error causes via Unwrap()    
best practice: test interfaces, not values/types   (partial)
errors.As(), errors.Is()    
errors.Wrap()    
automatic error wrap when format ends with : %w    
standard wrappers with efficient stack trace capture    

This schism is real and sad. The reason why it occurred, perhaps surprisingly, is that the Go team could not settle on a good way to standardize how to print errors. We will see why in a subsequent article in this series.

Nevertheless, the community of pkg/errors users cannot simply hop into the Go 1.13 bandwagon. There is a gap here, some demand for some cross-compatible library which bridges the gap.

This is why, among other things, the CockroachDB errors library achieves that. You can use it as a drop-in replacement to both pkg/errors and Go 1.13’s own errors package.

Like this post? Share on: TwitterHacker NewsRedditLinkedInEmail

Comments

So what do you think? Did I miss something? Is any part unclear? Leave your comments below.


Reading Time

~8 min read

Published

The CockroachDB errors library

Category

Programming

Stay in Touch