This writeup is part of the prologue to a series of articles that talk
about the “CockroachDB errors library”, which is really a
general-purpose, open source replacement for Go’s standard errors
package.
Consider for example the following piece of code:
import "fmt"
type T struct {
x int
}
func main() {
v := T{123}
fmt.Println(v)
}
This program prints {123}
, even though we havent taught Go how
to print our type T
. How does it do this?
Equivalence of printers
The logic in the fmt
package is shared between all the printers,
such that the following calls are all guaranteed to be equivalent:
fmt.Print(x)
fmt.Printf("%v", x)
os.Stdout.Write([]byte(fmt.Sprint(x)))
os.Stdout.Write([]byte(fmt.Sprintf("%v", x)))
In other words, the logic for fmt.Print
is always the same as
using Printf
with the verb %v
—to the point that the former
actually uses the latter as its implementation.
Likewise, fmt.Println
uses fmt.Print
and thus the %v
verb
under the hood, and ditto for fmt.Sprintln
and fmt.Sprint
.
fmt.Stringer
and fmt.Formatter
Now, add the following at the bottom of the code above:
func (t T) String() string { return "boo" }
And run the program again. What happens? It prints boo
. The value
123
is nowhere to be seen.
What is happening here is that a method String()
returning
string
implements the standard interface fmt.Stringer
, and the
functions in fmt
try to use that if they can find it.
Separately, try removing the String()
function definition above,
and replace it with this:
func (t T) Format(s fmt.State, _ rune) {
fmt.Fprint(s, "baa")
}
What happens then? The program now prints baa
. Again the value 123
is nowhere to be seen.
What is happening here is that a method Format(fmt.State, rune)
implements the fmt.Formatter
interface, and the functions in
fmt
try to use that if they can find it.
What if both methods are available?
The program then prints baa
: fmt.Formatter
is preferred over
fmt.Stringer
if both are available.
And when neither method is available, the fmt
logic “falls back”
on its own internal display code, which does a best effort at
representing the value.
What fmt
knows about error
Go’s standard error
interface provides just an Error()
method
returning a string, and nothing else.
The fmt
logic knows about error
, and knows how to use its
Error()
method, by extending the preference rule explained above:
fmt.Formatter
is preferred in all cases if present.- if
fmt.Formatter
is not present, buterror
is, thenError()
is preferred. - otherwise
fmt.Stringer
is used if present.
Relationship between %s
, %v
, %q
and %x
/ %X
So far we’ve seen how the fmt
logic can optionally use
fmt.Stringer
, error
fmt.Formatter
under the hood for %v
.
Yet perhaps the more common verb used in Go code is %s
. How does
%s
relate to %v
?
Generally, %s
uses more or less the same logic as %v
: if
either fmt.Stringer
, error
or fmt.Formatter
is present, it will use
that with the same preference.
The difference appears when the object implements neither
String()
, Error()
nor Format()
. In this case, %v
has some
predefined representation (e.g. {123}
in the example above),
whereas %s
complains that “the argument has the wrong type” and
fails to represent anything.
This is why unless the code is manipulating values with the specific type string
,
the Go idiom is to reach out for %v
in the general case instead of %s
.
The additional verbs %q
and %x
/ %X
are variants of %s
(with the same restrictions when neither String()
, Error()
nor Format()
is available):
%q
quotes the resulting string, so thatfmt.Printf("%q", `he said "hi"`)
printshe said \"hi\"
.%x
/%X
show a hexadecimal representation of the bytes in the string. I personally found this was extremely rarely used in practice (in contrast to using it for integer types, which is relatively common).
By-value printing and by-reference methods
Now consider the program above, and the following combination of implementations (beware of the receiver types):
func (t T) String() string { return "boo" }
func (t *T) Format(s fmt.State, _ rune) { fmt.Fprint(s, "baa") }
This, now, prints boo
again. What is happening? The code above
passes the T
instance by value. At that level, only the
String()
method is available, so the fmt
logic prefers
that. But now check out this:
func (t *T) String() string { return "boo" }
func (t *T) Format(s fmt.State, _ rune) { fmt.Fprint(s, "baa") }
What gives? The program now prints {123}
again. Neither method is
visible to the fmt
logic.
Hence the following rule: if an object is printed by value, only its by-value methods are considered.
By-reference printing and by-value methods
Now let us switch things over, with the following main program instead:
func main() {
v := &T{123}
fmt.Println(v)
}
Now consider the following program variants:
Variant A:
func (t T) String() string { return "boo" } func (t T) Format(s fmt.State, _ rune) { fmt.Fprint(s, "baa") }
Variant B:
func (t T) String() string { return "boo" } func (t *T) Format(s fmt.State, _ rune) { fmt.Fprint(s, "baa") }
Variant C:
func (t *T) String() string { return "boo" } func (t T) Format(s fmt.State, _ rune) { fmt.Fprint(s, "baa") }
Variant D:
func (t *T) String() string { return "boo" } func (t *T) Format(s fmt.State, _ rune) { fmt.Fprint(s, "baa") }
What is printed in each case?
- Both variants A, B and D print
baa
. - Variant C prints
boo
.
What is going on? The answer is to look at the receiver type for the
methods. The fmt
logic first looks at the exact type of the
argument, which in this case is *T
, and only if it cannot find
anything at that level it tries to look at the type “pointed to”, in
this case T
. This clarifies what happens:
- in variant D, both
fmt.Stringer
andfmt.Formatter
are available on*T
, so that is picked. - in variant C,
fmt.Stringer
is available on*T
, so that is picked. - in variant B,
fmt.Formatter
is available on*T
, so that is picked. - in variant A, no method is available on
*T
, sofmt
looks atT
next. At that level it finds both and prefersfmt.Formatter
, as explained above.
Hence the following general rules:
- if an object is printed by-reference, its by-reference methods are considered first.
- when implementing your own custom printer methods, prefer implementing them by-reference, so they get picked up in more cases.
Verbose printing with %+v
The +
flag for numeric types forces the display of a plus sign for
positive values, so that the sign is always shown.
In combination with v
however, it triggers “verbose printing”.
With the default fmt
logic, this adds the name of fields to structs.
If just fmt.Stringer
is implemented, +
does not change
anything; however if fmt.Formatter
is implemented, then by
convention the code in the Format()
method includes more details
in the output than when +
is not specified.
The Go library does not prescribe how this should be achieved: different packages tend to do this in different ways. The lack of specification is not an issue however; in either case the output is intended for use by human eyes and so minor display inconsistencies are not (yet) considered consequential.
Go representation and the %#v
verb
Finally, change the original main program to use the %#v
verb instead:
func main() {
v := T{123}
fmt.Printf("%#v\n", v)
}
What does this print?
- if a
String()
method is available, it is ignored. - if a
Format()
method is available, that is used. - otherwise, if a
GoString()
method is available (from thefmt.GoStringer
interface), that is used. - otherwise, a printout of the structure using Go syntax is produced.
What is happening here is that the %#v
specifier intends to print
out the “Go representation” of the value, not its “human
representation.” The fmt
logic knows how to do this,
but a custom type can customize this behavior with the fmt.Formatter
or fmt.GoStringer
interfaces.
Note that I include this explanation of fmt.GoStringer
for
completeness; I have found in practice that it is only rarely used.
I also personally recommend the facility at
https://github.com/kr/pretty, which is able to print Go
representations much more clearly than Go’s standard library; for
example: fmt.Printf("%# v", pretty.Formatter(x))
.
Formatting verbs, flags and modifiers
We have seen so far how %v
differs from %s
in intent and
purpose, and how e.g. %v
differs from %+v
.
What if we wanted to define our own customization with a different result for each of them?
The reliable customization mechanism, for all three cases, is the fmt.Formatter
interface:
package fmt
// Formatter can be implemented by your custom types.
type Formatter interface {
Format(s State, verb rune)
}
// An object of type State is provided by the fmt
// logic to your custom Format() method.
type State interface {
io.Writer // inherits the Write() method
Flag(int) bool
Width() (int, bool)
Precision() (int, bool)
}
What interests us most is:
the
verb
argument passed directly to our own customFormat()
method. This indicates the main “formatting verb”: for%v
,verb == 'v'
. For%#v
,verb == 'v'
also. For%s
, the verb iss
, and so on.the
Flag()
method on thefmt.State
passed as argument to theFormat()
method.Flag()
returnstrue
iff the corresponding formatting flag has been set.For example, for
%v
,Flag('#') == false
, whereas for%#v
,Flag('#') == true
.the fact that
fmt.State
also implementsio.Writer
. This makes it possible to e.g. pass theState
variable directly as first argument to another call tofmt.Fprint
to further simplify the implementation of customFormat()
methods.
The Width()
and Precision()
methods on fmt.State
are also
interesting as they give access to the additional numeric parameters,
or modifiers, in a formatting string. For example, in %3.2f
, we
have width 3 and precision 2. However, I found that these were used
less often in practice.
Here is a rather idiomatic example:
type Response struct {
code int
msg string
}
func (r *Response) Format(s fmt.State, verb rune) {
switch verb {
case 'v':
if s.Flag('+') {
// With %+v, we print both the message and the code.
fmt.Fprintf(s, "%s (%d)", r.msg, r.code)
}
fallthrough
case 's':
// For %s, or %v without +, we just print the message.
fmt.Fprint(s, r.msg)
}
}
// String is provided for convenience.
func (r *Response) String() string { return fmt.Sprint(r) }
What is going on here?
the main representation function for type
*Response
isfmt.Formatter
. When used with%+v
, it prints both the message and the code between parentheses. With just%v
/%s
, it prints just the message.to make the type compatible with the
fmt.Stringer
interface, for use in other places where aString()
method is required, an implementation ofString()
is implemented by calling intofmt.Sprint
.This is discussed further below.
An interesting aspect of this code is that it does not handle %q
/
%x
/ %X
. For these verbs, it outputs nothing. fmt
is OK
with that.
Neither does it support other flags to %v
than +
; for example
it treats %#v
and %v
the same.
In fact, the Go API does not make it easy to implement a custom
Format()
that is as general and powerful as its own internal
logic, and Go packages “in the wild” often contain incomplete
implementations like the one above.
Custom formatters in practice
I have found in practice that the following properties hold well across packages in the ecosystem:
- custom
Format()
methods always do something valid and useful for thev
verb, regardless of the flags provided. - the behavior of
Format()
with verbv
and no flags (i.e. a simple%v
) is most often kept consistent with the behavior ofString()
, if it is also available. - if a custom formatter has both a “simple” and a “verbose” mode,
it commonly recognizes
+
as the flag to access the verbose mode. - if both
%s
and%v
(without flags) are recognized, they usually emit the same thing. - it’s uncommon to see
%q
,%x
and%X
handled properly in customFormat()
methods, if at all. - custom formatters for non-numeric types nearly never handle the width and precision modifiers.
This last point in particular is the reason why code that cares about fixed-width string formatting should spell out the printing in two steps, as follows:
s := fmt.Sprint(v)
fmt.Printf("%30s", s) // instead of printing v directly
Code reuse between fmt.Stringer
, fmt.Formatter
and error
An example above was implementing String()
by calling
fmt.Sprint
, which in turn uses the Format()
method on the same
type. To simplify:
type T struct { msg string }
func (r *T) Format(s fmt.State, _ rune) {
fmt.Fprint(s, r.msg)
}
func (r *T) String() string {
// This causes fmt to call Format() above and ultimately
// print r.msg.
return fmt.Sprint(r)
}
Why would one choose to implement String()
via return
fmt.Sprint(r)
instead of return r.msg
in this case?
This is an instance of DRY: if later
the logic needs to change to “print more stuff”, only the Format()
methods needs to be modified; the String()
method automatically
benefits from it.
This pattern is relatively common; but so is the following:
type T struct { msg string }
func (r *T) String() string {
return r.msg
}
func (r *T) Format(s fmt.State, _ rune) {
fmt.Fprint(s, r.String()) // or: s.Write([]byte(r.String()))
}
Again, one method is implemented “using the other”, so that one only needs to change either of them to get the same behavior in both.
Likewise, if the error
interface is involved, we see all
combinations of reuses in practice:
type T struct { msg string }
func (r *T) Error() string { return r.msg }
func (r *T) String() string { return r.Error() }
func (r *T) Format(s fmt.State, _ rune) { fmt.Fprint(s, r.Error()) }
type U struct { msg string }
func (r *U) String() string { return r.msg }
func (r *U) Error() string { return r.String() }
func (r *U) Format(s fmt.State, _ rune) { fmt.Fprint(s, r.String()) }
type V struct { msg string }
func (r *V) String() string { return fmt.Sprint(r) }
func (r *V) Error() string { return fmt.Sprint(r) }
func (r *V) Format(s fmt.State, _ rune) { fmt.Fprint(s, r.msg) }
Why do we see so much diversity?
I am not exactly sure, but I blame the lack of prescription in the Go library documentation. Also see the two answers below.
Does it matter since we get the same result in every case?
From a functional perspective these examples are all equivalent. From
a performance perspective, one should consider which of the variants
is used more often in the program. If the String()
method is
commonly used, more so than printing out the object, then having
String()
contain the simplest implementation may yield better
performance. This is because the logic in the fmt
package is a
little heavyweight. However note that in practice I have not found
this to be often the case, so I would say it does not matter much.
I am implementing my own custom type. What pattern should I aim for?
If your type only has just one representation, then you can reach out
to String()
directly (or Error()
if you are implementing an
error type), and omit Format()
entirely.
If you need to make a difference between “simple” and “verbose”
displays, then implement Format()
first then derive String()
(and/or Error()
) from it.
Summary and take aways
Go provides a general-purpose formatting API in its standard fmt
package.
All the functions in that API are powered by common logic, which is
the logic used under the hood by Printf
/ Sprintf
: each object
is displayed in the context of some formatting “verb”.
The most common and reliable verb is v
(tip: it is “v” like
“value”), also used under the hood by Print()
and
Println()
. It can print pretty much anything and is not picky
about whether the value is nil or implements a particular interface.
Meanwhile, when implementing your own type, you can customize the
behavior of fmt
by implementing certain interfaces:
fmt.Stringer
, a simpleString() string
method.error
, a simpleError() string
method.fmt.Formatter
, aFormat()
method. This can be used to display different things when used via%v
vs.%+v
and other combinations of verbs and flags.
In pratice, we see packages that provide both String()
and
Format()
methods side-by-side, or Error()
and Format()
. One
is often implemented by calling the other, to avoid code
duplication. All combinations of reuse are allowed by Go’s standard
library, and we actually can find all variants in the ecosystem.