Note
The latest version of this document can be found online at https://dr-knz.net/go-executable-size-visualization-with-d3-2021.html. Alternate formats: Source, PDF.
Note
After a lively discussion on Hacker News and input from Russ Cox, the conclusions in the analysis below were reworded to avoid the notion of “non-useful bytes”. The bytes have a purpose.
Introduction
Two years ago, my article “Why are my Go executable files so large?” showed how to utilize D3 and a tree map visualization to explore the size of executable files produced by the Go compiler.
A few things have changed since, and so an update is in order.
Tooling updates
As presented the first time around, we are using a data pipeline that looks as follows:
- build a Go executable.
- use
go tool nm -size
and applyc++filt
on the output. - transform the output into a tree using a custom-designed Python
script
tab2dict.py
. - transform the tree into a valid input for the D3 tree map visualization
using a custom-designed Python script
simplify.py
.
The source code for the Python scripts is public on GitHub: https://github.com/knz/go-binsize-viz
Since 2019, the Go and C++ compilers produce a larger diversity of
symbols; the regular expressions used in tab2dict.py
have been
adjusted accordingly.
Separately, there was a usability shortcoming in the original
implementation: if a Go package contained both some source files
(e.g. sql/create.go
) and sub-packages
(e.g. sql/sem/tree/eval.go
), the “own” size of the package and
that of its sub-packages were appearing side-by-side in the
visualization, instead of “inside” each other. This was confusing
because the (human) explorer naturally expects a hierarchical view
between these two values.
This shortcoming has also been corrected.
Example visualization
Here is a visualization for CockroachDB v20.2.7, the latest stable release at the time of this writing:
Surprising finding: “dark” file usage
The sum of the sizes reported by go tool nm
does not add up to the final size of the Go executable.
For example, in the CockroachDB 20.2.7 binary:
- the file occuppies 211694984 bytes (202MiB) on disk;
- however, the sum of symbol sizes adds up to 118928245 bytes (113MB).
- there is a gap of 92766739 bytes (88MiB) missing, or ~44% unaccounted for.
At first I suspected that this size was occupied by the symbol table
itself, or the debugging information. To check this, we can use strip
to remove the symbol table and debugging info and observe the difference. Alas:
- the stripped executable size is 190680384 bytes (182MiB) on disk;
- so there is still a gap of ~68MiB, or ~34% non-symtable data that is unaccounted for.
At this time, I do not have an explanation for this “dark” file usage.
(The word “dark” here reflects the idea that the bytes are not enlightened by the symbol table. It is also inspired from the concept of dark silicon invented/discovered around 2011. It is not a statement of moral value about the nature of these bytes.)
Note
After originally stating the above, Russ Cox from the Go team explained that the “dark” bytes are metadata for garbage collection and reflection, which are not present in the symbol table because 1) they do not need to be 2) accounting for them in the symbol table would make the binary even larger.
We can see how this dark file usage has evolved throughout the growth of CockroachDB:
CockroachDB version | Go | Exec. size | (MiB) | Stripped | Sum nm -size |
Stripped sz. | Dark bytes | % dark bytes |
---|---|---|---|---|---|---|---|---|
v1.0.0 | 1.8 | 39830792 | 38.0 | 39799624 | 32216984 | 31168 | 7582640 | 19.0% |
v1.0.7 | 1.8 | 39799624 | 38.0 | 39830792 | 32371395 | 0 | 7459397 | 18.7% |
v1.1.0 | 1.8 | 43447496 | 41.4 | 43447496 | 35602483 | 0 | 7845013 | 18.1% |
v1.1.9 | 1.8 | 46300200 | 44.2 | 46300200 | 37849642 | 0 | 8450558 | 18.2% |
v2.0.0 | 1.10 | 54384568 | 51.9 | 54384576 | 44463267 | 0 | 9921309 | 18.2% |
v2.0.7 | 1.10 | 56432824 | 53.8 | 56432832 | 45969263 | 0 | 10463569 | 18.5% |
v2.1.0 | 1.10 | 135223352 | 129.0 | 68835904 | 55212282 | 66387448 | 13623622 | 10.1% |
v2.1.11 | 1.10 | 136101520 | 129.8 | 69429056 | 54714649 | 66672464 | 14714407 | 10.8% |
v19.1.0 | 1.11 | 124365384 | 118.6 | 111470120 | 71166968 | 12895264 | 40303152 | 32.4% |
v19.1.11 | 1.11 | 124588560 | 118.8 | 111688008 | 71257435 | 12900552 | 40430573 | 32.4% |
v19.2.0 | 1.12 | 163978120 | 156.4 | 145398096 | 92535059 | 18580024 | 52863037 | 32.2% |
v19.2.12 | 1.12 | 165974336 | 158.3 | 147303432 | 93850056 | 18670904 | 53453376 | 32.2% |
v20.1.0 | 1.13 | 135223352 | 129.0 | 147594448 | 93789751 | 0 | 53804697 | 39.8% |
v20.1.13 | 1.13 | 167269624 | 159.5 | 148256120 | 94208103 | 19013504 | 54048017 | 32.3% |
v20.2.0 | 1.13 | 209352968 | 199.7 | 188618784 | 117667098 | 20734184 | 70951686 | 33.9% |
v20.2.7 | 1.13 | 211694984 | 201.9 | 190680384 | 118928245 | 21014600 | 71752139 | 33.9% |
v21.1-alpha-geb1aa69bc4 | 1.15 | 183075488 | 174.6 | 135352792 | 107792826 | 47722696 | 27559966 | 15.1% |
In this table:
- “Exec. size” is the raw size of the executable file, in bytes.
- “Stripped” is the size of the executable after the
strip
command was applied; i.e. after the symbol table and debugging information are removed. - “Sum
nm -size
” is the sum of the advertised sizes of the entries in the symbol table. - “Symtable sz.” is the estimated size of the symbol table itself, as deducted by taking the difference between the first two sizes. We can see that the v1.0.7 to v2.0.7 executables, as well as v20.1.0, were released pre-stripped.
- “Dark bytes” is the gap between the raw file size and the combined sum of the advertised symbol sizes and the symbol table’s size and debugging information, in bytes.
- “% dark bytes” is the percentage of the dark bytes relative to the raw file size.
We can see that the dark size percentage was lower than 20% prior to CockroachDB v19.1, and has then been oscillating around 33% of the file size until v21.1. With the upcoming v21.1 release, using Go 1.15, the dark size is reduced to 15% again.
Note
Not all the changes from row to row in the previous table is attributable to the Go compiler. Obviously, the CockroachDB software has evolved as well.
The evolution of pclntab
Up to Go 1.14
As explained in the previous analysis, up to and including Go 1.15
the compiler would generate a special table called runtime.pclntab
inside the executable.
The purpose of this data structure is to enable the Go runtime system
to produce descriptive stack traces upon a crash or upon
internal requests via the runtime.GetStack
API.
We can see how this table grows across Go versions until v1.13, and then decreases in v1.15:
CockroachDB version | Go | Exec. size | (MiB) | pclntab sz |
(MiB) | % pclntab |
---|---|---|---|---|---|---|
v1.0.0 | 1.8 | 39830792 | 38.0 | 7316726 | 7.0 | 18.4% |
v1.0.7 | 1.8 | 39799624 | 38.0 | 7318030 | 7.0 | 18.4% |
v1.1.0 | 1.8 | 43447496 | 41.4 | 8193397 | 7.8 | 18.9% |
v1.1.9 | 1.8 | 46300200 | 44.2 | 9103318 | 8.7 | 19.7% |
v2.0.0 | 1.10 | 54384568 | 51.9 | 10745419 | 10.2 | 19.8% |
v2.0.7 | 1.10 | 56432824 | 53.8 | 11205818 | 10.7 | 19.9% |
v2.1.0 | 1.10 | 135223352 | 129.0 | 14364564 | 13.7 | 10.6% |
v2.1.11 | 1.10 | 136101520 | 129.8 | 14445353 | 13.8 | 10.6% |
v19.1.0 | 1.11 | 124365384 | 118.6 | 25055403 | 23.9 | 20.1% |
v19.1.11 | 1.11 | 124588560 | 118.8 | 25095079 | 23.9 | 20.1% |
v19.2.0 | 1.12 | 163978120 | 156.4 | 33619081 | 32.1 | 20.5% |
v19.2.12 | 1.12 | 165974336 | 158.3 | 34010910 | 32.4 | 20.5% |
v20.1.0 | 1.13 | 135223352 | 129.0 | 29927833 | 28.5 | 22.1% |
v20.1.13 | 1.13 | 167269624 | 159.5 | 30073122 | 28.7 | 18.0% |
v20.2.0 | 1.13 | 209352968 | 199.7 | 36139876 | 34.5 | 17.3% |
v20.2.7 | 1.13 | 211694984 | 201.9 | 36467961 | 34.8 | 17.2% |
v21.1-alpha-geb1aa69bc4 | 1.15 | 183075488 | 174.6 | 30763345 | 29.3 | 16.8% |
The large size of the pclntab
was due to a choice by the Go team
to store the mapping of program counters to function names uncompressed.
To paraphrase:
- prior to 1.2, the Go linker was emitting a compressed line table, and the program would decompress it upon initialization at run-time.
- in Go 1.2, a decision was made to pre-expand the line table in the executable file into its final format suitable for direct use at run-time, without an additional decompression step.
In other words, the Go team decided to make executable files larger to save up on initialization time and run-time memory usage.
As we discussed back then, this choice was not well warranted for network servers like CockroachDB which are executed rarely, and where the size of the program on disk matters more than the start-up time.
Go 1.15 and beyond
The publication of my article in 2019, together with the community outcry that it triggered, were actually noticed by the Go team.
The Go team subsequently decided to change course and
start working on compressing pclntab
again.
We can see this change in the table above:
- starting in Go 1.15, the
pclntab
is compressed again. - starting in Go 1.16, the
pclntab
is not embedded directly in the binary (or, at least, its advertised size in the symbol table is zero), and certain parts of it re-computed from other data in the executable file at run time. What does this change exactly?
Transition from Go 1.15 to 1.16
Using the source code for CockroachDB v21.1-alpha-geb1aa69bc4, we can produce custom builds across Linux and FreeBSD, with both the 1.15 and 1.16 compilers.
Platform | Go | Build mode | Exec. sz. | pclntab sz. | Dark sz. |
---|---|---|---|---|---|
amd64-linux | 1.15 | release | 183075488 | 30763345 (17%) | 27559966 (15%) |
amd64-freebsd | 1.15 | release (no geos) | 305452856 | 30824594 (10%) | 27052709 (9%) |
amd64-freebsd | 1.16 | release (no geos) | 289463288 | 0 | 64733620 (22%) |
amd64-linux | 1.15 | dev | 182679320 | 30811805 (17%) | 27445431 (15%) |
amd64-freebsd | 1.15 | dev (no geos) | 305452912 | 30824594 (10%) | 27052769 (9%) |
amd64-freebsd | 1.16 | dev (no geos) | 289463280 | 0 | 64733616 (22%) |
What do we see here?
From Go 1.15 to 1.16, the overall size of the executable file has decreased. This reflects improvements in the Go toolchain.
Meanwhile, the bytes previously occupied by pclntab are now part of the dark bytes, which are not in the symbol table.
Note
After the above was published, Russ Cox explained:
One thing that did change from Go 1.15 to Go 1.16 is that we broke up the pclntab into a few different pieces. Again, it’s all in the section headers. But the pieces are not in the actual binary’s symbol table anymore, because they don’t need to be. And since the format is different, we would have removed the old “runtime.pclntab” symbol entirely, except some old tools got mad if the symbol was missing. So we left the old symbol table entry present, with a zero length.
So much data! For what exactly?
An interesting way to think about the results above is that we now have two parts of a Go executable file that do not really contribute to making a program “work”:
- Optional parts, which can be deleted via
strip
:- the symbol table itself,
- the debugging information.
- Parts which are neither code nor data; we’ll call them “Go internal data”:
- The
pclntab
, when generated. This is needed to generate stack traces upon errors and other debugging-related runtime features of Go programs. - The “dark bytes”, which is byte usage in the raw executable file not accounted for in the symbol table.
- The
We can derive the numbers from the tables above:
Go | Raw size | Stripped | Optional | Code+data | Go internal |
---|---|---|---|---|---|
1.8 | 39830792 | 39799624 | 31168 | 24900258 | 14899366 |
1.8 | 39799624 | 39830792 | 0 | 25053365 | 14777427 |
1.8 | 43447496 | 43447496 | 0 | 27409086 | 16038410 |
1.8 | 46300200 | 46300200 | 0 | 28746324 | 17553876 |
1.10 | 54384568 | 54384576 | 0 | 33717848 | 20666728 |
1.10 | 56432824 | 56432832 | 0 | 34763445 | 21669387 |
1.10 | 135223352 | 68835904 | 66387448 | 40847718 | 27988186 |
1.10 | 136101520 | 69429056 | 66672464 | 40269296 | 29159760 |
1.11 | 124365384 | 111470120 | 12895264 | 46111565 | 65358555 |
1.11 | 124588560 | 111688008 | 12900552 | 46162356 | 65525652 |
1.12 | 163978120 | 145398096 | 18580024 | 58915978 | 86482118 |
1.12 | 165974336 | 147303432 | 18670904 | 59839146 | 87464286 |
1.13 | 135223352 | 147594448 | 0 | 63861918 | 83732530 |
1.13 | 167269624 | 148256120 | 19013504 | 64134981 | 84121139 |
1.13 | 209352968 | 188618784 | 20734184 | 81527222 | 107091562 |
1.13 | 211694984 | 190680384 | 21014600 | 82460284 | 108220100 |
1.15 | 183075488 | 135352792 | 47722696 | 77029481 | 58323311 |
1.15 | 182679320 | 135270120 | 47409200 | 77012884 | 58257236 |
1.15 | 305452856 | 133460408 | 171992448 | 75583105 | 57877303 |
1.15 | 305452912 | 133460472 | 171992440 | 75583109 | 57877363 |
1.16 | 289463288 | 140513816 | 148949472 | 75780196 | 64733620 |
1.16 | 289463280 | 140513816 | 148949464 | 75780200 | 64733616 |
Note
Not all the changes from row to row in the previous table are attributable to the Go compiler. Obviously, the CockroachDB software has evolved as well.
Note
The exception to this is the last 4 rows in the table, which are produced with the same version of CockroachDB (v21.1-alpha, see previous section).
Note
After the above was published, user ‘zeebo’ from the site Lobste.rs did the work to compile the same version of CockroachDB (v20.2.0) with different Go versions. The results are published here.
External perspective
We can look at the raw numbers above from an “external” perspective: how much are the various groups of data responsible for the payload on disk.
Here are the rows in the table above where the executable was pre-stripped:
Go | Code+data bytes | Go internal bytes |
---|---|---|
1.8 | 62.5% | 37.4% |
1.8 | 62.9% | 37.1% |
1.8 | 63.1% | 36.9% |
1.8 | 62.1% | 37.9% |
1.10 | 62.0% | 38.0% |
1.10 | 61.6% | 38.4% |
1.13 | 47.2% | 61.9% |
These percentages are the ratio of the corresponding size in the table above to the total raw size of the executable file.
The ratios are somewhat consistent throughout this sequence, except for the last row. The last row is an outlier. I do not have an explanation for that, but I find it possible that the corresponding binary, crdb 21.1.0 and thus the first in its release series, was produced with different compiler flags.
Here are the rows where the optional bytes were still present:
Go | Optional bytes | Code+data bytes | Go internal bytes |
---|---|---|---|
1.10 | 49.1% | 30.2% | 20.7% |
1.10 | 49.0% | 29.6% | 21.4% |
1.11 | 10.4% | 37.1% | 52.6% |
1.11 | 10.4% | 37.1% | 52.6% |
1.12 | 11.3% | 35.9% | 52.7% |
1.12 | 11.2% | 36.1% | 52.7% |
1.13 | 11.4% | 38.3% | 50.3% |
1.13 | 9.9% | 38.9% | 51.2% |
1.13 | 9.9% | 39.0% | 51.1% |
1.15 | 26.1% | 42.1% | 31.9% |
1.15 | 26.0% | 42.2% | 31.9% |
1.15 | 56.3% | 24.7% | 18.9% |
1.16 | 51.5% | 26.2% | 22.4% |
What we see from this data:
- The “share” of Go-internal data as % of the total executable size was growing up to go 1.13, and then seriously decreased after that.
- However, it was complemented by a strong increase in the optional bytes.
My interpretation of this shift:
- some debugging data was previously stored as Go objects and was moved to a DWARF (debugging data) representation, which is now strippable.
- The encoding of
pclntab
and/or other data structures in the Go runtime has become more efficient.
I am still surprised by the uptick in the % of Go internal data in v1.16 relative to v1.15, which remains to be investigated. (Also, as noted above, this % increase is still accompanied by a general decrease in absolute executable size, for the same source code.)
Internal perspective
Here, we are going to try and check whether the “Go internal data” is derived from the program code+data as a constant factor overhead.
For this, we take the same raw numbers as above but re-express the Go internal data as percentage of the size of code+data on disk (i.e. excluding the optional data).
Taking percentages also allows us to abstract from changes in the program’s source code.
This looks as follows:
Go | pclntab |
Dark bytes | Total |
---|---|---|---|
1.8 | 29.4% | 30.5% | 59.8% |
1.8 | 29.2% | 29.8% | 59.0% |
1.8 | 29.9% | 28.6% | 58.5% |
1.8 | 31.7% | 29.4% | 61.1% |
1.10 | 31.9% | 29.4% | 61.3% |
1.10 | 32.2% | 30.1% | 62.3% |
1.10 | 35.2% | 33.4% | 68.5% |
1.10 | 35.9% | 36.5% | 72.4% |
1.11 | 54.3% | 87.4% | 141.7% |
1.11 | 54.4% | 87.6% | 141.9% |
1.12 | 57.1% | 89.7% | 146.8% |
1.12 | 56.8% | 89.3% | 146.2% |
1.13 | 46.9% | 84.3% | 131.1% |
1.13 | 46.9% | 84.3% | 131.2% |
1.13 | 44.3% | 87.0% | 131.4% |
1.13 | 44.2% | 87.0% | 131.2% |
1.15 | 39.9% | 35.8% | 75.7% |
1.15 | 40.0% | 35.6% | 75.6% |
1.15 | 40.8% | 35.8% | 76.6% |
1.15 | 40.8% | 35.8% | 76.6% |
1.16 | 0.0% | 85.4% | 85.4% |
1.16 | 0.0% | 85.4% | 85.4% |
What we see in this table:
- In Go 1.16, the
pclntab
data is not represented any more in the symbol table, but it is still there in the overall Go internal bytes; it simply has become “dark”. - The Go internal data had become non-linearly larger than code+data bytes across Go versions until Go 1.13. Moreover, up to and including Go 1.12, the function was not a constant factor of the compiled size, meaning that the expansion was dependent on the type of program in the source code or some other factor.
- From Go 1.13 onward, the factor became about constant for a given Go version, as revealed by the Go 1.13-1.16 data points. This means that we can deem it reasonable to assume that the Go internal data is now sized as a function of the compiled program code+data, and not any more some other factor.
- The overall percentage of Go internal data has increased again in Go 1.16. This was unexpected and I do not have an explanation for this yet. (Note: this is not a problem per se as the overall executable size has decreased at the same time.)
In any case, we are still looking at an executable file where nearly as much space is occuped by Go internal data other than code+data (with a mere 15% difference in Go 1.16).
Summary and conclusions
In our original analysis in 2019, we looked at the output of go tool
nm -size
and drew a tree map representation for it. This helped us
detect an anomaly, in the size of a special data structure called
runtime.pclntab
, which was growing excessively large.
In files generated by a Go 1.16 compiler, this data structure appears to be absent from the symbol table. Was it removed?
This year, we revisited the analysis and discovered that the symbol table is not complete. There are many bytes in the binary executable that are not accounted for, neither by the announced size of objects in the symbol table, nor by the size of the symbol table itself.
We can call this the “dark file usage” of Go binaries. It is “dark” not because it is “bad” or “unknown” but because it is not enlightened by the symbol table.
In fact, the removal of the advertised bytes for pclntab
in Go
1.16 was accompanied by a corresponding increase to the “dark”
bytes. The data is still there, it is just not accounted for. As
explained by Russ Cox from the Go team, it is not accounted for
because it does not need to be.
These bytes of non-code, non-data objects have been made necessary by internal algorithmic choices by the Go team, which are not made in the same way in other languages, and we can thus say they are “Go internal bytes”.
They are nearly as large as the code+data bytes in the file: for every 6 bytes of code+data, there are 5 accompanying “Go internal” bytes (approximately, in the latest version).
The proportion of these “Go internal bytes” relative to program code+data has grown in CockroachDB over the course of Go versions up to Go 1.13, and then reduced in Go 1.15, and has grown again in Go 1.16. However, meanwhile, improvements in the Go toolchain have reduced the absolute size of compiled code+data to a larger extent, so that the corresponding small increase in the size of Go internal bytes is not reflected in total file size.
Separately and independently, the Go compiler produces extensive debugging information in the DWARF format. This data, together with the symbol tables, grows in proportion to the number and size of items in the compiled code. It is optional and can be removed (“stripped”) without change in functionality. However, it is also fairly large and has grown recently in absolute size: more than 50% of total executable size in the latest versions. Moreover, Go application deployments in the wild do not often strip this data out.
So in conclusion, if we combine the Go internal bytes (both “dark” and non-“dark”) with the optional symbol tables and debugging information, we see a growing proportion of executable files encoded as non-code, non-data, currently at 70%, for the CockroachDB executable.
Comments
So what do you think? Did I miss something? Is any part unclear? Leave your comments below.