Note

The latest version of this document can be found online at https://dr-knz.net/go-executable-size-visualization-with-d3-2021.html. Alternate formats: Source, PDF.

Note

After a lively discussion on Hacker News and input from Russ Cox, the conclusions in the analysis below were reworded to avoid the notion of “non-useful bytes”. The bytes have a purpose.

Introduction

Two years ago, my article “Why are my Go executable files so large?” showed how to utilize D3 and a tree map visualization to explore the size of executable files produced by the Go compiler.

A few things have changed since, and so an update is in order.

Tooling updates

As presented the first time around, we are using a data pipeline that looks as follows:

  1. build a Go executable.
  2. use go tool nm -size and apply c++filt on the output.
  3. transform the output into a tree using a custom-designed Python script tab2dict.py.
  4. transform the tree into a valid input for the D3 tree map visualization using a custom-designed Python script simplify.py.

The source code for the Python scripts is public on GitHub: https://github.com/knz/go-binsize-viz

Since 2019, the Go and C++ compilers produce a larger diversity of symbols; the regular expressions used in tab2dict.py have been adjusted accordingly.

Separately, there was a usability shortcoming in the original implementation: if a Go package contained both some source files (e.g. sql/create.go) and sub-packages (e.g. sql/sem/tree/eval.go), the “own” size of the package and that of its sub-packages were appearing side-by-side in the visualization, instead of “inside” each other. This was confusing because the (human) explorer naturally expects a hierarchical view between these two values.

This shortcoming has also been corrected.

Example visualization

Here is a visualization for CockroachDB v20.2.7, the latest stable release at the time of this writing:

Surprising finding: “dark” file usage

The sum of the sizes reported by go tool nm does not add up to the final size of the Go executable.

For example, in the CockroachDB 20.2.7 binary:

  • the file occuppies 211694984 bytes (202MiB) on disk;
  • however, the sum of symbol sizes adds up to 118928245 bytes (113MB).
  • there is a gap of 92766739 bytes (88MiB) missing, or ~44% unaccounted for.

At first I suspected that this size was occupied by the symbol table itself, or the debugging information. To check this, we can use strip to remove the symbol table and debugging info and observe the difference. Alas:

  • the stripped executable size is 190680384 bytes (182MiB) on disk;
  • so there is still a gap of ~68MiB, or ~34% non-symtable data that is unaccounted for.

At this time, I do not have an explanation for this “dark” file usage.

(The word “dark” here reflects the idea that the bytes are not enlightened by the symbol table. It is also inspired from the concept of dark silicon invented/discovered around 2011. It is not a statement of moral value about the nature of these bytes.)

Note

After originally stating the above, Russ Cox from the Go team explained that the “dark” bytes are metadata for garbage collection and reflection, which are not present in the symbol table because 1) they do not need to be 2) accounting for them in the symbol table would make the binary even larger.

We can see how this dark file usage has evolved throughout the growth of CockroachDB:

CockroachDB version Go Exec. size (MiB) Stripped Sum nm -size Stripped sz. Dark bytes % dark bytes
v1.0.0 1.8 39830792 38.0 39799624 32216984 31168 7582640 19.0%
v1.0.7 1.8 39799624 38.0 39830792 32371395 0 7459397 18.7%
v1.1.0 1.8 43447496 41.4 43447496 35602483 0 7845013 18.1%
v1.1.9 1.8 46300200 44.2 46300200 37849642 0 8450558 18.2%
v2.0.0 1.10 54384568 51.9 54384576 44463267 0 9921309 18.2%
v2.0.7 1.10 56432824 53.8 56432832 45969263 0 10463569 18.5%
v2.1.0 1.10 135223352 129.0 68835904 55212282 66387448 13623622 10.1%
v2.1.11 1.10 136101520 129.8 69429056 54714649 66672464 14714407 10.8%
v19.1.0 1.11 124365384 118.6 111470120 71166968 12895264 40303152 32.4%
v19.1.11 1.11 124588560 118.8 111688008 71257435 12900552 40430573 32.4%
v19.2.0 1.12 163978120 156.4 145398096 92535059 18580024 52863037 32.2%
v19.2.12 1.12 165974336 158.3 147303432 93850056 18670904 53453376 32.2%
v20.1.0 1.13 135223352 129.0 147594448 93789751 0 53804697 39.8%
v20.1.13 1.13 167269624 159.5 148256120 94208103 19013504 54048017 32.3%
v20.2.0 1.13 209352968 199.7 188618784 117667098 20734184 70951686 33.9%
v20.2.7 1.13 211694984 201.9 190680384 118928245 21014600 71752139 33.9%
v21.1-alpha-geb1aa69bc4 1.15 183075488 174.6 135352792 107792826 47722696 27559966 15.1%

In this table:

  • “Exec. size” is the raw size of the executable file, in bytes.
  • “Stripped” is the size of the executable after the strip command was applied; i.e. after the symbol table and debugging information are removed.
  • “Sum nm -size” is the sum of the advertised sizes of the entries in the symbol table.
  • “Symtable sz.” is the estimated size of the symbol table itself, as deducted by taking the difference between the first two sizes. We can see that the v1.0.7 to v2.0.7 executables, as well as v20.1.0, were released pre-stripped.
  • “Dark bytes” is the gap between the raw file size and the combined sum of the advertised symbol sizes and the symbol table’s size and debugging information, in bytes.
  • “% dark bytes” is the percentage of the dark bytes relative to the raw file size.

We can see that the dark size percentage was lower than 20% prior to CockroachDB v19.1, and has then been oscillating around 33% of the file size until v21.1. With the upcoming v21.1 release, using Go 1.15, the dark size is reduced to 15% again.

Note

Not all the changes from row to row in the previous table is attributable to the Go compiler. Obviously, the CockroachDB software has evolved as well.

The evolution of pclntab

Up to Go 1.14

As explained in the previous analysis, up to and including Go 1.15 the compiler would generate a special table called runtime.pclntab inside the executable.

The purpose of this data structure is to enable the Go runtime system to produce descriptive stack traces upon a crash or upon internal requests via the runtime.GetStack API.

We can see how this table grows across Go versions until v1.13, and then decreases in v1.15:

CockroachDB version Go Exec. size (MiB) pclntab sz (MiB) % pclntab
v1.0.0 1.8 39830792 38.0 7316726 7.0 18.4%
v1.0.7 1.8 39799624 38.0 7318030 7.0 18.4%
v1.1.0 1.8 43447496 41.4 8193397 7.8 18.9%
v1.1.9 1.8 46300200 44.2 9103318 8.7 19.7%
v2.0.0 1.10 54384568 51.9 10745419 10.2 19.8%
v2.0.7 1.10 56432824 53.8 11205818 10.7 19.9%
v2.1.0 1.10 135223352 129.0 14364564 13.7 10.6%
v2.1.11 1.10 136101520 129.8 14445353 13.8 10.6%
v19.1.0 1.11 124365384 118.6 25055403 23.9 20.1%
v19.1.11 1.11 124588560 118.8 25095079 23.9 20.1%
v19.2.0 1.12 163978120 156.4 33619081 32.1 20.5%
v19.2.12 1.12 165974336 158.3 34010910 32.4 20.5%
v20.1.0 1.13 135223352 129.0 29927833 28.5 22.1%
v20.1.13 1.13 167269624 159.5 30073122 28.7 18.0%
v20.2.0 1.13 209352968 199.7 36139876 34.5 17.3%
v20.2.7 1.13 211694984 201.9 36467961 34.8 17.2%
v21.1-alpha-geb1aa69bc4 1.15 183075488 174.6 30763345 29.3 16.8%

The large size of the pclntab was due to a choice by the Go team to store the mapping of program counters to function names uncompressed.

To paraphrase:

  • prior to 1.2, the Go linker was emitting a compressed line table, and the program would decompress it upon initialization at run-time.
  • in Go 1.2, a decision was made to pre-expand the line table in the executable file into its final format suitable for direct use at run-time, without an additional decompression step.

In other words, the Go team decided to make executable files larger to save up on initialization time and run-time memory usage.

As we discussed back then, this choice was not well warranted for network servers like CockroachDB which are executed rarely, and where the size of the program on disk matters more than the start-up time.

Go 1.15 and beyond

The publication of my article in 2019, together with the community outcry that it triggered, were actually noticed by the Go team.

The Go team subsequently decided to change course and start working on compressing pclntab again.

We can see this change in the table above:

  • starting in Go 1.15, the pclntab is compressed again.
  • starting in Go 1.16, the pclntab is not embedded directly in the binary (or, at least, its advertised size in the symbol table is zero), and certain parts of it re-computed from other data in the executable file at run time. What does this change exactly?

Transition from Go 1.15 to 1.16

Using the source code for CockroachDB v21.1-alpha-geb1aa69bc4, we can produce custom builds across Linux and FreeBSD, with both the 1.15 and 1.16 compilers.

Platform Go Build mode Exec. sz. pclntab sz. Dark sz.
amd64-linux 1.15 release 183075488 30763345 (17%) 27559966 (15%)
amd64-freebsd 1.15 release (no geos) 305452856 30824594 (10%) 27052709 (9%)
amd64-freebsd 1.16 release (no geos) 289463288 0 64733620 (22%)
amd64-linux 1.15 dev 182679320 30811805 (17%) 27445431 (15%)
amd64-freebsd 1.15 dev (no geos) 305452912 30824594 (10%) 27052769 (9%)
amd64-freebsd 1.16 dev (no geos) 289463280 0 64733616 (22%)

What do we see here?

From Go 1.15 to 1.16, the overall size of the executable file has decreased. This reflects improvements in the Go toolchain.

Meanwhile, the bytes previously occupied by pclntab are now part of the dark bytes, which are not in the symbol table.

Note

After the above was published, Russ Cox explained:

One thing that did change from Go 1.15 to Go 1.16 is that we broke up the pclntab into a few different pieces. Again, it’s all in the section headers. But the pieces are not in the actual binary’s symbol table anymore, because they don’t need to be. And since the format is different, we would have removed the old “runtime.pclntab” symbol entirely, except some old tools got mad if the symbol was missing. So we left the old symbol table entry present, with a zero length.

So much data! For what exactly?

An interesting way to think about the results above is that we now have two parts of a Go executable file that do not really contribute to making a program “work”:

  • Optional parts, which can be deleted via strip:
    • the symbol table itself,
    • the debugging information.
  • Parts which are neither code nor data; we’ll call them “Go internal data”:
    • The pclntab, when generated. This is needed to generate stack traces upon errors and other debugging-related runtime features of Go programs.
    • The “dark bytes”, which is byte usage in the raw executable file not accounted for in the symbol table.

We can derive the numbers from the tables above:

Go Raw size Stripped Optional Code+data Go internal
1.8 39830792 39799624 31168 24900258 14899366
1.8 39799624 39830792 0 25053365 14777427
1.8 43447496 43447496 0 27409086 16038410
1.8 46300200 46300200 0 28746324 17553876
1.10 54384568 54384576 0 33717848 20666728
1.10 56432824 56432832 0 34763445 21669387
1.10 135223352 68835904 66387448 40847718 27988186
1.10 136101520 69429056 66672464 40269296 29159760
1.11 124365384 111470120 12895264 46111565 65358555
1.11 124588560 111688008 12900552 46162356 65525652
1.12 163978120 145398096 18580024 58915978 86482118
1.12 165974336 147303432 18670904 59839146 87464286
1.13 135223352 147594448 0 63861918 83732530
1.13 167269624 148256120 19013504 64134981 84121139
1.13 209352968 188618784 20734184 81527222 107091562
1.13 211694984 190680384 21014600 82460284 108220100
1.15 183075488 135352792 47722696 77029481 58323311
1.15 182679320 135270120 47409200 77012884 58257236
1.15 305452856 133460408 171992448 75583105 57877303
1.15 305452912 133460472 171992440 75583109 57877363
1.16 289463288 140513816 148949472 75780196 64733620
1.16 289463280 140513816 148949464 75780200 64733616

Note

Not all the changes from row to row in the previous table are attributable to the Go compiler. Obviously, the CockroachDB software has evolved as well.

Note

The exception to this is the last 4 rows in the table, which are produced with the same version of CockroachDB (v21.1-alpha, see previous section).

Note

After the above was published, user ‘zeebo’ from the site Lobste.rs did the work to compile the same version of CockroachDB (v20.2.0) with different Go versions. The results are published here.

External perspective

We can look at the raw numbers above from an “external” perspective: how much are the various groups of data responsible for the payload on disk.

Here are the rows in the table above where the executable was pre-stripped:

Go Code+data bytes Go internal bytes
1.8 62.5% 37.4%
1.8 62.9% 37.1%
1.8 63.1% 36.9%
1.8 62.1% 37.9%
1.10 62.0% 38.0%
1.10 61.6% 38.4%
1.13 47.2% 61.9%

These percentages are the ratio of the corresponding size in the table above to the total raw size of the executable file.

The ratios are somewhat consistent throughout this sequence, except for the last row. The last row is an outlier. I do not have an explanation for that, but I find it possible that the corresponding binary, crdb 21.1.0 and thus the first in its release series, was produced with different compiler flags.

Here are the rows where the optional bytes were still present:

Go Optional bytes Code+data bytes Go internal bytes
1.10 49.1% 30.2% 20.7%
1.10 49.0% 29.6% 21.4%
1.11 10.4% 37.1% 52.6%
1.11 10.4% 37.1% 52.6%
1.12 11.3% 35.9% 52.7%
1.12 11.2% 36.1% 52.7%
1.13 11.4% 38.3% 50.3%
1.13 9.9% 38.9% 51.2%
1.13 9.9% 39.0% 51.1%
1.15 26.1% 42.1% 31.9%
1.15 26.0% 42.2% 31.9%
1.15 56.3% 24.7% 18.9%
1.16 51.5% 26.2% 22.4%

What we see from this data:

  • The “share” of Go-internal data as % of the total executable size was growing up to go 1.13, and then seriously decreased after that.
  • However, it was complemented by a strong increase in the optional bytes.

My interpretation of this shift:

  • some debugging data was previously stored as Go objects and was moved to a DWARF (debugging data) representation, which is now strippable.
  • The encoding of pclntab and/or other data structures in the Go runtime has become more efficient.

I am still surprised by the uptick in the % of Go internal data in v1.16 relative to v1.15, which remains to be investigated. (Also, as noted above, this % increase is still accompanied by a general decrease in absolute executable size, for the same source code.)

Internal perspective

Here, we are going to try and check whether the “Go internal data” is derived from the program code+data as a constant factor overhead.

For this, we take the same raw numbers as above but re-express the Go internal data as percentage of the size of code+data on disk (i.e. excluding the optional data).

Taking percentages also allows us to abstract from changes in the program’s source code.

This looks as follows:

Go pclntab Dark bytes Total
1.8 29.4% 30.5% 59.8%
1.8 29.2% 29.8% 59.0%
1.8 29.9% 28.6% 58.5%
1.8 31.7% 29.4% 61.1%
1.10 31.9% 29.4% 61.3%
1.10 32.2% 30.1% 62.3%
1.10 35.2% 33.4% 68.5%
1.10 35.9% 36.5% 72.4%
1.11 54.3% 87.4% 141.7%
1.11 54.4% 87.6% 141.9%
1.12 57.1% 89.7% 146.8%
1.12 56.8% 89.3% 146.2%
1.13 46.9% 84.3% 131.1%
1.13 46.9% 84.3% 131.2%
1.13 44.3% 87.0% 131.4%
1.13 44.2% 87.0% 131.2%
1.15 39.9% 35.8% 75.7%
1.15 40.0% 35.6% 75.6%
1.15 40.8% 35.8% 76.6%
1.15 40.8% 35.8% 76.6%
1.16 0.0% 85.4% 85.4%
1.16 0.0% 85.4% 85.4%

What we see in this table:

  • In Go 1.16, the pclntab data is not represented any more in the symbol table, but it is still there in the overall Go internal bytes; it simply has become “dark”.
  • The Go internal data had become non-linearly larger than code+data bytes across Go versions until Go 1.13. Moreover, up to and including Go 1.12, the function was not a constant factor of the compiled size, meaning that the expansion was dependent on the type of program in the source code or some other factor.
  • From Go 1.13 onward, the factor became about constant for a given Go version, as revealed by the Go 1.13-1.16 data points. This means that we can deem it reasonable to assume that the Go internal data is now sized as a function of the compiled program code+data, and not any more some other factor.
  • The overall percentage of Go internal data has increased again in Go 1.16. This was unexpected and I do not have an explanation for this yet. (Note: this is not a problem per se as the overall executable size has decreased at the same time.)

In any case, we are still looking at an executable file where nearly as much space is occuped by Go internal data other than code+data (with a mere 15% difference in Go 1.16).

Summary and conclusions

In our original analysis in 2019, we looked at the output of go tool nm -size and drew a tree map representation for it. This helped us detect an anomaly, in the size of a special data structure called runtime.pclntab, which was growing excessively large.

In files generated by a Go 1.16 compiler, this data structure appears to be absent from the symbol table. Was it removed?

This year, we revisited the analysis and discovered that the symbol table is not complete. There are many bytes in the binary executable that are not accounted for, neither by the announced size of objects in the symbol table, nor by the size of the symbol table itself.

We can call this the “dark file usage” of Go binaries. It is “dark” not because it is “bad” or “unknown” but because it is not enlightened by the symbol table.

In fact, the removal of the advertised bytes for pclntab in Go 1.16 was accompanied by a corresponding increase to the “dark” bytes. The data is still there, it is just not accounted for. As explained by Russ Cox from the Go team, it is not accounted for because it does not need to be.

These bytes of non-code, non-data objects have been made necessary by internal algorithmic choices by the Go team, which are not made in the same way in other languages, and we can thus say they are “Go internal bytes”.

They are nearly as large as the code+data bytes in the file: for every 6 bytes of code+data, there are 5 accompanying “Go internal” bytes (approximately, in the latest version).

The proportion of these “Go internal bytes” relative to program code+data has grown in CockroachDB over the course of Go versions up to Go 1.13, and then reduced in Go 1.15, and has grown again in Go 1.16. However, meanwhile, improvements in the Go toolchain have reduced the absolute size of compiled code+data to a larger extent, so that the corresponding small increase in the size of Go internal bytes is not reflected in total file size.

Separately and independently, the Go compiler produces extensive debugging information in the DWARF format. This data, together with the symbol tables, grows in proportion to the number and size of items in the compiled code. It is optional and can be removed (“stripped”) without change in functionality. However, it is also fairly large and has grown recently in absolute size: more than 50% of total executable size in the latest versions. Moreover, Go application deployments in the wild do not often strip this data out.

So in conclusion, if we combine the Go internal bytes (both “dark” and non-“dark”) with the optional symbol tables and debugging information, we see a growing proportion of executable files encoded as non-code, non-data, currently at 70%, for the CockroachDB executable.

Like this post? Share on: TwitterHacker NewsRedditLinkedInEmail


Raphael ‘kena’ Poss Avatar Raphael ‘kena’ Poss is a computer scientist and software engineer specialized in compiler construction, computer architecture, operating systems and databases.
Comments

So what do you think? Did I miss something? Is any part unclear? Leave your comments below.


Keep Reading


Reading Time

~13 min read

Published

Last Updated

Category

Programming

Tags

Stay in Touch