A while ago, we wrote a post on how we helped a client initially integrate the Testwell CTC++ code coverage tool from Verifysoft into their Bazel build.
Since then, some circumstances have changed, and we were recently challenged to see if we could improve the CTC++/Bazel integration to the point were CTC++ coverage builds could enjoy the same benefits of Bazel caching and incremental rebuilds as regular (non-coverage) builds. Our objective was to make it feasible for developers to do coverage builds with CTC++ locally, rather than them using different coverage tools or delaying coverage testing altogether. Thus we could enable the client to focus their efforts on improving overall test coverage with CTC++ as their only coverage tool.
In this sequel to the initial integration, we, as a team, have come up with a more involved scheme for making CTC++ meet Bazel’s expectations of hermetic and reproducible build actions. There is considerable extra complexity needed to make this work, but the result is a typical speedup of 5-10 times on most coverage builds. The kind of speedup that not only makes your CI faster, but that allows developers to work in a different and more efficient way, altogether.
More generally, we hope this blog post can serve as a good example (or maybe a cautionary tale 😉) of how to take a tool that does not play well with Bazel’s idea of a well-behaved build step, and force it into a shape where we can still leverage Bazel’s strengths.
The status quo
You can read our previous blog post for more details, but here we’ll quickly summarize the relevant bits of the situation after our initial integration of CTC++ coverage builds with Bazel:
- CTC++ works by wrapping the compiler invocation with its
ctc
tool, and adding coverage instrumentation between the preprocessing and compiling steps. - In addition to instrumenting the source code itself,
ctc
also writes instrumentation data in a custom text format (aka. symbol data) to a separate output file, typically calledMON.sym
(aka. the symbol file). - At runtime the instrumented unit tests will collect coverage statistics and
write these (in binary form) to another separate output file:
MON.dat
. - As far as Bazel is concerned, both the
MON.sym
andMON.dat
files are untracked side-effects of the respective compilation and testing steps. As such we had to poke a hole in the Bazel sandbox and arrange for these files to be written to a persistent location without otherwise being tracked or managed by Bazel. - More importantly, these side-effects mean that we have to disable all caching
and re-run the entire build and all tests from scratch every single time.
Otherwise, we would end up with incomplete
MON.sym
andMON.dat
files.
Another consideration - not emphasized in our previous post since we had to
disable caching of intermediate outputs in any case - is that the outputs from
ctc
are not hermetic and reproducible. Both the instrumentation that is added
to the source code, as well as the symbol file that is written separately by
ctc
contain the following information that is collected at compile time:
- Absolute paths to source code files: Even though Bazel passes relative
paths on the command-line,
ctc
will still resolve these into absolute paths and record these paths into its outputs. Since all these build steps run inside the Bazel sandbox, the recorded paths vary arbitrarily from build to build. Even worse: the paths are made invalid as soon as the sandbox is removed, when the compilation step is done. - Timestamps:
ctc
will also record timestamps into the instrumented source code and the symbol file. As far as we know, these might have been part of some internal consistency check in previous versions of CTC++, but currently they are simply copied into the final report, and displayed as a property of the associated symbol data on which the HTML report is based. Since our coverage reports are already tied to known Git commits in the code base, these timestamps have no additional value for us. - Fingerprints:
ctc
calculates a 32-bit fingerprint based on the symbol data, and records this fingerprint into both the instrumented source and the symbol file. Since the symbol data already contains absolute path names as detailed above, the resulting fingerprint will also vary accordingly, and thus not be reproducible from one build to the next, even when all other inputs remain unchanged.
Outlining the problems to be solved
If we are to make CTC++ coverage builds quicker by leveraging the Bazel cache, we must answer these two questions:
- Can we make
ctc
’s outputs reproducible? Without this, re-enabling the Bazel cache for these builds is a non-starter, as each re-evaluation of an intermediate build step will have never-before-seen action inputs, and none of the cached outputs from previous builds will ever get reused. - Can we somehow capture the extra
MON.sym
output written byctc
at build time, and appropriately include it into Bazel’s build graph?1 We need for Bazel to cache and reuse the symbol data associated with a compilation unit in exactly the same way that it would cache and reuse the object file associated with the same compilation unit.
Solving both of these would allow us to achieve a correct coverage report assembled from cached object files and symbol data from previously-built and unchanged source code, together with newly-built object files and symbol data from recently-changed source code (in addition to the coverage statistics collected from re-running all tests).
Achieving reproducibility
Let’s tackle the problem of making ctc
’s outputs reproducible first. We start
by observing that ctc
allows us to
configure hook scripts that will be invoked at various
points while ctc
is running. We are specifically interested in:
RUN_AFTER_CPP
, allows access to the preprocessed source before the instrumentation step, andRUN_AFTER_INSTR
, allows access to the instrumented source before it’s passed on to the underlying compiler.
From our existing work, we of course also have our own wrapper script around
ctc
, which allows us to access the outputs of each ctc
invocation before
they are handed back to Bazel. We also know, from our previous work, that we can
instruct ctc
to write a separate symbol file per compilation unit, rather than
have all compilation units append to the same MON.sym
file.
Together this allows us to rewrite the outputs from ctc
in such a way as to
make them reproducible. What we want to rewrite, has already been outlined
above:
- Absolute paths into the sandbox: We could rewrite these into corresponding absolute paths to the original source tree instead, but we can just as well take it one step further and simply strip the sandbox root directory prefix from all absolute paths. This turns them into relative paths that happen to resolve correctly, whether they’re taken relative to the sandbox directory at compile time, or relative to the root of the source tree afterwards.
- Timestamps: This one is relatively easy, we just need to decide on a static timestamp that does not change across builds. For some reason the CTC++ report tooling did not like us passing the ultimate default timestamp, aka. the Unix Epoch, so we instead settled for midnight on January 1 2024.2
- Fingerprints: Here we need to calculate a 32-bit value that will reflect
the complete source code in this compilation unit (but importantly with
transient sandbox paths excluded). We don’t have direct access to the
in-progress symbol data that
ctc
uses to calculate its own fingerprint, so instead we settle on calculating a CRC32 checksum across the entire preprocessed source code (beforectc
adds its own instrumentation).3
Once we’ve figured out what to rewrite, we can move on to the how:
- Using the
RUN_AFTER_CPP
option toctc
, we can pass in a small script that calculates our new fingerprint by running the preprocessed source code through CRC32. - Using the
RUN_AFTER_INSTR
option toctc
, we can pass in a script that processes the instrumented source, line by line:- rewriting any absolute paths that point into the Bazel sandbox,
- rewriting the timestamp recorded by
ctc
into our static timestamp, and - rewriting the fingerprint to the one calculated in step 1.
- In our script that wraps the
ctc
invocation, we can insert the above two options on thectc
command line. We can also instructctc
to write a separate.sym
file for this compilation unit inside the sandbox. - In the same wrapper script, after
ctc
is done producing the object file and symbol file for a compilation unit, we can now rewrite the symbol file thatctc
produced. The rewrites are essentially the same as performed in step 2, although the syntax of the symbol file is different than the instrumented source.
At this point, we have managed to make ctc
’s outputs reproducible, and we can
proceed to looking at the second problem from above: properly capturing and
maintaining the symbol data generated by ctc
. However, we have changed
the nature of the symbol data somewhat: Instead of having multiple compilation
units write to the same MON.sym
file outside of the sandbox, we now have one
.sym
file per compilation unit written inside the sandbox. These files are
not yet known to Bazel, and would be removed together with the rest of the
sandbox as soon as the compilation step is finished.
Enabling correct cache/reuse of symbol data
What we want to achieve here is for the symbol data associated with a compilation unit to closely accompany the corresponding object file from the same compilation unit: If the object file is cached and later reused by Bazel, we want the symbol file to be treated the same. And when the object file is linked into an executable or a shared library, we want the symbol file to automatically become part of any coverage report that is later created based on running code from that executable or library.
I suspect there are other ways we could handle this, for example using Bazel aspects, or similar, but since we’re already knee-deep in compiler wrappers and rewriting outputs…
In for a penny, in for a pound…
Given that we want the symbol file to be as closely associated with the object file as possible, let’s take that to the ultimate conclusion and make it a stowaway inside the object file. After all, the object file is “just” an ELF file, and it does not take too much squinting to regard the ELF format as a generic container of sections, where a section really can be any piece of data you like.
The objcopy
tool, part of the GNU binutils tool suite, also comes to our aid
with options like --add-section
and --dump-section
to help us embed and
extract such sections from any ELF file.
With this in hand, we can design the following scheme:
- In our wrapper script, after
ctc
has generated an object file with an accompanying symbol file, we runobjcopy --add-section ctc_sym=$SYMBOL_FILE $OBJECT_FILE
to embed the symbol file as a newctc_sym
section inside the object file. - We make no changes to our Bazel build, otherwise. We merely expect Bazel to collect, cache, and reuse the object files as it would do with any intermediate build output. The symbol data is just along for the ride.
- In the linking phase (which is already intercepted by
ctc
and our wrapper script) we can forward the symbol data from the linker inputs (ELF object files) into the linker output (a shared library or executable, also in the ELF format), like this: Extract thectc_sym
from each object file passed as input (objcopy --dump-section ctc_sym=$SYMBOL_FILE $OBJECT_FILE /dev/null
), then concatenate these symbol files together, and finally embed that into the ELF output file from the linker.4 - At test run time, in addition to running the tests (which together produce
MON.dat
as a side effect), we can iterate over the test executables and their shared library dependencies, and extract anyctc_sym
sections that we come across. These are then split into separate symbol files and placed next toMON.dat
. - Finally, we can pass
MON.dat
and all the.sym
files on to thectcreport
report generator to generate the final HTML report.5
Results
With all of the above in place, we can run coverage builds with and without our changes, while testing various build scenarios, to see what we have achieved.
Let’s look at some sample build times for generating CTC++ coverage reports. All times below are taken from the best of three runs, all on the same machine.
Status quo
Starting with the situation as of our previous blog post:
Scope of coverage build + tests | bazel build/test |
ctcreport |
Total |
---|---|---|---|
Entire source tree | 38m46s | 2m06s | 44m26s |
One large application | 13m59s | 43s | 15m30s |
One small application | 21s | 1s | 35s |
Since caching is intentionally disabled and there is no reuse between these coverage builds, these are the kinds of numbers you will get, no matter the size of your changes since the last coverage build.
Let’s look at the situation after we made the changes outlined above.
Worst case after our changes: No cache to be reused
First, for a new coverage build from scratch (i.e. a situation in which there is nothing that can be reused from the cache):
Scope of coverage build + tests | bazel build/test |
ctcreport |
Total | Speedup |
---|---|---|---|---|
Entire source tree | 38m48s | 1m59s | 43m03s | 1.0x |
One large application | 13m04s | 43s | 14m26s | 1.1x |
One small application | 19s | 1s | 22s | 1.6x |
As expected, these numbers are very similar to the status quo. After all, we are doing the same amount of work, and this is not the scenario we sought to improve in any case.
There is maybe a marginal improvement in the overhead (i.e. the time spent
between/around bazel
and ctcreport
), but it’s pretty much lost in the noise,
and certainly nothing worth writing a blog post about.
Best case after our changes: Rebuild with no changes
This is the situation where we are now able to reuse already-instrumented intermediate build outputs. In fact, in this case there are no changes whatsoever, and Bazel can reuse the test executables from the previous build directly, no (re-)building necessary. However, as discussed above, we do need to re-run all tests and then re-generate the coverage report:
Scope of coverage build + tests | bazel build/test |
ctcreport |
Total | Speedup |
---|---|---|---|---|
Entire source tree | 3m24s | 1m58s | 6m55s | 6.4x |
One large application | 1m31s | 42s | 2m49s | 5.5x |
One small application | 1s | 1s | 4s | 8.8x |
Common case after our changes: Rebuild with limited change set
This last table is in many ways the most interesting (but least accurate), as it tries to reflect the common case that most developers are interested in:
“I’ve made a few changes to the source code, how long will I have to wait to see the updated coverage numbers?”
Of course, as with a regular build, it depends on the size of your changes, and the extent to which they cause misses in Bazel’s build cache. Here, I’ve done some small source code change that cause rebuilds in a handful of compilation units:
Scope of coverage build + tests | bazel build/test |
ctcreport |
Total | Speedup |
---|---|---|---|---|
Entire source tree | 3m23s | 1m57s | 6m54s | 6.4x |
One large application | 1m34s | 42s | 2m52s | 5.4x |
One small application | 4s | 1s | 6s | 5.8x |
The expectation here would be that the total time needed is the sum of how long it takes to do a regular build of your changes, plus the numbers from the no-op case above. And this seems to largely hold true. Especially for the single- application case were we expect your changes to affect application’s unit tests, and therefore the build phase must strictly precede the test runs.
In the full source tree scenario, it seems that Bazel can start running other (unrelated) tests concurrently with building your changes, and as long as your changes, and the tests on which they depend, are not among the slowest tests to run, then those other, slower test will “hide” the marginal build time cost imposed by your changes.
Conclusion
We have achieved what we set out to do: to leverage the Bazel cache to avoid unnecessary re-building of coverage-instrumented source code. It involves a fair amount of added complexity in the build process, in order to make CTC++‘s outputs reproducible, and thus reusable by Bazel, but the end result, in the common case - a developer making a small source code change relative to a previous coverage build - is a 5-10x speedup of the total time needed to build and test with coverage instrumentation, including the generation of the final coverage report.
Future work
A natural extension of the above scheme is to apply a similar treatment to the generation of the coverage statistics at test runtime: Bazel allows for test runs to be cached, so that later build/test runs can reuse the results and logs from earlier test runs, rather than having to re-run tests that haven’t changed.
However, in much the same way as for symbol data at build time, we would need to
make sure that coverage statistics (.dat
files) were saved and reused along
with the corresponding test run results/logs.
One could imagine each test creating a separate .dat
file when run, and then
have Bazel cache this together with the test logs. The report generation phase
would then need to collect the .dat
files from both the reused/cached and
the new/uncached test runs, and pass them all to the ctcreport
tool.
Failure to do so correctly would cause coverage statistics to be lost, and the
resulting coverage report would be misleading.
With all this in place we could then enable caching of test results (in
practice, removing the --nocache_test_results
flag that we currently pass),
and enjoy yet another speedup courtesy of Bazel’s cache.
That said, we are entering the realm of diminishing returns: Unit tests - once
they are built - typically run quickly, and there is certainly less time to be
saved here than what is saved by reusing cached build results. Looking at the
above numbers: even if we were able to fully eliminate time used by
bazel test
, we would still only achieve another 2x speedup, theoretically.
For now, we can live with re-running all tests from scratch in order to create
a complete MON.dat
file, every time.
And that is where I believe it stops: extending this even further to incrementally generate the coverage report itself, in effect to re-generate parts of the report based on a few changed inputs, is - as far as I can see - not possible with the existing tools.
Finally, I want to commend Verifysoft for their understanding and cooperation. I can only imagine that for someone not used to working with Bazel, our initial questions must have seemed very eccentric. They were, however, eager to understand our situation and find a way to make CTC++ work for us. They have even hinted at including a feature in a future version of CTC++ to allow shortening/mapping paths at instrumentation time. Using such a feature to remove the sandbox paths would also have the nice side effect of making CTC++‘s own fingerprint logic reproducible, as far as we can see. Together, this would enable us to stop rewriting paths and fingerprints on our own.
Thanks to Mark Karpov for being my main co-conspirator in coming up with this scheme, and helping to work out all the side quests and kinks along the way.
Also thanks to Christopher Harrison, Joseph Neeman, and Malte Poll for their reviews of this article.
- For now, we ignore the non-hermetic writing of
MON.dat
files. See the section on future work for how tackling this properly is in many ways similar (and similarly complex) to what we’re doing for the CTC++ symbol data in the rest of this article.↩ - On reconsideration, we should probably have used the somewhat standardized
$SOURCE_DATE_EPOCH
environment variable here rather than coming up with our own static date. In practice, it should not matter.↩ - In later talks with Verifysoft, we have been given the OK that this fingerprint scheme should be sufficient for our purpose, at least until a new version of CTC++ that allows for more reproducible fingerprints are available.↩
- It seems that - by default - the linker is doing almost exactly what we
want: The
ctc_sym
sections from the linker inputs are indeed automatically concatenated into the linker output. However, the linker appears to discard sections from inputs that are completely optimized away at link time. But we do in fact want these symbol data sections to be retained, otherwise the final coverage report would omit the corresponding source files rather than showing them as lacking test coverage. Hence we resort to maintaining thectc_sym
section ourselves at link time.↩ - As an extra sanity check,
ctcreport
will verify that the fingerprints from inside the given.sym
files match the corresponding fingerprints recorded alongside the coverage statistics in theMON.dat
file. Thus we can discover if we’ve messed up somewhere along the way.↩
About the author
Johan is a Developer Productivity Engineer at Tweag. Originally from Western Norway, he is currently based in Delft, NL, and enjoys this opportunity to discover the Netherlands and the rest of continental Europe. Johan has almost twenty years of industry experience, mostly working with Linux and open source software within the embedded realm. He has a passion for designing and implementing elegant and useful solutions to challenging problems, and is always looking for underlying root causes to the problems that face software developers today. Outside of work, he enjoys playing jazz piano and cycling.
If you enjoyed this article, you might be interested in joining the Tweag team.