Bazel and Testwell CTC++, revisited

6 March 2025 — by Johan Herland

A while ago, we wrote a post on how we helped a client initially integrate the Testwell CTC++ code coverage tool from Verifysoft into their Bazel build.

Since then, some circumstances have changed, and we were recently challenged to see if we could improve the CTC++/Bazel integration to the point were CTC++ coverage builds could enjoy the same benefits of Bazel caching and incremental rebuilds as regular (non-coverage) builds. Our objective was to make it feasible for developers to do coverage builds with CTC++ locally, rather than them using different coverage tools or delaying coverage testing altogether. Thus we could enable the client to focus their efforts on improving overall test coverage with CTC++ as their only coverage tool.

In this sequel to the initial integration, we, as a team, have come up with a more involved scheme for making CTC++ meet Bazel’s expectations of hermetic and reproducible build actions. There is considerable extra complexity needed to make this work, but the result is a typical speedup of 5-10 times on most coverage builds. The kind of speedup that not only makes your CI faster, but that allows developers to work in a different and more efficient way, altogether.

More generally, we hope this blog post can serve as a good example (or maybe a cautionary tale 😉) of how to take a tool that does not play well with Bazel’s idea of a well-behaved build step, and force it into a shape where we can still leverage Bazel’s strengths.

The status quo

You can read our previous blog post for more details, but here we’ll quickly summarize the relevant bits of the situation after our initial integration of CTC++ coverage builds with Bazel:

CTC++ works by wrapping the compiler invocation with its ctc tool, and adding coverage instrumentation between the preprocessing and compiling steps.
In addition to instrumenting the source code itself, ctc also writes instrumentation data in a custom text format (aka. symbol data) to a separate output file, typically called MON.sym (aka. the symbol file).
At runtime the instrumented unit tests will collect coverage statistics and write these (in binary form) to another separate output file: MON.dat.
As far as Bazel is concerned, both the MON.sym and MON.dat files are untracked side-effects of the respective compilation and testing steps. As such we had to poke a hole in the Bazel sandbox and arrange for these files to be written to a persistent location without otherwise being tracked or managed by Bazel.
More importantly, these side-effects mean that we have to disable all caching and re-run the entire build and all tests from scratch every single time. Otherwise, we would end up with incomplete MON.sym and MON.dat files.

Another consideration - not emphasized in our previous post since we had to disable caching of intermediate outputs in any case - is that the outputs from ctc are not hermetic and reproducible. Both the instrumentation that is added to the source code, as well as the symbol file that is written separately by ctc contain the following information that is collected at compile time:

Absolute paths to source code files: Even though Bazel passes relative paths on the command-line, ctc will still resolve these into absolute paths and record these paths into its outputs. Since all these build steps run inside the Bazel sandbox, the recorded paths vary arbitrarily from build to build. Even worse: the paths are made invalid as soon as the sandbox is removed, when the compilation step is done.
Timestamps: ctc will also record timestamps into the instrumented source code and the symbol file. As far as we know, these might have been part of some internal consistency check in previous versions of CTC++, but currently they are simply copied into the final report, and displayed as a property of the associated symbol data on which the HTML report is based. Since our coverage reports are already tied to known Git commits in the code base, these timestamps have no additional value for us.
Fingerprints: ctc calculates a 32-bit fingerprint based on the symbol data, and records this fingerprint into both the instrumented source and the symbol file. Since the symbol data already contains absolute path names as detailed above, the resulting fingerprint will also vary accordingly, and thus not be reproducible from one build to the next, even when all other inputs remain unchanged.

Outlining the problems to be solved

If we are to make CTC++ coverage builds quicker by leveraging the Bazel cache, we must answer these two questions:

Can we make ctc’s outputs reproducible? Without this, re-enabling the Bazel cache for these builds is a non-starter, as each re-evaluation of an intermediate build step will have never-before-seen action inputs, and none of the cached outputs from previous builds will ever get reused.
Can we somehow capture the extra MON.sym output written by ctc at build time, and appropriately include it into Bazel’s build graph?¹ We need for Bazel to cache and reuse the symbol data associated with a compilation unit in exactly the same way that it would cache and reuse the object file associated with the same compilation unit.

Solving both of these would allow us to achieve a correct coverage report assembled from cached object files and symbol data from previously-built and unchanged source code, together with newly-built object files and symbol data from recently-changed source code (in addition to the coverage statistics collected from re-running all tests).

Achieving reproducibility

Let’s tackle the problem of making ctc’s outputs reproducible first. We start by observing that ctc allows us to configure hook scripts that will be invoked at various points while ctc is running. We are specifically interested in:

RUN_AFTER_CPP, allows access to the preprocessed source before the instrumentation step, and
RUN_AFTER_INSTR, allows access to the instrumented source before it’s passed on to the underlying compiler.

From our existing work, we of course also have our own wrapper script around ctc, which allows us to access the outputs of each ctc invocation before they are handed back to Bazel. We also know, from our previous work, that we can instruct ctc to write a separate symbol file per compilation unit, rather than have all compilation units append to the same MON.sym file.

Together this allows us to rewrite the outputs from ctc in such a way as to make them reproducible. What we want to rewrite, has already been outlined above:

Absolute paths into the sandbox: We could rewrite these into corresponding absolute paths to the original source tree instead, but we can just as well take it one step further and simply strip the sandbox root directory prefix from all absolute paths. This turns them into relative paths that happen to resolve correctly, whether they’re taken relative to the sandbox directory at compile time, or relative to the root of the source tree afterwards.
Timestamps: This one is relatively easy, we just need to decide on a static timestamp that does not change across builds. For some reason the CTC++ report tooling did not like us passing the ultimate default timestamp, aka. the Unix Epoch, so we instead settled for midnight on January 1 2024.²
Fingerprints: Here we need to calculate a 32-bit value that will reflect the complete source code in this compilation unit (but importantly with transient sandbox paths excluded). We don’t have direct access to the in-progress symbol data that ctc uses to calculate its own fingerprint, so instead we settle on calculating a CRC32 checksum across the entire preprocessed source code (before ctc adds its own instrumentation).³

Once we’ve figured out what to rewrite, we can move on to the how:

Using the RUN_AFTER_CPP option to ctc, we can pass in a small script that calculates our new fingerprint by running the preprocessed source code through CRC32.
Using the RUN_AFTER_INSTR option to ctc, we can pass in a script that processes the instrumented source, line by line:
- rewriting any absolute paths that point into the Bazel sandbox,
- rewriting the timestamp recorded by ctc into our static timestamp, and
- rewriting the fingerprint to the one calculated in step 1.
In our script that wraps the ctc invocation, we can insert the above two options on the ctc command line. We can also instruct ctc to write a separate .sym file for this compilation unit inside the sandbox.
In the same wrapper script, after ctc is done producing the object file and symbol file for a compilation unit, we can now rewrite the symbol file that ctc produced. The rewrites are essentially the same as performed in step 2, although the syntax of the symbol file is different than the instrumented source.

At this point, we have managed to make ctc’s outputs reproducible, and we can proceed to looking at the second problem from above: properly capturing and maintaining the symbol data generated by ctc. However, we have changed the nature of the symbol data somewhat: Instead of having multiple compilation units write to the same MON.sym file outside of the sandbox, we now have one .sym file per compilation unit written inside the sandbox. These files are not yet known to Bazel, and would be removed together with the rest of the sandbox as soon as the compilation step is finished.

Enabling correct cache/reuse of symbol data

What we want to achieve here is for the symbol data associated with a compilation unit to closely accompany the corresponding object file from the same compilation unit: If the object file is cached and later reused by Bazel, we want the symbol file to be treated the same. And when the object file is linked into an executable or a shared library, we want the symbol file to automatically become part of any coverage report that is later created based on running code from that executable or library.

I suspect there are other ways we could handle this, for example using Bazel aspects, or similar, but since we’re already knee-deep in compiler wrappers and rewriting outputs…

In for a penny, in for a pound…

Given that we want the symbol file to be as closely associated with the object file as possible, let’s take that to the ultimate conclusion and make it a stowaway inside the object file. After all, the object file is “just” an ELF file, and it does not take too much squinting to regard the ELF format as a generic container of sections, where a section really can be any piece of data you like.

The objcopy tool, part of the GNU binutils tool suite, also comes to our aid with options like --add-section and --dump-section to help us embed and extract such sections from any ELF file.

With this in hand, we can design the following scheme:

In our wrapper script, after ctc has generated an object file with an accompanying symbol file, we run objcopy --add-section ctc_sym=$SYMBOL_FILE $OBJECT_FILE to embed the symbol file as a new ctc_sym section inside the object file.
We make no changes to our Bazel build, otherwise. We merely expect Bazel to collect, cache, and reuse the object files as it would do with any intermediate build output. The symbol data is just along for the ride.
In the linking phase (which is already intercepted by ctc and our wrapper script) we can forward the symbol data from the linker inputs (ELF object files) into the linker output (a shared library or executable, also in the ELF format), like this: Extract the ctc_sym from each object file passed as input (objcopy --dump-section ctc_sym=$SYMBOL_FILE $OBJECT_FILE /dev/null), then concatenate these symbol files together, and finally embed that into the ELF output file from the linker.⁴
At test run time, in addition to running the tests (which together produce MON.dat as a side effect), we can iterate over the test executables and their shared library dependencies, and extract any ctc_sym sections that we come across. These are then split into separate symbol files and placed next to MON.dat.
Finally, we can pass MON.dat and all the .sym files on to the ctcreport report generator to generate the final HTML report.⁵

Results

With all of the above in place, we can run coverage builds with and without our changes, while testing various build scenarios, to see what we have achieved.

Let’s look at some sample build times for generating CTC++ coverage reports. All times below are taken from the best of three runs, all on the same machine.

Status quo

Starting with the situation as of our previous blog post:

Scope of coverage build + tests	`bazel` build/test	`ctcreport`	Total
Entire source tree	38m46s	2m06s	44m26s
One large application	13m59s	43s	15m30s
One small application	21s	1s	35s

Since caching is intentionally disabled and there is no reuse between these coverage builds, these are the kinds of numbers you will get, no matter the size of your changes since the last coverage build.

Let’s look at the situation after we made the changes outlined above.

Worst case after our changes: No cache to be reused

First, for a new coverage build from scratch (i.e. a situation in which there is nothing that can be reused from the cache):

Scope of coverage build + tests	`bazel` build/test	`ctcreport`	Total	Speedup
Entire source tree	38m48s	1m59s	43m03s	1.0x
One large application	13m04s	43s	14m26s	1.1x
One small application	19s	1s	22s	1.6x

As expected, these numbers are very similar to the status quo. After all, we are doing the same amount of work, and this is not the scenario we sought to improve in any case.

There is maybe a marginal improvement in the overhead (i.e. the time spent between/around bazel and ctcreport), but it’s pretty much lost in the noise, and certainly nothing worth writing a blog post about.

Best case after our changes: Rebuild with no changes

This is the situation where we are now able to reuse already-instrumented intermediate build outputs. In fact, in this case there are no changes whatsoever, and Bazel can reuse the test executables from the previous build directly, no (re-)building necessary. However, as discussed above, we do need to re-run all tests and then re-generate the coverage report:

Scope of coverage build + tests	`bazel` build/test	`ctcreport`	Total	Speedup
Entire source tree	3m24s	1m58s	6m55s	6.4x
One large application	1m31s	42s	2m49s	5.5x
One small application	1s	1s	4s	8.8x

Common case after our changes: Rebuild with limited change set

This last table is in many ways the most interesting (but least accurate), as it tries to reflect the common case that most developers are interested in:

“I’ve made a few changes to the source code, how long will I have to wait to see the updated coverage numbers?”

Of course, as with a regular build, it depends on the size of your changes, and the extent to which they cause misses in Bazel’s build cache. Here, I’ve done some small source code change that cause rebuilds in a handful of compilation units:

Scope of coverage build + tests	`bazel` build/test	`ctcreport`	Total	Speedup
Entire source tree	3m23s	1m57s	6m54s	6.4x
One large application	1m34s	42s	2m52s	5.4x
One small application	4s	1s	6s	5.8x

The expectation here would be that the total time needed is the sum of how long it takes to do a regular build of your changes, plus the numbers from the no-op case above. And this seems to largely hold true. Especially for the single- application case were we expect your changes to affect application’s unit tests, and therefore the build phase must strictly precede the test runs.

In the full source tree scenario, it seems that Bazel can start running other (unrelated) tests concurrently with building your changes, and as long as your changes, and the tests on which they depend, are not among the slowest tests to run, then those other, slower test will “hide” the marginal build time cost imposed by your changes.

Conclusion

We have achieved what we set out to do: to leverage the Bazel cache to avoid unnecessary re-building of coverage-instrumented source code. It involves a fair amount of added complexity in the build process, in order to make CTC++‘s outputs reproducible, and thus reusable by Bazel, but the end result, in the common case - a developer making a small source code change relative to a previous coverage build - is a 5-10x speedup of the total time needed to build and test with coverage instrumentation, including the generation of the final coverage report.

Future work

A natural extension of the above scheme is to apply a similar treatment to the generation of the coverage statistics at test runtime: Bazel allows for test runs to be cached, so that later build/test runs can reuse the results and logs from earlier test runs, rather than having to re-run tests that haven’t changed.

However, in much the same way as for symbol data at build time, we would need to make sure that coverage statistics (.dat files) were saved and reused along with the corresponding test run results/logs.

One could imagine each test creating a separate .dat file when run, and then have Bazel cache this together with the test logs. The report generation phase would then need to collect the .dat files from both the reused/cached and the new/uncached test runs, and pass them all to the ctcreport tool. Failure to do so correctly would cause coverage statistics to be lost, and the resulting coverage report would be misleading.

With all this in place we could then enable caching of test results (in practice, removing the --nocache_test_results flag that we currently pass), and enjoy yet another speedup courtesy of Bazel’s cache.

That said, we are entering the realm of diminishing returns: Unit tests - once they are built - typically run quickly, and there is certainly less time to be saved here than what is saved by reusing cached build results. Looking at the above numbers: even if we were able to fully eliminate time used by bazel test, we would still only achieve another 2x speedup, theoretically.

For now, we can live with re-running all tests from scratch in order to create a complete MON.dat file, every time.

And that is where I believe it stops: extending this even further to incrementally generate the coverage report itself, in effect to re-generate parts of the report based on a few changed inputs, is - as far as I can see - not possible with the existing tools.

Finally, I want to commend Verifysoft for their understanding and cooperation. I can only imagine that for someone not used to working with Bazel, our initial questions must have seemed very eccentric. They were, however, eager to understand our situation and find a way to make CTC++ work for us. They have even hinted at including a feature in a future version of CTC++ to allow shortening/mapping paths at instrumentation time. Using such a feature to remove the sandbox paths would also have the nice side effect of making CTC++‘s own fingerprint logic reproducible, as far as we can see. Together, this would enable us to stop rewriting paths and fingerprints on our own.

Thanks to Mark Karpov for being my main co-conspirator in coming up with this scheme, and helping to work out all the side quests and kinks along the way.

Also thanks to Christopher Harrison, Joseph Neeman, and Malte Poll for their reviews of this article.

For now, we ignore the non-hermetic writing of MON.dat files. See the section on future work for how tackling this properly is in many ways similar (and similarly complex) to what we’re doing for the CTC++ symbol data in the rest of this article.↩
On reconsideration, we should probably have used the somewhat standardized $SOURCE_DATE_EPOCH environment variable here rather than coming up with our own static date. In practice, it should not matter.↩
In later talks with Verifysoft, we have been given the OK that this fingerprint scheme should be sufficient for our purpose, at least until a new version of CTC++ that allows for more reproducible fingerprints are available.↩
It seems that - by default - the linker is doing almost exactly what we want: The ctc_sym sections from the linker inputs are indeed automatically concatenated into the linker output. However, the linker appears to discard sections from inputs that are completely optimized away at link time. But we do in fact want these symbol data sections to be retained, otherwise the final coverage report would omit the corresponding source files rather than showing them as lacking test coverage. Hence we resort to maintaining the ctc_sym section ourselves at link time.↩
As an extra sanity check, ctcreport will verify that the fingerprints from inside the given .sym files match the corresponding fingerprints recorded alongside the coverage statistics in the MON.dat file. Thus we can discover if we’ve messed up somewhere along the way.↩

Behind the scenes

Johan Herland

Johan is a Developer Productivity Engineer at Tweag. Originally from Western Norway, he is currently based in Delft, NL, and enjoys this opportunity to discover the Netherlands and the rest of continental Europe. Johan has almost twenty years of industry experience, mostly working with Linux and open source software within the embedded realm. He has a passion for designing and implementing elegant and useful solutions to challenging problems, and is always looking for underlying root causes to the problems that face software developers today. Outside of work, he enjoys playing jazz piano and cycling.

Tech Group

Scalable Builds

Correct, efficient, and reliable builds are critical for developers to work and collaborate effectively.

If you enjoyed this article, you might be interested in joining the Tweag team.

This article is licensed under a Creative Commons Attribution 4.0 International license.

← Evaluating the evaluators: know your RAG metrics A hundred pull requests for Liquid Haskell →