Tweag developed rules_nixpkgs
to empower Bazel users with the ability to leverage
Nix’s reproducible builds and its extensive package registry. That ruleset has proven to be
especially advantageous in endeavors demanding intricate dependency administration and the
maintenance of uniform build environments.
However, rules_nixpkgs
is incompatible with remote execution. This is a major limitation given that remote
execution is possibly the main reason why people switch to Bazel. And that rules_nixpkgs
provides a great way to configure hermetic toolchains, which are an important ingredient for reliable remote execution. There is no trivial fix as
can be seen in the related, longstanding open issue. At Tweag we
investigated a promising solution presented at Bazel eXchange 2022 (recording), but these ideas
were never implemented in a public proof of concept.
In this post, we will present our new remote execution infrastructure repo and walk you
through the required steps to comprehend and replicate how it achieves remote execution with
rules_nixpkgs
.
The remote execution limitation
When we make use of rules_nixpkgs
, we instruct Bazel to use packages from nixpkgs
rather than those from the host system. This means that when we try to build a C++ project, Bazel won’t use the
gcc
compiler, which is typically found under /usr/bin
, but instead will use the compiler specified
by rules_nixpkgs
and provided by Nix, typically stored under some /nix/store/<unique_hash>-gcc/bin
directory.
Bazel distinguishes actions to import external dependencies from regular build actions. The former are always executed locally1, while the latter can be distributed using remote execution. rules_nixpkgs
falls into the former category and invokes Nix to download and install the required /nix/store/<unique_hash>-gcc
path locally on your machine.
This scenario works fine when we’re building locally. However, when we enable remote execution, rules_nixpkgs
still installs dependencies locally, while the build happens on another machine, which
will not have those paths available, so it will inevitably fail.
Initial setup with remote execution
For our proof of concept, we decided to use Buildbarn to provide the remote execution
endpoint and infrastructure. Buildbarn provides Kubernetes manifests that we can use to deploy all
the necessary Buildbarn components for remote execution to work. We’ll be using the examples from
the bb-deployments repository to test our setup, but also modifying it to make
use of rules_nixpkgs
.
To replicate our implementation you’ll need a working Buildbarn infrastructure, which in this case would be a Kubernetes cluster. You can use our guide to set up a cluster on AWS.
Test remote execution without rules_nixpkgs
To make sure that everything is working as expected, we’ll use the @abseil-hello
Bazel target
which is available in the Buildbarn deployments repo. This example does not use
rules_nixpkgs
, yet. You can clone the bb-deployments repository, if you want to follow
along.
- Get the service endpoint of the Buildbarn executor service (
frontend
). If you’re deploying on a cloud provider this would be a load-balancer.
$ kubectl get services -n buildbarn
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
browser ClusterIP 172.20.22.171 <none> 7984/TCP 8d
frontend LoadBalancer 172.20.126.97 xxxxx.us-east-1.elb.amazonaws.com 8980:31657/TCP 8d
scheduler ClusterIP 172.20.83.110 <none> 8982/TCP,8983/TCP,7982/TCP 8d
storage ClusterIP None <none> 8981/TCP 8d
- Update
.bazelrc
to use the remote executor endpoint of our environment
...
build:remote-exec --remote_executor=grpc://[endpoint-from-previous-step]
...
Now we can try building the @abseil-hello
target using the remote execution infrastructure. Note that we’ll
be using a custom toolchain specific to the default executors created by Buildbarn.
bazel build --config=remote-ubuntu-22-04 @abseil-hello//:hello_main
Test remote execution with rules_nixpkgs
Once we have validated that our setup works we can create a new target that uses rules_nixpkgs
.
Update .bazelversion
to use 6.4
which is a version supported by rules_nixpkgs
(any other
version on the 6.x
should work as well).
Update the WORKSPACE
file with the following:
http_archive(
name = "io_tweag_rules_nixpkgs",
strip_prefix = "rules_nixpkgs-244ae504d3f25534f6d3877ede4ee50e744a5234",
urls = ["https://github.com/tweag/rules_nixpkgs/archive/244ae504d3f25534f6d3877ede4ee50e744a5234.tar.gz"],
)
load("@io_tweag_rules_nixpkgs//nixpkgs:repositories.bzl", "rules_nixpkgs_dependencies")
rules_nixpkgs_dependencies()
load("@io_tweag_rules_nixpkgs//nixpkgs:nixpkgs.bzl", "nixpkgs_git_repository", "nixpkgs_package", "nixpkgs_cc_configure")
load("@io_tweag_rules_nixpkgs//nixpkgs:toolchains/go.bzl", "nixpkgs_go_configure") # optional
nixpkgs_git_repository(
name = "nixpkgs",
revision = "23.11",
)
nixpkgs_cc_configure(
repository = "@nixpkgs",
name = "nixpkgs_config_cc",
attribute_path = "clang",
)
This is the standard boilerplate to install rules_nixpkgs
on our Bazel workspace. We’re also
creating a reference to the nixpkgs repository, and a C++ toolchain using clang
.
Next, we create a new cc_binary
target in BUILD.bazel
with a simple hello-world program.
$ cat BUILD.bazel
...
cc_binary(
name = "hello-world",
srcs = ["hello-world.cc"],
)
$ cat hello-world.cc
#include <iostream>
int main(int argc, char** argv) {
std::cout << "Hello world!" << std::endl;
return 0;
}
Now we need to update the custom Buildbarn toolchain used by the executors to reference
@nixpkgs_config_cc
. Update the file tools/remote-toolchains/BUILD.bazel
and replace the instances
of @remote_config_cc
with @nixpkgs_config_cc
.
We can try building the application using the C++ toolchain we defined with rules_nixpkgs
. We expect
this to fail because the executors are not Nix-aware yet.
$ bazel build --config=remote-ubuntu-22-04 @abseil-hello//:hello_main
...
ERROR: /home/user/.cache/bazel/_bazel_user/5ce2ca33a49034ed7557e24d70204ce5/external/com_google_absl/absl/base/BUILD.bazel:324:11: Compiling absl/base/internal/throw_delegate.cc failed: (Exit 34): Remote Execution Failure:
Invalid Argument: Failed to run command: Failed to start process: fork/exec /nix/store/n37gxbg343hxin3wdryx092mz2dkafy8-clang-wrapper-16.0.6/bin/cc: no such file or directory
...
Because the executors don’t have the /nix/store
available, they cannot resolve the compiler path
which is generated locally on our machine when we invoke bazel build
.
Now let’s see how we can solve this problem by configuring the executors to access a shared
/nix/store
via NFS.
NFS-based solution
Our solution involves a Nix server that bridges this gap. This server manages and synchronizes the Nix dependencies across the Bazel build environment.
Here’s how it works:
-
During
bazel build
therules_nixpkgs
repository rules will build and copy any Nix derivation to the remote Nix server. -
The Nix server will export the
/nix/store
directory tree via a read-only NFS mount share to the executors. -
When a build is triggered, all necessary dependencies are already available on the executors, allowing for the build process to continue.
Implementation-wise, we’ll need to make the following changes to the Buildbarn infrastructure:
-
A Nix server. This could be a VM with Nix installed that is exporting the
/nix/store
directory as a read-only NFS share over the private network. We’ll need SSH access on that server from the machine that invokesbazel build
. -
Kubernetes executors with the exported NFS share mounted.
For a detailed setup guide and implementation specifics, refer to our infrastructure repository.
To instruct rules_nixpkgs
to copy the nix derivations to the server we’ll need to create
an entry in our SSH config (typically found under ~/.ssh/config
) with the remote server and then
set the environment variable BAZEL_NIX_REMOTE
with the name of that entry.
# SSH Configuration
$ cat ~/.ssh/config
Host nix-server
Hostname [public-ip]
IdentityFile [ssh-private-key]
Port [ssh-port]
User [ssh-user]
Testing out remote execution again
With the new setup, we can try building the project again.
$ export BAZEL_NIX_REMOTE=nix-server
$ bazel clean --expunge # To refetch the Nix derivations
$ bazel build --config=remote-ubuntu-22-04 @abseil-hello//:hello_main
You should now see lines like the following, confirming communication with the Nix server
...
Analyzing: target @abseil-hello//:hello_main (0 packages loaded, 0 targets configured)
Fetching repository @nixpkgs_config_cc_info; Remote-building Nix derivation 9s
...
And the build should be successful.
Conclusion
In this post, we explored the challenges and our solution for integrating rules_nixpkgs
with remote
execution in Bazel. Of course this solution is not perfect and it comes with some shortcomings that end
user should be aware of.
-
The first issue is about cache eviction. Caching all the Nix paths over the long term is not practical from a storage standpoint. That’s why we need a way to mark the required paths, and garbage collect the others. A Nix path should be available as long as a client may trigger a remote build that uses it. However, there’s no way to determine when a client no longer needs a specific path. A simple solution will be to invalidate the least used paths. That will require a tighter integration with the Bazel APIs in order to track the Nix path usage.
-
The second issue relates to NFS performance. This depends on the infrastructure and workloads in operation. At least we want to tune the NFS synchronization to the point that the paths are available before any build begins. Slow synchronization between the NFS server and client can lead to failed builds.
- Bazel has an experimental feature that enables remotable repository rule actions. However, their capabilities are too limited to support the
rules_nixpkgs
use-case.↩
About the authors
An SRE/DevOps engineer with a keen interest in networking, infrastructure and build systems.
Guillaume has a background in computer science, engineering and applied mathematics. Regarding software systems, his main concern is correctness, reliability, and trustworthiness. He is passionate about understanding complex systems and untangling intricate issues.
If you enjoyed this article, you might be interested in joining the Tweag team.