This post describes how to use OBazl to build OCaml projects with Bazel. Compared to the first OCaml ruleset published in 2016, which contains only a few basic rules, OBazl is under active development and aims to provide highly granular, low-level control over the build, and eventually automatically generate build files. It is therefore a practical choice to enjoy Bazel’s incremental, parallel, cacheable and remote building capacities. However, it is in its early stages, requiring manual work for some tasks that will be automated in the future.
To exemplify its usage, I will show how to convert the build of the uri package to Bazel. Since OCaml and Bazel is not a frequent combination, I will give an introduction to Bazel, as well as some aspects of compiling OCaml modules. The final product of this endeavour can be examined on GitHub.
Bazel Basics
The uri
project is suitable for this introduction because it is small, but
incorporates a number of interesting build requirements.
It contains three libraries (uri
, uri-re
and uri-sexp
in the
subdirectories lib
, lib_re
and lib_sexp
), a test suite in the subdirectory
lib_test
and two code generation helpers for the tests.
The goal is to be able to run these command lines to build and test all libraries contained in the project:
bazel build //lib:lib-uri
bazel test //lib_test:test
These commands specify targets, which use this syntax:
@reponame//sub/dir:targetname
//
denotes the project root directory, called workspace.
It must contain the file WORKSPACE.bazel
, in which dependencies and global
aspects of the project are configured.
The part before :
is the path of the subdirectory that contains the
target, called a package, and can be omitted to reference the root directory.
Each package must contain the file BUILD.bazel
.
The part after :
is the name of the target, which is assigned in a
BUILD.bazel
.
Multiple targets may be declared in a single package.
The whole specification can be prefixed with @some-name
to reference an
external repository configured in WORKSPACE.bazel
. Using the .bazel
extension for these files is optional.
Additionally, custom rule definitions and helper functions may be placed in
arbitrary files with the .bzl
extension. In this project, the subdirectory
bzl
hosts all the .bzl
files.
OBazl Configuration
Let’s start configuring our build for OBazl by writing WORKSPACE.bazel
.
First, we initialize the workspace with a project name.
workspace(name = "uri")
Next, we import the http_archive
function.
Importing functions is done by the load
function, which takes one argument denoting a file, with the target syntax
from above, and a list of function names to
be imported into the current scope.
@bazel_tools
is a native workspace distributed with Bazel.
That is, the following line imports the function http_archive
from
the file http.bzl
in the tools/build_defs_repo
package of the @bazel_tools
workspace.
load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
Using http_archive
, we fetch the OBazl tarball. This gives us a new
workspace, @obazl_rules_ocaml
, that we can reference.
http_archive(
name = "obazl_rules_ocaml",
strip_prefix = "rules_ocaml-9a700d54e86ffce2da0e2bd425aebe2c79f5167c",
urls = [
"https://github.com/obazl/rules_ocaml/archive/9a700d54e86ffce2da0e2bd425aebe2c79f5167c.tar.gz"
],
sha256 = "3ef69cca2c7829b3bc839c31eef8c02f498f8319ca33b4a27c740f8cbff24f51"
)
The next snippet imports OBazl’s main function, ocaml_configure
, as
well as the opam
configuration dictionary that we will define in the
file bzl/opam.bzl
below.
load("@obazl_rules_ocaml//ocaml:bootstrap.bzl", "ocaml_configure")
load("//:bzl/opam.bzl", "opam")
Finally, we initialize the OBazl configuration:
ocaml_configure(opam = opam, build = "4.12")
Here, build = "4.12"
specifies the compiler version for the project, which
will be configured in the next step.
Opam Configuration
Opam uses build environment specifications called switches which provide
a compiler and package dependency sets.
We configure it in bzl/opam.bzl
:
load("@obazl_rules_ocaml//ocaml:providers.bzl", "OpamConfig", "BuildConfig")
opam_pkgs = {
"stringext": ["1.4.0"],
. . .
}
opam = OpamConfig(
version = "2.0",
builds = {
"4.12": BuildConfig(
default = True,
switch = "4.12",
compiler = "4.12",
packages = opam_pkgs,
install = True,
),
}
)
This defines a single compiler configuration named 4.12
.
The packages
attribute offers a glimpse into the future of OBazl – the feature
that would automatically fetch Opam packages isn’t fully implemented yet.
Consequently, dependencies have to be installed manually, assuming opam
is
present in the system:
$ opam install ocamlfind stringext angstrom re sexplib ppx_sexp_conv ppx_deriving ounit
Building a Library
The uri
project contains three libraries, each composed of one or more OCaml
modules.
To build those with Bazel, we place a file named BUILD.bazel
in each
library’s subdirectory.
In that file, we first import the needed OBazl functions:
load(
"@obazl_rules_ocaml//ocaml:rules.bzl",
"ocaml_signature",
"ocaml_module",
"ocaml_library",
)
The lib_re
package contains two modules.
For each module, we have to create a target for the signature and the
implementation:
ocaml_signature(
name = "uri_re_sig",
src = "uri_re.mli",
deps_opam = ["stringext", "re"],
)
ocaml_module(
name = "uri_re",
struct = "uri_re.ml",
sig = "uri_re_sig",
deps_opam = ["stringext", "re"],
)
The name
attribute is the identifier of the target.
In this case, uri_re_sig
is assigned to the interface file uri_re.mli
and
referenced by the sig
attribute of the ocaml_module
rule.
Since the signature is in the same package as the module, we can simply refer
to the signature target by its name, but it would also be possible to use the extended target
syntax: //lib_re:uri_re_sig
.
The struct
argument denotes the file containing the implementation of the
module.
The signature target compiles to a .cmi
, while the module target compiles to
a .cmx
and combines the implementation and signature into a .o
file.
Building the module target with bazel build //lib_re:uri_re
(referencing
subdirectory:targetname
) produces the .o
at
bazel-bin/lib_re/__obazl/Uri_re.o
.
Applying the same rules to the other module, uri_legacy.ml
, allows us to
create a library target from those two modules:
ocaml_signature(
name = "uri_legacy_sig",
src = "uri_legacy.mli",
deps_opam = ["stringext", "re"],
)
ocaml_module(
name = "uri_legacy",
struct = "uri_legacy.ml",
sig = "uri_legacy_sig",
deps_opam = ["stringext", "re"],
)
ocaml_library(
name = "lib-uri-re",
modules = ["uri_re", "uri_legacy"],
)
This library target does not compile anything, its purpose is merely to group
the set of modules for reference in other targets (to build a cmxa
archive, OBazl has an ocaml_archive
rule).
Testing
The subdirectory lib_test
contains two test suites for the core library
uri
.
Aside from the library, it also has a dependency on test fixtures, located in
the etc
subdirectory.
Building those is a little more involved since they are generated from data
files.
The pipeline for this task starts with the subdirectory config
, which builds
an executable that performs the code generation.
This program converts the file etc/services.short
to OCaml, which is combined
with some extra code in etc/uri_services_raw.ml
into a module named
uri_services.ml
.
Building the executable in config/BUILD.bazel
is straightforward:
load("@obazl_rules_ocaml//ocaml:rules.bzl", "ocaml_module", "ocaml_executable")
ocaml_module(name = "gen_services", struct = "gen_services.ml", deps_opam = ["stringext"])
ocaml_executable(name = "exe", main = "gen_services")
The executable, //config:exe
, can then be used as a dependency in the etc
package to generate uri_services.ml
:
genrule(
name = "gen_uri_services",
srcs = ["services.short", "uri_services_raw.ml"],
outs = ["uri_services.ml"],
cmd = "$(execpath //config:exe) $(execpath services.short) > $@ && cat
$(execpath uri_services_raw.ml) >> $@",
tools = ["//config:exe"],
)
The function genrule
allows us to execute an arbitrary shell command using
other build products of our project.
The rule depends on the executable, which can be used in the cmd
attribute by
calling the auxiliary function execpath
which is provided by Bazel and
produces the path to the referenced target.
By the same mechanism, we get the paths to our data files.
$@
is expanded to the paths in the outs
attribute, containing the generated
module’s file name, which will be written to
bazel-bin/etc/uri_services.ml
.
So far, so good – but now the module has to be built.
This comes with a significant caveat:
The module has an associated interface file, and it is statically located at
etc/uri_services.mli
.
The naive approach would be a build definition referencing both signature and
implementation by their target names:
ocaml_signature(name = "uri_services_sig", src = "uri_services.mli")
ocaml_module(name = "uri_services", struct = "uri_services.ml", sig = "uri_services_sig")
This compiles just fine, but when we use this module as a dependency of the test executable, the linker tells us:
Error: Files bazel-bin/lib_test/__obazl/Test_runner.cmx
and bazel-bin/etc/__obazl/Uri_services.cmx
make inconsistent assumptions over interface Uri_services
What is happening is that the OCaml compiler expects the .mli
file to be in the
same directory as the .ml
file; but, since we generated the .mli
file, it is located
in the bazel-out
directory, while the interface is in the source
tree. Therefore, OCaml defaults to compiling Uri_services
as if
there were no Uri_services.mli
, and exposes all of its internals.
The solution to this is to copy the interface file to the build artifact
directory using the copy_file
function from @bazel_skylib
,
Bazel’s standard library. However, there is yet another hurdle: in
Bazel, the name of an output file cannot be the name of an input file;
but at the same time, the .mli
file and the .ml
file must have
matching names. To address this, we copy uri_services.mli
to the
generated/
subdirectory of the build artifact directory. The
generated/
directory doesn’t exist in the sources, which guarantees
that there is no name clash between inputs and
outputs. Correspondingly uri_services.ml
is generated in the
generated/
directory.
load("@bazel_skylib//rules:copy_file.bzl", "copy_file")
copy_file(
name = "uri_services_mli",
src = "uri_services.mli",
out = "generated/uri_services.mli",
)
genrule(
name = "gen_uri_services",
srcs = ["services.short", "uri_services_raw.ml"],
outs = ["generated/uri_services.ml"],
cmd = "$(execpath //config:exe) $(execpath services.short) > $@ && cat
$(execpath uri_services_raw.ml) >> $@",
tools = ["//config:exe"],
)
We then create a library named //etc:services_short
using the usual rules and
are equipped to write the definition of the test suite in lib_test/BUILD.bazel
:
load(
"@obazl_rules_ocaml//ocaml:rules.bzl",
"ocaml_module",
"ocaml_test",
)
ocaml_module(
name = "test_runner",
struct = "test_runner.ml",
deps_opam = ["oUnit"],
deps = ["//lib:lib-uri", "//etc:services_short"],
)
ocaml_test(name = "test", main = "test_runner")
The test_runner
depends on the test framework oUnit
, the main library using
its target label //lib:lib-uri
, and the generated services.
The rule ocaml_test
is similar to ocaml_executable
, but it defines a test
target that allows us to run the test command:
bazel test //lib_test:test
INFO: Found 1 test target...
Target //lib_test:test up-to-date:
bazel-bin/lib_test/test
INFO: Build completed successfully, 6 total actions
//lib_test:test PASSED in 0.3s
Executed 1 out of 1 test: 1 test passes.
Preprocessors
The lib_sexp
directory, containing the uri-sexp
library, has an additional
requirement:
a preprocessor from the opam package ppx_sexp_conv
needs to be executed on
the module before building it.
The module uri_sexp.ml
, contains the preprocessor annotation @@deriving sexp
:
type component = [
| `Scheme
. . .
] [@@deriving sexp]
OBazl provides additional tools for this purpose.
The OCaml preprocessor core library
ppxlib provides an entry point for
standalone executables that perform the rewrite routines of any linked
preprocessor library.
By linking ppx_sexp_conv
, we can create a tool that can be used by OBazl:
write_file(
name = "driver_ml",
out = "driver.ml",
content = ["let () = Ppxlib.Driver.standalone ()"],
)
ppx_module(
name = "driver",
struct = ":driver.ml",
deps_opam = ["ppxlib"],
)
ppx_executable(
name = "sexp_preprocessor",
main = ":driver",
opts = ["-linkall"],
deps_opam = ["ppx_sexp_conv"],
)
In order to apply the preprocessor to a module, sexp_preprocessor
has to be passed as the ppx
attribute of the ppx_module
rule, along with the output encoding in ppx_print
:
ppx_module(
name = "uri_sexp",
struct = "uri_sexp.ml",
sig = ":uri_sexp_sig",
deps_opam = ["sexplib", "ppx_sexp_conv"],
deps = ["//lib:lib-uri"],
ppx = ":sexp_preprocessor",
ppx_print = "@ppx//print:text",
)
The intermediate file at bazel-bin/lib_sexp/__obazl/Uri_sexp.ml
shows that
preprocessing was successful:
type component = [
| `Scheme
. . .
] [@@deriving sexp]
include
struct
let rec __component_of_sexp__ =
(let _tp_loc = "lib_sexp/uri_sexp.ml.Derived.component" in
. . .
Now the generated file has to be compiled.
Since the module file is now located in the build directory while the interface
is in the source directory, the same treatment as for the uri_services
module
is necessary: we have to copy the interface file to the build directory.
In this case, the generator rule is not directly under our control, so we’ll
have to adapt to the ppx
rules’ output path, which is
__obazl/Uri_sexp.ml
:
load("@obazl_rules_ocaml//ocaml:rules.bzl", "ppx_library")
copy_file(
name = "uri_sexp_mli",
src = "uri_sexp.mli",
out = "__obazl/uri_sexp.mli",
)
ocaml_signature(
name = "uri_sexp_sig",
src = "uri_sexp_mli",
deps_opam = ["sexplib"],
deps = ["//lib:lib-uri"],
)
Using ppx_library
instead of ocaml_library
isn’t strictly necessary here; it
is a convention to clearly mark the involvement of a preprocessor through Bazel
providers.
Conclusion
The current state of OBazl is quite functional, albeit in early stages. You will find that you sometimes need rather substantial efforts to compensate for missing features. For instance, when I started working with OBazl, it didn’t work on NixOS (we like NixOS here at Tweag). So I had to implement NixOS support myself.
However, development is in full swing, with promising improvements on the horizon, including automatic build file generation, dependency handling, and even support for Coq builds.
About the author
Torsten is a Haskell architect with a passion for creating complex tools and libraries with cutting-edge technologies, seamless user experience, and detailed, accessible documentation.
If you enjoyed this article, you might be interested in joining the Tweag team.