Have you ever needed to execute in another computer a binary that you have built, only to find that it doesn’t work because it depends on shared libraries that aren’t installed? Or even worse, you don’t have admin rights on the machine, so you cannot install the correct dependencies?
In this post we present clodl
, a simple tool to build
self-contained
closures of dynamically linked executables and shared libraries.
The closure is a file that can be copied elsewhere and executed
or loaded as a library without depending on any pre-installed library.
Deploying binaries
While copying executables is the simplest way to deploy a binary, it depends strongly on a correctly configured computing environment. Other options, such as copying all the dependencies to a Docker container, or building instead a fully static binary executable alleviate this problem. However, there are cases where these methods are not appropriate.
One such case is feeding native code to the Java Virtual Machine (JVM), since it can only load shared libraries. If we are to deploy a hybrid application requiring both the JVM and native code, we need to be sure that all the native code dependencies are shipped or the JVM won’t be able to load it.
A possibility is creating a Docker image with the JVM and all the libraries. This can work as long as we have control over how the application is launched, which was not the case in one of our projects.
In this project, we wanted to integrate Haskell programs with Spark, a system for running distributed computations. Among other things, Spark interfaces with systems which manage resources in a cluster and orchestrates the creation of JVM instances to run distributed computations. Computations are submitted to Spark as self-contained JAR files containing the code to execute. In this scenario, we must be able to encapsulate dependencies.
Another situation where clodl
has been useful for us is when we
need a self-contained (static-like) executable, but we are using a
compiler that doesn’t build fully static executables by default (e.g. GHC).
Depending on the application, doing so might not be trivial.
The clodl
approach offers a stop-gap solution, or even a permanent
one
if fully static builds don’t work (some dependencies often make this difficult).
What is a closure?
In essence, the closure of a shared library or executable is a file
containing this initial library or executable and its dependencies.
In the case of clodl
this takes the form of a zip archive,
though any container format could be used.
In addition, the closure must have some provisions to execute or load the code contained within. For example, a Java application could uncompress the closure and load the initial shared library. However, this can cause problems, since there is the danger that the dependencies would be searched at other locations where incompatible libraries could be found. Unix-like systems have a flexible mechanism for finding the dependencies of a library, but this mechanism is also hard to override when we want to force the library to use a specific set of dependencies.
Because of this, the closure has a wrapper library created solely for the purpose of loading it. This wrapper library is the result of linking dynamically the initial library and all of its dependencies, and therefore all the libraries in the closure are direct dependencies of it. Moreover, the runtime library path of this wrapper library instructs the dynamic linker to search for dependencies at the same location where the wrapper is.
If we want the closure to be executed rather than loaded, we create a wrapper executable instead of a wrapper library, which is linked in the same way.
Creating closures with clodl
clodl
is implemented with a mix of bash scripts and Bazel
rules. Bazel is a build system which is able to express dependencies
between source files in various languages (e.g. Haskell and Java).
While we have been using clodl
with it, it would be possible to
implement the same approach in other tools or even as a standalone
shell script.
In order to build a closure, one has to write rules in a BUILD
file
which can be fed to Bazel. For instance
# bring into scope the library_closure rule
load(
"@io_tweag_clodl//clodl:clodl.bzl",
"library_closure",
)
# create a C library
cc_library(
name = "lib-cc",
srcs = ["lib.c"],
)
# create a shared library from the C library
cc_binary(
name = "libhello-cc.so",
# include a main function if you intend to make
# the closure executable
srcs = ["main.c"],
deps = ["lib-cc"],
linkshared = 1,
)
# create the closure of the shared library
library_closure(
name = "clotest-cc",
srcs = ["libhello-cc.so"],
)
The library_closure
rule here produces a file clotest-cc.zip
containing all the shared libraries needed to load libhello-cc.so
.
clodl
finds the dependencies using whatever version of ldd
is on the
PATH when Bazel runs the rules.
We can make the the closure executable by replacing the
library_closure
rule with
binary_closure(
name = "clotestbin-cc",
src = "libhello-cc.so",
)
This will produce a file clotestbin-cc.sh
that has the closure zip
file prepended with a bash script that invokes the main function in
main.c
. The clodl
repository has full examples.
At the time of this writing, we only have tested clodl
on Linux.
Tools for deployment
clodl
is not the first tool on Linux designed to deploy binaries. There
is Flatpack and Snap, for instance. These are build
tools that start from the source code to build cross-linux packages.
Another such tool is nix-bundle, which requires the application
to be packaged with nix in order to provide a binary which
bundles all of its dependencies.
These tools require specifying the dependencies, the source
code and any build settings. clodl
, in contrast, starts from the
executable binary or shared library, written in any language, built by
any compiler.
However, there are tools that follow this approach as well. One of
these is a commercial tool called ermine, which produces
a self-contained executable from a dynamically-linked application.
Unfortunately, it provides no clues on the structure of the final
executable. clodl
produces a zip archive, which is convenient to
uncompress and have its contents loaded in the JVM. The choice of
the archiving format is not crucial for the approach as long as the
target machine has the ability to extract it.
Finally, there is a tool called exodus which produces a tar
archive from a dynamically linked executable, its dependencies and
the dynamic linker. Most notably, it also includes a launcher program
that will invoke the linker with command line flags that constrain the
directories in which to search for libraries. This approach is rather
frugal. Unlike clodl
, it doesn’t need to build a wrapper executable.
However, exodus
will not work for building closures of libraries.
When we load libraries in the JVM, we assume no control over the flags
passed to the dynamic linker that the JVM is using.
Summary
In this post we have presented clodl
, a tool to build closures of
dynamically linked executables and shared libraries. These closures
are easy to copy and load or execute in other hosts.
We expect these capabilities to be useful to various projects.
You are welcome to get in touch if you would like to contribute or have
clodl
features enhanced.
About the authors
Mathieu is the CEO and founder of Tweag.
If you enjoyed this article, you might be interested in joining the Tweag team.