Clodl: build, copy, and run

20 February 2019 — by Facundo Dominguez, Mathieu Boespflug

Have you ever needed to execute in another computer a binary that you have built, only to find that it doesn’t work because it depends on shared libraries that aren’t installed? Or even worse, you don’t have admin rights on the machine, so you cannot install the correct dependencies?

In this post we present clodl, a simple tool to build self-contained closures of dynamically linked executables and shared libraries. The closure is a file that can be copied elsewhere and executed or loaded as a library without depending on any pre-installed library.

Deploying binaries

While copying executables is the simplest way to deploy a binary, it depends strongly on a correctly configured computing environment. Other options, such as copying all the dependencies to a Docker container, or building instead a fully static binary executable alleviate this problem. However, there are cases where these methods are not appropriate.

One such case is feeding native code to the Java Virtual Machine (JVM), since it can only load shared libraries. If we are to deploy a hybrid application requiring both the JVM and native code, we need to be sure that all the native code dependencies are shipped or the JVM won’t be able to load it.

A possibility is creating a Docker image with the JVM and all the libraries. This can work as long as we have control over how the application is launched, which was not the case in one of our projects.

In this project, we wanted to integrate Haskell programs with Spark, a system for running distributed computations. Among other things, Spark interfaces with systems which manage resources in a cluster and orchestrates the creation of JVM instances to run distributed computations. Computations are submitted to Spark as self-contained JAR files containing the code to execute. In this scenario, we must be able to encapsulate dependencies.

Another situation where clodl has been useful for us is when we need a self-contained (static-like) executable, but we are using a compiler that doesn’t build fully static executables by default (e.g. GHC). Depending on the application, doing so might not be trivial. The clodl approach offers a stop-gap solution, or even a permanent one if fully static builds don’t work (some dependencies often make this difficult).

What is a closure?

In essence, the closure of a shared library or executable is a file containing this initial library or executable and its dependencies. In the case of clodl this takes the form of a zip archive, though any container format could be used.

In addition, the closure must have some provisions to execute or load the code contained within. For example, a Java application could uncompress the closure and load the initial shared library. However, this can cause problems, since there is the danger that the dependencies would be searched at other locations where incompatible libraries could be found. Unix-like systems have a flexible mechanism for finding the dependencies of a library, but this mechanism is also hard to override when we want to force the library to use a specific set of dependencies.

Because of this, the closure has a wrapper library created solely for the purpose of loading it. This wrapper library is the result of linking dynamically the initial library and all of its dependencies, and therefore all the libraries in the closure are direct dependencies of it. Moreover, the runtime library path of this wrapper library instructs the dynamic linker to search for dependencies at the same location where the wrapper is.

If we want the closure to be executed rather than loaded, we create a wrapper executable instead of a wrapper library, which is linked in the same way.

Creating closures with clodl

clodl is implemented with a mix of bash scripts and Bazel rules. Bazel is a build system which is able to express dependencies between source files in various languages (e.g. Haskell and Java). While we have been using clodl with it, it would be possible to implement the same approach in other tools or even as a standalone shell script.

In order to build a closure, one has to write rules in a BUILD file which can be fed to Bazel. For instance

# bring into scope the library_closure rule
load(
    "@io_tweag_clodl//clodl:clodl.bzl",
    "library_closure",
)

# create a C library
cc_library(
    name = "lib-cc",
    srcs = ["lib.c"],
)

# create a shared library from the C library
cc_binary(
    name = "libhello-cc.so",
	# include a main function if you intend to make
	# the closure executable
    srcs = ["main.c"],
    deps = ["lib-cc"],
    linkshared = 1,
)

# create the closure of the shared library
library_closure(
    name = "clotest-cc",
    srcs = ["libhello-cc.so"],
)

The library_closure rule here produces a file clotest-cc.zip containing all the shared libraries needed to load libhello-cc.so. clodl finds the dependencies using whatever version of ldd is on the PATH when Bazel runs the rules.

We can make the the closure executable by replacing the library_closure rule with

binary_closure(
    name = "clotestbin-cc",
    src = "libhello-cc.so",
)

This will produce a file clotestbin-cc.sh that has the closure zip file prepended with a bash script that invokes the main function in main.c. The clodl repository has full examples. At the time of this writing, we only have tested clodl on Linux.

Tools for deployment

clodl is not the first tool on Linux designed to deploy binaries. There is Flatpack and Snap, for instance. These are build tools that start from the source code to build cross-linux packages. Another such tool is nix-bundle, which requires the application to be packaged with nix in order to provide a binary which bundles all of its dependencies.

These tools require specifying the dependencies, the source code and any build settings. clodl, in contrast, starts from the executable binary or shared library, written in any language, built by any compiler.

However, there are tools that follow this approach as well. One of these is a commercial tool called ermine, which produces a self-contained executable from a dynamically-linked application. Unfortunately, it provides no clues on the structure of the final executable. clodl produces a zip archive, which is convenient to uncompress and have its contents loaded in the JVM. The choice of the archiving format is not crucial for the approach as long as the target machine has the ability to extract it.

Finally, there is a tool called exodus which produces a tar archive from a dynamically linked executable, its dependencies and the dynamic linker. Most notably, it also includes a launcher program that will invoke the linker with command line flags that constrain the directories in which to search for libraries. This approach is rather frugal. Unlike clodl, it doesn’t need to build a wrapper executable. However, exodus will not work for building closures of libraries. When we load libraries in the JVM, we assume no control over the flags passed to the dynamic linker that the JVM is using.

Summary

In this post we have presented clodl, a tool to build closures of dynamically linked executables and shared libraries. These closures are easy to copy and load or execute in other hosts. We expect these capabilities to be useful to various projects. You are welcome to get in touch if you would like to contribute or have clodl features enhanced.

Behind the scenes

Facundo Dominguez

Mathieu Boespflug

Mathieu is the CEO and founder of Tweag.

Tech Group

Scalable Builds

Correct, efficient, and reliable builds are critical for developers to work and collaborate effectively.

If you enjoyed this article, you might be interested in joining the Tweag team.

This article is licensed under a Creative Commons Attribution 4.0 International license.

← The types got you JupyterWith: Declarative, Reproducible Notebook Environments →