kotatsuyaki’s site

Yes, Nixpkgs 23.05 still works* on CentOS 6

Published on

During recent years, I’ve encountered more times than I should with HPC systems stuck on CentOS 6 or even CentOS 5. One of them in a corporation, three of them in academic settings. Installing software as a normal user on them without root have always been a pain due to how ancient these systems are, and it didn’t help that some of these systems are offline for security requirements.

It occurred to me that maybe I can stretch Nixpkgs’s support to these old, offline systems—and it turned out to be easier than I imagined. It did require some thought and wrestling, but in the end I’ve succeeded bootstrapping Nixpkgs for the x86_64-unknown-linux-musl system up to a fully functional Emacs 28.2 on an offline CentOS 6 machine without root.

“Why musl, and why bootstrap from source?” you might ask. To begin with, I have three special constraints imposed here:

  1. CentOS 6 comes with kernel 2.6.32.
  2. HPC users usually don’t have root access.
  3. Some HPC’s are offline.

Consequently, through the logic of this flow chart, I arrived at the conclusion that the easiest path is to bootstrap Nixpkgs for native musl libc on the offline machine.

Some explanations:

The rest of this post documents how I bootstrapped Nixpkgs for localSystem = "x86_64-unknown-linux-musl" up to the emacs package on an offline machine running CentOS 6 as an unprivileged user with a custom store location at /home/kotatsu/.nix/. From now on, I’ll refer to the target system as “native musl” to avoid spelling out the host triple (quadruple?) repeatedly.

Prefetch the source closure for offline bootstrap

The information on how to prefetch source tarballs for offline build is pretty sparse on the Internet, and all I could find was this thread from 2021 asking for “source closure”, hence the heading of this section. The gist is to list all the derivations required for building a particular derivation (which was Emacs in my case), filter the list to retain all the fixed-output derivations, and finally build all the fixed-output derivations from a machine with access to the Internet.

In my case, there’s another problem—I can’t take advantage of the binary cache for fetching the source closure, because

  1. I’m targeting a custom store location, and also because
  2. I’m targeting native musl,

which are both not cached officially. Trying to fetch anything that requires pkgs.fetchgit or even just pkgs.fetchurl from Nixpkgs results in bootstrapping the whole stdenv up to git and curl so that the fetchers can be run, which I do not want to do on my local systems, because it takes way too long to bootstrap on my local systems compared to the HPC system.

My approach to this problem is to write some dirty Python scripts that perform these steps. Effectively, the scripts fetch the fixed-output derivations using the x86_64-unknown-linux (-glibc) Nixpkgs powered by the official binary cache, and then transfer the resulting store paths from the /nix store to the custom store.

  1. Evaluate the installable I want to build (e.g. Emacs) to get the closure of all dependencies by nix path-info --recursive --derivation <installable>. This step is done on the custom store with native musl, but no bootstrapping is required because I only evaluate, not build, the derivations.
  2. Get the JSON representation of each derivation nix derivation show /path/to/the/derivation.drv^* to determine whether it’s a fixed-output derivation that should be prefetched. If a derivation is fixed-output, record the derivation. This step is also done on the custom store.
  3. On the /nix store, recreate and build the fixed-output derivations. Because this is done on the /nix store with glibc, even the ones that require bootstrapping many things, such as buildCargoTarball that requires building Cargo, can be built with ease thanks to the binary cache.
  4. Copy each store path from the /nix store back to the custom store by nix store add-file or nix store add-path depending on whether the output hash mode of the fixed-output derivation is flat or recursive, respectively.

The store paths copied into the custom store in the final step can then be copied to a local binary cache store by nix copy --to file:///tmp/binary-cache <store paths> and transferred to the offline machine.

In the step 2 above, how do I preserve the information to recreate the fixed-output derivations themselves on a different instance of Nixpkgs targeting glibc? Well, I made some modifications to Nixpkgs to preserve the function arguments to each fetch* fetcher function, and read those arguments from the JSON representation of each derivation in my Python scripts. For example, in the fetchgit function I added these “metadata attributes” such as fod-fetch-variant and fod-fetch-inputs.

{ url, rev ? "HEAD", md5 ? "", sha256 ? "",
  # More function inputs omitted here...
}:

stdenvNoCC.mkDerivation {
  # Preserve function name (fetchgit) and inputs for the script
  # to recreate the derivation.
  fod-fetch-variant = "fetchgit";
  fod-fetch-inputs = builtins.toJSON {
    inherit url rev sha256 hash
      # All other inputs preserved here...
      ;
  };
}

I’m not sure if this method of injecting stuff to the derivations is the best solution, but this got me as far as building Emacs without problems.

Fix the build failures for native musl on kernel 2.6

After prefetching and transferring the source closure to the offline HPC, I built the derivation for musl by nix build. To build Emacs, I had to fix or workaround the following build issues. In practice, this task is no different from packaging software with Nix for the conventional glibc platform.

And that’s all I did. Ignoring the failed attempts, the whole build process completes in a few hours on my test VM with 6 cores of an i7-8700 allocated.

Is this practical?

Yes, the setup is practical. Here’s a screenshot of the Emacs 28.2 bootstrapped from source.

This copy of Emacs is built with all the optional dependencies such as freetype, harfbuzz, and libgccjit for native compilation, without the need to learn to configure and build each of the dependencies and their transitive dependencies. Before this post, I used to “bootstrap” software from scratch by referring to the BLFS book such as this page for Emacs on HPC systems, but the upgradability of software installed in this ad-hoc way is a mess. Using Nix makes the process of bootstrapping and patching software much less tedious, and it comes with the benefits of being reproducible and easily transferrable.

Other alternatives to my approach

There are some alternatives that were either tried, discarded, or overlooked when I did this post. The alternatives for the general task “installing new software on old, offline HPC” are listed here. If you went down any of the rabbit holes in the list, please write about your experience and let me know!


Caveats

These are some caveats that didn’t end up in the main prose.

Modern glibc on kernel 2.6?

What happens when glibc is built on kernel 2.6, anyways? Well, the dynamic linker fails to load libc.so:

mkdir -p -- /home/kotatsu/.nix/store/dg8d4rfn2ykhnf32a00hjgkw8j95db87-glibc-2.38-23/lib/locale
C.UTF-8.../tmp/nix-build-glibc-2.38-23.drv-0/build/locale/localedef: error while loading shared libraries: libc.so.
make[2]: *** [Makefile:475: install-archive-C.UTF-8/UTF-8] Error 127
make[2]: Leaving directory '/tmp/nix-build-glibc-2.38-23.drv-0/glibc-2.38/localedata'
make[1]: *** [Makefile:749: localedata/install-locales] Error 2
make[1]: Leaving directory '/tmp/nix-build-glibc-2.38-23.drv-0/glibc-2.38'
make: *** [Makefile:9: localedata/install-locales] Error 2

Running strace /path/to/localedef reveals the cause. The dynamic linker from modern glibc somehow uses AT_EMPTY_PATH, which is not available on kernel 2.6.32. Glibc really meant it when they dropped support for kernel under version 3.2 ;)

newfstatat(3, "", 0x7fffffffa780, AT_EMPTY_PATH) = -1 EINVAL (Invalid argument)

I only learned about the Nix HPC channel at #hpc:nixos.org after going through the trouble of new glibc on an old kernel, and there was a report of this exact EINVAL problem from November 2022. Oops.

nix-shell builds glibc on native musl

nix-shell does not work by default, because it tries to build glibc stdenv, because it wants bash from <nixpkgs>, which is the glibc version by default.

To make nix-shell work, specify a musl-based shell using the NIX_BUILD_SHELL environment variable. I haven’t found a way to do the same thing for nix develop, but I think it can be done by patching Nixpkgs to use prefer musl over glibc when running on Linux.

Other caveats


  1. There’s another tool called proot that intercepts system calls using ptrace() to implement the functionalities of chroot, but in practice I found proot’s overhead to be problematic.↩︎