Monday, March 21, 2016

pbs-stage1: Building toolchains for cross-compilation


Cross-compiling is the process of compiling code on one (build) machine to be run on another (host) machine. Often the host machine will use a different architecture than the build machine. The process is mostly the same as regular compilation, with the notable exception that the code that is built cannot be run on the build machine. This can be significant, and excruciating, under certain circumstances, but that's a story that will be told another time.


Toolchain is a rather loosely defined term, but it typically refers to the collection of utilities required to build software. Depending on the software this set can vary widely. For the purposes of this post I'll limit the set to a C compiler, an assembler and a linker. In the GNU/Linux world this is provided by the GCC and GNU binutils projects. For bare-metal software those will typically be enough, but most developers will at least want to be able to run software under a Linux operating system, so we'll need to include a C runtime library (such as the GNU C library, a.k.a. glibc) and Linux kernel headers in our set as well. There are a couple of dependencies that the above (primarily GCC) require, so those go into the set as well. If we want anything fancy we may even consider adding things like m4, autoconf, automake and libtool, or even pkgconfig and quilt. For debugging support, let's also add the GNU debugger (gdb) to the mix.

As you see there's a bunch of software involved to build software. It's fairly complicated to build all these pieces and make them work well together. There are a couple of projects that do this already:
Mostly out of curiosity I started looking into how to build a cross-compilation toolchain many moons ago. Some of the above tools didn't exist at the time, and those that did exist didn't quite do what I wanted them to. I could of course have done the right thing and improve one of the existing solutions, but instead I went and did something new. Because I thought I could do better. Also doing so was very instructive, and that's what it's really all about, right?

Both crosstool-ng and buildroot work perfectly for a large number of people, so I definitely recommend looking at those if you're looking for ways to build cross-compilation toolchains. crosstool has become somewhat outdated, so you'll probably have less luck with it.

Some distributions also provide cross-compilation toolchain packages. Debian has quite a few while others often only provide toolchains targetting the ARM architecture. It's also quite common for distributions to ship only bare-metal toolchains, so developing against a Linux-based distribution doesn't work. There are also some other well-known toolchain providers, such as CodeSourcery, DENX (ELDK) and Linaro.


The name is a historic relic. A long time ago (or at least it now seems so) my day job involved building custom distributions for embedded systems. Again, a number of alternatives existed (and still do) to do the same thing. And again, curiosity was what made me want to roll my own. Back at the time, the system to build these distributions was called PBS (platform build system). As the name indicates, these weren't the kinds of generic distributions, but they were highly configurable and tailored to a specific (hardware) platform.

Initially, PBS included a toolchain builder, but that turned out not to scale very well, because it essentially meant that you needed to build a toolchain everytime you wanted to start a new platform, even if both used the same architecture or even chip. To get rid of that duplication, the project was split into stages: the toolchain builder (stage 1) and the platform builder (stage 2).

pbs-stage1 uses a fairly simple build system. It has a top-level makefile that will include a toolchain host definition and build all required packages in the right order, using make dependencies. Some of the components of the toolchain are architecture-independent (libtool, autoconf, automake, pkgconfig, ...) and the scripts will install stamp files in the installation directory to prevent those from being built multiple times. Effectively only the binutils, gcc, Linux kernel header files, glibc and gdb will be built per host definition.

By default, toolchains will be installed into a pbs-stage1 subdirectory of your home directory. This can be overridden by overriding the PREFIX variable on the command-line of the make invocation.

Using the toolchain is straightforward once you've built and installed it. Simply add the $HOME/pbs-stage1/bin directory to your PATH and you're good to go.


You can find pbs-stage1 in a repository on github:
The repository contains a README file that explains briefly how to build a toolchain. There are a bunch of host definitions included, though some of the more exotic ones aren't guaranteed to build or work.

No comments:

Post a Comment