Thursday, June 23, 2016

What's new for Tegra in Linux v4.7

With v4.7-rc1 out[1], it's time to look at what to expect from Tegra support in Linux v4.7.


XUSB is a USB host controller that supports USB 3.0. After loading it with a proprietary firmware blob, it will expose an xHCI compliant interface and do USB 3.0 on ports that support it, as well as high-speed and full-speed where super-speed is not availabel.

The XUSB driver has been under development for a ridiculously long time. One of the reasons is that it relies on the XUSB pad controller to configure its pins as required by the board design. The XUSB pad controller is very likely one of the least-intuitive pieces of hardware I've ever encountered, and the attempts to come up with a device tree binding to describe it have been very numerous. We did finally settle on something earlier this year and after the existing code was updated for the new binding, we're finally able to support super-speed USB on Tegra124 and later.

Most of the work on this was done by Google as part of the Nyan Chromebooks and the Pixel C.

Core SoC drivers

Jon Hunter cleaned up the PMC driver and fixed a number of issues in it. He then went on to drive to conclusion another long-standing patchset that had been prototyped by Vince Hsu and me with the goal to expose the Tegra power partitions as generic power domains, which will eventually allow us to get rid of a custom Tegra API.

Some of the other core SoC drivers, such as I2C and DMA, have seen a couple of updates by Laxman Dewangan, though they've been feature-complete for the longest time, so there isn't anything really new there.


Alex Courbot has been continuing to improve support for the Tegra GPUs in the Nouveau driver. After adding support for secure boot, required by the Maxwell generation of GPUs, in v4.6, work has been ongoing in other parts of the kernel to enable the GPU on Jetson TX1. We will hopefully see that happen in the v4.8 timeframe.


Not much happened on the Tegra DRM front because I was occupied with the XUSB driver. However, a lot of code for Tegra X1 had already been merged in earlier releases, so many of the prerequisites to enable display on Jetson TX1 are in place already.

The Jetson TX1 comes with an HDMI connector that supports HDMI 2.0. The board also has a display connector that can be used to connect a DSI or eDP panel. Most of the NVIDIA engineers have a default display board equipped with a DSI panel running at a 1200x1920 resolution.


Laxman Dewangan has also added support for the Maxim MAX77620 PMIC that can be found on the P2180[2] and other Tegra X1-based boards (such as the Google Pixel C, a.k.a. Smaug). The MFD driver as well as the regulator drivers have been merged in time for v4.7, whereas a couple of others, such as the pinctrl, GPIO and RTC drivers have been merged for v4.8.

This work allows us to make good progress for the next release, because a lot of the hardware need the regulators and GPIOs exposed by these drivers for power. Patches are being worked on to enable DSI, HDMI, XUSB and GPU on the Jetson TX1.

Most 32-bit and 64-bit ARM Tegra boards now use the stdout-path property in device tree that will cause the right UART to be chosen for the debug console, so the old console kernel command-line parameter no longer needs to be passed explicitly.

Support was added for the Google Pixel C (a.k.a. Smaug), though it isn't very complete yet. It's enough to boot to a login prompt from a root filesystem on the eMMC. Further support was blocked on the MAX77620 PMIC patches, but since those have been merged, more features can now be enabled on Smaug for v4.8 and later.

  1. We're actually close to v4.7-rc5 at the time of this writing, but I had originally planned to finish this up much earlier.

  2. The P2180 is also known as the Jetson TX1 compute module. I guess it is technically called simply Jetson TX1, but everybody uses that as the short name for the Jetson TX1 Development Kit, which is a P2180 processor module on a P2597 I/O board.

Friday, April 22, 2016

Display panels are not special

Why are video timings not specified in device tree?

This is a question that arises every once in a while regarding display panel device tree bindings. This post tries to provide a complete rationale for why this is, so that I will hopefully never have to answer that question again.

First, let's see how a panel is typically described in device tree:
 panel: panel {  
   compatible = "auo,b133xtn01";  
   backlight = <&backlight>;  
   ddc-i2c-bus = <&dpaux>;  
The node name is panel and it has an associated label ("panel:") that consumers can use to easily reference it. The first property defines the compatible string. It identifies both the binding that the device tree node adheres to and also provides the primary means of identifying what kind of device it is.

What binding the device tree node adheres to is important because it defines the other properties, required and optional, that are relevant. In this case the panel can have a backlight associated with it (via a phandle reference) and a DDC bus (via another phandle reference) that is typically used to query the panel's EDID1.

Given that compatible strings are required to uniquely identify a device (via a vendor prefix separated from the model by a comma), they imply a great deal of information about the device. For IP blocks they imply the register layout and programming model. They also imply many static properties of the device, such as internal delays or constraints imposed on external signals.

Extending this to display panels it is natural to conclude that video timings are implied by the compatible string. Similarly, properties such as the physical dimensions of the active display area, type of video bus or pixel format are all implicitly defined by the compatible string.

But panels are special!

For some reason that evades me, people keep thinking that display panels are somehow special.

They really aren't.

Very early on in the process of defining the device tree bindings for panels people suggested that we come up with a generic binding that would allow us to fully describe all panels. But it became clear very quickly that this was never going to work.

I suspect one of the reasons why people think panels are special is because they think that video timings (and perhaps physical dimensions) are the only relevant properties of a panel. However once you go into the details, there are quite a few very important other characteristics.

It didn't take very long before panels were encountered that required a number of external resources (such as enable GPIO, reset GPIO, power supply regulator and so forth) to be controlled in a very specific way in order to properly turn a panel on and off. The first attempt at a solution was a debacle. The idea was to introduce a generic binding to describe power sequences in device tree and attach these power sequences to devices. The curious reader can find the discussion in public mailing list archives, but the gist of it is that the implementation turned into a monster of a scripting language within device tree that nobody wanted to have in the end.

Once you decide that the power sequences are implied by the compatible string this problem solves itself because you can simply write a driver that binds to the compatible string and provides simple C code to request the resources and control them in exactly the right way to match the sequences as specified in the datasheet.

Complex display panels

The matter becomes even more important when you start using complex display panels such as those on a DSI bus. It's fairly common for these display panels to require register level programming to get into a functional state. There are standards (such as DCS) that try to make this somewhat uniform, but the reality is that most panels require vendor- and model-specific register programming before they start displaying anything.

If you wanted to support such panels in a fully generic device tree binding you'd need to add support for DSI register programming to your power sequencing script language. You don't want to go there.

Dumb display panels

Many panels, fortunately, are fairly dumb and require a small set of resources with relatively simple power on and power off sequences. The Linux kernel implements a simple-panel driver that can support a large number of these simple panels with a single device tree binding.

Bloat versus duplication

People were, and still are, concerned about the kernel becoming bloated by the large number of video timings within the kernel. I don't think that's turned out to be true. The simple-panel driver supports 46 panels (in v4.6-rc4) and the resulting loadable kernel module is ~37 KiB with a .rodata section of ~20 KiB. In my opinion that's not very much. If people cared enough they could always go and add Kconfig symbols for all the simple panels in order to allow the size to be reduced.

While it is certainly true that this data wouldn't be shipped with every kernel image if the data was contained within device tree, doing so comes at its own cost: duplication. As argued above, the compatible string implies the video timings for a display panel. It follows that specifying the video timings in device tree is duplicating information. Even more so if you happen to have two boards which use the same display panel.

With the current model it is fairly trivial to add board support. Provided that your panel is already supported it is a simple matter of adding the node with the given compatible string and hooking up the board-specific resources (GPIOs, regulators, ...). If, however, the video timings, physical dimensions and power sequences had to be defined in device tree, everybody would have to tediously copy/paste all that data from existing boards, or resort to reading the datasheet in order to find out the information.

1 Ironically the EDID is where, among other things, the panel's video timings are stored. If an EDID is found, the DRM panel framework will of course use those timings and fall back to built-ins otherwise. 

Wednesday, March 30, 2016

Build-testing for the Linux kernel DRM subsystem

Following up on this earlier post, I'll briefly detail how I run build tests on the Linux kernel DRM subsystem.

I maintain the scripts and build configurations in a git repository:


The build script takes a number of arguments, most of which should be familiar. You can get a list of them by running the scripts with the -h or --help option. Here's how I typically run the script:

 $ drm/build --jobs 13 --color  

This instructs the script to have make start 13 jobs (that's the number of cores * 2 + 1, feel free to use whatever you think is a good number) and colorize the output. I find colorized output convenient because it will mark failed builds with a bright red status report.

By default the script will derive the build directory name from the output of git describe and prepend build/ as well as append /drm. Each configuration will be built in a subdirectory (with the same name as the configuration) of that build directory. This should give you a fairly unique name, so no need to worry about losing any existing builds.

In addition to command-line options, the script can take a list of configurations to build via positional arguments.

Once done, each configuration's build directory contains a build.log file as well as an error.log file. The contents of error.log are included in the build.log file. In addition the script will output a single line per build, along with a status report of the build result.


There's another script in the above-mentioned repository: coverage. You can run it after a run of the build script to check for compile coverage. This is fairly simplistic, since it checks only on a per-file basis, not how much code in the file actually got compiled. It'll report any source files that aren't compiled in the drivers/gpu/drm directory and can be a quick indicator of whether or not the configurations are adequate.


Most of the above can be achieved by simply building x86 defconfig and ARM multi_v7_defconfig configurations. However there are drivers that aren't covered by those configurations (drm/vc4 needs ARM v6, EDIT: this is no longer true, ARCH_BCM2835 which DRM_VC4 depends on is now available with ARM v7 and hence multi_v7_defconfig) and the default configurations usually take much longer to build than the minimal configurations contained in the repository. Also the included configurations build 32-bit and 64-bit variants, which occasionally catches inconsistent type usage (size_t vs. unsigned long/int, ...).

Compile-testing for the Linux kernel subsystems

One of the most common pitfalls when contributing to Linux kernel development is that if you make subsystem-wide changes, it can become difficult to ensure everything still builds fine. Chances are that if the subsystem is moderately big you won't be able to build all drivers on a single architecture or configuration. Contributors may not always care about test-building all drivers, but at least maintainers will have to, otherwise chances are that they'll make life miserable for others.


The Linux kernel configuration system provides some assistance to maintainers through the COMPILE_TEST Kconfig symbol. It can be used as a catch-all dependency to override architectural dependencies of drivers. This allows all drivers marked with this dependency to be built on all architectures (unless excluded via other dependency chains), which means that you can compile-test the code without resorting to cross-compilation or multiple configurations.

If used without care it can be surprisingly painful, though. The problem with the COMPILE_TEST symbol is that drivers are exposed on architectures and configurations that they normally aren't compile-tested on. Build breakage is a common result of adding COMPILE_TEST dependencies without careful consideration. The reason is primarily that the architectural dependencies imply the existence of a specific API that may isn't supported on all architectures. Builds on some of the more exotic architectures will trigger this kind of failure, though they've also been known to happen on fairly common configurations.

Perhaps the most popular solution to this is to provide dummy implementations of an API that will allow compilation to succeed on configurations where it isn't available. This cuts both ways, though, because it also means that you get to successfully build a kernel image that includes drivers which will use the dummy implementation and therefore be completely dysfunctional. There are legitimate uses for that, but they are very rare.


A more rigorous approach is to use cross-compilation toolchains and build multiple configurations that will ensure full coverage without resorting to fake dependencies.

Using toolchains, scripts and configuration that I've written about previously, I perform quick sanity builds for a number of architectures and configurations for several subsystems on a regular basis. Often I will run them on the latest linux-next tree or before pushing code to a public repository.

Maintaining configurations

Sanity builds often rely on allmodconfig configurations, which enable all drivers on a particular architecture. This is useful because it always gives you the maximum coverage. The downside is that it will require a very long time for such builds to complete. If all you want to do is compile a set of drivers (i.e. all those in a particular subsystem) you can get done much faster.

However, you wouldn't want to keep various .config files around, because they can become a nightmare to maintain as kernel development progresses. I've been using a method that takes advantage of some of Kconfig's functionality to create a set of minimal configurations that will provide maximum build coverage.

The idea is to create a sort of script for each configuration that will gradually tune the .config file. Each script starts out by specifying the architecture and will then typically use allnoconfig as a starting point. Individual symbols can then be enabled as needed. Finally all the changes will be applied and additional dependencies resolved using an olddefconfig run before the configuration is built. The script language is easily extensible, we'll see shortly why that is, but these are the most common commands:
  • include: includes another script
  • kconfig: implements a Kconfig frontend with additional subcommands:
    • architecture: selects the architecture to build
    • allnoconfig, olddefconfig, ...: this is really a wildcard on *config that will run the given configuration target using the Linux kernel's makefile
    • enable: enable a Kconfig symbol
    • module: enable a Kconfig symbol as module
    • disable: disable a Kconfig symbol
  • kbuild: build the configuration
An example configuration might look like this:

kconfig architecture x86  
kconfig allnoconfig  
# basic configuration  
kconfig enable DEBUG_FS  
kconfig enable SYSFS  
# for PWM_CRC  
kconfig enable GPIOLIB  
kconfig enable I2C  
kconfig enable INTEL_SOC_PMIC  
# for PWM_LPSS_PCI  
kconfig enable PCI  
kconfig enable ACPI  
# PWM drivers  
kconfig enable PWM  
kconfig enable PWM_CRC  
kconfig enable PWM_LPSS  
kconfig enable PWM_LPSS_PCI  
kconfig enable PWM_LPSS_PLATFORM  
kconfig enable PWM_PCA9685  
# PWM users  
include users.include  
kconfig enable DRM  
kconfig enable DRM_I915  
kconfig enable MFD_INTEL_SOC_PMIC  
kconfig olddefconfig  

This is one of the configurations I use to compile-test the PWM subsystem. As you can see, this selects the x86 architecture and starts off with an allnoconfig. It then enables some basic options such as DEBUG_FS and SYSFS because they enable optional code. What follows are some sections that enable dependencies for various drivers, the driver options themselves and a set of users. Note how this includes users.include, a file that contains a set of options that enables users of the PWM API and which is shared with various other configurations. Finally the olddefconfig command will resolve all dependencies and generate the final .config file which is used by the kbuild command to build the kernel image.

An interesting implementation detail is that this script is really a shell script. This has a number of advantages:
  • comments are automatically parsed and discarded by the shell
  • commands can be implemented simply by shell functions
  • the include command is trivial to implement in terms of the source builtin
There are various example scripts that make use of this:
Configuration scripts are available in the configs subdirectories of the respective parent directories. If you look at the code you'll note that there's potential for refactoring. Each of the scripts, at the time of this writing, duplicates the implementation of the configuration script commands. I'm likely going to factor this out into a common script library that can be shared.

Wednesday, March 23, 2016

Dealing with multiple cross-compilation toolchains

I've written previously about how I use a set of scripts to build cross-compilation toolchains. Managing a multitude of toolchains can be quite painful, because you need to remember the host triplets (or quadruplets) for each of them along with their install location. The latter isn't really an issue if you have them installed in the same directory, but if you rely on prebuilt toolchains that is often not possible.

Being a Linux kernel maintainer, I spend most of my time cross-compiling Linux kernels. It's natural to write scripts to automate as much as possible. After a couple of years I've accumulated a number of scripts to do a number of different things. For example I have a script that will build the Linux kernel with various default configurations for sanity checking. I often run it on linux-next for baseline results, which means that if I see the build fail on any of those configurations I usually don't bother rebasing any of my development trees on top.

I also have a set of scripts to assist with building more specific configurations, such as for testing Tegra in particular, or build all source files that are related to the PWM subsystem along with a script that provides some measure of coverage.

Any of these scripts needs to have some code that sets up the cross-compilation toolchain to use for a particular build. That code is always the same, so after a while I started thinking about how to reuse the code. I used to have large case statements in the scripts to select the right host triplet/quadruplet based on the architecture and which added the directory containing the toolchain binaries to the PATH. Copy/pasting that to new scripts became tedious, and maintaining a consistent set of mappings across all scripts was a nightmare.

The solution I came up with is rather simple. I now configure the cross-compilation toolchains using a .cross-compile file in my home directory. It consists of key-value pairs where the key is either path or the name of an architecture such as arm, arm64, mips, x86 or x86_64. Keys are separated from values by a colon. path entries add the value to the PATH. Architecture entries specify the host triplet/quadruplet to be used for a specific architecture.

Here's the ~/.cross-compile file that I use:
path: $HOME/pbs-stage1/bin:$HOME/toolchain/avr32/bin:$HOME/toolchain/unicore32/bin
arm: armv7l-unknown-linux-gnueabihf-
arm64: aarch64-unknown-linux-gnu-
avr32: avr32-
blackfin: bfin-unknown-elf-
mips: mips-linux-gnu-
unicore32: unicore32-linux-

I also have a small shell script library that I can easily include in scripts which will parse the file and set up the PATH, ARCH and CROSS_COMPILE environment variables. That's kind of tailored to the Linux kernel build system, unsurprisingly, given where this comes from. It's fairly easy to reuse this in other contexts, such as autotools, though.

The shell library can be found here:
An example script that makes use of it can be found here:

Monday, March 21, 2016

pbs-stage1: Building toolchains for cross-compilation


Cross-compiling is the process of compiling code on one (build) machine to be run on another (host) machine. Often the host machine will use a different architecture than the build machine. The process is mostly the same as regular compilation, with the notable exception that the code that is built cannot be run on the build machine. This can be significant, and excruciating, under certain circumstances, but that's a story that will be told another time.


Toolchain is a rather loosely defined term, but it typically refers to the collection of utilities required to build software. Depending on the software this set can vary widely. For the purposes of this post I'll limit the set to a C compiler, an assembler and a linker. In the GNU/Linux world this is provided by the GCC and GNU binutils projects. For bare-metal software those will typically be enough, but most developers will at least want to be able to run software under a Linux operating system, so we'll need to include a C runtime library (such as the GNU C library, a.k.a. glibc) and Linux kernel headers in our set as well. There are a couple of dependencies that the above (primarily GCC) require, so those go into the set as well. If we want anything fancy we may even consider adding things like m4, autoconf, automake and libtool, or even pkgconfig and quilt. For debugging support, let's also add the GNU debugger (gdb) to the mix.

As you see there's a bunch of software involved to build software. It's fairly complicated to build all these pieces and make them work well together. There are a couple of projects that do this already:
Mostly out of curiosity I started looking into how to build a cross-compilation toolchain many moons ago. Some of the above tools didn't exist at the time, and those that did exist didn't quite do what I wanted them to. I could of course have done the right thing and improve one of the existing solutions, but instead I went and did something new. Because I thought I could do better. Also doing so was very instructive, and that's what it's really all about, right?

Both crosstool-ng and buildroot work perfectly for a large number of people, so I definitely recommend looking at those if you're looking for ways to build cross-compilation toolchains. crosstool has become somewhat outdated, so you'll probably have less luck with it.

Some distributions also provide cross-compilation toolchain packages. Debian has quite a few while others often only provide toolchains targetting the ARM architecture. It's also quite common for distributions to ship only bare-metal toolchains, so developing against a Linux-based distribution doesn't work. There are also some other well-known toolchain providers, such as CodeSourcery, DENX (ELDK) and Linaro.


The name is a historic relic. A long time ago (or at least it now seems so) my day job involved building custom distributions for embedded systems. Again, a number of alternatives existed (and still do) to do the same thing. And again, curiosity was what made me want to roll my own. Back at the time, the system to build these distributions was called PBS (platform build system). As the name indicates, these weren't the kinds of generic distributions, but they were highly configurable and tailored to a specific (hardware) platform.

Initially, PBS included a toolchain builder, but that turned out not to scale very well, because it essentially meant that you needed to build a toolchain everytime you wanted to start a new platform, even if both used the same architecture or even chip. To get rid of that duplication, the project was split into stages: the toolchain builder (stage 1) and the platform builder (stage 2).

pbs-stage1 uses a fairly simple build system. It has a top-level makefile that will include a toolchain host definition and build all required packages in the right order, using make dependencies. Some of the components of the toolchain are architecture-independent (libtool, autoconf, automake, pkgconfig, ...) and the scripts will install stamp files in the installation directory to prevent those from being built multiple times. Effectively only the binutils, gcc, Linux kernel header files, glibc and gdb will be built per host definition.

By default, toolchains will be installed into a pbs-stage1 subdirectory of your home directory. This can be overridden by overriding the PREFIX variable on the command-line of the make invocation.

Using the toolchain is straightforward once you've built and installed it. Simply add the $HOME/pbs-stage1/bin directory to your PATH and you're good to go.


You can find pbs-stage1 in a repository on github:
The repository contains a README file that explains briefly how to build a toolchain. There are a bunch of host definitions included, though some of the more exotic ones aren't guaranteed to build or work.

git-topic: workflow based on topic branches

Topic Branches

The work that I do requires me to contribute to a number of different projects, often across various areas within these projects. Managing patches can become a tedious task. A common workflow for dealing with the patch load is called topic branches.

Topic branches allow you to organize patches into sets that deal with separate... well... topics. A topic is usually a particular new feature you're working on, or it might be a set of patches that cleanup a particular usage pattern. At other times you might want to fix a bug, but doing so requires a set of preparatory patches that need to be applied in a specific sequence. Keeping the set of patches in a separate topic branch makes it easy to ensure the required order. Often a topic branch will result in a patch series that you send to the project's development mailing list.

A bit of background

As an example, as part of my work as a Linux kernel maintainer for Tegra, I work across various subsystems such as graphics, clocks, power management and USB, among others. A lot of the time I don't write patches myself but rather pull in work done by others. Often my own work and that by others isn't quite ready for inclusion into linux-next or mainline, but I keep a development/integration tree that can be used to test what I like to think of as the "bleeding edge".

The idea is that while developers are busy working with subsystem maintainers to get their patches included into mainline or linux-next, users (myself included) can get a peek at the work in progress. This is especially important given the long time it often takes to get patches merged upstream. New hardware can be crippled for the first couple of months after its release by the lack of support in mainline.

A development/integration tree gives people something that they can boot on their hardware and make it work. It also gives people a good indication of what's being worked on and what isn't, so it can help to reduce duplication of work. Furthermore, some developers' work might be blocked by the lack of a specific feature. An integration tree can help them get a head-start.

Topic branches are a good way to model such an integration tree. Individual patch sets can go into separate topic branches. One could of course apply all sets to a single branch, but that would become rather messy very quickly. Topic branches help order patches in a natural way. Integrating all the work into a working tree is often simply a matter of merging together all of the topic branches. Conflicts will result sometimes if two or more of the branches modify the same files (device tree, default configurations, ...), but those are usually trivial to resolve.

Of course one of the dangers of a development/integration tree is that it can devolve into a vendor tree. To prevent that, every contributor needs to make sure that the work will eventually get merged upstream. One additional measure to encourage the flow upstream is to keep rebasing the entire tree. The rebase intervals really depend on the project and the amount of work you're willing to invest. This could be releases, or release candidates. For the Linux kernel I've found linux-next to be a good choice. You'll see later on why it is a good choice.

One difficulty is that if you keep sufficiently many of these topic branches, it can become very tedious to continually merge them together. And if you want to rebase the tree fairly frequently, the amount of time spent running git commands can become significant.

Development/integration tree structure

First, let's take a look at how a development/integration tree can be structured.

The whole tree is built on a common base. That is, each of the topic branches starts with the same commit. This isn't strictly necessary because git is really good when it comes to merging. But having a single base is very convenient for scripting. Starting from the base commit we can create any number of topic branches. A fairly common branch that I happened to create in many projects is called "fixes". This contains cleanups, build fixes and other, mostly trivial, patches. For the Linux kernel I keep other branches, often named after a particular driver or subsystem (pci, xhci, clk, drm/tegra, drm/panel, ...).

All of these branches get merged into a master branch. On top of this master branch, I like to have a work branch that contains various bits that are work in progress, code that's not quite finalized yet, or that I haven't categorized yet.

To keep the repository clean I like to have a common namespace for this tree. All branches get a staging/ prefix. This gives us the following tree-like structure:
  • staging/base
    • staging/clk
    • staging/pci
      • staging/master
        • staging/work
Rebasing the tree onto a new base involves the following steps:
  1. rebase all topic branches onto the new base
  2. make staging/base point at the new base
  3. merge all topic branches into staging/master
  4. rebase staging/work onto staging/master
Publishing the development/integration tree is merely a matter of pushing all the branches to a remote.

Using a frequently updated upstream as a base might be a bit of work, but it comes with a few advantages. On one hand, you get to run your latest code on top of other people's latest code (truly bleeding edge). That is opposed to most vendor trees where the vendor's latest code runs on top of some very old baseline. Running on top of other people's latest code means you'll be able to quickly notice when some API changes and breaks the code in one of your topic branches. You'll be forced to update patches to the latest changes upstream, which prevents the patches from becoming stale. Also, basing on top of bleeding edge code you'll notice any runtime fallout early on, so you can take measures to fix things (in your code or upstream's) before anything even hits mainline or linux-next. Finally a nice side-effect is that once features or patch sets get merged upstream, your topic branches will automatically be flushed during the rebase. Though sometimes automatic here might involve skipping patches manually (git rebase --skip) when git can't figure that out on its own.

git-topic to the rescue

As with many repetitive tasks, scripting is our friend. git is very easy to extend: any script named git-foo and available on the PATH can be run like any git sub-command using git foo. It is also fairly trivial to enable command-line completion using bash (or other shells).

To make it easier for me to maintain my own development trees, I wrote a script that automates most of this work for me:

git-topic is a custom sub-command that automates the maintenance of a development/integration tree. git-topic keeps a list of topic branches in a file stored in a separate orphan branch (staging/branches). An orphan branch is one which has no common ancestor with the master branch. Keeping the list of branches in a separate branch makes it possible to version the branch list just like any source file. It also becomes trivial to share the topic branch information with others, because the staging/branches branch will be pushed to remotes along with all the other branches. The format of the file is very trivial: it consists of a branch name (not including the staging/ prefix) per line, lines starting with a # are comments and will be ignored.

The following explains how to setup a tree for use with git-topic and goes over the commands used in day-to-day work.

Initializing a tree

The init sub-command initializes the tree using a given base commit:
$ git topic init next/master
This sets up staging/base and staging/master branches that point at next/master (I have next set up as a remote to track linux-next, and master always points at the latest release).

You can use the branches sub-command to edit the list of branches. This uses the EDITOR environment variable to open the branch list file in your favorite editor.
$ git topic branches
Creating branches isn't done automatically, you'll have to do that yourself:
$ git checkout -b staging/foo staging/base

Maintaining the tree 

Once you have all your branches set up, run the rebase sub-command to have everything merged into staging/master and staging/work rebased on top.
$ git topic rebase
If not passed a new base, the rebase sub-command will use staging/base and skip rebasing the individual topic branches.

Whenever you want to rebase the tree, it's as simple as running:
$ git fetch next
$ git topic rebase next/master
This fetches the next remote and then rebases all of the topic branches onto next/master, merges them together and rebases staging/work on top. If next/master hasn't changed, git-topic will notice and skip the rebase of the individual branches.

If git encounters any conflicts during the rebase or merge steps, it will drop you to a shell and let you resolve things. This is one of the areas that could use some work, because it's not entirely trivial to know how to continue after resolving. If rebasing a topic branch fails, you'll be dropped to a shell by git first. You're then supposed to resolve the conflict and git rebase --continue. After the rebase finishes, git-topic will drop you to a shell again to check that all is well. If so, simply exit the sub-shell to continue with the next topic branch. Similarly if the merge fails at some point, you need to resolve the conflict and commit the resolution, just like you would with any other merge. If everything else looks good at that point, exit the sub-shell to continue the merge. The final rebase behaves exactly like the intermediate ones. If an conflict is encountered, you'll need to resolve it and git rebase --continue. In that case you'll still be dropped to a shell by git-topic after staging/work has been rebased, so you need to make sure that you exit that sub-shell, otherwise git-topic won't be able to clean up. You're done if you see this line:
$ git topic rebase next/master
git-topic: rebase: all done

Publishing the tree

Making the tree available to others is as simple as running the push sub-command with the target remote:
$ git topic push github