Thursday, January 26, 2017




I recently came across editorconfig while working on patches for Mesa. One thing I had noticed was that the file I was modifying was using inconsistent indentation (tabs vs. spaces) and it just happened to be in the lines that I modified, so I wanted to first make it consistent and then make the changes.

When I posted the patches I also included a vim modeline that will instruct vim to use specific settings for a that particular file. That has of course several disadvantages:

  1. It only works for vim, or editors that understand that modeline format
  2. Each file requires a duplicate of the modeline

There are various attempts to work around 2. using insecure vim options. And apparently there is a way to achieve this in Emacs. Given that I don't use Emacs, and evidently other Mesa developers don't either, that left me with no options.

The flaw in using these modelines was pointed out during review. Now there are a couple of vim modelines in Mesa code, so it had seemed like the best option at first, but it's obviously not a viable option to add modelines to each and every file of the huge code base that Mesa is. Also, a solution like that would have to be duplicated for other types of modelines.

Enter editorconfig

An interesting hint was dropped during review, though. The Mesa git repository contains a file named .editorconfig in the top-level directory. This is in a format specified by the editorconfig project.

The format is very simple and well documented on the above website. In a nutshell it allows you to specify coding style options that are applied to a set of files selected by a glob pattern.

Coding style is about more than just indentation, but indentation is certainly the easiest to leave up to the editor. Other aspects of the coding style are easy to adhere to when writing code (curly bracket placement, spaces around operators, ...). editorconfig focusses mainly on getting the indentation and whitespace usage right.

The best part is that editorconfig is supported by a number of editors. Some have built-in support and for all others it's likely that you can add support using one of the plugins that you can find on the editorconfig website.

For vim, the plugin to use is editorconfig-vim and which is available in many distributions. Debian has a package called vim-editorconfig and it is also available to Arch Linux users via the editorconfig-vim package in the AUR.

Again, the editorconfig website has links to many, many more plugins, so go look there if you want to try it out on your favorite editor.

Spread the word

Inconsistent indentation is one of the more annoying parts of open source software development [1]. I was quite surprised that I hadn't heard about the editorconfig project before, after all it's been around for half a decade.

If you've never heard of it before either, please help spread the word [2].

The world needs more editorconfig.

[1]... and software development in general.
[2]Please also help spread the word if you've heard of it before.

Thursday, June 23, 2016

What's new for Tegra in Linux v4.7

With v4.7-rc1 out[1], it's time to look at what to expect from Tegra support in Linux v4.7.


XUSB is a USB host controller that supports USB 3.0. After loading it with a proprietary firmware blob, it will expose an xHCI compliant interface and do USB 3.0 on ports that support it, as well as high-speed and full-speed where super-speed is not availabel.

The XUSB driver has been under development for a ridiculously long time. One of the reasons is that it relies on the XUSB pad controller to configure its pins as required by the board design. The XUSB pad controller is very likely one of the least-intuitive pieces of hardware I've ever encountered, and the attempts to come up with a device tree binding to describe it have been very numerous. We did finally settle on something earlier this year and after the existing code was updated for the new binding, we're finally able to support super-speed USB on Tegra124 and later.

Most of the work on this was done by Google as part of the Nyan Chromebooks and the Pixel C.

Core SoC drivers

Jon Hunter cleaned up the PMC driver and fixed a number of issues in it. He then went on to drive to conclusion another long-standing patchset that had been prototyped by Vince Hsu and me with the goal to expose the Tegra power partitions as generic power domains, which will eventually allow us to get rid of a custom Tegra API.

Some of the other core SoC drivers, such as I2C and DMA, have seen a couple of updates by Laxman Dewangan, though they've been feature-complete for the longest time, so there isn't anything really new there.


Alex Courbot has been continuing to improve support for the Tegra GPUs in the Nouveau driver. After adding support for secure boot, required by the Maxwell generation of GPUs, in v4.6, work has been ongoing in other parts of the kernel to enable the GPU on Jetson TX1. We will hopefully see that happen in the v4.8 timeframe.


Not much happened on the Tegra DRM front because I was occupied with the XUSB driver. However, a lot of code for Tegra X1 had already been merged in earlier releases, so many of the prerequisites to enable display on Jetson TX1 are in place already.

The Jetson TX1 comes with an HDMI connector that supports HDMI 2.0. The board also has a display connector that can be used to connect a DSI or eDP panel. Most of the NVIDIA engineers have a default display board equipped with a DSI panel running at a 1200x1920 resolution.


Laxman Dewangan has also added support for the Maxim MAX77620 PMIC that can be found on the P2180[2] and other Tegra X1-based boards (such as the Google Pixel C, a.k.a. Smaug). The MFD driver as well as the regulator drivers have been merged in time for v4.7, whereas a couple of others, such as the pinctrl, GPIO and RTC drivers have been merged for v4.8.

This work allows us to make good progress for the next release, because a lot of the hardware need the regulators and GPIOs exposed by these drivers for power. Patches are being worked on to enable DSI, HDMI, XUSB and GPU on the Jetson TX1.

Most 32-bit and 64-bit ARM Tegra boards now use the stdout-path property in device tree that will cause the right UART to be chosen for the debug console, so the old console kernel command-line parameter no longer needs to be passed explicitly.

Support was added for the Google Pixel C (a.k.a. Smaug), though it isn't very complete yet. It's enough to boot to a login prompt from a root filesystem on the eMMC. Further support was blocked on the MAX77620 PMIC patches, but since those have been merged, more features can now be enabled on Smaug for v4.8 and later.

  1. We're actually close to v4.7-rc5 at the time of this writing, but I had originally planned to finish this up much earlier.

  2. The P2180 is also known as the Jetson TX1 compute module. I guess it is technically called simply Jetson TX1, but everybody uses that as the short name for the Jetson TX1 Development Kit, which is a P2180 processor module on a P2597 I/O board.

Friday, April 22, 2016

Display panels are not special

Why are video timings not specified in device tree?

This is a question that arises every once in a while regarding display panel device tree bindings. This post tries to provide a complete rationale for why this is, so that I will hopefully never have to answer that question again.

First, let's see how a panel is typically described in device tree:
 panel: panel {  
   compatible = "auo,b133xtn01";  
   backlight = <&backlight>;  
   ddc-i2c-bus = <&dpaux>;  
The node name is panel and it has an associated label ("panel:") that consumers can use to easily reference it. The first property defines the compatible string. It identifies both the binding that the device tree node adheres to and also provides the primary means of identifying what kind of device it is.

What binding the device tree node adheres to is important because it defines the other properties, required and optional, that are relevant. In this case the panel can have a backlight associated with it (via a phandle reference) and a DDC bus (via another phandle reference) that is typically used to query the panel's EDID1.

Given that compatible strings are required to uniquely identify a device (via a vendor prefix separated from the model by a comma), they imply a great deal of information about the device. For IP blocks they imply the register layout and programming model. They also imply many static properties of the device, such as internal delays or constraints imposed on external signals.

Extending this to display panels it is natural to conclude that video timings are implied by the compatible string. Similarly, properties such as the physical dimensions of the active display area, type of video bus or pixel format are all implicitly defined by the compatible string.

But panels are special!

For some reason that evades me, people keep thinking that display panels are somehow special.

They really aren't.

Very early on in the process of defining the device tree bindings for panels people suggested that we come up with a generic binding that would allow us to fully describe all panels. But it became clear very quickly that this was never going to work.

I suspect one of the reasons why people think panels are special is because they think that video timings (and perhaps physical dimensions) are the only relevant properties of a panel. However once you go into the details, there are quite a few very important other characteristics.

It didn't take very long before panels were encountered that required a number of external resources (such as enable GPIO, reset GPIO, power supply regulator and so forth) to be controlled in a very specific way in order to properly turn a panel on and off. The first attempt at a solution was a debacle. The idea was to introduce a generic binding to describe power sequences in device tree and attach these power sequences to devices. The curious reader can find the discussion in public mailing list archives, but the gist of it is that the implementation turned into a monster of a scripting language within device tree that nobody wanted to have in the end.

Once you decide that the power sequences are implied by the compatible string this problem solves itself because you can simply write a driver that binds to the compatible string and provides simple C code to request the resources and control them in exactly the right way to match the sequences as specified in the datasheet.

Complex display panels

The matter becomes even more important when you start using complex display panels such as those on a DSI bus. It's fairly common for these display panels to require register level programming to get into a functional state. There are standards (such as DCS) that try to make this somewhat uniform, but the reality is that most panels require vendor- and model-specific register programming before they start displaying anything.

If you wanted to support such panels in a fully generic device tree binding you'd need to add support for DSI register programming to your power sequencing script language. You don't want to go there.

Dumb display panels

Many panels, fortunately, are fairly dumb and require a small set of resources with relatively simple power on and power off sequences. The Linux kernel implements a simple-panel driver that can support a large number of these simple panels with a single device tree binding.

Bloat versus duplication

People were, and still are, concerned about the kernel becoming bloated by the large number of video timings within the kernel. I don't think that's turned out to be true. The simple-panel driver supports 46 panels (in v4.6-rc4) and the resulting loadable kernel module is ~37 KiB with a .rodata section of ~20 KiB. In my opinion that's not very much. If people cared enough they could always go and add Kconfig symbols for all the simple panels in order to allow the size to be reduced.

While it is certainly true that this data wouldn't be shipped with every kernel image if the data was contained within device tree, doing so comes at its own cost: duplication. As argued above, the compatible string implies the video timings for a display panel. It follows that specifying the video timings in device tree is duplicating information. Even more so if you happen to have two boards which use the same display panel.

With the current model it is fairly trivial to add board support. Provided that your panel is already supported it is a simple matter of adding the node with the given compatible string and hooking up the board-specific resources (GPIOs, regulators, ...). If, however, the video timings, physical dimensions and power sequences had to be defined in device tree, everybody would have to tediously copy/paste all that data from existing boards, or resort to reading the datasheet in order to find out the information.

1 Ironically the EDID is where, among other things, the panel's video timings are stored. If an EDID is found, the DRM panel framework will of course use those timings and fall back to built-ins otherwise. 

Wednesday, March 30, 2016

Build-testing for the Linux kernel DRM subsystem

Following up on this earlier post, I'll briefly detail how I run build tests on the Linux kernel DRM subsystem.

I maintain the scripts and build configurations in a git repository:


The build script takes a number of arguments, most of which should be familiar. You can get a list of them by running the scripts with the -h or --help option. Here's how I typically run the script:

 $ drm/build --jobs 13 --color  

This instructs the script to have make start 13 jobs (that's the number of cores * 2 + 1, feel free to use whatever you think is a good number) and colorize the output. I find colorized output convenient because it will mark failed builds with a bright red status report.

By default the script will derive the build directory name from the output of git describe and prepend build/ as well as append /drm. Each configuration will be built in a subdirectory (with the same name as the configuration) of that build directory. This should give you a fairly unique name, so no need to worry about losing any existing builds.

In addition to command-line options, the script can take a list of configurations to build via positional arguments.

Once done, each configuration's build directory contains a build.log file as well as an error.log file. The contents of error.log are included in the build.log file. In addition the script will output a single line per build, along with a status report of the build result.


There's another script in the above-mentioned repository: coverage. You can run it after a run of the build script to check for compile coverage. This is fairly simplistic, since it checks only on a per-file basis, not how much code in the file actually got compiled. It'll report any source files that aren't compiled in the drivers/gpu/drm directory and can be a quick indicator of whether or not the configurations are adequate.


Most of the above can be achieved by simply building x86 defconfig and ARM multi_v7_defconfig configurations. However there are drivers that aren't covered by those configurations (drm/vc4 needs ARM v6, EDIT: this is no longer true, ARCH_BCM2835 which DRM_VC4 depends on is now available with ARM v7 and hence multi_v7_defconfig) and the default configurations usually take much longer to build than the minimal configurations contained in the repository. Also the included configurations build 32-bit and 64-bit variants, which occasionally catches inconsistent type usage (size_t vs. unsigned long/int, ...).

Compile-testing for the Linux kernel subsystems

One of the most common pitfalls when contributing to Linux kernel development is that if you make subsystem-wide changes, it can become difficult to ensure everything still builds fine. Chances are that if the subsystem is moderately big you won't be able to build all drivers on a single architecture or configuration. Contributors may not always care about test-building all drivers, but at least maintainers will have to, otherwise chances are that they'll make life miserable for others.


The Linux kernel configuration system provides some assistance to maintainers through the COMPILE_TEST Kconfig symbol. It can be used as a catch-all dependency to override architectural dependencies of drivers. This allows all drivers marked with this dependency to be built on all architectures (unless excluded via other dependency chains), which means that you can compile-test the code without resorting to cross-compilation or multiple configurations.

If used without care it can be surprisingly painful, though. The problem with the COMPILE_TEST symbol is that drivers are exposed on architectures and configurations that they normally aren't compile-tested on. Build breakage is a common result of adding COMPILE_TEST dependencies without careful consideration. The reason is primarily that the architectural dependencies imply the existence of a specific API that may isn't supported on all architectures. Builds on some of the more exotic architectures will trigger this kind of failure, though they've also been known to happen on fairly common configurations.

Perhaps the most popular solution to this is to provide dummy implementations of an API that will allow compilation to succeed on configurations where it isn't available. This cuts both ways, though, because it also means that you get to successfully build a kernel image that includes drivers which will use the dummy implementation and therefore be completely dysfunctional. There are legitimate uses for that, but they are very rare.


A more rigorous approach is to use cross-compilation toolchains and build multiple configurations that will ensure full coverage without resorting to fake dependencies.

Using toolchains, scripts and configuration that I've written about previously, I perform quick sanity builds for a number of architectures and configurations for several subsystems on a regular basis. Often I will run them on the latest linux-next tree or before pushing code to a public repository.

Maintaining configurations

Sanity builds often rely on allmodconfig configurations, which enable all drivers on a particular architecture. This is useful because it always gives you the maximum coverage. The downside is that it will require a very long time for such builds to complete. If all you want to do is compile a set of drivers (i.e. all those in a particular subsystem) you can get done much faster.

However, you wouldn't want to keep various .config files around, because they can become a nightmare to maintain as kernel development progresses. I've been using a method that takes advantage of some of Kconfig's functionality to create a set of minimal configurations that will provide maximum build coverage.

The idea is to create a sort of script for each configuration that will gradually tune the .config file. Each script starts out by specifying the architecture and will then typically use allnoconfig as a starting point. Individual symbols can then be enabled as needed. Finally all the changes will be applied and additional dependencies resolved using an olddefconfig run before the configuration is built. The script language is easily extensible, we'll see shortly why that is, but these are the most common commands:
  • include: includes another script
  • kconfig: implements a Kconfig frontend with additional subcommands:
    • architecture: selects the architecture to build
    • allnoconfig, olddefconfig, ...: this is really a wildcard on *config that will run the given configuration target using the Linux kernel's makefile
    • enable: enable a Kconfig symbol
    • module: enable a Kconfig symbol as module
    • disable: disable a Kconfig symbol
  • kbuild: build the configuration
An example configuration might look like this:

kconfig architecture x86  
kconfig allnoconfig  
# basic configuration  
kconfig enable DEBUG_FS  
kconfig enable SYSFS  
# for PWM_CRC  
kconfig enable GPIOLIB  
kconfig enable I2C  
kconfig enable INTEL_SOC_PMIC  
# for PWM_LPSS_PCI  
kconfig enable PCI  
kconfig enable ACPI  
# PWM drivers  
kconfig enable PWM  
kconfig enable PWM_CRC  
kconfig enable PWM_LPSS  
kconfig enable PWM_LPSS_PCI  
kconfig enable PWM_LPSS_PLATFORM  
kconfig enable PWM_PCA9685  
# PWM users  
include users.include  
kconfig enable DRM  
kconfig enable DRM_I915  
kconfig enable MFD_INTEL_SOC_PMIC  
kconfig olddefconfig  

This is one of the configurations I use to compile-test the PWM subsystem. As you can see, this selects the x86 architecture and starts off with an allnoconfig. It then enables some basic options such as DEBUG_FS and SYSFS because they enable optional code. What follows are some sections that enable dependencies for various drivers, the driver options themselves and a set of users. Note how this includes users.include, a file that contains a set of options that enables users of the PWM API and which is shared with various other configurations. Finally the olddefconfig command will resolve all dependencies and generate the final .config file which is used by the kbuild command to build the kernel image.

An interesting implementation detail is that this script is really a shell script. This has a number of advantages:
  • comments are automatically parsed and discarded by the shell
  • commands can be implemented simply by shell functions
  • the include command is trivial to implement in terms of the source builtin
There are various example scripts that make use of this:
Configuration scripts are available in the configs subdirectories of the respective parent directories. If you look at the code you'll note that there's potential for refactoring. Each of the scripts, at the time of this writing, duplicates the implementation of the configuration script commands. I'm likely going to factor this out into a common script library that can be shared.

Wednesday, March 23, 2016

Dealing with multiple cross-compilation toolchains

I've written previously about how I use a set of scripts to build cross-compilation toolchains. Managing a multitude of toolchains can be quite painful, because you need to remember the host triplets (or quadruplets) for each of them along with their install location. The latter isn't really an issue if you have them installed in the same directory, but if you rely on prebuilt toolchains that is often not possible.

Being a Linux kernel maintainer, I spend most of my time cross-compiling Linux kernels. It's natural to write scripts to automate as much as possible. After a couple of years I've accumulated a number of scripts to do a number of different things. For example I have a script that will build the Linux kernel with various default configurations for sanity checking. I often run it on linux-next for baseline results, which means that if I see the build fail on any of those configurations I usually don't bother rebasing any of my development trees on top.

I also have a set of scripts to assist with building more specific configurations, such as for testing Tegra in particular, or build all source files that are related to the PWM subsystem along with a script that provides some measure of coverage.

Any of these scripts needs to have some code that sets up the cross-compilation toolchain to use for a particular build. That code is always the same, so after a while I started thinking about how to reuse the code. I used to have large case statements in the scripts to select the right host triplet/quadruplet based on the architecture and which added the directory containing the toolchain binaries to the PATH. Copy/pasting that to new scripts became tedious, and maintaining a consistent set of mappings across all scripts was a nightmare.

The solution I came up with is rather simple. I now configure the cross-compilation toolchains using a .cross-compile file in my home directory. It consists of key-value pairs where the key is either path or the name of an architecture such as arm, arm64, mips, x86 or x86_64. Keys are separated from values by a colon. path entries add the value to the PATH. Architecture entries specify the host triplet/quadruplet to be used for a specific architecture.

Here's the ~/.cross-compile file that I use:
path: $HOME/pbs-stage1/bin:$HOME/toolchain/avr32/bin:$HOME/toolchain/unicore32/bin
arm: armv7l-unknown-linux-gnueabihf-
arm64: aarch64-unknown-linux-gnu-
avr32: avr32-
blackfin: bfin-unknown-elf-
mips: mips-linux-gnu-
unicore32: unicore32-linux-

I also have a small shell script library that I can easily include in scripts which will parse the file and set up the PATH, ARCH and CROSS_COMPILE environment variables. That's kind of tailored to the Linux kernel build system, unsurprisingly, given where this comes from. It's fairly easy to reuse this in other contexts, such as autotools, though.

The shell library can be found here:
An example script that makes use of it can be found here:

Monday, March 21, 2016

pbs-stage1: Building toolchains for cross-compilation


Cross-compiling is the process of compiling code on one (build) machine to be run on another (host) machine. Often the host machine will use a different architecture than the build machine. The process is mostly the same as regular compilation, with the notable exception that the code that is built cannot be run on the build machine. This can be significant, and excruciating, under certain circumstances, but that's a story that will be told another time.


Toolchain is a rather loosely defined term, but it typically refers to the collection of utilities required to build software. Depending on the software this set can vary widely. For the purposes of this post I'll limit the set to a C compiler, an assembler and a linker. In the GNU/Linux world this is provided by the GCC and GNU binutils projects. For bare-metal software those will typically be enough, but most developers will at least want to be able to run software under a Linux operating system, so we'll need to include a C runtime library (such as the GNU C library, a.k.a. glibc) and Linux kernel headers in our set as well. There are a couple of dependencies that the above (primarily GCC) require, so those go into the set as well. If we want anything fancy we may even consider adding things like m4, autoconf, automake and libtool, or even pkgconfig and quilt. For debugging support, let's also add the GNU debugger (gdb) to the mix.

As you see there's a bunch of software involved to build software. It's fairly complicated to build all these pieces and make them work well together. There are a couple of projects that do this already:
Mostly out of curiosity I started looking into how to build a cross-compilation toolchain many moons ago. Some of the above tools didn't exist at the time, and those that did exist didn't quite do what I wanted them to. I could of course have done the right thing and improve one of the existing solutions, but instead I went and did something new. Because I thought I could do better. Also doing so was very instructive, and that's what it's really all about, right?

Both crosstool-ng and buildroot work perfectly for a large number of people, so I definitely recommend looking at those if you're looking for ways to build cross-compilation toolchains. crosstool has become somewhat outdated, so you'll probably have less luck with it.

Some distributions also provide cross-compilation toolchain packages. Debian has quite a few while others often only provide toolchains targetting the ARM architecture. It's also quite common for distributions to ship only bare-metal toolchains, so developing against a Linux-based distribution doesn't work. There are also some other well-known toolchain providers, such as CodeSourcery, DENX (ELDK) and Linaro.


The name is a historic relic. A long time ago (or at least it now seems so) my day job involved building custom distributions for embedded systems. Again, a number of alternatives existed (and still do) to do the same thing. And again, curiosity was what made me want to roll my own. Back at the time, the system to build these distributions was called PBS (platform build system). As the name indicates, these weren't the kinds of generic distributions, but they were highly configurable and tailored to a specific (hardware) platform.

Initially, PBS included a toolchain builder, but that turned out not to scale very well, because it essentially meant that you needed to build a toolchain everytime you wanted to start a new platform, even if both used the same architecture or even chip. To get rid of that duplication, the project was split into stages: the toolchain builder (stage 1) and the platform builder (stage 2).

pbs-stage1 uses a fairly simple build system. It has a top-level makefile that will include a toolchain host definition and build all required packages in the right order, using make dependencies. Some of the components of the toolchain are architecture-independent (libtool, autoconf, automake, pkgconfig, ...) and the scripts will install stamp files in the installation directory to prevent those from being built multiple times. Effectively only the binutils, gcc, Linux kernel header files, glibc and gdb will be built per host definition.

By default, toolchains will be installed into a pbs-stage1 subdirectory of your home directory. This can be overridden by overriding the PREFIX variable on the command-line of the make invocation.

Using the toolchain is straightforward once you've built and installed it. Simply add the $HOME/pbs-stage1/bin directory to your PATH and you're good to go.


You can find pbs-stage1 in a repository on github:
The repository contains a README file that explains briefly how to build a toolchain. There are a bunch of host definitions included, though some of the more exotic ones aren't guaranteed to build or work.