Identify Yourself, Firmware!

July 29, 2019 | Jacob Lewallen

It all began with a common firmware header. This header is at the beginning of every binary our build system produces and contains metadata about that particular binary. Information like the timestamp of the build, the hash of the binary, the git commit of the source tree, the binary’s size and, critically, symbol information.

There’s a few ways to do this. The quick and dirty way is just to concatenate the header and the generated binary. This approach would work but also leaves a little to be desired, especially when compared to “The Right Way”. In this situation, the right way is to include that firmware header as an actual symbol declared in the source and to carry it through the entire build process. So this means I got to spend a lot of time learning about linking and linking scripts.

Linking in modern C/C++ is incredibly complex and so this project was a good introduction to prepare myself for future functionality and this is by no means an exhaustive description of that process. Our build chain looks something like this:

  1. Compile *.c, *.cpp, and *.s to object files.
  2. Archive grouped source files into static libraries.
  3. Link those libraries together to form an ELF file.
  4. Run that ELF file through a custom tool [1], generating an “FKB-ELF” file (FKB is FK Binary)
  5. Developers can then use that generated ELF with gdb or dump a binary using objdump.

To get our headers working, it all starts with a declaration.

Typically, the very first chunk of data in your firmware binary (for a Cortex-M chip) is the ISR vectors table. This table starts with the initial stack pointer value, and then a table of pointers to the functions for handling various IRQs. This is where the hardware finds your Reset_Handler function, which is the first function to be invoked.

Executable files are composed of multiple sections, or segments. Each of these has a special purpose. For example, executable instructions are stored in .text segments. If you refer back to the post on Memory, data is stored in a .data segment, and there’s also a .bss segment, though it’s not present in the binary and just managed so that we can determine its size. In all major compilers you can override the section/segment that variables and functions are kept in.

What I wanted was for the FKB header to occupy the leading bytes of the final binary, before the vectors table so that the bootloader and other tools could find them. This is very easily done by assigning variables to custom sections in the source, in my case using gcc’s __attribute__((section())) mechanism. So, the header is declared like so:

__attribute__((section(".fkb.header")))
const fkb_header_t fkb_header = {
  .signature = FKB_HEADER_SIGNATURE(),
  .version = 1,
  .size = sizeof(fkb_header_t),
  /* Etc */
};

The linker script then places this section before the ISR vectors, being sure to maintain the alignment the hardware expects on that table.

.data.fkb.header : {
    KEEP(*(.fkb.header)) . = ALIGN(0x1000);
} > FLASH

I should mention that the header as compiled is basically empty and filled with default values. I wanted to be able to customize this header after compilation. Especially because certain things become tricky if you try to inline the header values during compilation, like how do you include the hash for the final binary? Catch-22 town. I also knew that there would be other steps that would have to be performed after linking, which we’ll get to later.

Next, enter our custom firmware tool. I wrote this tool in Python using the libraries provided by the LIEF project [2] This is a library for manipulating ELF files and has been great. With this library it was very easy to open up the fresh ELF file, find the section I was looking for and replace the contents with the final header, with all values populated. Because this happens after linking the tool has access to all kinds of information it might not otherwise know, like code and data section sizes and the final size of the binary. A new ELF file is then generated and the header appears at the start of the binary as expected.

Our bootloader knows to look for these headers when loading firmware. It can tell if there’s a header using a magic string. If that header is missing it’ll just boot the binary as regular firmware. Otherwise, the bootloader can find the vector table using an offset baked into our header so that future headers can be sized arbitrarily. Our bootloader also reports this information to the debugging console:

bl: looking for executable...
bl: [0x00004000] checking for header
bl: [0x00004000] found ('fk-bundled-fkb.elf_JACOB-HOME_20190726_155624') flags=0x0 size=161424 vtor=0x1000
bl: [0x00004000] hash='884b434ee60e8d38bad81ae5d3e48c07acadfdd2' timestamp=1564156584
bl: [0x00004000] number-syms=0 number-rels=0 got=0x0 data=0x2000048c
bl: [0x0002C000] checking for header
bl: [0x00005000] executing (entry=0x00005004) (got=0x20000000)

There’s some other information in there that I’ll get to later. This information can also easily be included in diagnostic reports and compared to binaries on phones or provided over a network. It’s also alerted me once to a situation where I broke our build system and wasn’t getting new binaries!

[1] https://github.com/jlewallen/loading/blob/master/tools/mkfirmware.py

[2] https://github.com/lief-project/LIEF