# Archive for the 'p.m.o' Category

## Google Reader death, or how the cloud model can fail you

If you’re a Google Reader user, you probably read in one of your subscriptions that Google is pulling the plug on Google Reader. It is yet another demonstration of why putting data in the cloud isn’t so much of a nice idea: the service you rely on may well disappear some day, with all the data it contains.

Sure Google, in its extreme goodness, allows you to “take out” the Google Reader data. Or does it?

• followers.json, following.json: both files contain similar data, that I suspect correspond to Buzz subscriptions (yet another dead service). Each friends item contains some information about your “friend”, and a stream identifier for their activity (I guess), as well as a few websites urls. For instance Tim Bray’s stream is “user/05198174665841271019/state/com.google/broadcast“. What the hell do I do with that? Fortunately, he has websites, but not all my “friends” have. Thankfully, I haven’t really been using this feature, so there’s almost nothing in these files.
• liked.json, starred.json, shared.json, shared-by-followers.json: all have the same structure, and contain items you liked, starred, shared, or that the people you follow shared (yeah, that file is badly named). Each item contains an url (or so I hope), and the corresponding content (yay). shared-by-followers.json however doesn’t contain more than the items the people you follow actively shared: it doesn’t contain their feeds (and I’m pretty sure I read more from Tim Bray than the two links he shared)
• subscriptions.xml: Essentially, a list of RSS feed urls, with no content ; nothing from Tim Bray here, but now that I think about it, I think I was only following his Buzz feed, so that went away with Buzz without me noticing.

Interestingly, while looking into shared-by-followers.json, I found urls that would correspond to friend streams. For instance, Tim Bray’s is http://www.google.com/reader/public/atom/user/05198174665841271019/state/com.google/broadcast. But it’s useless: all it displays is “permission denied”.

As for subscriptions, one of the strengths of Google Reader is that it allowed to search though past items, which means a big part of the interesting data is the archived items. But that’s not part of the “take out”. Sure, you have the feed urls, but most RSS feeds contain a limited amount of items, not the entire history of items for the given feed. So, history is more or less lost. Except if I star, like or share all items in all my subscriptions and “take out” again.

So much goodness.

It could have been worse, though.

2013-03-14 08:35:45+0200

## Ten years

Ten years ago, this very day, my first Debian package entered the Debian unstable repository. It was an addon for Mozilla Composer, Daniel Glazman’s Cascades.

On the same day, my second Debian package entered the Debian unstable repository as well. It was an addon for Mozilla Browser, Checky.

A few days later, my third Debian package entered Debian unstable. It was an addon for Mozilla Browser, Diggler.

Do you see a pattern? They are now abandoned software, although I made Checky and Diggler live a little longer (and I’m actually considering reviving Diggler) but they had their importance in my journey, and are part of the reason why I am where I am now.

My journey on the web started with NCSA Mosaic on VAX/VMS, then continued with Netscape Navigator, Netscape Communicator and Mozilla Suite on Linux.

That’s where I was ten years ago, sailing between Galeon (a browser using the Mozilla engine) and Mozilla Suite, and filing some layout bugs.

Ten years ago, there was a new kid on the block. It used to be called Phoenix, it had just changed its name to Firebird. Eventually, it changed again for Firefox. You may have heard about it. Because Firebird was so much nicer than the browser in the Mozilla Suite, I started using its Debian package, and wanted my packaged addons to work with it. So I contacted Eric Dorland, Phoenix/Firebird package maintainer at the time, and got the addons working. I then ended up fixing a bunch of packaging issues.

This is how I got involved in Firefox packaging for Debian, and what eventually led me to work for Mozilla.

2013-02-19 22:45:30+0200

## Using distcc and clang on Linux to build Firefox faster on MacOSX

Apple makes nice hardware, until you look into the details. As it turns out, my Macbook Pro has issues with heat. So much that when building Firefox, the CPU decides it’s too hot and throttles its frequency to emit less heat. The result is that building Firefox takes much more time than it would if the CPU could stay at its highest frequency all the time. On this machine, it takes more than an hour to build Firefox.

Most of the time, this MBP runs Linux. The heat problem is the same, but I also have a much beefier build server, running Linux too, that I use for builds. I used to use distcc from the MBP, using both the MBP and the build server to build everything. This was convenient, because the source tree could stay on the MBP. It however turned out to be slower than doing a local build on the build server. So I now push my changes to the build server, which then does the build alone. It takes about 18 minutes for a clobber build (without ccache), this way.

Anyways, I don’t build often on Windows or Mac, but when I need to, I’m essentially doing a clobber build, and as a result, the MBP spends an awful lot of time doing so. So, since I already had a distcc daemon running, with clang available, and since I was in need for a Mac build, I figured I’d try something seemingly crazy: use the Linux build server.

It turns out Linux clang is able to emit Mach-O objects (the binary format used on MacOSX). So all I needed to do was to add something like the following to my .mozconfig:

export CC="distcc clang -ccc-host-triple x86_64-apple-darwin"
export CXX="distcc clang++ -ccc-host-triple x86_64-apple-darwin"


(After fixing bug 818092)

Now I can enjoy builds taking slightly less than 30 minutes. Which is more than twice as fast.

Update: in case this wasn’t very clear, the build is initiated from OSX, where preprocessing is handled (distcc pump mode is not enabled), which allows the Linux build server to do the right thing with clang.

2012-12-04 19:30:15+0200

## Hooking the memory allocator in Firefox

Supplanting the system memory allocator usually involves some tricks. In a cross-platform software like Firefox, this involves different tricks on different platforms. Firefox uses such tricks to implant jemalloc. Sadly, this makes replacing jemalloc itself even trickier.

For instance, trace-malloc, our leak detection tool, used on debug builds, requires that jemalloc is disabled.

Work is under way to make supplanting jemalloc much easier. It is not yet clear if this will be enabled by default on release builds, but it would make sense to enable the feature at least on nightlies.

What does the feature provide? A way to hook or replace jemalloc in Firefox at startup time (as opposed to build time, like trace-malloc). The idea is to build a specialized library (more on that further below) and make Firefox use it instead, or on top of jemalloc, with some weak linking tricks. To enable the feature, pass --enable-replace-malloc to configure or add ac_add_options --enable-replace-malloc to your mozconfig (provided you applied the patches or got a tree where the patches are landed).

With the feature built, you can start Firefox with a malloc replacement library easily:

• On GNU/Linux:

$LD_PRELOAD=/path/to/library.so firefox • On OSX: $ DYLD_INSERT_LIBRARIES=/path/to/library.dylib firefox
• On Windows:
$MOZ_REPLACE_MALLOC_LIB=drive:\path\to\library.dll firefox • On Android: $ am start -a android.activity.MAIN -n org.mozilla.fennec/.App --es env0 MOZ_REPLACE_MALLOC_LIB=/path/to/library.so

As I happen to have built Firefox with the feature enabled for all platforms on try, to validate that it works, you can toy around with these builds.

A replacement library is expected to provide the following functions, or any subset:

• void replace_init(const malloc_table_t *table)
• void *replace_malloc(size_t size)
• int replace_posix_memalign(void **ptr, size_t alignment, size_t size)
• void *replace_aligned_alloc(size_t alignment, size_t size)
• void *replace_calloc(size_t num, size_t size)
• void *replace_realloc(void *ptr, size_t size)
• void replace_free(void *ptr)
• void *replace_memalign(size_t alignment, size_t size)
• void *replace_valloc(size_t size)
• size_t replace_malloc_usable_size(usable_ptr_t ptr)
• size_t replace_malloc_good_size(size_t size)
• void replace_jemalloc_stats(jemalloc_stats_t *stats)
• void replace_jemalloc_purge_freed_pages()
• void replace_jemalloc_free_dirty_pages()

The first function, replace_init is the first function from the library that will be called (if it exists), before the first call to any other. It is passed a pointer to a function table containing pointers to the corresponding jemalloc functions from Firefox.

The last three functions are specific to jemalloc. jemalloc_stats is only important to replace if you want about:memory to still be accurate according to anything you’ve done in other functions, and jemalloc_purge_freed_pages and jemalloc_free_dirty_pages are used to force the allocator to return some unused memory to the system.

The other functions are the usual suspects, picked from C89, POSIX, C11, or OSX (malloc_good_size). They should however all be considered cross-platform (especially malloc_good_size).

All these functions, when they exist, are called instead of the corresponding jemalloc functions, which makes it the responsibility of the replacing functions to call back the corresponding jemalloc function if necessary.
This allows, for example, to:

• Replace jemalloc entirely. The third patch bug 804303 does that to allow to replace the (currently default) old fork of jemalloc with a fresh jemalloc. Something similar could be done to test other allocators, like tcmalloc.
• Make memory allocation functions randomly return NULL as in Out of Memory conditions, aka fuzzing.
• Make all allocations bigger to add tracing data.
• Log allocations.
• etc.

## A small implementation example

Consider the following question: how many times does realloc end up copying data? Stated differently, how many times does realloc not return the pointer it was given?

Create the memory/replace/realloc/realloc.c file with the following content:

// This header will declare all the replacement functions, such that you don't need
// to worry about exporting them with the right idiom (dllexport, visibility...)
#include "replace_malloc.h"
#include <stdlib.h>
#include <stdio.h>

static const malloc_table_t *funcs = NULL;
static unsigned int total = 0, copies = 0;

void print_stats()
{
printf("%d reallocs, %d copies\n", total, copies);
}

void replace_init(const malloc_table_t *table)
{
funcs = table;
atexit(print_stats);
}

void *replace_realloc(void *ptr, size_t size)
{
void *newptr = funcs->realloc(ptr, size);
// Not thread-safe, but it's only an example.
total++;
// We don't want to count deallocations as copies.
if (newptr && newptr != ptr)
copies++;
return newptr;
}


Add a memory/replace/realloc/Makefile.in file:

DEPTH           = @DEPTH@
topsrcdir       = @top_srcdir@
srcdir          = @srcdir@
VPATH           = @srcdir@

include $(DEPTH)/config/autoconf.mk LIBRARY_NAME = replace_realloc FORCE_SHARED_LIB = 1 NO_DIST_INSTALL = 1 CSRCS = realloc.c MOZ_GLUE_LDFLAGS = # Don't link against mozglue WRAP_LDFLAGS = # Never wrap malloc function calls with -Wl,--wrap include$(topsrcdir)/config/rules.mk


Add the following to memory/replace/Makefile.in:

DIRS += realloc

Finally, build objdir/memory/replace. You’ll get a library in objdir/memory/replace/realloc that you can use as described at the beginning of this post.

On my system, after starting and quitting Firefox without doing much, it prints:

41078 reallocs, 37197 copies

It sure is a simple example, that can actually be fulfilled with other tools (like dtrace), but it’s now up to you, developers, to come up with more useful uses. The blocked bugs already show some. Note this facility still has the advantage of being more cross-platform than tools like dtrace, and to work happily on top of jemalloc (valgrind, for instance, doesn’t support that gracefully), which can be important when looking at some particular aspects of memory allocation. The above example, while simple, is a typical case where the underlying memory allocation library has an impact on the result: other memory allocation libraries have different size classes, which modifies how often realloc will need to actually reallocate, as opposed to grow the existing allocation in-place.

2012-11-27 13:49:15+0200

## Debian EFI mode boot on a Macbook Pro, without rEFIt

Diego’s post got me to switch from grub-pc to grub-efi to boot Debian on my Macbook Pro. But I wanted to go further: getting rid of rEFIt.

rEFIt is a pretty useful piece of software, but it’s essentially dead. There is the rEFInd fork, which keeps it up-to-date, but it doesn’t really help with FileVault. Moreover, the boot sequence for a Linux distro with rEFIt/rEFInd looks like: Apple EFI firmware → rEFIt/rEFInd → GRUB → Linux kernel. Each intermediate step adding its own timeout, so rEFIt/rEFInd can be seen as not-so-useful intermediate step.

Thankfully, Matthew Garrett did all the research to allow to directly boot GRUB from the Apple EFI firmware. Unfortunately, his blog post didn’t have much actual detail on how to do it.

So here it is, for a Debian system:

• Install a few packages you’ll need in this process:
# apt-get install hfsprogs icnsutils
• Create a small HFS+ partition. I have a 9MB one, but it’s only filled by about 500K, so even smaller should work too. If, like me, you were previously using grub-pc, you probably have a GRUB partition, you can repurpose it. In gdisk, it looks like this:
Number  Start (sector)    End (sector)  Size       Code  Name
5       235284480       235302943   9.0 MiB     AF00  Apple HFS/HFS+

Partition GUID code: 48465300-0000-11AA-AA11-00306543ECAC (Apple HFS/HFS+)
First sector: 235284480 (at 112.2 GiB)
Last sector: 235302943 (at 112.2 GiB)
Partition size: 18464 sectors (9.0 MiB)
Attribute flags: 0000000000000000
Partition name: 'Apple HFS/HFS+'

• Create a HFS+ filesystem on that partition:

# mkfs.hfsplus /dev/sda5 -v Debian

(replace /dev/sda5 with whatever your partition is)

• Add a fstab entry for that filesystem:
# echo $(blkid -o export -s UUID /dev/sda5) /boot/efi auto defaults 0 0 >> /etc/fstab • Mount the filesystem: # mkdir /boot/efi # mount /boot/efi  • Edit /usr/sbin/grub-install, look for « xfat », and remove the block of code that looks like: if test "x$efi_fs" = xfat; then :; else
echo "${efidir} doesn't look like an EFI partition." 1>&2 efidir= fi  • Run grub-install. At this point, there should be a /boot/efi/EFI/debian/grubx64.efi file (if using grub-efi-amd64). • Create a /boot/efi/System/Library/CoreServices directory: # mkdir -p /boot/efi/System/Library/CoreServices • Create a hard link: # ln /boot/efi/EFI/debian/grubx64.efi /boot/efi/System/Library/CoreServices/boot.efi • Create a dummy mach_kernel file: # echo "This file is required for booting" > /boot/efi/mach_kernel • Grab the mactel-boot source code, unpack and build it: # wget http://www.codon.org.uk/~mjg59/mactel-boot/mactel-boot-0.9.tar.bz2 # tar -jxf mactel-boot-0.9.tar.bz2 # cd mactel-boot-0.9 # make PRODUCTVERSION=Debian  • Copy the SystemVersion.plist file: # cp SystemVersion.plist /boot/efi/System/Library/CoreServices/ • Bless the boot file: # ./hfs-bless /boot/efi/System/Library/CoreServices/boot.efi • (optional) Add an icon: # rsvg-convert -w 128 -h 128 -o /tmp/debian.png /usr/share/reportbug/debian-swirl.svg # png2icns /boot/efi/.VolumeIcon.icns /tmp/debian.png # rm /tmp/debian.png  Now, the Apple Boot Manager, shown when holding down the option key when booting the Macbook Pro, looks like this: And the Startup disk preferences dialog under OSX, like this: 2012-11-18 11:18:14+0200 ## Fun with weak dynamic linking Dynamic linkers, at least in the UNIX world, usually allow to load libraries in a process address space at startup. On Linux systems, you load such a library with LD_PRELOAD. On OSX, with DYLD_INSERT_LIBRARIES. On Linux systems, when using LD_PRELOAD, the dynamic linker will also use symbols from the (pre)loaded library instead of system libraries. For example, if a program calls the write function and a library exporting a write symbol is loaded with LD_PRELOAD, the write function from the loaded library will be used instead of that of the libc (even when the symbol version doesn’t match). On OSX, symbol resolution is usually done with a “two-level namespace”: symbols are associated with library names, and when resolving symbols, both are used. More than that, the library name is used to find the library in the dyld search path. Libraries loaded with DYLD_INSERT_LIBRARIES won’t be used with two-level namespace symbol resolution, which makes it less useful than LD_PRELOAD. Fortunately, it is also possible to use a “flat” namespace, in which case only the symbol name is considered during symbol resolution, and is searched in all loaded libraries, in the order in which they were loaded. Flat namespace can be triggered by setting the DYLD_FORCE_FLAT_NAMESPACE environment variable, linking the main program with -force_flat_namespace, or linking programs and libraries with the -flat_namespace argument. Note the latter only affects the programs and libraries built with that argument, while the former two force to use a flat namespace for all libraries, including those which, like system ones, were built with a two-level namespace. There are also cases where a single symbol may be resolved with the flat namespace, while others in the program or library are using two-level namespace. Weak dynamic linking is another feature that can be used to tell the dynamic linker to ignore missing symbols. Consider the following source code: extern void foo() __attribute__((weak)); // weak_import is preferred on OSX. int main() { if (foo) foo(); return 0; }  On Linux systems, this just works. Compile the program (you’ll need to build it with -fPIC, though), start it, and it will do nothing, since foo is defined nowhere. Combined with shared library (pre)loading, this can be used to provide simple hooks in your application. For example, with the following source code built as a shared library: #include <stdio.h> void foo() { printf("foo\n"); }  Running the program again with LD_PRELOAD set to load that shared library (note LD_PRELOAD=foo.so won’t work, a path is needed, like LD_PRELOAD=./foo.so), it will print foo because the dynamic linker will have resolved the foo symbol to that of the shared library. On OSX, unfortunately, things are not as easy. First, building the test program above fails: Undefined symbols for architecture x86_64: "_foo", referenced from: _main in test-LIeVtB.o ld: symbol(s) not found for architecture x86_64 clang: error: linker command failed with exit code 1 (use -v to see invocation)  There are linker options to force undefined symbols to be resolved at runtime: • -undefined dynamic_lookup, which will mark all undefined symbols as having to be looked up at runtime, • -Wl,-U,symbol_name, which only does so for the given symbol (note: you have to prepend an underscore to the symbol name) With one of these options, the program works as on Linux: it does nothing when run alone, and prints foo when loading the library with DYLD_INSERT_LIBRARIES… if you build with XCode 4.5. Running the program built with XCode 4.3 fails when not loading the shared library, with the following error: dyld: Symbol not found: _foo Referenced from: ./test Expected in: flat namespace in ./test Trace/BPT trap: 5  And if at build time, you target OSX 10.5 (with MACOSX_DEPLOYMENT_TARGET or -mmacosx-version-min), the error is slightly different: dyld: Symbol not found: _foo Referenced from: ./test Expected in: dynamic lookup Trace/BPT trap: 5  Each error is due to a different bug: • Since OSX 10.6, the link edition rules in the __LINKEDIT segment are in a new, compressed, format: DYLD_INFO. The linker in Xcode < 4.5 forgets to flag weak imports as weak in the DYLD_INFO data. Compare the output for dyldinfo with a binary built with Xcode 4.5 vs. the output for a binary built with Xcode 4.3: $ dyldinfo -bind test-xcode4.5 | sed -n '2p;/foo/p'
__DATA  __got            0x100001038    pointer      0 flat-namespace   _foo (weak import)

$dyldinfo -bind test-xcode4.3 | sed -n '2p;/foo/p' segment section address type addend dylib symbol __DATA __got 0x100001038 pointer 0 flat-namespace _foo  Notice the missing weak import. Manually setting the flag with a hexadecimal editor fixes it (in both bind and lazy_bind tables). • In older OSX releases, the link edition rules use a “classic” format. In that format, the symbol table is used for flags such as N_WEAK_REF (weak reference). In the new DYLD_INFO format, the N_WEAK_REF flag isn’t used, which partly explains the previous bug. When using the old format and two-level namespaces, missing weak symbols are errors. With flat namespace, they aren’t. This means the error we get with a program built for a 10.5 target goes away if building with -flat_namespace. Note this doesn’t work at the symbol level: if the library is built for two-level namespace (and is marked as such in the Mach-O header), and the weak symbol is without a corresponding library name, making it resolved with a flat namespace, it doesn’t work. At this point, one could think weak dynamic linking is pretty much useless on OSX, at least, that it was before Xcode 4.5 was released. As it turns out, there are other use cases where it actually works. Consider the following code: #include <malloc/malloc.h> // The following is defined in malloc/malloc.h: // extern size_t malloc_zone_pressure_relief(malloc_zone_t *zone, size_t goal) __attribute__((weak_import)); int main() { if (malloc_zone_pressure_relief) malloc_zone_pressure_relief(NULL, 0); return 0; }  This program in itself is useless, but the point is, the malloc_zone_pressure_relief function is only available since OSX 10.7. Without the weak_import, the program would fail to start with an undefined symbol on OSX 10.6 and below. Without the if, it would crash because the symbol would resolve to NULL, and the program would jump there. But the program itself has to be built on OSX 10.7 at least, for the malloc_zone_pressure_relief function to be there when building. And in that use-case, we end up with the right flags in DYLD_INFO: $ dyldinfo -bind test | sed -n '2p;/malloc/p'
__DATA  __got            0x100001038    pointer      0 libSystem        _malloc_zone_pressure_relief (weak import)


This actually gives us a hint for a way out of our misery for our foo function on Xcode < 4.5: linking against a dummy library implementing the weak symbol. When doing so, we get the proper flag in DYLD_INFO, like with malloc_zone_pressure_relief:

$dyldinfo -bind test | sed -n '2p;/foo/p' segment section address type addend dylib symbol __DATA __got 0x100001038 pointer 0 libfoo _foo (weak import)  And the linker additionally does something nice: when all symbols needed from a library are weak references, it marks the library import itself as weak: $ otool -l test | grep -B 2 libfoo
cmdsize 40
name libfoo.dylib (offset 24)


What this means is that even if the libfoo.dylib is missing, it will still work. The downside is that we are now using two-level namespace, which, as mentioned above, makes DYLD_INSERT_LIBRARIES useless. The program thus needs to be built with a flat namespace.

Update: It turns out some versions of XCode don’t conveniently mark libraries with only weak imports as weakly linked, so the -Wl,-weak_library,libraryname.dylib flag is required instead of -Llibraryname.

In summary, if you want to add optional hooks that a library loaded through LD_PRELOAD/ DYLD_INSERT_LIBRARIES can implement:

• on OSX < 10.6, use weak symbols and build with -flat_namespace,
• on OSX >= 10.6, when building with Xcode < 4.5, use weak symbols, build with -flat_namespace and link against a dummy library implementing the hook functions with -Wl,-weak_library,libraryname.dylib,
• on Linux systems and on OSX >= 10.6, when building with Xcode 4.5, simply use weak symbols,
• on Windows, to the best of my knowledge, weak dynamic linking is not supported.

Stay tuned for the next post, which will describe what this will be used for in Firefox.

2012-11-05 21:49:26+0200

## Android Debug Bridge for Firefox

I uploaded an experimental add-on on AMO that adds some kind of support for the Android Debug Bridge to Firefox. It requires the adb/adb.exe executable from the Android SDK. Disclaimer: the add-on is experimental and is known to leak small amounts of memory and have various problems handling errors. It’s also on github, if you want to give a hand.

The first one is access to Android and Firefox OS devices filesystem through the adb:// protocol. Without the serial number for any device, a list of connected devices is shown, like in the screenshot above. With a device serial number in the url (adb://serial/), it shows filesystem contents, as below:

Another feature is the ability to grab the framebuffer on a device and display it as a PNG image. This feature is available through adbview:// urls, and like adb://, a list of connected devices will be shown if no device serial is given.

It can show a screen capture of Android devices:

As well as Firefox OS devices:

For Firefox OS devices, it requires a fork of the screencap tool. It should be included in recent Firefox OS builds.

Finally, the add-on improves the Remote Debugging feature included in Firefox. As described on Mark Finkle’s blog, you would normally type an adb command to forward a tcp port first. When you have several devices, that requires forwarding different local ports to port 6000 on each device, and remembering which is which.

The addon allows to skip that step, and to connect through the Android Debug Bridge directly, without local port forwarding.

Check out Mark’s post for instructions to enable Remote Debugging on both the device and desktop.

The prompt shown when opening the Remote Debugger is augmented with a list of devices. local is the desktop machine running Firefox. The other items in the list are the connected Android or Firefox OS devices. This relies on monkey-patching Firefox Remote Debugger code, so this is not completely reliable, currently only works with a recent Firefox 17 nightly, and may break at any time in the future.

When you select a device, you shouldn’t need to change the host:port field, but you could theoretically change it to access even more remote devices through the device’s network.

Once you selected a device in the Remote Debugger prompt, you get the usual prompt on the device itself:

(By the way, the screenshot above was taken with adbview://)

That brings up scripts running in the Firefox instance on the device:

Thanks to the mobile team for having brought Remote Debugging.

2012-08-23 20:35:51+0200

When you add a flag somewhere in configure.in and need to use it in code or in the build system, you use AC_SUBST or AC_DEFINE. While using AC_DEFINE in configure.in is usually sufficient for its purpose, using AC_SUBST requires to add some boilerplate in config/autoconf.mk.in, in the form:

VARIABLE_NAME = @VARIABLE_NAME@


Here’s the news for you: these days are over. With bug 742795, now on mozilla-central, autoconf.mk is auto-generated. No need to touch config/autoconf.mk.in anymore.

2012-08-07 19:58:00+0200

## The DEPTH of me

When adding a new directory to the Mozilla codebase, one usually needs to add a Makefile.in file, with some magic incantations at the beginning of it:

DEPTH = ../..
topsrcdir = @top_srcdir@
srcdir = @srcdir@
VPATH = @srcdir@

include $(DEPTH)/config/autoconf.mk etc.  Some even add: relativesrcdir = foo/bar  In the above, DEPTH and relativesrcdir both need to be carefully filled depending where the Makefile.in is. As of bug 774032, landed over the week-end (along with some now hopefully fixed tree breakage for some people, sorry for the inconvenience), there are two additional substitution variables for Makefile.in that replace the need to be careful. Now the boilerplate for a new Makefile.in is: DEPTH = @DEPTH@ topsrcdir = @top_srcdir@ srcdir = @srcdir@ VPATH = @srcdir@ include$(DEPTH)/config/autoconf.mk

etc.


And, if needed,

relativesrcdir = @relativesrcdir@


2012-08-06 18:52:34+0200

## Building a Linux kernel module without the exact kernel headers

Imagine you have a Linux kernel image for an Android phone, but you don’t have the corresponding source, nor do you have the corresponding kernel headers. Imagine that kernel has module support (fortunately), and that you’d like to build a module for it to load. There are several good reasons why you can’t just build a new kernel from source and be done with it (e.g. the resulting kernel lacks support for important hardware, like the LCD or touchscreen). With the ever-changing Linux kernel ABI, and the lack of source and headers, you’d think you’re pretty much in a dead-end.

As a matter of fact, if you build a kernel module against different kernel headers, the module will fail to load with errors depending on how different they are. It can complain about bad signatures, bad version or other different things.

But more on that later.

## Configuring a kernel

The first thing is to find a kernel source for something close enough to the kernel image you have. That’s probably the trickiest part with getting a proper configuration. Start from the version number you can read from /proc/version. If, like me, you’re targeting an Android device, try Android kernels from Code Aurora, Linaro, Cyanogen or Android, whichever is closest to what is in your phone. In my case, it was msm-3.0 kernel. Note you don’t necessarily need the exact same version. A minor version difference is still likely to work. I’ve been using a 3.0.21 source, which the kernel image was 3.0.8. Don’t however try e.g. using a 3.1 kernel source when the kernel you have is 3.0.x.

If the kernel image you have is kind enough to provide a /proc/config.gz file, you can start from there, otherwise, you can try starting from the default configuration, but you need to be extra careful, then (although I won’t detail using the default configuration because I was fortunate enough that I didn’t have to, there will be some details further below as to why a proper configuration is important).

Assuming arm-eabi-gcc is in your PATH, and that you have a shell opened in the kernel source directory, you need to start by configuring the kernel and install headers and scripts:

$mkdir build$ gunzip -c config.gz > build/.config # Or whatever you need to prepare a .config
$make silentoldconfig prepare headers_install scripts ARCH=arm CROSS_COMPILE=arm-eabi- O=build KERNELRELEASE=adb shell uname -r  The silentoldconfig target is likely to ask you some questions about whether you want to enable some things. You may want to opt for the default, but that may also not work properly. You may use something different for KERNELRELEASE, but it needs to match the exact kernel version you’ll be loading the module from. ## A simple module To create a dummy module, you need to create two files: a source file, and a Makefile. Place the following content in a hello.c file, in some dedicated directory: #include <linux/module.h> /* Needed by all modules */ #include <linux/kernel.h> /* Needed for KERN_INFO */ #include <linux/init.h> /* Needed for the macros */ static int __init hello_start(void) { printk(KERN_INFO "Hello world\n"); return 0; } static void __exit hello_end(void) { printk(KERN_INFO "Goodbye world\n"); } module_init(hello_start); module_exit(hello_end);  Place the following content in a Makefile under the same directory: obj-m = hello.o  Building such a module is pretty straightforward, but at this point, it won’t work yet. Let me enter some details first. ## The building of a module When you normally build the above module, the kernel build system creates a hello.mod.c file, which content can create several kind of problems: MODULE_INFO(vermagic, VERMAGIC_STRING); VERMAGIC_STRING is derived from the UTS_RELEASE macro defined in include/generated/utsrelease.h, generated by the kernel build system. By default, its value is derived from the actual kernel version, and git repository status. This is what setting KERNELRELEASE when configuring the kernel above modified. If VERMAGIC_STRING doesn’t match the kernel version, loading the module will lead to the following kind of message in dmesg: hello: version magic '3.0.21-perf-ge728813-00399-gd5fa0c9' should be '3.0.8-perf' Then, there’s the module definition. struct module __this_module __attribute__((section(".gnu.linkonce.this_module"))) = { .name = KBUILD_MODNAME, .init = init_module, #ifdef CONFIG_MODULE_UNLOAD .exit = cleanup_module, #endif .arch = MODULE_ARCH_INIT, };  In itself, this looks benign, but the struct module, defined in include/linux/module.h comes with an unpleasant surprise: struct module { (...) #ifdef CONFIG_UNUSED_SYMBOLS (...) #endif (...) /* Startup function. */ int (*init)(void); (...) #ifdef CONFIG_GENERIC_BUG (...) #endif #ifdef CONFIG_KALLSYMS (...) #endif (...) (... plenty more ifdefs ...) #ifdef CONFIG_MODULE_UNLOAD (...) /* Destruction function. */ void (*exit)(void); (...) #endif (...) }  This means for the init pointer to be at the right place, CONFIG_UNUSED_SYMBOLS needs to be defined according to what the kernel image uses. And for the exit pointer, it’s CONFIG_GENERIC_BUG, CONFIG_KALLSYMS, CONFIG_SMP, CONFIG_TRACEPOINTS, CONFIG_JUMP_LABEL, CONFIG_TRACING, CONFIG_EVENT_TRACING, CONFIG_FTRACE_MCOUNT_RECORD and CONFIG_MODULE_UNLOAD. Start to understand why you’re supposed to use the exact kernel headers matching your kernel? Then, the symbol version definitions: static const struct modversion_info ____versions[] __used __attribute__((section("__versions"))) = { { 0xsomehex, "module_layout" }, { 0xsomehex, "__aeabi_unwind_cpp_pr0" }, { 0xsomehex, "printk" }, };  These come from the Module.symvers file you get with your kernel headers. Each entry represents a symbol the module requires, and what signature it is expected to have. The first symbol, module_layout, varies depending on what struct module looks like, i.e. depending on which of the config options mentioned above are enabled. The second, __aeabi_unwind_cpp_pr0, is an ARM ABI specific function, and the last, is for our printk function calls. The signature for each function symbol may vary depending on the kernel code for that function, and the compiler used to compile the kernel. This means that if you have a kernel you built from source, modules built for that kernel, and rebuild the kernel after modifying e.g. the printk function, even in a compatible way, the modules you built initially won’t load with the new kernel. So, if we were to build a kernel from the hopefully close enough source code, with the hopefully close enough configuration, chances are we wouldn’t get the same signatures as the binary kernel we have, and it would complain as follows, when loading our module: hello: disagrees about version of symbol symbol_name Which means we need a proper Module.symvers corresponding to the binary kernel, which, at the moment, we don’t have. ## Inspecting the kernel Conveniently, since the kernel has to do these verifications when loading modules, it actually contains a list of the symbols it exports, and the corresponding signatures. When the kernel loads a module, it goes through all the symbols the module requires, in order to find them in its own symbol table (or other modules’ symbol table when the module uses symbols from other modules), and check the corresponding signature. The kernel uses the following function to search in its symbol table (in kernel/module.c): bool each_symbol_section(bool (*fn)(const struct symsearch *arr, struct module *owner, void *data), void *data) { struct module *mod; static const struct symsearch arr[] = { { __start___ksymtab, __stop___ksymtab, __start___kcrctab, NOT_GPL_ONLY, false }, { __start___ksymtab_gpl, __stop___ksymtab_gpl, __start___kcrctab_gpl, GPL_ONLY, false }, { __start___ksymtab_gpl_future, __stop___ksymtab_gpl_future, __start___kcrctab_gpl_future, WILL_BE_GPL_ONLY, false }, #ifdef CONFIG_UNUSED_SYMBOLS { __start___ksymtab_unused, __stop___ksymtab_unused, __start___kcrctab_unused, NOT_GPL_ONLY, true }, { __start___ksymtab_unused_gpl, __stop___ksymtab_unused_gpl, __start___kcrctab_unused_gpl, GPL_ONLY, true }, #endif }; if (each_symbol_in_section(arr, ARRAY_SIZE(arr), NULL, fn, data)) return true; (...)  The struct used in this function is defined in include/linux/module.h as follows: struct symsearch { const struct kernel_symbol *start, *stop; const unsigned long *crcs; enum { NOT_GPL_ONLY, GPL_ONLY, WILL_BE_GPL_ONLY, } licence; bool unused; };  Note: this kernel code hasn’t changed significantly in the past four years. What we have above is three (or five, when CONFIG_UNUSED_SYMBOLS is defined) entries, each of which contains the start of a symbol table, the end of that symbol table, the start of the corresponding signature table, and two flags. The data is static and constant, which means it will appear as is in the kernel binary. By scanning the kernel for three consecutive sequences of three pointers within the kernel address space followed by two integers with the values from the definitions in each_symbol_section, we can deduce the location of the symbol and signature tables, and regenerate a Module.symvers from the kernel binary. Unfortunately, most kernels these days are compressed (zImage), so a simple search is not possible. A compressed kernel is actually a small bootstrap binary followed by a compressed stream. It is possible to scan the kernel zImage to look for the compressed stream, and decompress it from there. I wrote a script to do decompression and extraction of the symbols info automatically. It should work on any recent kernel, provided it is not relocatable and you know the base address where it is loaded. It takes options for the number of bits and endianness of the architecture, but defaults to values suitable for ARM. The base address, however, always needs to be provided. It can be found, on ARM kernels, in dmesg: $ adb shell dmesg | grep "\.init"
<5>[01-01 00:00:00.000] [0: swapper]      .init : 0xc0008000 - 0xc0037000   ( 188 kB)


The base address in the example above is 0xc0008000.

If like me you’re interested in loading the module on an Android device, then what you have as a binary kernel is probably a complete boot image. A boot image contains other things besides the kernel, so you can’t use it directly with the script. Except if the kernel in that boot image is compressed, in which case the part of the script that looks for the compressed image will find it anyways.

If the kernel is not compressed, you can use the unbootimg program as outlined in this old post of mine to get the kernel image out of your boot image. Once you have the kernel image, the script can be invoked as follows:

$make M=/path/to/module/source ARCH=arm CROSS_COMPILE=arm-eabi- O=build modules  And that’s it. You can now copy the resulting hello.ko onto the device and load it. ## and enjoy $ adb shell
# insmod hello.ko
# dmesg | grep insmod
<6>[mm-dd hh:mm:ss.xxx] [id: insmod]Hello world
# lsmod
hello 586 0 - Live 0xbf008000 (P)
# rmmod hello
# dmesg | grep rmmod
<6>[mm-dd hh:mm:ss.xxx] [id: rmmod]Goodbye world


2012-08-06 15:11:41+0200