Archive for the 'p.d.o' Category

libgcc.a symbol visibility considered harmful

I recently got to rebuild an Android NDK with a fresh toolchain again, and hit an interesting problem. I actually had hit it before, but only this time I fully analyzed what's going on. [As a side note, if you build such a NDK, don't use mpfr 3.1.0, as there is a bug in the libtool it ships]

Linking an application or a library pulls many things, that aren't part of the code being built. One of these many things is the libgcc static library. Part of libgcc consists in an implementation of the platform ABI. On Android systems, this means the ARM EABI. GCC, when compiling some instructions, will generate ABI calls. For example, integer divisions may call __aeabi_idiv.

Consider the following minimized real world scenario:

$ echo "int foo(int a) { return 42 % a; }" > foo.c
$ arm-linux-androideabi-gcc -o libfoo.so -shared foo.c -mandroid

GCC will emit a call to __aeabi_idivmod for the % operation. With GCC 4.6.3, this function is in _divsi3.o under libgcc.a. That function itself calls __aeabi_idiv0, which lives in _dvmd_lnx.o under libgcc.a.

When statically linking, ld will thus include foo.o, _divsi3.o and _dvmd_lnx.o, meaning it will include all functions from these object files. That is, foo, __divsi3, __aeabi_idiv, __aeabi_idivmod, __aeabi_idiv0 and __aeabi_ldiv0. And more than being included, these functions are exported, because symbol visibility in libgcc.a is default. So while we expect exporting foo from our library, we're actually exporting much more, including functions that just happened to be near the ones that our code (indirectly) uses.

Now, let's say we want to build another library, using that foo function from libfoo:

$ cat > bar.c <<EOF
extern int foo(int a);
long long bar(long long a) { return foo(a) % a; }
EOF
$ arm-linux-androideabi-gcc -o libbar.so -shared bar.c -mandroid

(The code above has absolutely no meaning, it just triggers the same function calls as what I was getting in the actual real world case)

When statically linking the above code, GCC will generate a call to __aeabi_ldivmod, which calls __aeabi_ldiv0, and many other things, directly or indirectly. When linking as above, nothing particularly nasty is going to happen. However, linking as above is actually wrong: the resulting library has an undefined reference to the foo symbol, and doesn't depend on libfoo. At runtime, if libfoo wasn't already loaded somehow, loading libbar would fail.

The proper way to link is the following:

$ arm-linux-androideabi-gcc -o libbar.so -shared bar.c -mandroid -L. -lfoo

A feature of ELF static linking is that when it resolves undefined symbols, the linker will choose to use the first occurrence of a symbol it finds in the various objects and libraries given on its command line. So with the command line above, for each __aeabi_* symbol, it will first look in libfoo if there isn't one. And while __aeabi_ldivmod is not in libfoo, __aeabi_ldiv0 is (see above).

So instead of including the code for __aeabi_ldiv0 from libgcc.a, it will call the copy from libfoo.

This wouldn't be so much of a problem if __aeabi_ldiv0 wasn't a weak symbol.

Enters faulty.lib. In the real world case, libfoo is loaded by the system dynamic linker, and libbar by faulty.lib. When resolving symbols for libbar, faulty.lib has to resolve libfoo symbols with the system linker, using dlsym(). On Android, dlsym() returns NULL for weak (defined) symbols, so faulty.lib can't resolve __aeabi_ldiv0.

The real world case wasn't a problem with GCC 4.4.3 from the vanilla Android NDK because in that GCC version, __aeabi_ldivmod doesn't call __aeabi_ldiv0.

This wouldn't happen if shared libraries wouldn't expose random platform ABI specific bits depending on what they use and depending on other symbols that happen to be in the same object files.

A similar issue happened a little while ago on Debian powerpc because a shared library was exporting ABI specific bits. Even worse, the toolchain was assuming the symbols would come from libgcc.a and generated wrong relocations for these symbols.

Update: Interestingly, the __aeabi_* symbols are hidden, in libgcc.a as provided on the Debian armel port.

2012-03-06 17:19:34+0900

faulty.lib, p.d.o, p.m.o | 5 Comments »

Introducing faulty.lib

TL;DR link: faulty.lib is on github.

Android applications are essentially zip archives, with the APK extension instead of ZIP. While most of Android is java, and java classes are usually loaded from a ZIP archive (usually with the JAR extension), Android applications using native code need to have native libraries on the file system. These native libraries are found under /data/data/$appid/lib, where $appid is the package name, as defined in the AndroidManifest.xml file.

So, when Android installs an application, it puts that APK file under /data/app. Then, if the APK contains native libraries under a lib/$ABI subdirectory (where $ABI is armeabi, armeabi-v7a or x86), it also decompresses the files and places them under /data/data/$appid/lib. This means native libraries are actually stored twice on internal flash: once compressed and once decompressed.

This is why Chrome for Android takes almost 50MB of internal flash space after installation.

Firefox for Android used to have that problem, and we decided we should stop doing that. Michael Wu thus implemented a custom dynamic linker, which would load most of Firefox libraries directly off the APK. This involves decompressing the zipped data somewhere in memory, and doing ld.so's job to make the library usable (please note that on Android, ld.so is actually named linker). There were initially circumstances under which we would decompress into a file and reuse it the next time Firefox starts, but we subsequently removed that possibility (except for debugging purpose) because it ended up being slower than decompressing each time (thanks to internal flash being so slow).

Anyways, in order to do ld.so's job, our custom linker was directly derived from Android's system linker, with many tweaks. This custom linker has done its job quite well for some time, now, but has been recently replaced, see further below.

Considering Firefox can't do anything useful involving Gecko until its libraries are loaded, in practice, this means Firefox can't display a web page faster than completely decompressing the libraries. Or can it?

Just don’t sit down 'cause I’ve moved your chair

We know that a lot of code and data is not used during Firefox startup. Based on that knowledge, we started working on only loading the necessary bits. The core of the idea is, when a library is requested to be loaded, to reserve anonymous memory for its decompressed size, and... that's all. That memory is protected such that any access to it triggers a segmentation fault. When a segmentation fault happens, the required bits are decompressed, and execution is resumed where it was before the segmentation fault.

The original prototype was decompressing from a normal zip deflated stream, which means it was impossible to seek in it. So, if an access was made at the end of the library, it was necessary to decompress the whole library. With some nasty binary reordering, and some difficulty, it was possible to avoid accessing the end of the library, but the approach is very much fragile. It only takes an unexpected code path to make things much slower than they should be.

Consequently, for the past months, I've been working on improving the original idea and, with some assistance from Julian Seward, implemented the scheme with seekable compressed streams. Instead of letting the zip archive creation tool deflate libraries, we store specially crafted files. Essentially, files are cut in small chunks, and each chunk is compressed individually. This means a less efficient compression, but it also means random access to chunks is possible.

However, instead of stacking on top of our existing custom linker, I started over, from the ground up. First, because it needed a serious clean up (a good part of linker.c is leftovers from the Android linker that we don't use, and APKOpen.cpp is a messy mix of JNI stubs, library decompression handling (which in itself was also a mess) and Gecko initialization code) and most importantly, because it relied on some Android system linker internals and thus required binary compatibility with the system linker. Which, according to Google engineers that contacted us a few months ago, was going to break in what we now know will be called Android Jelly Bean.

The benefit of the clean slate approach is that the new code is not tied to Gecko at all and was designed to work on Android as well as on desktop Linux systems (which made debugging much much easier). We're thus releasing the code as a separate project: faulty.lib. It is licensed under the Mozilla Public License version 2.0. Please feel free to try, contribute, and/or fork it.

This dynamic linker is not meant to completely follow standard ELF rules (most notably for symbol resolution), and as a result does some assumptions. It's also still work in progress, with some obvious optimizations pending (like, avoiding to resolve the same symbols again and again during relocations), and some features missing (for example, symbol versioning).

The next blog post will give some information about how to build Firefox for Android to benefit from on-demand decompression. I will also detail a few of the tricks involved in this dynamic linker in subsequent blog posts.

2012-03-01 17:10:11+0900

faulty.lib, p.d.o, p.m.o | 5 Comments »

Fun with weak symbols

Consider the following foo.c source file:

extern int bar() __attribute__((weak));
int foo() {
  return bar();
}

And the following bar.c source file:

int bar() {
  return 42;
}

Compile both sources:

$ gcc -o foo.o -c foo.c -fPIC
$ gcc -o bar.o -c bar.c -fPIC

In the resulting object for foo.c, we have an undefined symbol reference to bar. That symbol is marked weak.

In the resulting object for bar.c, the bar symbol is defined and not weak.

What we expect from linking both objects is that the weak reference is fulfilled by the existence of the bar symbol in the second object, and that in the resulting binary, the foo function calls bar.

$ ld -shared -o test1.so foo.o bar.o

And indeed, this is what happens.

$ objdump -T test1.so | grep "\(foo\|bar\)"
0000000000000260 g    DF .text  0000000000000007 foo
0000000000000270 g    DF .text  0000000000000006 bar

What do you think happens if the bar.o object file is embedded in a static library?

$ ar cr libbar.a bar.o
$ ld -shared -o test2.so foo.o libbar.a
$ objdump -T test2.so | grep "\(foo\|bar\)"
0000000000000260 g    DF .text  0000000000000007 foo
0000000000000000  w   D  *UND*  0000000000000000 bar

The bar function is now undefined and there is a weak reference for the symbol. Calling foo will crash at runtime.

This is apparently a feature of the linker. If anyone knows why, I would be interested to hear about it. Good to know, though.

2012-02-23 10:46:50+0900

p.d.o, p.m.o | 10 Comments »

Debian Mozilla news

Here are the few noteworthy news about Mozilla packages in Debian:

  • Iceape 2.7 made its way to unstable. This is a huge jump from the previously available 2.0.14, and finally happened because Iceape was finally the top item on my TODO list.
  • Iceape 2.7 is also available for Squeeze users, on the Debian Mozilla team APT archive.
  • Localization is now part of Iceweasel uploads, which means that upgrades won't break localization anymore. It also means the Debian Mozilla team APT archive now also ships Iceweasel locales.

2012-02-18 09:37:10+0900

firefox, iceape | 4 Comments »

How to waste a lot of space without knowing

const char *foo = "foo";

This was recently mentioned on bugzilla, and the problem is usually underestimated, so I thought I would give some details about what is wrong with the code above.

The first common mistake here is to believe foo is a constant. It is a pointer to a constant. In practical ELF terms, this means the pointer lives in the .data section, and the string constant in .rodata. The following code defines a constant pointer to a constant:

const char * const foo = "foo";

The above code will put both the pointer and the string constant in .rodata. But keeping a constant pointer to a constant string is pointless. In the above examples, the string itself is 4 bytes (3 characters and a zero termination). On 32-bits architectures, a pointer is 4 bytes, so storing the pointer and the string takes 8 bytes. A 100% overhead. On 64-bits architectures, a pointer is 8 bytes, putting the total weight at 12 bytes, a 200% overhead.

The overhead is always the same size, though, so the longer the string, the smaller the overhead, relatively to the string size.

But there is another, not well known, hidden overhead: relocations. When loading a library in memory, its base address varies depending on how many other libraries were loaded beforehand, or depending on the use of address space layout randomization (ASLR). This also applies to programs built as position independent executables (PIE). For pointers embedded in the library or program image to point to the appropriate place, they need to be adjusted to the base address where the program or library is loaded. This process is called relocation.

The relocation process requires information which is stored in .rel.* or .rela.* ELF sections. Each pointer needs one relocation. The relocation overhead varies depending on the relocation type and the architecture. REL relocations use 2 words, and RELA relocations use 3 words, where a word is 4 bytes on 32-bits architectures and 8 bytes on 64-bits architectures.

On x86 and ARM, to mention the most popular 32-bits architectures nowadays, REL relocations are used, which makes a relocation weigh 8 bytes. This puts the pointer overhead for our example string to 12 bytes, or 300% of the string size.

On x86-64, RELA relocations are used, making a relocation weigh 24 bytes! This puts the pointer overhead for our example string to 32 bytes, or 800% of the string size!

Another hidden cost of using a pointer to a constant is that every time it is used in the code, there will be pointer dereference. A function as simple as

int bar() { return foo; }

weighs one instruction more when foo is defined const char *. On x86, that instruction weighs 2 bytes. On x86-64, 3 bytes. On ARM, 4 bytes (or 2 in Thumb). That weight can vary depending on the additional instructions required, but you get the idea: using a pointer to a constant also adds overhead to the code, both in time and space. Also, if the string is defined as a constant instead of being used as a literal in the code, chances are it's used several times, multiplying the number of such instructions. Update: Note that in the case of const char * const, the compiler will optimize these instruction and avoid reading the pointer, since it's never going to change.

The symbol for foo is also exported, making it available from other libraries or programs, which might not be required, but also adds its own overhead: an entry in the symbols table (5 words), an entry in the string table for the symbol name (strlen("foo") + 1) and an entry in the symbols hash chain table (4 bytes if only one type of hash table (sysv or GNU) is present, 8 if both are present), and possibly an entry in the symbols hash bucket table, depending on the other exported symbols (4 or 8 bytes, as chain table). It can also affect the size of the bloom filter table in the GNU symbol hash table.

So here we are, with a seemingly tiny 3 character string possibly taking 64 bytes or more! Now imagine what happens when you have an array of such tiny strings. This also doesn't only apply to strings, it applies to any kind of global pointer to constants.

In conclusion, using a definition like

const char *foo = "foo";

is almost never what you want. Instead, you want to use one of the following forms:

  • For a string meant to be exported:

    const char foo[] = "foo";

  • For a string meant to be used in the same source file:

    static const char foo[] = "foo";

  • For a string meant to be used across several source files for the same library:

    __attribute__((visibility("hidden"))) const char foo[] = "foo";

2012-02-18 09:17:21+0900

p.d.o, p.m.o | 15 Comments »

Building a custom kernel for the Nexus S

There are several reasons why someone would want to build a custom kernel for their Android phone. In my case, this is because I wanted performance counters (those used by the perf tool that comes with the kernel source). In Julian Seward's case, he wanted swap support to overcome the limited memory amount on these devices in order to run valgrind. In both cases, the usual suspects (AOSP, CyanogenMod) don't provide the wanted features in prebuilt ROMs.

There are also several reasons why someone would NOT want to build a complete ROM for their Android phone. In my case, the Nexus S is what I use to work on Firefox Mobile, but it is also my actual mobile phone. It's a quite painful and long process to create a custom ROM, and another long (but arguably less painful thanks to ROM manager) process to backup the phone data, install the ROM, restore the phone data. And if you happen to like or use the proprietary Google Apps that don't come with the AOSP sources, you need to add more steps.

There are however tricks that allow to build a custom kernel for the Nexus S and use it with the system already on the phone. Please note that the following procedure has only been tested on two Nexus S with a 2.6.35.7-something kernel (one with a stock ROM, but unlocked, and another one with an AOSP build). Also please note that there are various ways to achieve many of the steps in this procedure, but I'll only mention one (or two in a few cases). Finally, please note some steps rely on your device being rooted. There may be ways to do without, but I'm pretty sure it requires an unlocked device at the very least. This post doesn't cover neither rooting nor unlocking.

Preparing a build environment

To build an Android kernel, you need a cross-compiling toolchain. Theoretically, any will do, provided it targets ARM. I just used the one coming in the Android NDK:

$ wget http://dl.google.com/android/ndk/android-ndk-r6b-linux-x86.tar.bz2
$ tar -jxf android-ndk-r6b-linux-x86.tar.bz2
$ export ARCH=arm
$ export CROSS_COMPILE=$(pwd)/android-ndk-r6b/toolchains/arm-linux-androideabi-4.4.3/prebuilt/linux-x86/bin/arm-linux-androideabi-

For the latter, you need to use a directory path containing prefixed versions (such as arm-eabi-gcc or arm-linux-androideabi-gcc), and include the prefix, but not "gcc".

You will also need the adb tool coming from the Android SDK. You can install it this way:

$ wget http://dl.google.com/android/android-sdk_r12-linux_x86.tgz
$ tar -zxf android-sdk_r12-linux_x86.tgz
$ android-sdk-linux_x86/tools/android update sdk -u -t platform-tool
$ export PATH=$PATH:$(pwd)/android-sdk-linux_x86/platform-tools

Building the kernel

For the Nexus S, one needs to use the Samsung Android kernel tree, which happens to be unavailable at the moment of writing due to the kernel.org outage. Fortunately, there is a clone used for the B2G project, which also happens to contain the necessary cherry-picked patch to add support for the PMU registers on the Nexus S CPU that are needed for the performance counters.

$ git clone -b devrom-2.6.35 https://github.com/cgjones/samsung-android-kernel
$ cd samsung-android-kernel

You can then either start from the default kernel configuration:

$ make herring_defconfig

or use the one from the B2G project, which enables interesting features such as oprofile:

$ wget -O .config https://raw.github.com/cgjones/B2G/master/config/kernel-nexuss4g

From then, you can use the make menuconfig or similar commands to further configure your kernel.

One of the problems you'd first encounter when booting such a custom kernel image is that the bcm4329 driver module that is shipped in the system partition (and not in the boot image) won't match the kernel, and won't be loaded. The unfortunate consequence is the lack of WiFi support.

One way to overcome this problem is to overwrite the kernel module in the system partition, but I didn't want to have to deal with switching modules when switching kernels.

There is however a trick allowing the existing module to be loaded by the kernel: compile a kernel with the same version string as the one already on the phone. Please note this only really works if the kernel is really about the same. If there are differences in the binary interface between the kernel and the modules, it will fail in possibly dangerous ways.

To use that trick, you first need to know what kernel version is running on your device. Settings > About phone > Kernel version will give you that information on the device itself. You can also retrieve that information with the following command:

$ adb shell cat /proc/version

With my stock ROM, this looks like the following:

Linux version 2.6.35.7-ge382d80 (android-build@apa28.mtv.corp.google.com) (gcc version 4.4.3 (GCC) ) #1 PREEMPT Thu Mar 31 21:11:55 PDT 2011

In the About phone information, it looks like:

2.6.35.7-ge382d80
android-build@apa28

The important part above is -ge382d80, and that is what we will be using in our kernel build. Make sure the part preceding -ge382d80 does match the output of the following command:

$ make kernelversion

The trick is to write that -ge382d80 in a .scmversion file in the kernel source tree (obviously, you need to replace -ge382d80 with whatever your device has):

$ echo -ge382d80 > .scmversion

The kernel can now be built:

$ make -j$(($(grep -c processor /proc/cpuinfo) * 3 / 2))

The -j... part is the general rule I use when choosing the number of parallel processes make can use at the same time. You can pick whatever suits you better.

Before going further, we need to get back to the main directory:

$ cd ..

Getting the current boot image

The Android boot image living in the device doesn't contain only a kernel. It also contains a ramdisk containing a few scripts and binaries, that starts the system initialization. As we will be using the ramdisk coming with the existing kernel, we need to get that ramdisk from the device flash memory:

$ adb shell cat /proc/mtd | awk -F'[:"]' '$3 == "boot" {print $1}'

The above command will print the mtd device name corresponding to the "boot" partition. On the Nexus S, this should be mtd2.

$ adb shell
$ su
# dd if=/dev/mtd/mtd2 of=/sdcard/boot.img bs=4096
2048+0 records in
2048+0 records out
8388608 bytes transferred in x.xxx secs (xxxxxxxx bytes/sec)
# exit
$ exit

In the above command sequence, replace mtd2 with whatever the previous command did output for you. Now, you can retrieve the boot image:

$ adb pull /sdcard/boot.img

Creating the new boot image

We first want to extract the ramdisk from that boot image. There are various tools to do so, but for convenience, I took unbootimg, on github, and modified it slightly to seemlessly support the page size on the Nexus S. For convenience as well, we'll use mkbootimg even if fastboot is able to create boot images.

Building unbootimg, as well as the other tools rely on the Android build system, but since I didn't want to go through setting it up, I figured a minimalistic way to build the tools:

$ git clone https://github.com/glandium/unbootimg.git
$ git clone git://git.linaro.org/android/platform/system/core.git

The latter is a clone of git://android.git.kernel.org/platform/system/core.git, which is down at the moment.

$ gcc -o unbootimg/unbootimg unbootimg/unbootimg.c core/libmincrypt/sha.c -Icore/include -Icore/mkbootimg
$ gcc -o mkbootimg core/mkbootimg/mkbootimg.c core/libmincrypt/sha.c -Icore/include
$ gcc -o fastboot core/fastboot/{protocol,engine,bootimg,fastboot,usb_linux,util_linux}.c core/libzipfile/{centraldir,zipfile}.c -Icore/mkbootimg -Icore/include -lz

Once the tools are built, we can extract the various data from the boot image:

$ unbootimg/unbootimg boot.img
section sizes incorrect
kernel 1000 2b1b84
ramdisk 2b3000 22d55
second 2d6000 0
total 2d6000 800000
...but we can still continue

Don't worry about the error messages about incorrect section sizes if it tells you "we can still continue". The unbootimg program creates three files:

  • boot.img-mk, containing the mkbootimg options required to produce a working boot image,
  • boot.img-kernel, containing the kernel image,
  • boot.img-ramdisk.cpio.gz, containing the gzipped ramdisk, which we will reuse as-is.

All that is left to do is to generate the new boot image:

$ eval ./mkbootimg $(sed s,boot.img-kernel,samsung-android-kernel/arch/arm/boot/zImage, boot.img-mk)

Booting the image

There are two ways you can use the resulting boot image: one-time boot or flash. If you want to go for the latter, it is best to actually do both, starting with the one-time boot, to be sure you won't be leaving your phone useless (though recovery is there to the rescue, but is not covered here).

First, you need to get your device in the "fastboot" mode, a.k.a. boot-loader:

$ adb reboot bootloader

Alternatively, you can power it off, and power it back on while pressing the volume up button.

Once you see the boot-loader screen, you can test the boot image with a one-time boot:

$ ./fastboot boot boot.img
downloading 'boot.img'...
OKAY [ 0.xxxs]
booting...
OKAY [ 0.xxxs]
finished. total time: 0.xxxs

As a side note, if fastboot sits "waiting for device", it either means your device is not in fastboot mode (or is not connected), or that you have permissions issues on the corresponding USB device in /dev.

Your device should now be starting up, and eventually be usable under your brand new kernel (and WiFi should be working, too). Congratulations.

If you want to use that kernel permanently, you can now flash it after going back in the bootloader:

$ adb reboot bootloader
$ ./fastboot flash boot boot.img
sending 'boot' (2904 KB)...
OKAY [ 0.xxxs]
writing 'boot'...
OKAY [ 0.xxxs]
finished. total time: 0.xxxs
$ ./fastboot reboot

Voilà.

2011-09-14 09:23:47+0900

p.d.o, p.m.o | 9 Comments »

Initial VMFS 5 support

Today I added an initial VMFS 5 support to vmfs-tools. For the most part, VMFS 5 is VMFS 3, so these should just work as before, and adds new features; but this initial support is very limited:

  • Unified 1MB File Block Size - Nothing has been changed here, so file size is still limited to 256 GB with 1MB File Block Size.
  • Large Single Extent Volumes - This is not supported yet. So the 2TB extent limitation still exists.
  • Smaller Sub-Block - This actually doesn't change anything to the on-disk format, but is only really the tuning of an existing value in the format. This should be handled out of the box by vmfs-tools.
  • Small File Support - VMFS 5 now stores files smaller than 1KB in the inode itself instead of allocating a Sub-Block. Support for this has been added on master.
  • Increased File Count - Like smaller Sub-Blocks, this was supported by the on-disk format, and the change is only about tuning, so this should just work out of the box.

On related news, while the git repository here is kept alive, I also pushed it to github. The main reason I did so is the Issue tracker.

Update: It turns out the small file support makes the vmfs-tools crash when accessing files bigger than 256GB, because the assumption made when reverse engineering was wrong and clashes with how files bigger than 256GB are implemented. It also turns out large single extent volumes may be working already because it looks like it was only about tuning an existing value, like smaller sub-block and increased file count.

Update 2: Latest master now supports small files without crashing with files bigger than 256GB.

2011-09-09 17:34:17+0900

vmfs-tools | 23 Comments »

Extreme tab browsing

I have a pathological use of browser tabs: I use a lot of them. A lot is probably an understatement. We could say I use them as bookmarks of things I need to track. A couple weeks ago, I was saying I had around two hundred tabs opened. I now actually have much more.

It affected startup until I discovered that setting the browser.sessionstore.max_concurrent_tabs pref to 0 was making things much better by only loading tabs when they are selected. This preference has/will become browser.sessionstore.restore_on_demand. However, since I only start my main browser once a day, while other applications start and while I begin to read email, I hadn't noticed that this was still heavily affecting startup time: about:startup tells me reaching the sessionRestored state takes seven seconds, even on a warm startup.

It also affects memory usage, because even when tabs are only loaded on demand, there is a quite big overhead for each tab.

And more importantly, it gets worse with time. And I think the user interface is actively making it worse.

So, to get an idea how bad things were in my session, I wrote a little restartless extension. After installing it, you can go to the about:tabs url to see the damage on your session. Please note that the number of groups is currently wrong until you open the tab grouping interface.

This is what the extension has to say about my session 2 days ago, right after a startup:

  • 556 tabs across 4 groups in 1 window
  • 1 tab has been loaded
  • 444 unique addresses
  • 105 unique hosts
  • 9 empty tabs
  • 210 http:
  • 319 https:
  • 14 ftp:
  • 2 about:
  • 2 file:
  • 55 addresses in more than 1 tab
  • 39 hosts in more than 1 tab

The first thing to note is that when I filed the memory bug 4 days earlier, I had a bit more than 470 tabs in that session. You can see 4 days later, I now have 555 tabs (if excluding the about:tabs tab).

The second thing to note is something I suspected because it's so easy to get there: a lot of the tabs are opened on the same address. Since Firefox 4.0, if I'm not mistaken, there is a great feature in the awesomebar, that allows to jump to an existing tab matching what you type in the urlbar. That is very useful, and I use it a lot. However, there are a lot of cases where it's not as useful as it could be.

One of the addresses I visit a lot is http://buildd.debian.org/iceweasel. It gives me the build status of the latest iceweasel package I uploaded to Debian unstable. That url is particularly known in my browsing history, and is the first hit when I type "buildd" in the urlbar (actually, even typing "b" brings it first). Unfortunately, that url redirects to https://buildd.debian.org/status/package.php?p=iceweasel through an HTTP redirection. I say unfortunately because when I type "buildd" in the urlbar, I get 6 suggestions for urls in the form http://buildd.debian.org/package (I also watch other packages build status), and the suggestion to switch to the existing tab for what the first hit would get me to is 7th. Guess what? The suggestion list only shows 6 items ; you have to scroll to see the 7th.

The result is that I effectively have fifteen tabs open on that url.

I also keep a lot of bugzilla.mozilla.org bugs open in different tabs. The extension tells me there are 255 of them... for 166 unique bugs. Largely, the duplicate bug tabs are due to having these bugs open in some tab, but accessing the same bugs from somewhere else, usually a dependent bug or TBPL. I also have 5 tabs opened on my request queue. I usually get there by going to the bugzilla home page and clicking on the "My Requests" link. And I have several tabs opened on the same bug lists. For the same reason.

When I started using tab groups, I splitted in very distinct groups. Basically, one for Mozilla, one for Debian, one for stuff I want to follow (usually blog posts I want to follow comments from), and one for the rest. While I was keeping up with grouping at the beginning, I don't anymore, and the result is that each group is now a real mess.

Firefox has hundreds of millions users. It's impossible to create a user experience that works for everyone. One thing is sure, it doesn't work for me. My usage is probably very wrong at different levels, but I don't feel my browser is encouraging me to use it better, except by making my number of opened tabs explode to an unmanageable level (I already have 30 tabs more than when I started writing this post 2 days ago).

There are a few other things I would like to know about my usage that my extension hasn't told me yet, either because it doesn't tell, or because I haven't looked:

  • How many tabs end up loaded at the end of a typical day?
  • How many tabs do I close?
  • How many duplicate tabs do I open and close?
  • How long has it been since I looked at a given tab?
  • How do the number of tabs and duplicates evolve with time?

Reflecting on my usage patterns, I think a few improvements, either in the stock browser, or through extensions, could make my browsing easier:

  • Auto-grouping tabs: When I click on a link to an url under mozilla.org, I most likely want it in the Mozilla group. An url under debian.org would most likely go in the Debian group.
  • Switch to an existing tab when following a link to an already opened url: That might not be very useful as a general rule, but at least for some domains, it would seem useful for me that the browser switches to an existing tab not only through the urlbar, but also when following links in a page. If I'm reading a bug, click on a bug it depends on, and that bug is already opened in another tab, get me there. There would be a history problem to solve, though. (e.g. where do back and forward bring?)

Maybe these exist as extensions, I don't know. It's hard to find very specific things like that through an add-on search (though I haven't searched very hard). [Looks like there is an experiment for the auto tab grouping part]

I think it would also be interesting to have something like Test Pilot, but for users that want to know the answer to "How do I use my browser?". As I understand it, Test Pilot can show individual user data, but it only can do so if there is such data, and you can't get data for past studies you didn't take.

In my case, I'm not entirely sure that, apart from the pinned tabs, I use the tab bar a lot. And even for pinned tabs, most of the time I use keyboard shortcuts. I'm not using the menu button that much either. I already removed the url and search bar (most of the time) with LessChrome HD. Maybe I could go further and use the full window for web browsing.

2011-08-29 09:27:55+0900

firefox, p.m.o | 48 Comments »

-feliminate-dwarf2-dups FAIL

DWARF-2 is a format to store debugging information. It is used on many ELF systems such as GNU/Linux. With the way things are compiled, there is a lot of redundant information in the DWARF-2 sections of an ELF binary.

Fortunately, there is an option to gcc that helps dealing with the redundant information and downsizes the DWARF-2 sections of ELF binaries. This option is -feliminate-dwarf2-dups.

Unfortunately, it doesn't work with C++.

With -g alone, libxul.so is 468 MB. With -g -feliminate-dwarf2-dups, it is... 1.5 GB. FAIL.

The good news is that as stated in the message linked above, -gdwarf-4 does indeed help reducing debugging information size. libxul.so, built with -gdwarf-4 is 339 MB. This however requires gcc 4.6 and a pretty recent gdb.

2011-07-30 11:21:01+0900

p.d.o, p.m.o | 1 Comment »

debian-rules-missing-recommended-target, dh, and dumb make

Lintian now has a warning for debian/rules missing build-arch and build-indep targets. As a dh user, I was surprised that some of my dh-using packages had this problem. And when looking at the source, I remembered how I came to this: GNU make is stupid.

Considering the following excerpt of the GNU make manual:

.PHONY
The prerequisites of the special target .PHONY are considered to be phony targets. When it is time to consider such a target, make will run its recipe unconditionally, regardless of whether a file with that name exists or what its last-modification time is.

And considering the following debian/rules:

.PHONY: build
%:
        dh $@

What do you think happens when you run debian/rules build in a directory containing a build file or directory?

make: Nothing to be done for `build'.

However, an explicit rule, like the following, works:

.PHONY: build
build:
        dh $@

It happens that many of the packages I maintain contain a build subdirectory in their source. As such, to work around the aforementioned issue, I just declared the dh rules explicitely, as in:

.PHONY: build binary binary-arch binary-indep (...)
build binary binary-arch binary-indep (...):
        dh $@

And this obviously doesn't scale for new rules such as build-arch and build-indep.

To be future-proof, I'll use the following instead:

.PHONY: build
build %:
        dh $@

I don't know why I didn't do that the first time...

2011-07-23 10:48:07+0900

p.d.o | 1 Comment »