Preloading, reloaded

As David Baron reminded me in the corresponding bug, the stupid preloading trick is just too stupid, and would actually severely impact builds with debugging symbols, as these would be preloaded as well. And debugging symbols on libxul.so are pretty massive (several hundreds of megabytes vs. around twenty without).

So I came up with a smarter preloader that would only load the more or less relevant parts (all those that the dynamic linker would load), and use the readahead() system call instead of read().

The latter has a double advantage: it limits the number of system calls (cat would read by 32KB chunks), and it avoids copying memory from the page cache to a memory buffer to write() it to /dev/null, because readahead() only populates the page cache without returning anything to userspace.

And as such, it makes preloading even (slightly) faster.

x86 x86-64
4.0b8 3,228.76 ± 0.57% 3,382.0 ± 0.51%
4.0b8 with preload 2,347.18 ± 0.67% 2,709.82 ± 0.54%
Difference 881.58 (27.30%) 672.18 (19.86%)
4.0b8 with better preload 2,231.16 ± 0.73% 2,636.76 ± 0.42%
Difference 997.6 (30.89%) 745.24 (22.04%)

When I first talked about this stupid hack, I mentioned this wouldn't work on OSX, since we are using (fat) universal binaries. Well, with an approach like this one, we should be able to only load the parts relevant to the runtime architecture. In the course of next week, I'll check if that would work out.

2011-02-11 16:22:09+0900

p.m.o

You can leave a response, or trackback from your own site.

Leave a Reply