Hooking the memory allocator in Firefox
Supplanting the system memory allocator usually involves some tricks. In a cross-platform software like Firefox, this involves different tricks on different platforms. Firefox uses such tricks to implant jemalloc. Sadly, this makes replacing jemalloc itself even trickier.
For instance, trace-malloc, our leak detection tool, used on debug builds, requires that jemalloc is disabled.
Work is under way to make supplanting jemalloc much easier. It is not yet clear if this will be enabled by default on release builds, but it would make sense to enable the feature at least on nightlies.
What does the feature provide? A way to hook or replace jemalloc in Firefox at startup time (as opposed to build time, like trace-malloc). The idea is to build a specialized library (more on that further below) and make Firefox use it instead, or on top of jemalloc, with some weak linking tricks. To enable the feature, pass --enable-replace-malloc
to configure or add ac_add_options --enable-replace-malloc
to your mozconfig
(provided you applied the patches or got a tree where the patches are landed).
With the feature built, you can start Firefox with a malloc replacement library easily:
- On GNU/Linux:
$ LD_PRELOAD=/path/to/library.so firefox
- On OSX:
$ DYLD_INSERT_LIBRARIES=/path/to/library.dylib firefox
- On Windows:
$ MOZ_REPLACE_MALLOC_LIB=drive:\path\to\library.dll firefox
- On Android:
$ am start -a android.activity.MAIN -n org.mozilla.fennec/.App --es env0 MOZ_REPLACE_MALLOC_LIB=/path/to/library.so
As I happen to have built Firefox with the feature enabled for all platforms on try, to validate that it works, you can toy around with these builds.
A replacement library is expected to provide the following functions, or any subset:
void replace_init(const malloc_table_t *table)
void *replace_malloc(size_t size)
int replace_posix_memalign(void **ptr, size_t alignment, size_t size)
void *replace_aligned_alloc(size_t alignment, size_t size)
void *replace_calloc(size_t num, size_t size)
void *replace_realloc(void *ptr, size_t size)
void replace_free(void *ptr)
void *replace_memalign(size_t alignment, size_t size)
void *replace_valloc(size_t size)
size_t replace_malloc_usable_size(usable_ptr_t ptr)
size_t replace_malloc_good_size(size_t size)
void replace_jemalloc_stats(jemalloc_stats_t *stats)
void replace_jemalloc_purge_freed_pages()
void replace_jemalloc_free_dirty_pages()
The first function, replace_init
is the first function from the library that will be called (if it exists), before the first call to any other. It is passed a pointer to a function table containing pointers to the corresponding jemalloc functions from Firefox.
The last three functions are specific to jemalloc. jemalloc_stats
is only important to replace if you want about:memory
to still be accurate according to anything you've done in other functions, and jemalloc_purge_freed_pages
and jemalloc_free_dirty_pages
are used to force the allocator to return some unused memory to the system.
The other functions are the usual suspects, picked from C89, POSIX, C11, or OSX (malloc_good_size
). They should however all be considered cross-platform (especially malloc_good_size
).
All these functions, when they exist, are called instead of the corresponding jemalloc functions, which makes it the responsibility of the replacing functions to call back the corresponding jemalloc function if necessary.
This allows, for example, to:
- Replace jemalloc entirely. The third patch bug 804303 does that to allow to replace the (currently default) old fork of jemalloc with a fresh jemalloc. Something similar could be done to test other allocators, like tcmalloc.
- Make memory allocation functions randomly return NULL as in Out of Memory conditions, aka fuzzing.
- Make all allocations bigger to add tracing data.
- Log allocations.
- etc.
A small implementation example
Consider the following question: how many times does realloc
end up copying data? Stated differently, how many times does realloc
not return the pointer it was given?
Create the memory/replace/realloc/realloc.c
file with the following content:
// This header will declare all the replacement functions, such that you don't need // to worry about exporting them with the right idiom (dllexport, visibility...) #include "replace_malloc.h" #include <stdlib.h> #include <stdio.h> static const malloc_table_t *funcs = NULL; static unsigned int total = 0, copies = 0; void print_stats() { printf("%d reallocs, %d copies\n", total, copies); } void replace_init(const malloc_table_t *table) { funcs = table; atexit(print_stats); } void *replace_realloc(void *ptr, size_t size) { void *newptr = funcs->realloc(ptr, size); // Not thread-safe, but it's only an example. total++; // We don't want to count deallocations as copies. if (newptr && newptr != ptr) copies++; return newptr; }
Add a memory/replace/realloc/Makefile.in
file:
DEPTH = @DEPTH@ topsrcdir = @top_srcdir@ srcdir = @srcdir@ VPATH = @srcdir@ include $(DEPTH)/config/autoconf.mk LIBRARY_NAME = replace_realloc FORCE_SHARED_LIB = 1 NO_DIST_INSTALL = 1 CSRCS = realloc.c MOZ_GLUE_LDFLAGS = # Don't link against mozglue WRAP_LDFLAGS = # Never wrap malloc function calls with -Wl,--wrap include $(topsrcdir)/config/rules.mk
Add the following to memory/replace/Makefile.in
:
DIRS += realloc
Finally, build objdir/memory/replace
. You'll get a library in objdir/memory/replace/realloc
that you can use as described at the beginning of this post.
On my system, after starting and quitting Firefox without doing much, it prints:
41078 reallocs, 37197 copies
It sure is a simple example, that can actually be fulfilled with other tools (like dtrace), but it's now up to you, developers, to come up with more useful uses. The blocked bugs already show some. Note this facility still has the advantage of being more cross-platform than tools like dtrace, and to work happily on top of jemalloc (valgrind, for instance, doesn't support that gracefully), which can be important when looking at some particular aspects of memory allocation. The above example, while simple, is a typical case where the underlying memory allocation library has an impact on the result: other memory allocation libraries have different size classes, which modifies how often realloc will need to actually reallocate, as opposed to grow the existing allocation in-place.
2012-11-27 13:49:15+0900