On my Nexus S, the time line when starting up current nightly looks like this:
|550ms||(+550ms)||The application entry point in Java code is reached|
|1350ms||(+800ms)||The application entry point in native code is reached|
|1700ms||(+350ms)||Start creating the top level window (i.e. no XUL has been loaded yet)|
|2500ms||(+800ms)||Something is painted on screen|
That’s a quite good scenario, currently. On the very first run after the installation, even more time is spent extracting a bunch of files and creating the profile. And when opening an url, it also is slower because of the time that’s needed to initialize the content process: tapping the icon will bring about:home, which doesn’t start a separate content process.
The Nexus S also isn’t exactly a slow device, even if there are much better devices nowadays. This means there are devices out there where startup is even slower than that.
Considering Android devices have a much more limited amount of memory than most Desktop computers, and that as a result switching between applications is very likely to get some background applications stopped, this makes startup time a particularly important issue on Android.
The Mobile team is working on making most of the last 800ms go away. The Performance team is working on the other bits.
Reaching Java code
550ms to reach our code sounds like both outrageous, and impossible to solve: we don’t control neither how nor when our code is called by Android after the user has tapped on our icon. In fact, we actually have some control.
Not all Firefox builds are made equal, and it turns out some builds don’t have the outrageous 550ms overhead. They instead have a 300ms overhead. The former builds contain multiple languages. The latter only contain english.
Android applications come in the form of APK files, which really are ZIP files. It turns out Android is taking a whole lot of time to read the entire list of files in the APK and/or handling the corresponding manifest file at startup, and the more files there are, the more time is spent. A multi-languages build contains 3651 files. A single language build contains “only” 1428 files.
We expect getting down to about 100ms by packing chrome, components and some other resources together in a single file, effectively storing a ZIP archive (omni.jar) inside the APK. We (obviously) won’t compress at both levels.
Reaching native code
Native code, like on other platforms, comes in the form of libraries. These libraries are stored in the APK, and uncompressed in memory when Firefox starts. Not so long ago, when space permitted, Firefox would actually store the uncompressed libraries on the internal flash, but as this wasn’t a clear win in actual use cases (on top of making first startup abysmally slow), it was disabled. We however still have the possibility to enable it for use with various tools such as valgrind or oprofile, which don’t support the way we load libraries directly off the APK.
These 800ms between reaching Java code and reaching native code are spent uncompressing the libraries, applying relocations and running static initialization code. Most of these 800ms are actually spent on uncompressing the main library file (libxul.so) alone.
But we don’t actually need all that code from the very start. We don’t even need all that code to display the user interface and the web page we first load. The analysis I conducted last year on desktop Firefox very much applies on mobile too.
By only loading the pieces that are needed, we expect to cut the loading time at least in half, and even more on multi-core devices. Last week, we built a prototype with promising results, but the experience also uncovered difficulties. More details will follow in a subsequent post.
Another significant part of the startup phase is initializing Gecko. This includes, but is not limited to:
- Registering and initializing components
- Initializing the various subsystems using sqlite databases
The switch to a native Android user interface is going to significantly alter how filesystem accesses (most notably sqlite) are going to happen. While we have tested asynchronous I/O for sqlite (and having it break Places), we can’t currently go much further until native UI reaches a certain level of maturity.
We however need to identify what particular pieces of initialization are contributing the most to the 350ms between reaching native code and starting to create the top level window.