Standing up the Cross-Compilation of Firefox for Windows on Linux

I've spent the past few weeks, and will spend the next few weeks, setting up cross-compiled builds of Firefox for Windows on Linux workers on Mozilla's CI. Following is a long wall of text, if that's too much for you, you may want to check the TL;DR near the end. If you're a Windows user wondering about the Windows Subsystem for Linux, please at least check the end of the post.

What is it?

Traditionally, compiling software happens mostly on the platform it is going to run on. Obviously, this becomes less true when you're building software that runs on smartphones, because you're usually not developing on said smartphone. This is where Cross-Compilation comes in.

Cross-Compilation is compiling for a platform that is not the one you're compiling on.

Cross-Compilation is less frequent for desktop software, because most developers will be testing the software on the machine they are building it with, which means building software for macOS on a Windows PC is not all that interesting to begin with.

Continuous Integration, on the other hand, in the era of "build pipelines", doesn't necessarily care that the software is built in the same environment as the one it runs on, or is being tested on.

But... why?

Five years ago or so, we started building Firefox for macOS on Linux. The main drivers, as far as I can remember, were resources and performance, and they were both tied: the only (legal) way to run macOS in a datacenter is to rack... Macs. And it's not like Apple had been producing rackable, server-grade, machines. Okay, they have, but that didn't last. So we were using aging Mac minis. Switching to Linux machines led to faster compilation times, and allowed to recycle the Mac minis to grow the pool running tests.

But, you might say, Windows runs on standard, rackable, server-grade machines. Or on virtually all cloud providers. And that is true. But for the same hardware, it turns out Linux performs better (more on that below), and the cost per hour per machine is also increased by the Windows license.

But then... why only now?

Firefox has a legacy of more than 20 years of development. That shows in its build system. All the things that allow cross-compiling Firefox for Windows on Linux only lined up recently.

The first of them is the compiler. You might interject with "mingw something something", but the reality is that binary compatibility for accessibility (screen readers, etc.) and plugins (Flash is almost dead, but not quite) required Microsoft Visual C++ until recently. What changed the deal is clang-cl, and Mozilla has stopped using MSVC for the builds of Firefox it ships with Firefox 63, about 20 months ago.
,
Another is the process of creating the symbol files used to process crash reports, which was using one of the tools from breakpad to dump the debug info from PDB files in the right format. Unfortunately, that was using a Windows DLL to do so. What recently changed is that we now have a platform-independent tool to do this, that doesn't require that DLL. And to place credit where credit is due, this was thanks to the people from Sentry providing Rust crates for most of the pieces necessary to do so.

Another is the build system itself, which assumed in many places that building for Windows meant you were on Windows, which doesn't help cross-compiling for Windows. But worse than that, it also assumed that the compiler was similar. This worked fine when cross-compiling for Android or MacOS on Linux because compiling tools for the build itself (most notably a clang plugin) and compiling Firefox use compatible compilers, that take the same kind of arguments. The story is different when one of the compilers is clang, which has command line arguments like GCC, and the other is clang-cl, which has command line arguments like MSVC. This changed recently with work to allow building Android Geckoview on Windows (I'm not entirely sure all the pieces for that are there just yet, but the ones in place surely helped me ; I might have inadvertently broken some things, though).

So how does that work?

The above is unfortunately not the whole story, so when I started looking a few weeks ago, the idea was to figure out how far off we were, and what kind of shortcuts we could take to make it happen.

It turns out we weren't that far off, and for a few things, we could work around by... just running the necessary Windows programs with Wine with some tweaks to the build system (Ironically, that means the tool to create symbol files didn't matter). For others... more on that further below.

But let's start looking how you could try this for yourself, now that blockers have been fixed.

First, what do you need?

  • A copy of Microsoft Visual C++. Unfortunately, we still need some of the tools it contains, like the assembler, as well as the platform development files.
  • A copy of the Windows 10 SDK.
  • A copy of the Windows Debug Interface Access (DIA) SDK.
  • A good old VFAT filesystem, large enough to hold a copy of all the above.
  • A WOW64-supporting version of Wine (wine64).
  • A full install of clang, including clang-cl (it usually comes along).
  • A copy of the Windows version of clang-cl (yes, both a Linux clang-cl and a Windows clang-cl are required at the moment, more on this further below).

Next, you need to setup a .mozconfig that sets the right target:

ac_add_options --target=x86_64-pc-mingw32

(Note: the target will change in the future)

You also need to set a few environment variables:

  • WINDOWSSDKDIR, with the full path to the base of the Windows 10 SDK in your VFAT filesystem.
  • DIA_SDK_PATH, with the full path to the base of the Debug Interface Access SDK in your VFAT filesystem.

You also need to ensure all the following are reachable from your $PATH:

  • wine64
  • ml64.exe (somewhere in the copy of MSVC in your VFAT filesystem, under a Hostx64/x64 directory)
  • clang-cl.exe (you also need to ensure it has the executable bit set)

And I think that's about it. If not, please leave a comment or ping me on Matrix (@glandium:mozilla.org), and I'll update the instructions above.

With an up-to-date mozilla-central, you should now be able to use ./mach build, and get a fresh build of Firefox for 64-bits Windows as a result (Well, not right now as of writing, the final pieces only just landed on autoland, they will be on mozilla-central in a few hours).

What's up with that VFAT filesystem?

You probably noticed I was fairly insistive about some things being in a VFAT filesystem. The reason is filesystem case-(in)sensitivity. As you probably know, filesystems on Windows are case-insensitive. If you create a file Foo, you can access it as foo, FOO, fOO, etc.

On Linux, filesystems are most usually case-sensitive. So when some C++ file contains #include "windows.h" and your filesystem actually contains Windows.h, things don't align right. Likewise when the linker wants kernel32.lib and you have kernel32.Lib.

Ext4 recently gained some optional case-insensitivity, but it requires a very recent kernel, and doesn't work on existing filesystems. VFAT, however, as supported by Linux, has always(?) been case-insensitive. It is the simpler choice.

There's another option, though, in the form of FUSE filesystems that wrap an existing directory to expose it as case-insensitive. That's what I tried first, actually. CIOPFS does just that, with the caveat that you need to start from an empty directory, or an all-lowercase directory, because files with any uppercase characters in their name in the original directory don't appear in the mountpoint at all. Unfortunately, the last version, from almost 9 years ago doesn't withstand parallelism: when several processes access files under the mountpoint, one or several of them get failures they wouldn't otherwise get if they were working alone. So during my first attempts cross-building Firefox I was actually using -j1. Needless to say, the build took a while, but it also made it more obvious when I hit something broken that needed fixing.

Now, on Mozilla CI, we can't really mount a VFAT filesystem or use FUSE filesystems that easily. Which brings us to the next option: LD_PRELOAD. LD_PRELOAD is an environment variable that can be set to tell the dynamic loader (ld.so) to load a specified library when loading programs. Which in itself doesn't do much, but the symbols the library exposes will take precedence over similarly named symbols from other libraries. Such as libc.so symbols. Which allows to divert e.g. open, opendir, etc. See where this is going? The library can divert the functions programs use to access files and change the paths the programs are trying to use on the fly.

Such libraries do exist, but I had issues with the few I tried. The most promising one was libcasefold, but building its dependencies turned out to be more work than it should have been, and the hooking it does via libsyscall_intercept is more hardcore than what I'm talking about above, and I wasn't sure we wanted to support something that hotpatches libc.so machine code at runtime rather than divert it.

The result is that we now use our own, written in Rust (because who wants to write bullet-proof path munging code in C?). It can be used instead of a VFAT filesystem in the setup described above, but, being a hack, is not guaranteed to work in all setups.

So what's up with needing clang-cl.exe?

One of the tools Firefox needs to build is the MIDL compiler. To do its work, the MIDL compiler uses a C preprocessor, and the Firefox build system makes it use clang-cl. Something amazing that I discovered while working on this is that Wine actually supports executing Linux programs from Windows programs. So it looked like it was going to be possible to use the Linux clang-cl for that. Unfortunately, that doesn't quite work the same way executing a Windows program does from the parent process's perspective, and the MIDL compiler ended up being unable to read the output from the preprocessor.

Technically speaking, we could have made the MIDL compiler use MSVC's cl.exe as a preprocessor, since it conveniently is in the same directory as ml64.exe, meaning it is already in $PATH. But that would have been a step backwards, since we specifically moved off cl.exe.

Alternatively, it is also theoretically possible to compile with --disable-accessibility to avoid requiring the MIDL compiler at all, but that currently doesn't work in practice. And while that would help for local builds, we still want to ship Firefox with accessibility on.

What about those compilation times, then?

Past my first attempts at -j1, I was able to get a Windows build on my Linux machine in slightly less than twice the time for a Linux build, which doesn't sound great. Several things factor in this:

  • the build system isn't parallelizing many of the calls to the MIDL compiler, and in practice that means the build sits there doing only that and nothing else (there are some known inefficiencies in the phase where this runs).
  • the build system isn't parallelizing the calls to the Effect compiler (FXC), and this has the same effect on build times as the MIDL compiler above.
  • the above two wouldn't actually be that much of a problem if ... Wine wasn't slow. When running full fledged applications or games, it really isn't, but there is a very noticeable overhead when running a lot of short-lived processes. That accumulates to several minutes over a full Firefox compilation.

That third point may or may not be related to the version of Wine available in Debian stable (what I was compiling on), or how it's compiled, but some lucky accident made things much faster on my machine.

See, we actually already have some Windows cross-compilation of Firefox on Mozilla CI, using mingw. Those were put in place to avoid breaking Tor Browser, because that's how they build for Windows, and because not breaking the Tor Browser is important to us. And those builds are already using Wine for the Effect compiler (FXC).

But the Wine they use doesn't support WOW64. So one of the first things necessary to setup 64-bits Windows cross-builds with clang-cl on Mozilla CI was to get a WOW64-supporting Wine. Following the Wine build instructions was more or less straightforward, but I hit a snag: it wasn't possible to install the freetype development files for both the 32-bits version and the 64-bits version because the docker images where we build Wine are still based on Debian 9 for reasons, and the freetype development package was not multi-arch ready on Debian 9, while it now is on Debian 10.

Upgrading to Debian 10 is most certainly possible, but that has a ton more implications than what I was trying to achieve is supposed to. You might ask "why are you building Wine anyways, you could use the Debian package", to which I'd answer "it's a good question, and I actually don't know. I presume the version in Debian 9 was too old (it is older than the one we build)".

Anyways, in the moment, while I happened to be reading Wine's configure script to get things working, I noticed the option --without-x and thought "well, we're not running Wine for any GUI stuff, how about I try that, that certainly would make things easy". YOLO, right?

Not only did it work, but testing the resulting Wine on my machine, compilation times were now down to only be 1 minute slower than a Linux build, rather than 4.5 minutes! That was surely good enough to go ahead and try to get something running on CI.

Tell us about those compilation times already!

I haven't given absolute values so far, mainly because my machine is not representative (I'll have a blog post about that soon enough, but you may have heard about it on Twitter, IRC or Slack, but I won't give more details here), and because the end goal here is Mozilla automation, for both the actual release of Firefox (still a long way to go there), and the Try server. Those are what matters more to my fellow developers. Also, I actually haven't built under Windows on my machine for a fair comparison.

So here it comes:

Build times on CI

Let's unwrap a little:

  • The yellowish and magenta points are native Windows "opt" builds, on two different kinds of AWS instances.
  • The other points are Cross-Compilations with the same "opt" configuration on three different kinds of AWS instances, one of which is the same as one used for Windows, and another one having better I/O than all the others (the cyan circles).
  • We use a tool to share a compilation cache between builds on automation (sccache), which explains the very noisy nature of the build times, because they depend on the amount of source code changes and of the cache misses they induce.
  • The Cross-Compiled builds were turned on around the 27th of February and started about as fast as the native Windows builds were at the beginning of the graph, but they had just seen a regression.
  • The regression was due to a recent change that made the clang plugin change in every build, which led to large numbers of cache misses.
  • After fixing the regression, the build times came back to their previous level on the native jobs.
  • Sccache handled clang-cl arguments in a way that broke cross-compilation, so when we turned on the cross-compiled jobs on automation, they actually had the cache turned off!
  • Let me state this explicitly because that wasn't expected at all: the cross-compiled jobs WITHOUT a cache were as fast as native jobs WITH a cache!
  • A day later, after fixing sccache, we turned it on for the cross-compiled jobs, and build times dropped.
  • The week-end passed, and with more realistic work loads where actual changes to compiled code happen and invalidate parts of the cache, build times get more noisy but stay well under what they are on native Windows.

But the above only captures build times. On automation, a job does actually more than build. It also needs to get the source code, and install the tools needed to build. The latter is unfortunately not tracked at the moment, but the former is:

clone times on CI
Now, for some explanation of the above graph:

  • The colors don't match the previous graph. Sorry about that.
  • The colors vary by AWS instance type, and there is no actual distinction between Windows and Linux, so the instance type that is shared between them has values for both, which explain why it now looks bimodal.
  • It can be seen that the ones with better I/O (in red) are largely faster to get the source code, but also that for the shared instance type, Linux is noticeably faster.

It would be fair to say that independently of Windows vs. Linux, way too much time is spent getting the source code, and there's other ongoing work to make things better.

TL;DR

Overall, the fast end of native Windows builds on Mozilla CI, including Try server, is currently around 45 minutes. That is the time taken by the entire job, and the minimum time between a developer pushing and Windows tests starting to run.

With Cross-Compilation, the fast end is, as of writing, 13 minutes, and can improve further.

As of writing, no actual Windows build job has switched over to Cross-compilation yet. Only an experimental, tier 2, job has been added. But the main jobs developers rely on on the Try server are going to switch real soon now™ (opt and debug for 32-bits, 64-bits and aarch64). Running all the test suites on Try against them yields successful results (modulo the usual known intermittent failures).

Actually shipping off Cross-compiled builds will take longer. We first need to understand the extent of the differences with the native builds and be confident that no subtle breakage happens. Also, PGO and LTO haven't been tested so far. Everything will come in time.

What about Windows Subsystem for Linux (WSL)?

The idea to allow developers on Windows to build Firefox from WSL has floated for a while. The work to stand up Cross-compiled builds on automation has brought us the closest ever to actually being able to do it! If you're interested in making it pass the finish line, please come talk to me in #build:mozilla.org on Matrix, there shouldn't be much work left and we can figure it out (essentially, all the places using Wine would need to do something else, and... that's it(?)). That should yield faster build times than natively with MozillaBuild.

2020-03-05 15:31:45+0900

p.m.o

You can leave a response, or trackback from your own site.

9 Responses to “Standing up the Cross-Compilation of Firefox for Windows on Linux”

  1. Anders Says:

    re: “So what’s up with needing clang-cl.exe? One of the tools Firefox needs to build is the MIDL compiler.”

    Could pidl be used?
    “pidl is an IDL compiler written in Perl that aims to be somewhat compatible with the midl compiler”
    https://metacpan.org/pod/distribution/Parse-Pidl/pidl
    https://wiki.samba.org/index.php/Pidl
    https://git.samba.org/?p=samba.git;a=tree;f=pidl;hb=HEAD

  2. glandium Says:

    @Anders: At my nth re-read/edit of this post, I was wondering whether to add a paragraph about that, but the post was already long enough. Not exactly that, though, because I actually didn’t know about pidl, but about widl, which is what the mingw-based builds use. As far as I know, widl is not suitable as a replacement, as in, it doesn’t generate code that would be binary compatible. Maybe pidl would, or maybe widl could get there. Both are definitely worth investigating.

  3. @PrivacyMatters Says:

    Hi, sooooo, interesting here, I must be the user test-case involuntarily included in the “experimental, tier 2, job”.

    Here is a copy of my build config, although I did attempt to download the stable release version (well, you know how that goes glandium!):

    Build platform
    target
    x86_64-pc-mingw32
    Build tools
    Compiler Version Compiler flags
    z:/task_1581954009/fetches/clang/bin/clang-cl.exe -Xclang -std=gnu99 9.0.1 -fcrash-diagnostics-dir=z:/task_1581954009/public/build -nologo -D_HAS_EXCEPTIONS=0 -W3 -Gy -Zc:inline -Gw -Wno-unknown-pragmas -Wno-ignored-pragmas -Wno-deprecated-declarations -Wno-invalid-noreturn
    z:/task_1581954009/fetches/clang/bin/clang-cl.exe -Xclang -std=c++17 9.0.1 -Qunused-arguments -Qunused-arguments -fcrash-diagnostics-dir=z:/task_1581954009/public/build -TP -nologo -Zc:sizedDealloc- -D_HAS_EXCEPTIONS=0 -W3 -Gy -Zc:inline -Gw -Wno-inline-new-delete -Wno-invalid-offsetof -Wno-microsoft-enum-value -Wno-microsoft-include -Wno-unknown-pragmas -Wno-ignored-pragmas -Wno-deprecated-declarations -Wno-invalid-noreturn -Wno-inconsistent-missing-override -Wno-implicit-exception-spec-mismatch -Wno-microsoft-exception-spec -Wno-unused-local-typedef -Wno-ignored-attributes -Wno-used-but-marked-unused -D_SILENCE_TR1_NAMESPACE_DEPRECATION_WARNING -GR- -Z7 -O2 -Oy
    Z:/task_1581954009/fetches/rustc/bin/rustc.exe 1.39.0
    Configure options

    MOZ_AUTOMATION=1 ‘MOZILLABUILD=C:mozilla-build’ –target=x86_64-pc-mingw32 MOZILLA_OFFICIAL=1 –enable-update-channel=release ‘MOZBUILD_STATE_PATH=Z:task_1581954009.mozbuild’ VC_PATH=z:/task_1581954009/build/src/vs2017_15.8.4/VC CC=clang-cl CXX=clang-cl WINDOWSSDKDIR=z:/task_1581954009/build/src/vs2017_15.8.4/SDK ‘DIA_SDK_PATH=z:/task_1581954009/build/src/vs2017_15.8.4/DIA SDK’ LINKER=lld-link MAKECAB=z:/task_1581954009/build/src/makecab.exe NASM=Z:/task_1581954009/fetches/nasm/nasm.exe ENABLE_CLANG_PLUGIN=1 RUSTC=Z:/task_1581954009/fetches/rustc/bin/rustc CARGO=Z:/task_1581954009/fetches/rustc/bin/cargo RUSTDOC=Z:/task_1581954009/fetches/rustc/bin/rustdoc CBINDGEN=Z:/task_1581954009/fetches/cbindgen/cbindgen RUSTFMT=Z:/task_1581954009/fetches/rustc/bin/rustfmt –enable-profile-use=cross ‘–with-pgo-profile-path=Z:task_1581954009/fetches/merged.profdata’ ‘–with-pgo-jarlog=Z:task_1581954009/fetches/en-US.log’ MOZ_LTO=cross –enable-js-shell –enable-rust-simd NODEJS=Z:/task_1581954009/fetches/node/node.exe ‘–with-mozilla-api-keyfile=Z:task_1581954009/mozilla-desktop-geoloc-api.key’ ‘–with-google-location-service-api-keyfile=Z:task_1581954009/gls-gapi.data’ ‘–with-google-safebrowsing-api-keyfile=Z:task_1581954009/sb-gapi.data’ MAKE=z:/task_1581954009/build/src/mozmake.EXE –enable-crashreporter –enable-official-branding

    If you would like to actually work “ABOVE GROUND” on this project, if there is anything I can do , which wouldn’t be much considering you are already plugged in and cross-compiling my browser (and who knows what else, a hidden Home-Net???) Anyway, it would be interesting to hear something back. Thanks

  4. glandium Says:

    @PrivacyMatters: You are not a test subject. In fact, what you copy/pasted says the opposite (all the paths being Windows paths, etc.). The target matches because that’s how Windows builds are called, and when you build from Linux, you have to tell the build system you want a Windows build.

  5. Emilio Says:

    I tried it, and it works!

    It took a bit of wrangling and a few chmod +x’s around, but I managed to run it on my windows laptop, and once set up the build on my desktop is much faster!

    I had to send a patch to wine1, and one to liblowercase2 because I was too lazy to repartition my disk. I hit some snafus when packaging, but they were easy to workaround.

  6. Emilio Says:

    Fwiw someone in wine-devel mentioned a possible solution for slowness:

    For me, for running lots of short-lived wine processes, the single biggest speedup comes from running a persistent wine server. Without any wine processes running, run “wineserver -p”. That wineserver instance will then stay around (until terminated with “wineserver -k”), greatly reducing the startup time for each command.

  7. Emilio Says:

    Ah, also: https://github.com/wine-staging/wine-staging/tree/master/patches/gdi32-Lazy_Font_Initialization

  8. glandium Says:

    @Emilio: I tried wineserver back then, and it didn’t make a difference. I guess the lazy font initialization thing doesn’t matter when building without X and without freetype.

  9. ddiss Says:

    Interesting article, thanks!
    Regarding midl -> pidl migration: pidl is showing it’s age a little, but we’re still using it in Samba for generating wire-compatible DCE/RPC marshalling / unmarshalling code (among other uses). It’d be great to hear about some of the specific midl use cases that Firefox has, so that we can see whether they can be supported with pidl.

Leave a Reply