No wonders with PGO on Android
I got Profile Guided Optimization (a.k.a. Feedback Directed Optimization) to work for Android builds, using GCC 4.6.1 and Gold 2.21.53.
Getting such a build is not difficult, just a bit time consuming.
- Apply the patches from bug 632954
- Get an instrumented build with the following command:
$ make -f client.mk MOZ_PROFILE_GENERATE=1 MOZ_PROFILE_BASE=/sdcard/mozilla-pgo
- Create a Fennec Android package:
$ make -C $objdir package
If you get an elfhack error during this phase, make sure to update your tree, the corresponding elfhack bug has been fixed.
- Install the package on your device:
$ adb install -r $objdir/dist/fennec-8.0a1.en-US.android-arm.apk
- Open Fennec on your device, and do some things in your browser, so that execution data is collected. For my last build, I installed the Zippity Test Harness add-on, and ran V8, Sunspider and PageLoad tests
- Collect the execution data:
$ adb pull /sdcard/mozilla-pgo /
- Clean-up the build tree:
$ make -f client.mk clean
- Build using the execution data:
$ make -f client.mk MOZ_PROFILE_USE=1
- Create the final Fennec Android package, install and profit:
$ make -C $objdir package
$ adb install -r $objdir/dist/fennec-8.0a1.en-US.android-arm.apk
As the title indicates, though, this actually leads to some disappointment. On my Nexus S, the resulting build is actually slightly slower on Sunspider than the corresponding nightly. It is however much faster on V8 (down to around 1200 from around 1800), but... is just as fast as a non PGO/FDO build with GCC 4.6. Even sadder, the non PGO/FDO build with GCC 4.6 is faster on Sunspider than the PGO/FDO build, and on-par with the GCC 4.4-built nightly.
So, my experiments suggest that switching to GCC 4.6 would give us some nice speed-ups, but enabling PGO/FDO wouldn't add to that.
If you want to test and compare my builds on different devices, please go ahead, with the following builds:
- Yesterday's nightly build, built with GCC 4.4 (13,522,584 bytes)
- Build of the same commit, with GCC 4.6 (12,984,560 bytes)
- Build of the same commit, with GCC 4.6 and PGO/FDO enabled (13,992,139 bytes)
The former will install as "Nightly", while the two others will install as "Fennec".
The sizes are also interesting: while the PGO build is bigger than the Nightly build, the plain GCC 4.6 build is smaller.
2011-08-04 14:50:50+0900
You can leave a response, or trackback from your own site.
2011-08-04 16:52:44+0900
> It is however much faster on V8 (down to around 1200 from around 1800)
Maybe I’m misunderstanding but isn’t that also worse, not better, since in the V8 benchmark “higher scores means better performance”?
2011-08-04 17:13:05+0900
I never actually looked on PDO on ARM, but the slowdowns should not really happen. Would be possible to analyse this bit further (i.e. figure out what slows down) so we could fix it for 4.7?
2011-08-04 17:13:52+0900
Also do you get any warnings on profile mismatches? Perhaps something is wrong to the degree that the relevant part of profile gets misapplied.
2011-08-04 17:40:37+0900
Torbjörn Andersson: The scores reported by the Zippity Test Harness for V8 are milliseconds.
Jan Hubicka: I posted on the gcc list, I think it would make more sense to discuss there.
2011-08-04 19:36:00+0900
Weren’t we building Fennec with -Os before? In which case PGO should start building some parts with -O3? Surprising that that gives no speedup…
2011-08-04 20:16:54+0900
It depends on the rest of flags. -fprofile-use -Os still defaults to optimizing for size everywhere just using profile info to guide decisions where size is not traded for speed (that are not that many). So perhaps that would explain the lack of speedup.
2011-08-05 12:52:16+0900
> The scores reported by the Zippity Test Harness for V8 are milliseconds.
Ah, I didn’t realize that’s what it was. Thanks for the clarification.
2011-08-08 00:38:14+0900
I’m curious if, for a control, you ran *just* SunSpider for training and took a look at what the speedup was on SunSpider in that case. Maybe the process is too painful to run a bunch of control experiments like that?
2011-08-08 09:00:23+0900
Chris: my first attempt was actually that: training with sunspider only. Same results.
2011-10-31 02:01:18+0900
http://gcc.gnu.org/
GCC 4.6.2 Final released.