Archive for the 'p.d.o' Category

Using git to interact with mercurial repositories

I was planning to publish this later, but after talking about this project to a few people yesterday and seeing the amount of excitement in response, I took some time this morning to tie a few loose ends and publish this now. Mozillians, here comes the git revolution.

Let me start with a bit of history. I am an early git user. I’ve been using git almost since its first release. I like it. A lot. I’ve contributed dozens of patches to git.

I started using mercurial when I got commit access to Mozilla repositories, much later. I don’t enjoy using mercurial much.

There are many tools to make git talk to mercurial. Most are called git-remote-hg because they use the git remote helpers infrastructure. All of them rely on having a local mercurial clone. When dealing with repositories like mozilla-central, it means storing more than 1.5GB of data just to talk to mercurial, on top of the git database.

So a few years ago, I started to toy with the idea to make git talk to mercurial directly. I got as far as being able to do a full clone of mozilla-central back then, in a reasonable amount of time. But I left it at that because I needed to figure out how to efficiently store all the metadata required to handle incremental updates/pulling, and didn’t have enough incentive to go forward: working with mercurial was not painful enough.

Fast forward to the beginning of this year. The mozilla-central repository is now much bigger than it used to be, and mercurial handles it much less smoothly than it used to when Mozilla switched to using it. That was enough to get me started again, but not enough to dedicate enough time to it.

Fast forward to a few weeks ago. Gregory Szorc poked dev-platform to know what kind of workflows people were using with git to work on Mozilla code. And I was really not satisfied with the answers. First, I was wondering why no-one was mentioning the existing tools. So I picked one, and tried.

Cloning mozilla-central took 12 hours and left me with a ~10GB .git directory. Running git gc –agressive for another 10 hours (my settings may have made gc take more time than it would have with the default configuration) brought it down to about 2.6GB, only 700MB of which is actual git data, the remainder being the associated mercurial repository. And as far as I understand it, the tool doesn’t really support our use of mercurial repositories, especially try (but I could be wrong, I didn’t really look too much).

That was the straw that broke the camel’s back. So after a couple weeks hacking, I now have something that can clone mozilla-central within 30 minutes on my machine (network transfer excluded). The resulting .git directory is around 1.5GB with the default git config, without running git gc. If you tweak the compression level in your git config, cloning takes a bit longer, and the repo takes about 1.1GB, And you can subsequently pull from mozilla-central. As well as pull from other branches without having to clone them from scratch. Push support is not there yet because it’s an early prototype, but I should be able to get that to work in the next couple weeks.

At this point, you may be wondering how you can use that thing. Here it comes:

$ git clone https://github.com/glandium/git-remote-hg
$ export PATH=$PATH:$(pwd)/git-remote-hg

Note it requires having the mercurial code available to python, because git-remote-hg uses the mercurial code to talk the mercurial wire protocol. Usually, having mercurial installed is enough.

You can now clone a mercurial repository:

$ git clone hg::http://hg.mozilla.org/mozilla-central

If, like me, you had a local mercurial clone, you can do the following instead:

$ git clone hg::/path/to/mozilla-central-clone
$ git remote set-url origin hg::http://hg.mozilla.org/mozilla-central

You can then use git fetch/pull like with git repositories:

$ git pull

Now, you can add other repositories:

$ git remote add inbound hg::http://hg.mozilla.org/integration/mozilla-inbound
$ git remote update inbound

There are a few caveats, like the fact that it currently creates new remote branches essentially any time you pull something. But it shouldn’t disrupt anything.

It should be noted that while the contents are identical to the gecko-dev git repositories (the git tree object sha1s are identical, I checked), the commit SHA1s are different. For two reasons: gecko-dev also contains the CVS history, and hg-git, which is used to fill it adds some mercurial metadata to commit messages that git-remote-hg doesn’t add.

It is, however, possible to graft the CVS history from gecko-dev to a clone created with git-remote-hg. Assuming you have a remote for gecko-dev and fetched from it, you can do the following:

$ echo eabda6aae98d14c71d7e7b95a66896868ff9500b 3ec464b55782fb94dbbb9b5784aac141f3e3ac01 >> .git/info/grafts

Last note: please read the README file when you update your git clone of the git-remote-hg repository. As the prototype evolves, there might be things that you need to do to your existing clones, and it will be written there.

2014-12-05 20:45:10+0200

cinnabar, p.m.o | 4 Comments »

Building a Firefox Debian package

It’s actually been possible for some time, but I made that simpler recently, and I figured I should mention it.

  • Grab the iceweasel source
    $ apt-get source iceweasel
  • Install its build dependencies
    $ apt-get build-dep iceweasel
  • Build it
    $ cd iceweasel-*
    $ PRODUCT_NAME=firefox dpkg-buildpackage -rfakeroot

2014-11-11 11:26:38+0200

firefox | 2 Comments »

No PIE for you!

You are a software vendor. You distribute software on multiple operating systems. Let’s say your software is a mildly popular internet browser. Let’s say its logo represents an animal and a globe.

Now, because you care about the security of your users, let’s say you would like the entire address space of your application to be randomized, including the main executable portion of it. That would be neat, wouldn’t it? And there’s even a feature for that: Position independent executables.

You get that working on (almost) all the operating systems you distribute software on. Great.

Then a Gnome user (or an Ubuntu user, for that matter) comes, and tells you they downloaded your software tarball, unpacked it, and tried opening your software, but all they get is a dialog telling them:

Could not display “application-name”
There is no application installed for “shared library” files

Because, you see, a Position independent executable, in ELF terms, is actually a (position independent) shared library that happens to be executable, instead of being an executable that happens to be position independent.

And nautilus (the file manager in Gnome and Ubuntu’s Unity) usefully knows to distinguish between executables and shared libraries. And will happily refuse to execute shared libraries, even when they have the file-system-level executable bit set.

You’d think you can get around this by using a .desktop file, but the Exec field in those files requires a full path. (No, ./ doesn’t work unless the executable is in the nautilus process current working directory, as in, the path nautilus was run from)

Dear lazyweb, please prove me wrong and tell me there’s a way around this.

2014-10-03 18:00:03+0200

p.d.o, p.m.o | 8 Comments »

So, hum, bash…

So, I guess you heard about the latest bash hole.

What baffles me is that the following still is allowed:

env echo='() { xterm;}' bash -c "echo this is a test"

Interesting replacements for “echo“, “xterm” and “echo this is a test” are left as an exercise to the reader.

Update: Another thing that bugs me: Why is this feature even enabled in posix mode? (the mode you get from bash --posix, or, more importantly, when running bash as sh) After all, export -f is a bashism.

2014-09-25 09:43:14+0200

p.d.o, p.m.o | 8 Comments »

Firefox and Gtk+ 3

Folks from Collabora and Red Hat have been working on making Firefox on Gtk+ 3 a thing. See Emilio’s blog post for some recent update. But getting Firefox to build and run locally is unfortunately not the whole story.

I’ve been working on getting Gtk+ 3 Firefox builds going on Mozilla build infrastructure, and I’m proud to announce today that those builds are now going through Mozilla continuous integration on a project branch: Elm, and receive the same automated testing as mozilla-central.

And when I said getting Firefox to build and run was unfortunately not the whole story, I meant it: if you click on the Elm link above, you’ll notice that there’s a lot of orange, when it should be all green.

So, yes, Firefox on Gtk+ 3 is a thing, and it now has continuous integration. But there’s still a whole bunch of things to fix. So if you’re interested in making those builds work better, you can hop in, there are many things you can do:

  • check the Gtk+ 3 tracking bug and its dependencies for a list of known issues or improvements to be made.
  • download one of the builds from the elm branch, test it, and file bugs if you find some that aren’t currently tracked. There aren’t nightlies, but you can get the latest builds for 32-bits and 64-bits systems.
  • and if you have level 1 commit access, you can test patches on the Try server, provided you pull from the elm branch or apply this patch on top of the tree you push there.

2014-07-02 08:24:25+0200

p.d.o, p.m.o | 4 Comments »

怒り、失望、ストレス発散

I started learning japanese calligraphy a few months ago, with no prior experience with a brush and ink. It is an interesting endeavour. For various reasons, I had to skip class for a few weeks, but after the past ten days, I needed some stress relief on paper.

怒り
失望

スッキリしました。

2014-04-05 11:21:58+0200

me, p.d.o, p.m.o | 1 Comment »

Don’t trust python’s os.execv

Python is nice and all, but its low-level functions have real disruptive discrepancies between platforms.

Case at point:

import os
os.execvp("sh", ["sh", "-c", "exit 1"])

As a UNIXy person, I’d expect running the above script to return an error code of 1. And I would be perfectly right… on UNIX systems.

On Windows, it returns 0.

You’d think such a difference in behavior would be documented? It’s not.

Thank you python.

2013-11-23 01:24:26+0200

p.d.o, p.m.o | 8 Comments »

日本へ引っ越し

Today, May the 30th, was my last day as a Mozilla employee. In a couple weeks, my wife, my cat and I will be on board of a flight heading about ten thousand kilometers east, and most of our stuff will be in some container on a boat. We’re moving to Japan. As adventurous as this may sound, I’m not venturing into unknown territory. My wife is Japanese, and I’ve lived there for close to 15 months. A long time ago, arguably.

I’m not actually leaving Mozilla. I’ll be back as a contractor, hopefully around the 25th of June. So as far as my fellow coworkers are concerned, I’ll be going on a long-ish vacation and changing timezone (but I’ll probably be around in the meanwhile on irc or bugmail, with high latency).

Jump-starting in a different country is not something really easy to pull off, and working for Mozilla as a remotee has been a key element in being able to do so. Although I’ve made it clear when I joined Mozilla that this would eventually happen, I’m thankful I can now actually do it.

2013-05-30 19:52:08+0200

me, p.d.o, p.m.o | 5 Comments »

signal() doubly considered harmful

When you want to set signal handlers on UNIX systems, the typical choice is to use signal (specified in C89, C99 and POSIX.1-2001) or sigaction (specified in POSIX.1-2001 and System V r4).

Quoting the signal manual page:

The only portable use of signal() is to set a signal’s disposition to SIG_DFL or SIG_IGN. The semantics when using signal() to establish a signal handler vary across systems (and POSIX.1 explicitly permits this variation); do not use it for this purpose.

POSIX.1 solved the portability mess by specifying sigaction(2), which provides explicit control of the semantics when a signal handler is invoked; use that interface instead of signal().

Then it goes on about the UNIX vs BSD semantics, and how they affect signal delivery, which essentially is the main reason why one would want to stop using signal and use sigaction instead, with specifically chosen flags.

But this is not really what I wanted to talk about here.

One of the uses of signal or sigaction is to temporarily set a signal handler and restore the old signal handler once the job is done. Notwithstanding the fact that it’s a pretty horrible thing to do in a multi-threaded program, it’s also a horrible thing to do at all with signal if sigaction is used.

The core of the problem is the following: the information you get from signal() about the old signal handler is missing all the important pieces about it if it was originally set with sigaction(), namely, flags, masks and restorer.

So if you do use signal() to temporarily set a signal handler and then restore the previous signal handler, you risk resetting flags, masks and restorer. The first awful thing this means is the previous signal handler might be expecting three arguments, only one of which will be valid when it’s invoked. Unexpected things can also happen with the lack of expected flags or masks. This is why you’ll see horrible workarounds like this or that.

In short, if you do use signal() to temporarily set a signal handler and then restore the previous signal handler, you’re doing it wrong. And if you do that in a system library or driver, thank you for screwing things up. I’m looking at you libsc-a3xx.so.

2013-05-27 17:15:13+0200

p.d.o, p.m.o | 2 Comments »

Google Reader death, or how the cloud model can fail you

If you’re a Google Reader user, you probably read in one of your subscriptions that Google is pulling the plug on Google Reader. It is yet another demonstration of why putting data in the cloud isn’t so much of a nice idea: the service you rely on may well disappear some day, with all the data it contains.

Sure Google, in its extreme goodness, allows you to “take out” the Google Reader data. Or does it?
These are what you’ll get from Google Takeout for Reader:

  • followers.json, following.json: both files contain similar data, that I suspect correspond to Buzz subscriptions (yet another dead service). Each friends item contains some information about your “friend”, and a stream identifier for their activity (I guess), as well as a few websites urls. For instance Tim Bray’s stream is “user/05198174665841271019/state/com.google/broadcast“. What the hell do I do with that? Fortunately, he has websites, but not all my “friends” have. Thankfully, I haven’t really been using this feature, so there’s almost nothing in these files.
  • liked.json, starred.json, shared.json, shared-by-followers.json: all have the same structure, and contain items you liked, starred, shared, or that the people you follow shared (yeah, that file is badly named). Each item contains an url (or so I hope), and the corresponding content (yay). shared-by-followers.json however doesn’t contain more than the items the people you follow actively shared: it doesn’t contain their feeds (and I’m pretty sure I read more from Tim Bray than the two links he shared)
  • subscriptions.xml: Essentially, a list of RSS feed urls, with no content ; nothing from Tim Bray here, but now that I think about it, I think I was only following his Buzz feed, so that went away with Buzz without me noticing.

Interestingly, while looking into shared-by-followers.json, I found urls that would correspond to friend streams. For instance, Tim Bray’s is http://www.google.com/reader/public/atom/user/05198174665841271019/state/com.google/broadcast. But it’s useless: all it displays is “permission denied”.

As for subscriptions, one of the strengths of Google Reader is that it allowed to search though past items, which means a big part of the interesting data is the archived items. But that’s not part of the “take out”. Sure, you have the feed urls, but most RSS feeds contain a limited amount of items, not the entire history of items for the given feed. So, history is more or less lost. Except if I star, like or share all items in all my subscriptions and “take out” again.

So much goodness.

It could have been worse, though.

2013-03-14 08:35:45+0200

p.d.o, p.m.o | 14 Comments »