Archive for the 'p.d.o' Category

One step closer to git push to mercurial

In case you missed it, I'm working on a new tool to use mercurial remotes in git. Since my previous post, I landed several fixes making clone and pull more reliable:

Of 247316 unique changesets in the various mozilla-* repositories, now only two (but both in fact come from the same patch, one of the changesets being a backport to aurora of the other) are "corrupted" because their mercurial date have a timezone with a second.
Of 23542 unique changesets in the canonical mercurial repository, only three are "corrupted" because their raw mercurial data contains, for an unknown reason, a whitespace after the timezone.

By corrupted, here, I mean that the round-trip hg->git->hg doesn't lead to matching their sha1. They will be fixed eventually, but I haven't decided how yet, because they're really edge cases. They're old enough that they don't really matter for push anyways.

Pushing to mercurial, however, is still not there, but it's getting closer. It involves several operations:

Negotiating with the mercurial server what it doesn't have that we do.
Creating mercurial changesets, manifests and files for local git commits that were not imported from mercurial.
Creating a bundle of the mercurial changesets, manifests and files that we have that the server doesn't.
Pushing that bundle to the server.

The first step is mostly covered by the pull code, that does a similar negotiation. I now have the third step covered (although I cheated around the "corruptions" mentioned above):

$ git clone hg::http://selenic.com/hg
Cloning into 'hg'...
(...)
Checking connectivity... done.
$ cd hg
$ git hgbundle > ../hg.hg
$ mkdir ../hg2
$ cd ../hg2
$ hg init
$ hg unbundle ../hg.hg
adding changesets
adding manifests
adding file changes
added 23542 changesets with 44305 changes to 2272 files
(run 'hg update' to get a working copy)
$ hg verify
checking changesets
checking manifests
crosschecking files in changesets and manifests
checking files
2272 files, 23542 changesets, 44305 total revisions

Note: that hgbundle command won't actually exist. It's just an intermediate step allowing me to work incrementally.

In case you wonder what happens when the bundle contains bad data, mercurial fortunately rejects it:

$ cd ../hg
$ git hgbundle-corrupt > ../hg.hg
$ mkdir ../hg3
$ cd ../hg3
$ hg unbundle ../hg.hg
adding changesets
transaction abort!
rollback completed
abort: integrity check failed on 00changelog.i:3180!

2014-12-16 13:54:15+0900

cinnabar, p.m.o | No Comments »

Using git to interact with mercurial repositories

I was planning to publish this later, but after talking about this project to a few people yesterday and seeing the amount of excitement in response, I took some time this morning to tie a few loose ends and publish this now. Mozillians, here comes the git revolution.

Let me start with a bit of history. I am an early git user. I've been using git almost since its first release. I like it. A lot. I've contributed dozens of patches to git.

I started using mercurial when I got commit access to Mozilla repositories, much later. I don't enjoy using mercurial much.

There are many tools to make git talk to mercurial. Most are called git-remote-hg because they use the git remote helpers infrastructure. All of them rely on having a local mercurial clone. When dealing with repositories like mozilla-central, it means storing more than 1.5GB of data just to talk to mercurial, on top of the git database.

So a few years ago, I started to toy with the idea to make git talk to mercurial directly. I got as far as being able to do a full clone of mozilla-central back then, in a reasonable amount of time. But I left it at that because I needed to figure out how to efficiently store all the metadata required to handle incremental updates/pulling, and didn't have enough incentive to go forward: working with mercurial was not painful enough.

Fast forward to the beginning of this year. The mozilla-central repository is now much bigger than it used to be, and mercurial handles it much less smoothly than it used to when Mozilla switched to using it. That was enough to get me started again, but not enough to dedicate enough time to it.

Fast forward to a few weeks ago. Gregory Szorc poked dev-platform to know what kind of workflows people were using with git to work on Mozilla code. And I was really not satisfied with the answers. First, I was wondering why no-one was mentioning the existing tools. So I picked one, and tried.

Cloning mozilla-central took 12 hours and left me with a ~10GB .git directory. Running git gc --agressive for another 10 hours (my settings may have made gc take more time than it would have with the default configuration) brought it down to about 2.6GB, only 700MB of which is actual git data, the remainder being the associated mercurial repository. And as far as I understand it, the tool doesn't really support our use of mercurial repositories, especially try (but I could be wrong, I didn't really look too much).

That was the straw that broke the camel's back. So after a couple weeks hacking, I now have something that can clone mozilla-central within 30 minutes on my machine (network transfer excluded). The resulting .git directory is around 1.5GB with the default git config, without running git gc. If you tweak the compression level in your git config, cloning takes a bit longer, and the repo takes about 1.1GB, And you can subsequently pull from mozilla-central. As well as pull from other branches without having to clone them from scratch. Push support is not there yet because it's an early prototype, but I should be able to get that to work in the next couple weeks.

At this point, you may be wondering how you can use that thing. Here it comes:

$ git clone https://github.com/glandium/git-remote-hg
$ export PATH=$PATH:$(pwd)/git-remote-hg

Note it requires having the mercurial code available to python, because git-remote-hg uses the mercurial code to talk the mercurial wire protocol. Usually, having mercurial installed is enough.

You can now clone a mercurial repository:

$ git clone hg::http://hg.mozilla.org/mozilla-central

If, like me, you had a local mercurial clone, you can do the following instead:

$ git clone hg::/path/to/mozilla-central-clone
$ git remote set-url origin hg::http://hg.mozilla.org/mozilla-central

You can then use git fetch/pull like with git repositories:

$ git pull

Now, you can add other repositories:

$ git remote add inbound hg::http://hg.mozilla.org/integration/mozilla-inbound
$ git remote update inbound

There are a few caveats, like the fact that it currently creates new remote branches essentially any time you pull something. But it shouldn't disrupt anything.

It should be noted that while the contents are identical to the gecko-dev git repositories (the git tree object sha1s are identical, I checked), the commit SHA1s are different. For two reasons: gecko-dev also contains the CVS history, and hg-git, which is used to fill it adds some mercurial metadata to commit messages that git-remote-hg doesn't add.

It is, however, possible to graft the CVS history from gecko-dev to a clone created with git-remote-hg. Assuming you have a remote for gecko-dev and fetched from it, you can do the following:

$ echo eabda6aae98d14c71d7e7b95a66896868ff9500b 3ec464b55782fb94dbbb9b5784aac141f3e3ac01 >> .git/info/grafts

Last note: please read the README file when you update your git clone of the git-remote-hg repository. As the prototype evolves, there might be things that you need to do to your existing clones, and it will be written there.

2014-12-05 20:45:10+0900

cinnabar, p.m.o | 4 Comments »

Building a Firefox Debian package

It's actually been possible for some time, but I made that simpler recently, and I figured I should mention it.

Grab the iceweasel source
```
$ apt-get source iceweasel
```
Install its build dependencies
```
$ apt-get build-dep iceweasel
```

Build it

$ cd iceweasel-*
$ PRODUCT_NAME=firefox dpkg-buildpackage -rfakeroot

2014-11-11 11:26:38+0900

firefox | 2 Comments »

No PIE for you!

You are a software vendor. You distribute software on multiple operating systems. Let's say your software is a mildly popular internet browser. Let's say its logo represents an animal and a globe.

Now, because you care about the security of your users, let's say you would like the entire address space of your application to be randomized, including the main executable portion of it. That would be neat, wouldn't it? And there's even a feature for that: Position independent executables.

You get that working on (almost) all the operating systems you distribute software on. Great.

Then a Gnome user (or an Ubuntu user, for that matter) comes, and tells you they downloaded your software tarball, unpacked it, and tried opening your software, but all they get is a dialog telling them:

Could not display "application-name"
There is no application installed for "shared library" files

Because, you see, a Position independent executable, in ELF terms, is actually a (position independent) shared library that happens to be executable, instead of being an executable that happens to be position independent.

And nautilus (the file manager in Gnome and Ubuntu's Unity) usefully knows to distinguish between executables and shared libraries. And will happily refuse to execute shared libraries, even when they have the file-system-level executable bit set.

You'd think you can get around this by using a .desktop file, but the Exec field in those files requires a full path. (No, ./ doesn't work unless the executable is in the nautilus process current working directory, as in, the path nautilus was run from)

Dear lazyweb, please prove me wrong and tell me there's a way around this.

2014-10-03 18:00:03+0900

p.d.o, p.m.o | 8 Comments »

So, hum, bash…

So, I guess you heard about the latest bash hole.

What baffles me is that the following still is allowed:

env echo='() { xterm;}' bash -c "echo this is a test"

Interesting replacements for "echo", "xterm" and "echo this is a test" are left as an exercise to the reader.

Update: Another thing that bugs me: Why is this feature even enabled in posix mode? (the mode you get from bash --posix, or, more importantly, when running bash as sh) After all, export -f is a bashism.

2014-09-25 09:43:14+0900

p.d.o, p.m.o | 8 Comments »

Firefox and Gtk+ 3

Folks from Collabora and Red Hat have been working on making Firefox on Gtk+ 3 a thing. See Emilio's blog post for some recent update. But getting Firefox to build and run locally is unfortunately not the whole story.

I've been working on getting Gtk+ 3 Firefox builds going on Mozilla build infrastructure, and I'm proud to announce today that those builds are now going through Mozilla continuous integration on a project branch: Elm, and receive the same automated testing as mozilla-central.

And when I said getting Firefox to build and run was unfortunately not the whole story, I meant it: if you click on the Elm link above, you'll notice that there's a lot of orange, when it should be all green.

So, yes, Firefox on Gtk+ 3 is a thing, and it now has continuous integration. But there's still a whole bunch of things to fix. So if you're interested in making those builds work better, you can hop in, there are many things you can do:

check the Gtk+ 3 tracking bug and its dependencies for a list of known issues or improvements to be made.
download one of the builds from the elm branch, test it, and file bugs if you find some that aren't currently tracked. There aren't nightlies, but you can get the latest builds for 32-bits and 64-bits systems.
and if you have level 1 commit access, you can test patches on the Try server, provided you pull from the elm branch or apply this patch on top of the tree you push there.

2014-07-02 08:24:25+0900

p.d.o, p.m.o | 4 Comments »

æ€’ã‚Šã€å¤±æœ›ã€ã‚¹ãƒˆãƒ¬ã‚¹ç™ºæ•£

I started learning japanese calligraphy a few months ago, with no prior experience with a brush and ink. It is an interesting endeavour. For various reasons, I had to skip class for a few weeks, but after the past ten days, I needed some stress relief on paper.

ã‚¹ãƒƒã‚ãƒªã—ã¾ã—ãŸã€‚

2014-04-05 11:21:58+0900

me, p.d.o, p.m.o | 1 Comment »

Don’t trust python’s os.execv

Python is nice and all, but its low-level functions have real disruptive discrepancies between platforms.

Case at point:

import os
os.execvp("sh", ["sh", "-c", "exit 1"])

As a UNIXy person, I'd expect running the above script to return an error code of 1. And I would be perfectly right... on UNIX systems.

On Windows, it returns 0.

You'd think such a difference in behavior would be documented? It's not.

Thank you python.

2013-11-23 01:24:26+0900

p.d.o, p.m.o | 8 Comments »

æ—¥æœ¬ã¸å¼•ã£è¶Šã—

Today, May the 30th, was my last day as a Mozilla employee. In a couple weeks, my wife, my cat and I will be on board of a flight heading about ten thousand kilometers east, and most of our stuff will be in some container on a boat. We're moving to Japan. As adventurous as this may sound, I'm not venturing into unknown territory. My wife is Japanese, and I've lived there for close to 15 months. A long time ago, arguably.

I'm not actually leaving Mozilla. I'll be back as a contractor, hopefully around the 25th of June. So as far as my fellow coworkers are concerned, I'll be going on a long-ish vacation and changing timezone (but I'll probably be around in the meanwhile on irc or bugmail, with high latency).

Jump-starting in a different country is not something really easy to pull off, and working for Mozilla as a remotee has been a key element in being able to do so. Although I've made it clear when I joined Mozilla that this would eventually happen, I'm thankful I can now actually do it.

2013-05-30 19:52:08+0900

me, p.d.o, p.m.o | 5 Comments »

signal() doubly considered harmful

When you want to set signal handlers on UNIX systems, the typical choice is to use signal (specified in C89, C99 and POSIX.1-2001) or sigaction (specified in POSIX.1-2001 and System V r4).

Quoting the signal manual page:

The only portable use of signal() is to set a signal's disposition to SIG_DFL or SIG_IGN. The semantics when using signal() to establish a signal handler vary across systems (and POSIX.1 explicitly permits this variation); do not use it for this purpose.

POSIX.1 solved the portability mess by specifying sigaction(2), which provides explicit control of the semantics when a signal handler is invoked; use that interface instead of signal().

Then it goes on about the UNIX vs BSD semantics, and how they affect signal delivery, which essentially is the main reason why one would want to stop using signal and use sigaction instead, with specifically chosen flags.

But this is not really what I wanted to talk about here.

One of the uses of signal or sigaction is to temporarily set a signal handler and restore the old signal handler once the job is done. Notwithstanding the fact that it's a pretty horrible thing to do in a multi-threaded program, it's also a horrible thing to do at all with signal if sigaction is used.

The core of the problem is the following: the information you get from signal() about the old signal handler is missing all the important pieces about it if it was originally set with sigaction(), namely, flags, masks and restorer.

So if you do use signal() to temporarily set a signal handler and then restore the previous signal handler, you risk resetting flags, masks and restorer. The first awful thing this means is the previous signal handler might be expecting three arguments, only one of which will be valid when it's invoked. Unexpected things can also happen with the lack of expected flags or masks. This is why you'll see horrible workarounds like this or that.

In short, if you do use signal() to temporarily set a signal handler and then restore the previous signal handler, you're doing it wrong. And if you do that in a system library or driver, thank you for screwing things up. I'm looking at you libsc-a3xx.so.

2013-05-27 17:15:13+0900

p.d.o, p.m.o | 2 Comments »