diffing

I always assumed the diff algorithm to be quite standard and used anywhere there is a diff function, well it seems it is not, and it also seems that human readability is very dependent on the tool you use:

Here is what diff -u, git diff and tla diff give:

@@ -42,10 +42,9 @@
 
 include $(DEPTH)/config/autoconf.mk
 
-include $(topsrcdir)/config/rules.mk
+EXTRA_COMPONENTS = nsKillAll.js
 
-libs::
-       $(INSTALL) $(srcdir)/nsKillAll.js $(DIST)/bin/components
+include $(topsrcdir)/config/rules.mk
 
 clean::
        rm -f $(DIST)/bin/components/nsKillAll.js

This is the least human readable output. In comparison, svn diff, svk diff and bzr diff do output:

@@ -42,11 +42,10 @@
 
 include $(DEPTH)/config/autoconf.mk
 
+EXTRA_COMPONENTS = nsKillAll.js
+
 include $(topsrcdir)/config/rules.mk
 
-libs::
-       $(INSTALL) $(srcdir)/nsKillAll.js $(DIST)/bin/components
-
 clean::
        rm -f $(DIST)/bin/components/nsKillAll.js
 

Mercurial outputs:

@@ -42,10 +42,9 @@
 
 include $(DEPTH)/config/autoconf.mk
 
+EXTRA_COMPONENTS = nsKillAll.js
+
 include $(topsrcdir)/config/rules.mk
-
-libs::
-       $(INSTALL) $(srcdir)/nsKillAll.js $(DIST)/bin/components
 
 clean::
        rm -f $(DIST)/bin/components/nsKillAll.js

which is pretty similar.

I got too fed up with tla and baz to try more (and didn't even go up to committing a file in baz, so there's no diff result for it)

2007-03-16 21:55:44+0900

miscellaneous, p.d.o

Both comments and pings are currently closed.

3 Responses to “diffing”

  1. Romain Lenglet Says:

    When using GNU diff, I always use a unique set of options:
    diff -Naurd
    I remember it because it sounds like “nerd”. (^_^)
    The -d option should give you a result more similar to the other two?

  2. glandium Says:

    No, the -d option changes nothing.

  3. Jan Hudec Says:

    No, there are basically two different diff algorithms:
    – minimal edit distance: This is what GNU diff, tla (which uses GNU diff) and git (which initially used GNU diff as well) use.
    – patience diff: This is what bzr and obviously also the others use. It will match unique lines first (the include line is unique in your example, so it is matched in that step) and than runs minimal edit distance on the blocks between them. This gives larger, but more sensible output.