Checksum for multi-gigabyte filesystem images ?
I'm currently looking for a (very) fast way to (more or less reliably) checksum multi-gigabyte filesystem images (namely VMFS), to validate that my somehow compressed image files, once expanded, are still accurate to what the original image used to be. One of the main features of the image files is that they are sparse files, so you'd get bonus points with checksum algorithms that allow fast addition of a huge amount of zeroes.
So far, I've considered Adler-32, Fletcher-32, though I'm unsure how they would accurately detect the kind of bogus output I could get, and also "md5 of md5" or "sha-1 of sha-1" (i.e. calculate the md5 or sha-1 sum of the concatenation of the md5 or sha-1 sums of blocks of some size, where you can pre-calculate the md5 or sha-1 sums for blocks full of zeroes).
Dear lazyweb, do you have any better ideas, before I start looking more seriously at the ideas above?
2009-08-26 23:34:38+0900