Visual Hash

A visual hash is an image that is generated from a large string, just as an ordinary cryptographic hash is usually represented as a hexadecimal string. The advantage of a visual hash is that it is easier for humans to remember and compare.

What makes a hash good enough?

A good visual hash should have the following properties:

  1. A high information content, as measured by its Shannon entropy. This results in a hash that has a low chance of collision. This provides pre-image resistance, i.e. it makes it difficult to create a second input that results in a hash identical to a given hash.

  2. A high minimum self-information. i.e. the lowest self-information value should be as high as possible. This property implies the former property, but is itself distinct. The lowest self-information output is most prone to collisions. Testing for this property is challenging, more challenging than the mean information content discussed previously.

    A second-best for this property would be to be able to identify hashes that have low self-information. Of course, if we can reliably identify them, we could (in principle) eliminate them entirely by checking for a low self-information during the hashing process, and trying again when this is encountered.

  3. Second pre-image resistance, which means that knowing the hash input, we should not be able to produce a similar (or identical) hash. We achieve this by using a cryptographic hash as input to our visual hashes, which ensures the second pre-image resistance is identical to the first pre-image resistance.

    In order to generate testable visual hashes, we actually aim for raw algorithms that lack second pre-image resistance (which we then add on via the cryptographic hash). This allows us to more readily explore “similar” images in order to estimate the information content in the visual hash, and thus its first pre-image resistance.

The VisualHash module

Create a visual hash of a string.

VisualHash is a package that includes several functions to create a visual hash of an arbitrary string. Each function implements a distinct algorithm that given a random number generator produces a visual image. The cryptographic strength of the hash relies on using a cryptographically strong random number generator that is seeded by the data to be hashed.

We provide a strong random number generator (called StrongRandom), which is based on taking the SHA512 hash of the data, followed by the SHA512 hash of the hash, and so on. This puts an upper bound of 512 bits of entropy on any of our hashes (which should not be a problem).

We also provide a “tweaked” random number generator TweakedRandom, which gives a slight variation on a specific strong random number sequence. This will enable us to test the effect of small changes in the generated hashes.

The visual hash styles supported are:

  • Fractal
  • Flag
  • T-Flag
  • RandomArt
  • Identicon
VisualHash.Flag(random=<VisualHash.random.StrongRandom object>, size=128)

Create a hash using the “flag” algorithm.

Given a random generator (and optionally a size in pixels) return a PIL Image that is a hash generated by the random generator.

VisualHash.Fractal(random=<VisualHash.random.StrongRandom object>, size=128)

Create a hash as a fractal flame.

Given a random generator (and optionally a size in pixels) return a PIL Image that is a hash generated by the random generator.

VisualHash.Identicon(random=<VisualHash.random.StrongRandom object>, size=128)

Create an identicon hash.

Given a random generator (and optionally a size in pixels) return a PIL Image that is a hash generated by the random generator. This hash has only 32 bits in it, so it is not a strong hash.

VisualHash.OptimizedFractal(random=<VisualHash.random.StrongRandom object>, size=128)

Create a hash as a fractal flame.

Given a random generator (and optionally a size in pixels) return a PIL Image that is a hash generated by the random generator.

VisualHash.RandomArt(random=<VisualHash.random.StrongRandom object>, size=128)

Create a hash using the randomart algorithm.

Given a random generator (and optionally a size in pixels) return a PIL Image that is a hash generated by the random generator.

VisualHash.TFlag(random=<VisualHash.random.StrongRandom object>, size=128)

Create a hash using the “flag” algorithm.

Given a random generator (and optionally a size in pixels) return a PIL Image that is a a hash generated by the random generator.

Contents:

Indices and tables