The Circle/Sharing Common Keys

From InfoAnarchy
Jump to: navigation, search

A number of distributed hash tables have mechanisms for dealing with very common hashes. A very common hash might result if a particular file is very popular, or if that hash represents a service that a great many nodes provide (the developers of The Circle refer to this as The Britney Spears Effect). If a hash value is very common, the node responsible could be placed under very heavy network load.

The usual solution involves some kind of caching mechanism, so that the load from queries is reduced but a single node still has to store all the data for a hash. This is only a partial solution. It will still fail if, say, a billion nodes all decide to publish a particular hash.

The Circle takes a slightly different approach. In The Circle, keys stored to the hashtable are 20 bytes long, consisting of a 16 byte MD5 hash followed by a 4 byte random number. Such a key can be interpreted as a position on a circle (ie like a position on a ruler, except with the end of the ruler bent round so it touches the start -- this simplifies the algorithms somewhat).

Nodes also have a 20 byte "name", and are responsible for the section of the hashtable between their name and the name of their immediate clockwise neighbour on the circle.

This means that keys in the hashtable that have the same MD5 hash will be adjacent in the hashtable, but not in exactly the same position. So a single hash can be spread over several nodes. These nodes will be adjacent on the circle, and will know their neighbours on the circle, so once one of the nodes has been found the rest can be located easily.

Circle nodes takes advantage of this by choosing as a node name a key that was published in the last session. This means that if a particular hash is common, then (with high probability) several nodes will pick this hash and divide up the section of hashtable that stores it between them.

The result is that the hashtable is divided fairly evenly between nodes, irrespective of the frequency of particular hashes.