Perceptual hashing


Perceptual hashing is the use of an algorithm that produces a snippet or fingerprint of various forms of multimedia. Perceptual hash functions are analogous if features are similar, whereas cryptographic hashing relies on the avalanche effect of a small change in input value creating a drastic change in output value. Perceptual hash functions are widely used in finding cases of online copyright infringement as well as in digital forensics because of the ability to have a correlation between hashes so similar data can be found. For example, Wikipedia could maintain a database of text hashes of popular online books or articles for which the authors hold copyrights to, anytime a Wikipedia user uploads an online book or article that has a copyright, the hashes will be almost exactly the same and could be flagged as plagiarism. This same flagging system can be used for any multimedia or text file. Based on research at Northumbria University, it can also be applied to simultaneously identify similar contents for video copy detection and detect malicious manipulations for video authentication. The system proposed performs better than current video hashing techniques in terms of both identification and authentication.
In addition to its uses in digital forensics, research has shown that perceptual hashing can be applied to a wide variety of situations. Similar to comparing images for copyright infringement, a group of researchers found that it could be used to compare and match images in a database. Their proposed algorithm proved to be not only effective, but more efficient than the standard means of database image searching. In addition, a team from China discovered that applying perceptual hashing to speech encryption proved to be effective. They were able to create a system in which the encryption was not only more accurate, but more compact as well.