Keystroke dynamics


Keystroke dynamics, keystroke biometrics, typing dynamics and lately typing biometrics, is the detailed timing information which describes exactly when each key was pressed and when it was released as a person is typing at a computer keyboard.

Science

The behavioral biometric of Keystroke Dynamics uses the manner and rhythm in which an individual types characters on a keyboard or keypad. The keystroke rhythms of a user are measured to develop a unique biometric template of the user's typing pattern for future authentication. Vibration information may be used to create a pattern for future use in both identification and authentication tasks.
Data needed to analyze keystroke dynamics is obtained by keystroke logging. Normally, all that is retained when logging a typing session is the sequence of characters corresponding to the order in which keys were pressed and timing information is discarded. When reading email, the receiver cannot tell from reading the phrase "I saw 3 zebras!" whether:
On May 24, 1844, the message "" was sent by telegraph from the U.S. Capitol in Washington, D.C. to the Baltimore and Ohio Railroad "outer depot" in Baltimore, Maryland, a new era in long-distance communications had begun. By the 1860s the telegraph revolution was in full swing and telegraph operators were a valuable resource. With experience, each operator developed their unique "signature" and was able to be identified simply by their tapping rhythm.
As late as World War II the military transmitted messages through Morse Code. Using a methodology called "The Fist of the Sender", Military Intelligence identified that an individual had a unique way of keying in a message's "dots" and "dashes", creating a rhythm that could help distinguish ally from enemy.

Use as biometric data

are interested in using this keystroke dynamic information, which is normally discarded, to verify or even try to determine the identity of the person who is producing those keystrokes. The techniques used to do this vary widely in power and sophistication, and range from statistical techniques to AI approaches like neural networks.
The time to get to and depress a key, and the time the key is held-down may be very characteristic for a person, regardless of how fast they are going overall. Most people have specific letters that take them longer to find or get to than their average seek-time over all letters, but which letters those are may vary dramatically but consistently for different people. Right-handed people may be statistically faster in getting to keys they hit with their right hand fingers than they are with their left hand fingers. Index fingers may be characteristically faster than other fingers to a degree that is consistent for a person day-to-day regardless of their overall speed that day.
In addition, sequences of letters may have characteristic properties for a person. In English, the word "the" is very common, and those three letters may be known as a rapid-fire sequence and not as just three meaningless letters hit in that order. Common endings, such as "ing", may be entered far faster than, say, the same letters in reverse order to a degree that varies consistently by person. This consistency may hold and may reveal the person's native language's common sequences even when they are writing entirely in a different language, just as revealing as an accent might in spoken English.
Common "errors" may also be quite characteristic of a person, and there is an entire taxonomy of errors, such as this person's most common "substitutions", "reversals", "drop-outs", "double-strikes", "adjacent letter hits", "homonyms", hold-length-errors. Even without knowing what language a person is working in, by looking at the rest of the text and what letters the person goes back and replaces, these errors might be detected. Again, the patterns of errors might be sufficiently different to distinguish two people.

Authentication versus identification

Keystroke dynamics is part of a larger class of biometrics known as behavioral biometrics; a field in which observed patterns are statistical in nature. Because of this inherent uncertainty, a commonly held belief is that behavioral biometrics are not as reliable as biometrics used for authentication based on physically observable characteristics such as fingerprints or retinal scans or DNA. The reality here is that behavioral biometrics use a confidence measurement instead of the traditional pass/fail measurements. As such, the traditional benchmarks of False Acceptance Rate and False Rejection Rates no longer have linear relationships.
The benefit to keystroke dynamics is that FRR/FAR can be adjusted by changing the acceptance threshold at the individual level. This allows for explicitly defined individual risk mitigation–something physical biometric technologies could never achieve.
One of the major problems that keystroke dynamics runs into is that a person's typing varies substantially during a day and between different days, and may be affected by any number of external factors.
Because of these variations, any system will make false-positive and false-negative errors. Some of the successful commercial products have strategies to handle these issues and have proven effective in large-scale use in real-world settings and applications.

Legal and regulatory issues

Use of keylogging software may be in direct and explicit violation of local laws, such as the U.S. Patriot Act, under which such use may constitute wire-tapping. This could have severe penalties including jail time. See spyware for a better description of user-consent issues and various fraud statutes.

Patents

Because keystroke timings are generated by human beings, they are not well correlated with external processes, and are frequently used as a source of hardware-generated random numbers for computer systems.

Other references