Audio-to-video synchronization

Audio-to-video synchronization refers to the relative timing of audio and video parts during creation, post-production, transmission, reception and play-back processing. AV synchronization can be an issue in television, videoconferencing, or film.
In industry terminology the lip sync error is expressed as an amount of time the audio departs from perfect synchronization with the video where a positive time number indicates the audio leads the video and a negative number indicates the audio lags the video. This terminology and standardization of the numeric lip sync error is utilized in the professional broadcast industry as evidenced by the various professional papers, standards such as ITU-R BT.1359-1, and other references below.
Digital or analog audio video streams or video files usually contain some sort of synchronization mechanism, either in the form of interleaved video and audio data or by explicit relative timestamping of data. The processing of data must respect the relative data timing by e.g. stretching between or interpolation of received data. If the processing does not respect the AV-sync error, it will increase whenever data gets lost because of transmission errors or because of missing or mis-timed processing.

Incorrectly synchronized

There are different ways in which the AV-sync can get incorrectly synchronized:

During creation AV-sync errors happen because of
*Internal AV-sync error: Different signal processing delays between image and sound in video camera and microphone. The AV-sync delay is normally fixed.
*External AV-sync error: If a microphone is placed far away from the sound source, the audio will be out of sync because the speed of sound is much lower than the speed of light. If the sound source is 340 meters from the microphone, then the sound arrives approximately 1 second later than the light. The AV-sync delay increases with distance.
During mixing of video clips normally either the audio or video needs to be delayed so they are synchronized. The AV-sync delay is static, but can vary with the individual clip.
Video editing effects.

Examples of transmission, reception and playback that can get the AV-sync incorrectly synchronized:

A video camera with built-in microphones or line-in may not delay sound and video paths by the same number of milliseconds. A video camera should have some sort of explicit AV-sync timing put into the video and audio streams. Solid state video cameras can delay the video signal by one or more frames.
An AV-stream may get corrupted during transmission because of electrical glitches or wireless interruptions - this may cause it to become out of sync. The AV-sync delay normally increases with time.
There is extensive use of audio and video signal processing circuitry with significant delays in television systems. Particular video signal processing circuitry which is widely used and contributes significant video delays include frame synchronizers, digital video effects processors, video noise reduction, format converters and compression systems.
The video monitor processing circuit may delay the video stream. Pixelated displays require video format conversion and deinterlace processing which can add one or more frames of video delay.
A video monitor with built-in speakers or line-out may not delay sound and video paths by the same number of milliseconds. Some video monitors contain internal user-adjustable audio delays to aid in correction of errors.
Some transmission protocols like RTP require an out-of-band method for synchronizing media streams. In RTP's case, each media stream has its own timestamp using an independent clock rate and per-stream randomized starting value. A RTCP Sender Report is needed for each stream in order to synchronize streams. The necessary RTCP packets might be lost or not sent until at least several seconds after the stream has begun. Many software clients do not send RTCP at all or send non-compliant data.
Effect of no explicit AV-sync timing

When a digital or analog audio video stream does not have some sort of explicit AV-sync timing these effects will cause the stream to become out of sync:

In film movies these timing errors are most commonly caused by worn films skipping over the movie projector sprockets because the film has torn sprocket holes.
Errors can also be caused by the projectionist misthreading the film in the projector, although this is rare with competent projectionists.
Audio to Video Synchronization is commonly corrected and maintained with an audio synchronizer. Television industry standards organizations have established acceptable amounts of audio and video timing error and suggested practices related to maintaining acceptable timing.
A/V sync errors are becoming a significant problem in the digital television industry because of the use of large amounts of video signal processing in television production, television broadcasting and pixelated television displays such as LCD, DLP and plasma displays.
In the television field, audio video sync problems are commonly caused when significant amounts of video processing is performed on the video part of the television program.
Typical sources of significant video delays in the television field include video synchronizers and video compression encoders and decoders. Particularly troublesome encoders and decoders are used in MPEG compression systems utilized for broadcasting digital television and storing television programs on consumer and professional recording and playback devices.
A source of significant video delay is found in pixelated television displays which utilize complex video signal processing to convert the resolution of the incoming video signal to the native resolution of the pixelated display, for example converting standard definition video to be displayed on a high definition display. "Lip-flap" may exceed 200 ms at times.
In broadcast television, it is not unusual for lip-sync error to vary by over 100 ms from time to time.
The EBU Recommendation R37 “The relative timing of the sound and vision components of a television signal” states that end-to-end audio/video sync should be within +40ms and -60ms and that each stage should be within +5ms and -15ms.
Viewer experience of incorrectly synchronized AV-sync

The result typically leaves a filmed or televised character moving his or her mouth when there is no spoken dialog to accompany it, hence the term "lip flap" or "lip-sync error". The resulting audio-video sync error can be annoying to the viewer and may even cause the viewer to not enjoy the program, decrease the effectiveness of the program or lead to a negative perception of the speaker on the part of the viewer. The potential loss of effectiveness is of particular concern for product commercials and political candidates. Television industry standards organizations, such as the Advanced Television Systems Committee, have become involved in setting standards for audio-video sync errors.
Because of these annoyances, AV-sync error is a concern to the television programming industry, including television stations, networks, advertisers and program production companies. Unfortunately, the advent of high-definition flat-panel display technologies, which can delay video more than audio, has moved the problem into the viewer's home and beyond control of the television programming industry alone. Consumer product companies now offer audio-delay adjustments to compensate for video-delay changes in TVs and A/V receivers, and several companies manufacture dedicated digital audio delays made exclusively for lip-sync error correction.

Recommendations

For television applications, the Advanced Television Systems Committee recommends that audio should lead video by no more than 15 milliseconds and audio should lag video by no more than 45 milliseconds. However, the ITU performed strictly controlled tests with expert viewers and found that the threshold for detectability is -125ms to +45ms. For film, acceptable lip sync is considered to be no more than 22 milliseconds in either direction.
The Consumer Electronics Association has published a set of recommendations for how digital television receivers should implement A/V sync.

SMPTE ST2064

standard ST2064, published in 2015, provides technology to reduce or eliminate lip-sync errors in digital television. The standard utilizes audio and video fingerprints taken from a television program. The fingerprints can be recovered and used to correct the accumulated lip-sync error. When fingerprints have been generated for a TV program, and the required technology is incorporated, the viewer's display device has the ability to continuously measure and correct lip-sync errors.

Timestamps

s are embedded in MPEG transport streams to precisely signal when each audio and video segment is to be presented, to avoid AV-sync errors. However, these timestamps are often added after the video undergoes frame synchronization, format conversion and preprocessing, and thus the lip sync errors created by these operations will not be corrected by the addition and use of timestamps.
The Real-time Transport Protocol clocks media using origination timestamps on an arbitrary timeline. A real-time clock such as one delivered by the Network Time Protocol and described in the Session Description Protocol associated with the media may be used to syntonize media. A server may then be used to for final synchronization to remove any residual offset.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...

Audio-to-video synchronization

Incorrectly synchronized

Effect of no explicit AV-sync timing

Viewer experience of incorrectly synchronized AV-sync

Recommendations

SMPTE ST2064

Timestamps