Speex

Speex is an audio compression codec specifically tuned for the reproduction of human speech and also a free software speech codec that may be used on VoIP applications and podcasts. It is based on the CELP speech coding algorithm. Speex claims to be free of any patent restrictions and is licensed under the revised BSD license. It may be used with the Ogg container format or directly transmitted over UDP/RTP. It may also be used with the FLV container format.
The Speex designers see their project as complementary to the Vorbis general-purpose audio compression project.
Speex is a lossy format, i.e. quality is permanently degraded to reduce file size.
The Speex project was created on February 13, 2002. The first development versions of Speex were released under LGPL license, but as of version 1.0 beta 1, Speex is released under Xiph's version of the BSD license. Speex 1.0 was announced on March 24, 2003, after a year of development. The last stable version of Speex encoder and decoder is 1.2.0.
Xiph.Org now considers Speex obsolete; its successor is the more modern Opus codec, which surpasses its performance in most areas except at the lowest sample rates.

Description

Speex is targeted at voice over IP and file-based compression. The design goals have been to make a codec that would be optimized for high quality speech and low bit rate. To achieve this the codec uses multiple bit rates, and supports ultra-wideband, wideband and narrowband. Since Speex was designed for VoIP instead of cell phone use, the codec must be robust to lost packets, but not to corrupted ones. All this led to the choice of code excited linear prediction as the encoding technique to use for Speex. One of the main reasons is that CELP has long proven that it could do the job and scale well to both low bit rates and high bit rates.
The main characteristics can be summarized as follows:

Free software/open-source, patent and royalty-free.
Integration of narrowband and wideband in the same bit-stream.
Wide range of bit rates available.
Dynamic bit rate switching and variable bit-rate.
Voice activity detection .
Variable complexity.
Ultra-wideband mode at 32 kHz.
Intensity stereo encoding option.
Features

;Sampling rate: Speex is mainly designed for three different sampling rates: 8 kHz, 16 kHz, and 32 kHz. These are respectively referred to as narrowband, wideband and ultra-wideband.
;Quality: Speex encoding is controlled most of the time by a quality parameter that ranges from 0 to 10. In constant bit-rate operation, the quality parameter is an integer, while for variable bit-rate, the parameter is a real number.
;Complexity : With Speex, it is possible to vary the complexity allowed for the encoder. This is done by controlling how the search is performed with an integer ranging from 1 to 10 in a way similar to the -1 to -9 options to gzip compression utilities. For normal use, the noise level at complexity 1 is between 1 and 2 dB higher than at complexity 10, but the CPU requirements for complexity 10 is about five times higher than for complexity 1. In practice, the best trade-off is between complexity 2 and 4, though higher settings are often useful when encoding non-speech sounds like DTMF tones, or if encoding is not in real-time.
;Variable bit-rate : Variable bit-rate allows a codec to change its bit rate dynamically to adapt to the "difficulty" of the audio being encoded. In the example of Speex, sounds like vowels and high-energy transients require a higher bit rate to achieve good quality, while fricatives can be coded adequately with fewer bits. For this reason, VBR can achieve lower bit rate for the same quality, or a better quality for a certain bit rate. Despite its advantages, VBR has three main drawbacks: first, by only specifying quality, there is no guarantee about the final average bit-rate. Second, for some real-time applications like voice over IP, what counts is the maximum bit-rate, which must be low enough for the communication channel. Third, encryption of VBR-encoded speech may not ensure complete privacy, as phrases can still be identified, at least in a controlled setting with a small dictionary of phrases, by analysing the pattern of variation of the bit rate.
;Average bit-rate : Average bit-rate solves one of the problems of VBR, as it dynamically adjusts VBR quality in order to meet a specific target bit-rate. Because the quality/bit-rate is adjusted in real-time, the global quality will be slightly lower than that obtained by encoding in VBR with exactly the right quality setting to meet the target average bitrate.
;Voice Activity Detection : When enabled, voice activity detection detects whether the audio being encoded is speech or silence/background noise. VAD is always implicitly activated when encoding in VBR, so the option is only useful in non-VBR operation. In this case, Speex detects non-speech periods and encodes them with just enough bits to reproduce the background noise. This is called "comfort noise generation". Last version VAD was working fine is 1.1.12, since v 1.2 it has been replaced with simple Any Activity Detection.
;Discontinuous transmission : Discontinuous transmission is an addition to VAD/VBR operation which allows ceasing transmitting completely when the background noise is stationary. In a file, 5 bits are used for each missing frame.
;Perceptual enhancement: Perceptual enhancement is a part of the decoder which, when turned on, tries to reduce the noise produced by the coding/decoding process. In most cases, perceptual enhancement makes the sound further from the original objectively, but in the end it still sounds better.
;Algorithmic delay: Every codec introduces a delay in the transmission. For Speex, this delay is equal to the frame size, plus some amount of "look-ahead" required to process each frame. In narrowband operation, the delay is 30 ms, while for wideband, the delay is 34 ms. These values do not account for the CPU time it takes to encode or decode the frames.

Applications

There are a large base of applications supporting the Speex codec. Examples include:

Streaming applications like teleconference
VoIP systems
Videogames
Audio processing applications.

Most of these are based on the DirectShow filter or OpenACM codec on Microsoft Windows, or Xiph.org's reference implementation, libvorbis, on Linux. There are also plugins for many audio players. See the plugin and software page on the speex.org site for more details.
The media type for Speex is audio/ogg while contained by Ogg, and audio/speex when transported through RTP or without container.
The United States Army's Land Warrior system, designed by General Dynamics, also uses Speex for VoIP on an EPLRS radio designed by Raytheon.
The Ear Bible is a single-ear headphone with a built-in Speex player with 1 GB of flash memory, preloaded with a recording of the New American Standard Bible.
ASL Safety & Security's Linux based VIPA OS software which is used in long line public address systems and voice alarm systems at major international air transport hubs and rail networks.
The Rockbox project uses Speex for its voice interface. It can also play Speex files on supported players, such as the Apple iPod or the iRiver H10.
The Vernier LabQuest handheld data acquisition device for science education uses Speex for voice annotations created by students and teachers using either the built-in or an external microphone.
The Google Mobile App for iPhone currently incorporates Speex. It has also been suggested that the new Google voice search iPhone app is using Speex to transmit voice to Google servers for interpretation.
Adobe Flash Player supports Speex starting with Flash Player 10.0.12.36, released in October 2008. Because of some bugs in Flash Player, the first recommended version for Speex support is 10.0.22.87 and later. Speex in Flash Player can be used for both kind of communication, through Flash Media Server or P2P. Speex can be decoded or converted to any format unlike Nellymoser audio, which was the only speech format in previous versions of Flash Player. Speex can be also used in the Flash Video container format, starting with version 10 of Video File Format Specification.
The JavaSonics ListenUp voice recorder uses Speex to compress voice messages that are recorded in a browser and then uploaded to a web server. Primary applications are language training, transcription and social networking.
Speex is used as the voice compression algorithm in the Siri voice assistance on the iPhone 4S. Since text-to-speech occurs on Apple's servers, the Speex codec is used to minimize network bandwidth.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...

Speex

Description

Features

Applications