FMLLR

In signal processing, Feature space Maximum Likelihood Linear Regression is a global feature transform that are typically applied in a speaker adaptive way, where fMLLR transforms acoustic features to speaker adapted features by a multiplication operation with a transformation matrix. In some literature, fMLLR is also known as the Constrained Maximum Likelihood Linear Regression.
fMLLR transformations are trained in a maximum likelihood sense on adaptation data. These transformations may be estimated in many ways, but only maximum likelihood estimation is considered in fMLLR. The fMLLR transformation is trained on a particular set of adaptation data, such that it maximizes the likelihood of that adaptation data given a current model-set.
This technique is a widely used approach for speaker adaptation in HMM-based speech recognition,
later research also shows that fMLLR is an excellent acoustic feature for DNN/HMM hybrid speech recognition models.
The advantage of fMLLR includes the following:

the adaptation process can be performed within a pre-precessing phase, and is independent of the ASR training and decoding process.
this type of adapted feature can be applied to deep neural networks to replace traditionally used mel-spectrogram in end-to-end speech recognition models.
fMLLR's speaker adaptation process leads to a significant performance boost for ASR models, hence outperforming other transform or features like MFCCs and FBANKs coefficients.
fMLLR features can be efficiently realized with speech toolkits like Kaldi.

Major problem and disadvantage of fMLLR:

when the amount of adaptation data is limited, the transformation matrices tends to easily overfit the given data.
Computing fMLLR transform

Feature transform of fMLLR can be easily computed with the open source speech tool Kaldi, the Kaldi script uses the standard estimation scheme described in Appendix B of the original paper, in particular the section Appendix B.1 "Direct method over rows".
In the Kaldi formulation, fMLLR is an affine feature transform of the form →, which can be written in the form →W, where = is the acoustic feature with a 1 appended. Note that this differs from some of the literature where the 1 comes first as =.
The sufficient statistics stored are:
where is the inverse co-variance matrix.
And for where is the feature dimension:
For a thorough review that explains fMLLR and the commonly used estimation techniques, see the original paper "Maximum likelihood linear transformations for HMM-based speech recognition ".
Note that the Kaldi script that performs the feature transforms of fMLLR differs with by using a column of the inverse in place of the cofactor row. In other words, the factor of the determinant is ignored, as it does not affect the transform result and can causes potential danger of numerical underflow or overflow.

Comparing with other features or transforms

Experiment result shows that by using the fMLLR feature in speech recognition, constant improvement is gained over other acoustic features on various commonly used benchmark datasets.
In particular, fMLLR features outperform MFCCs and FBANKs coefficients, which is mainly due to the speaker adaptation process that fMLLR performs.
In, phoneme error rate is reported for the test set of TIMIT with various neural architectures:

Models/Features	MFCC	FBANK	fMLLR
MLP	18.2	18.7	16.7
RNN	17.7	17.2	15.9
LSTM	15.1	14.3	14.5
GRU	16.0	15.2	14.9
Li-GRU	15.3	14.9	14.2

As expected, fMLLR features outperform MFCCs and FBANKs coefficients despite the use of different model architecture.
Where MLP serves as a simple baseline, on the other hand RNN, LSTM, and GRU are all well known recurrent models.
The Li-GRU architecture is based on a single gate and thus saves 33% of the computations over a standard GRU model, Li-GRU thus effectively address the gradient vanishing problem of recurrent models.
As a result, the best performance is obtained with the Li-GRU model on fMLLR features.

Extract fMLLR features with Kaldi

fMLLR can be extracted as reported in the s5 recipe of Kaldi.
Kaldi scripts can certainly extract fMLLR features on different dataset, below are the basic example steps to extract fMLLR features from the open source speech corpora .
Note that the instructions below are for the subsets train-clean-100,train-clean-360,dev-clean, and test-clean,
but they can be easily extended to support the other sets dev-other, test-other, and train-other-500.

These instruction are based on the codes provided in this , which contains Kaldi recipes on the LibriSpeech corpora to execute the fMLLR feature extraction process, replace the files under $KALDI_ROOT/egs/librispeech/s5/ with the files in the repository.
Install Kaldi.
Install .
If running on a single machine, change the following lines in $KALDI_ROOT/egs/librispeech/s5/cmd.sh to replace queue.pl to run.pl:

export train_cmd="run.pl --mem 2G"
export decode_cmd="run.pl --mem 4G"
export mkgraph_cmd="run.pl --mem 8G"

Change the data path in run.sh to your LibriSpeech data path, the directory LibriSpeech/ should be under that path. For example:

data=/media/user/SSD # example path

Install flac with: sudo apt-get install flac
Run the Kaldi recipe run.sh for LibriSpeech at least until Stage 13, for simplicity you can used the modified .
Copy exp/tri4b/trans.* files into exp/tri4b/decode_tgsmall_train_clean_*/ with the following command:

mkdir exp/tri4b/decode_tgsmall_train_clean_100 && cp exp/tri4b/trans.* exp/tri4b/decode_tgsmall_train_clean_100/

Compute the fMLLR features by running the following script, the script can also be downloaded :
!/bin/bash

../cmd.sh ## You'll want to change cmd.sh to something that will work on your system.
../path.sh ## Source the tools/utils
gmmdir=exp/tri4b
for chunk in dev_clean test_clean train_clean_100 train_clean_360 ; do
dir=fmllr/$chunk
steps/nnet/make_fmllr_feats.sh --nj 10 --cmd "$train_cmd" \
--transform-dir $gmmdir/decode_tgsmall_$chunk \
$dir data/$chunk $gmmdir $dir/log $dir/data || exit 1
compute-cmvn-stats --spk2utt=ark:data/$chunk/spk2utt scp:fmllr/$chunk/feats.scp ark:$dir/data/cmvn_speaker.ark
done

Compute alignments using:
alignments on dev_clean and test_clean

steps/align_fmllr.sh --nj 10 data/dev_clean data/lang exp/tri4b exp/tri4b_ali_dev_clean
steps/align_fmllr.sh --nj 10 data/test_clean data/lang exp/tri4b exp/tri4b_ali_test_clean
steps/align_fmllr.sh --nj 30 data/train_clean_100 data/lang exp/tri4b exp/tri4b_ali_clean_100
steps/align_fmllr.sh --nj 30 data/train_clean_360 data/lang exp/tri4b exp/tri4b_ali_clean_360

Apply CMVN and dump the fMLLR features to new.ark files, the script can also be downloaded :
!/bin/bash

data=/user/kaldi/egs/librispeech/s5 ## You'll want to change this path to something that will work on your system.
rm -rf $data/fmllr_cmvn/
mkdir $data/fmllr_cmvn/
for part in dev_clean test_clean train_clean_100 train_clean_360; do
mkdir $data/fmllr_cmvn/$part/
apply-cmvn --utt2spk=ark:$data/fmllr/$part/utt2spk ark:$data/fmllr/$part/data/cmvn_speaker.ark scp:$data/fmllr/$part/feats.scp ark:- | add-deltas --delta-order=0 ark:- ark:$data/fmllr_cmvn/$part/fmllr_cmvn.ark
done
du -sh $data/fmllr_cmvn/*
echo "Done!"

Use the python script to convert Kaldi generated.ark features to.npy for your own dataloader, an example is provided:

python ark2libri.py

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...

FMLLR

Computing fMLLR transform

Comparing with other features or transforms

Extract fMLLR features with Kaldi