Finite-state transducer

A finite-state transducer is a finite-state machine with two memory tapes, following the terminology for Turing machines: an input tape and an output tape. This contrasts with an ordinary finite-state automaton, which has a single tape. An FST is a type of finite-state automaton that maps between two sets of symbols. An FST is more general than a finite-state automaton. An FSA defines a formal language by defining a set of accepted strings, while an FST defines relations between sets of strings.
An FST will read a set of strings on the input tape and generates a set of relations on the output tape. An FST can be thought of as a translator or relater between strings in a set.
In morphological parsing, an example would be inputting a string of letters into the FST, the FST would then output a string of morphemes.

Overview

An automaton can be said to recognize a string if we view the content of its tape as input. In other words, the automaton computes a function that maps strings into the set. Alternatively, we can say that an automaton generates strings, which means viewing its tape as an output tape. On this view, the automaton generates a formal language, which is a set of strings. The two views of automata are equivalent: the function that the automaton computes is precisely the indicator function of the set of strings it generates. The class of languages generated by finite automata is known as the class of regular languages.
The two tapes of a transducer are typically viewed as an input tape and an output tape. On this view, a transducer is said to transduce the contents of its input tape to its output tape, by accepting a string on its input tape and generating another string on its output tape. It may do so nondeterministically and it may produce more than one output for each input string. A transducer may also produce no output for a given input string, in which case it is said to reject the input. In general, a transducer computes a relation between two formal languages.
Each string-to-string finite-state transducer relates the input alphabet Σ to the output alphabet Γ. Relations R on Σ*×Γ* that can be implemented as finite-state transducers are called rational relations. Rational relations that are partial functions, i.e. that relate every input string from Σ* to at most one Γ*, are called rational functions.
Finite-state transducers are often used for phonological and morphological analysis in natural language processing research and applications. Pioneers in this field include Ronald Kaplan, Lauri Karttunen, Martin Kay and Kimmo Koskenniemi.
A common way of using transducers is in a so-called "cascade", where transducers for various operations are combined into a single transducer by repeated application of the composition operator.

Formal construction

Formally, a finite transducer T is a 6-tuple such that:

is a finite set, the set of states;
is a finite set, called the input alphabet;
is a finite set, called the output alphabet;
is a subset of, the set of initial states;
is a subset of, the set of final states; and
is the transition relation.

We can view as a labeled directed graph, known as the transition graph of T: the set of vertices is Q, and means that there is a labeled edge going from vertex q to vertex r. We also say that a is the input label and b the output label of that edge.
NOTE: This definition of finite transducer is also called letter transducer ; alternative definitions are possible, but can all be converted into transducers following this one.
Define the extended transition relation as the smallest set such that:

;
for all ; and
whenever and then.

The extended transition relation is essentially the reflexive transitive closure of the transition graph that has been augmented to take edge labels into account. The elements of are known as paths. The edge labels of a path are obtained by concatenating the edge labels of its constituent transitions in order.
The behavior of the transducer T is the rational relation defined as follows: if and only if there exists and such that. This is to say that T transduces a string into a string if there exists a path from an initial state to a final state whose input label is x and whose output label is y.

Weighted automata

Finite State Transducers can be weighted, where each transition is labelled with a weight in addition to the input and output labels. A Weighted Finite State Transducer over a set K of weights can be defined similarly to an unweighted one as an 8-tuple, where:

are defined as above;
is the finite set of transitions;
maps initial states to weights;
maps final states to weights.

In order to make certain operations on WFSTs well-defined, it is convenient to require the set of weights to form a semiring. Two typical semirings used in practice are the log semiring and tropical semiring: unweighted automata may be regarded as having weights in the Boolean semiring.

Stochastic FST

Stochastic FSTs are presumably a form of weighted FST.

Operations on finite-state transducers

The following operations defined on finite automata also apply to finite transducers:

Union. Given transducers and, there exists a transducer such that if and only if or.
Concatenation. Given transducers and, there exists a transducer such that if and only if there exist with and
Kleene closure. Given a transducer, there exists a transducer with the following properties:
Composition. Given a transducer on alphabets Σ and Γ and a transducer on alphabets Γ and Δ, there exists a transducer on Σ and Δ such that if and only if there exists a string such that and. This operation extends to the weighted case.
Projection to an automaton. There are two projection functions: preserves the input tape, and preserves the output tape. The first projection, is defined as follows:
Determinization. Given a transducer, we want to build an equivalent transducer that has a unique initial state and such that no two transitions leaving any state share the same input label. The powerset construction can be extended to transducers, or even weighted transducers, but sometimes fails to halt; indeed, some non-deterministic transducers do not admit equivalent deterministic transducers. Characterizations of determinizable transducers have been proposed along with efficient algorithms to test them: they rely on the semiring used in the weighted case as well as a general property on the structure of the transducer.
Weight pushing for the weighted case.
Minimization for the weighted case.
Removal of epsilon-transitions.
Additional properties of finite-state transducers
It is decidable whether the relation of a transducer T is empty.
It is decidable whether there exists a string y such that xy for a given string x.
It is undecidable whether two transducers are equivalent. Equivalence is however decidable in the special case where the relation of a transducer T is a function.
If one defines the alphabet of labels, finite-state transducers are isomorphic to NDFA over the alphabet, and may therefore be determinized and subsequently minimized so that they have the minimum number of states.
Applications

Context-sensitive rewriting rules of the form a → b / c _ d, used in linguistics to model phonological rules and sound change, are computationally equivalent to finite-state transducers, provided that application is nonrecursive, i.e. the rule is not allowed to rewrite the same substring twice.
Weighted FSTs found applications in natural language processing, including machine translation, and in machine learning. An implementation for part-of-speech tagging can be found as one component of the OpenGrm library.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...