Joseph Mariani


Joseph Mariani is a French computer science researcher and pioneer in the field of speech processing.

Education and career

After obtaining a Doctor of Engineering degree in 1977 from the Pierre and Marie Curie University, Joseph Mariani joined the National Center for Scientific Research in the Computer Science Laboratory for Mechanics and Engineering Sciences as a researcher. He then was the head of the Speech Communication group from 1982 to 1985. He left for the United States where he worked as invited researcher at IBM T.J. Watson Research Center. Back in France, from 1987 to 2001 he was in charge of the Human-Machine Communication Department and was Director of LIMSI from 1989 to 2000. Later, he was named Director of the Department of Information and Communication Technologies at the Ministry of Research. Within the Ministry, he created the Techno-Langue and Techno-Vision Programs on the development and evaluation of technologies in these two domains.
During this time, he was named President of the European Language Resources Association and was on the boards of several organizations including the ANFr, the IGN, the OST and INRIA. He participated in the creation of many associations and international conferences such as ELSNET, COCOSDA, ESCA/ISCA, ELRA and LREC.
From 2006 through December 2013, he was director of the Institute for Multilingual and Multimedia Information, a CNRS Mixed International Unit, part of the Quaero Program, a collaboration between LIMSI, the Karlsruhe Institute of Technology and the University of Aix-la-Chapelle. In February 2016, he was named Emeritus Senior Researcher by the CNRS.

Research areas

Joseph’s research activities mainly concern Human-Machine Communication, both spoken and written, within the domain of Natural Language Processing.
Early in his career, he concentrated on automatic speech recognition and signal processing.
In the early 1980s, Joseph Mariani was already, within the NATO RSG-10 working group’s evaluation activities, using the name “evaluation paradigm” to denote an open evaluation effort seen as a quantitative black-box with performance metrics on shared data, and then combined and compared, a task now referred to as a “shared task”. This evaluation paradigm allowed for the continuous improvement of speech processing and the eventual appearance of vocal assistants such as SIRI, Cortan, ECHO and Google Voice.
He was involved in NIST2 becoming the center of automatic speech and text processing evaluation activities in the US in 1987. In 1994, with Robert Martin, then Director of the Institut National de la Langue Française, he organized the first francophone open text evaluation for morphosyntactic analyzers of French text thanks to the support of two CNRS departments, the Humanities and Social Sciences and the Engineering Sciences. The same year, he helped start a program in the field of linguistic engineering by Aupelf-Uref and coordinated by the Francophone Network on Language Engineering to strengthen francophone activities in this area. This encompasses Concerted Research Actions, a major action concerning the text and speech4evaluation paradigm. In the early 2000s, he contributed to a major publication on automatic speech processing: Spoken Language Processing5.
Between 2000 and 2010, his activities focused on multilingualism with the development of language matrices for the 24 languages of the European Union6. Later he worked on the publication of the META-NET White Paper Series7 in order to establish an inventory of the resources available for French.
Since 2010, he has worked on the automatic processing of regional languages8 and is interested in ethical problems related to the use of computers in daily life.
Since 20139, he has collected and studies articles in the whole field of natural language processing, including speech processing and information retrieval. This work has been carried out within the framework of the NLP4NLP project10 that began by using the ISCA archives, and later those of LREC11, TALN and IEEE and following that, other conferences and revues such as TREC. After this collection phase, which for the first time gathered a major part of the publications in the field, the publications were automatically analyzed from several points of view. First, all of the technical terms were extracted and compiled in a lexicon. Second, each lexical entry was attributed to the author who first used it. This is an innovation12 in scientific publication. The goal was to understand the mechanisms that influence the domain and thus to identify current and future trends. This work included the creation of technical terms, their evolution , such as the term “neural networks”. Another strategy was to create a predictive analysis, which consists of creating a statistical representation of the use of technical terms in order to predict their use over the following four years. The study also examined the impact of one conference on another, on plagiarism and on re-use in scientific publications13. A full synthesis of the NLP4NLP has been published in 2019 under the form of a double publication in Frontiers in Research Metrics and Analytics

Distinctions

Joseph Mariani was nominated knight in the French National Order of Merit and Officer in the Ordre des Arts et des Lettres. He is an honorary member of the Francophone Association for Speech Communication, a fellow and life member of ISCA, where he received the Special Service Medal in 1999, and honorary president of ELRA since 2010.