HAAWAII project (in association with other related projects supported by EC) has succeeded with its proposition to organize a special session at Interspeech 2021 conference (30.8 – 3.9.2021). Interspeech is the lead international conference organized yearly, focused primarily on research and applicability of the technologies on speech. The conference has a high ranking, with acceptance of the papers below 50%.
The organized session is dedicated to automatic speech recognition in air-traffic management, and the following agenda of the session has been released:
Thu-M-SS-2 Thursday, September 2, 11:00-13:00 Special-Hybrid: Automatic Speech Recognition in Air Traffic Management
- 11:00 Introduction
- 11:10 Thu-M-SS-2-1 333 Towards an Accent-Robust Approach for ATC Communications Transcription, Nataly Jahchan, Florentin Barbier, Ariyanidevi Dharma Gita, Khaled Khelif and Estelle Delpech
- 11:25 Thu-M-SS-2-2 1033 Detecting English Speech in the Air Traffic Control Voice Communication, Igor Szöke, Santosh Kesiraju, Ondřej Novotný, Martin Kocour, Karel Veselý and Jan Černocký
- 11:40 Thu-M-SS-2-6 935 Robust Command Recognition for Lithuanian Air Traffic Control Tower Utterances, Oliver Ohneiser, Saeed Sarfjoo, Hartmut Helmke, Shruthi Shetty, Petr Motlicek, Matthias Kleinert, Heiko Ehr and Šarūnas Murauskas
- 11:55 Thu-M-SS-2-3 1373 Contextual Semi-Supervised Learning: An Approach to Leverage Air-Surveillance and Untranscribed ATC Data in ASR Systems, Juan Zuluaga-Gomez, Iuliia Nigmatulina, Amrutha Prasad, Petr Motlicek, Karel Veselý, Martin Kocour and Igor Szöke
- 12:10 Thu-M-SS-2-4 1619 Boosting of Contextual Information in ASR for Air-Traffic Call-Sign Recognition, Martin Kocour, Karel Veselý, Alexander Blatt, Juan Zuluaga Gomez, Igor Szöke, Jan Černocký, Dietrich Klakow and Petr Motlíček
- 12:25 Thu-M-SS-2-5 1650 Modeling the Effect of Military Oxygen Masks on Speech Characteristics, Benjamin Elie, Jodie Gauvain, Jean-Luc Gauvain and Lori Lamel
- 12:40 Panel discussion
Goal of the special session:
Air-traffic management is a dedicated domain where in addition to using the voice signal, other contextual information (i.e. air traffic surveillance data, meteorological data, etc.) plays an important role. Automatic speech recognition is the first challenge in the whole chain. Further processing usually requires transforming the recognized word sequence into the conceptual form, a more important application in ATM. This also means that the usual metrics for evaluating ASR systems (e.g. word error rate) are less important, and other performance criteria (i.e. objective such as command recognition error rate, callsign detection accuracy, overall algorithmic delay, real-time factor, or reduced flight times, or subjective such as decrease of a workload of the users) are employed.
This special session is to bring together ATM players (both academic and industrial) interested in ASR and ASR researchers looking for new challenges. This can accelerate near future R&D plans to enable an integration of speech technologies to the challenging, but highly safety oriented air-traffic management domain.