CoCoLoToCoCo – Implementation of a Complex Annotation Ontology

In the last blog we showed how CoCoLoToCoCo assists the transcription process to get high quality transcriptions for ATC data (CoCoLoToCoCo – Easy and fast transcription of ATC utterances). But CoCoLoToCoCo is capable of much more, as it also supports the annotation process, i.e. specifying the semantics of an utterance using a defined ontology, agreed between 22 partners in SESAR-2 solution 16-04 and extended in the HAAWAII project. For that purpose, CoCoLoToCoCo provides an interface that assists and speeds up the task of creating annotations for ATC speech utterances.

The annotations consist of a set of commands and each command is composed of a callsign, type, second type, value(s), qualifier(s) and condition(s). An annotation can easily be created or edited, as shown in the picture below, by just clicking the elements of the list corresponding to callsign or type. Only those command types and their corresponding qualifiers etc. can be chosen, which are supported by the respective area, e.g. tower or approach. This is specified in a configuration file, which for example defines that a TAXI command type is not possible for an approach controller.

The following view shows the resulting annotation (created with CoCoLoToCoCo) for the transcription “lufthansa one two eight nine hello continue via november east and november your stand victor one zero nine”.

The colors visualize the close integration of CoCoLoToCoCo with ABSR (Assistant Based Speech Recognition). ABSR relies on the integration of a speech recognizer with an assistant system, periodically predicting the commands which are likely to appear based on weather, surveillance and/or flight plan data. The green color in the above figure shows commands which are predicted at the time the utterance was spoken. This helps to easily identify possible errors in the annotations.

The two commands marked in red in the figure below show for example commands which are not predicted. In this case the annotated callsign did not appear in the surveillance or flight plan data and the prediction therefore classified the callsign as unlikely to appear in the annotations.

The biggest support that CoCoLoToCoCo users receives is when they use the possibility to automatically generate annotations from transcriptions. Ideally the transcriptions in that case also come from an automatic pre-transcription, because this would mean the least amount of manual work for the user as only corrections on errors are necessary.

If you want to know more about this automatic process stay tuned for our next blog.