Book chapter
English

Lightweight Spoken Utterance Classification with CFG, tf-idf and Dynamic Programming

Published inCamelin, Nathalie, Estève, Yannick, Martín-Vide, Carlos (Ed.), Statistical Language and Speech Processing (SLSP), p. 143-154
PublisherSpringer
Collection
  • Lecture Notes in Computer Science; 10583
Publication date2017
Abstract

We describe a simple spoken utterance classification method suitable for data-sparse domains which can be approximately described by CFG grammars. The central idea is to perform robust matching of CFG rules against output from a large-vocabulary recogniser, using a dynamic programming method which optimises the tf-idf score of the matched grammar string. We present results of experiments carried out on a substantial CFG-based medical speech translator and the publicly available Spoken CALL Shared Task. Robust utterance classification using the tf-idf method strongly outperforms plain CFG-based recognition for both domains. When comparing with Naive Bayes classifiers trained on data sampled from the CFG grammars, the tf-idf/dynamic programming method is much better on the complex speech translation domain, but worse on the simple Spoken CALL Shared Task domain.

Keywords
  • Speech recognition
  • Spoken utterance classification
  • Robustness
  • Context-free grammar
  • Tf-idf
  • Medical applications
Note5th International Conference, SLSP 2017, Le Mans, France, October 23–25, 2017, Proceedings
Citation (ISO format)
RAYNER, Emmanuel, TSOURAKIS, Nikolaos, GERLACH, Johanna. Lightweight Spoken Utterance Classification with CFG, tf-idf and Dynamic Programming. In: Statistical Language and Speech Processing (SLSP). Camelin, Nathalie, Estève, Yannick, Martín-Vide, Carlos (Ed.). [s.l.] : Springer, 2017. p. 143–154. (Lecture Notes in Computer Science)
Identifiers
  • PID : unige:99298
ISBN978-3-319-68455-0
542views
0downloads

Technical informations

Creation10/18/2017 4:42:00 PM
First validation10/18/2017 4:42:00 PM
Update time03/15/2023 2:24:06 AM
Status update03/15/2023 2:24:06 AM
Last indexation10/31/2024 8:40:36 AM
All rights reserved by Archive ouverte UNIGE and the University of GenevaunigeBlack