Master
OA Policy
English

Automatic post-editing of subtitles - Rule-based post-editing of subtitles from Swiss German to Standard German

Master program titleMaîtrise universitaire en traitement informatique multilingue
Defense date2022
Abstract

This master thesis is part of the PASSAGE project, which aims to develop a system to automatically generate Standard German subtitles for spoken Swiss German in collaboration with the « Schweizer Radio and Fernsehen » (SRF). For this task, a normalized transcription from speech recognition needs to be post-edited for the training of an automatic post-editing system. The goal of the thesis is to develop and evaluate a rule-based automatic post-editing system, which is intended to assist post-editors in generating corrected segments. The system exploits regular expressions and rules using the natural language processing library spaCy. The developed rules cover errors due to spoken language and specific Swiss German phenomenons. The system detected errors (performed modifications) in approx. 5% (4%) of the normalized transcription input. A human evaluation confirmed the necessity (correctness) of the modifications in 97% (88%) of the cases.

Keywords
  • Rule-based post-editing
  • Automatic post-editing
  • APE
  • Swiss German
  • Regular expressions
  • Regex
  • SpaCy
  • Natural language processing
  • Nlp
Citation (ISO format)
HABERKORN, Veronika Christine. Automatic post-editing of subtitles - Rule-based post-editing of subtitles from Swiss German to Standard German. Master, 2022.
Main files (1)
Master thesis
accessLevelPublic
Identifiers
  • PID : unige:160778
317views
410downloads

Technical informations

Creation04/26/2022 2:37:00 PM
First validation04/26/2022 2:37:00 PM
Update time03/16/2023 6:31:58 AM
Status update03/16/2023 6:31:57 AM
Last indexation12/17/2024 3:37:46 PM
All rights reserved by Archive ouverte UNIGE and the University of GenevaunigeBlack