UNIGE document Chapitre d'actes
previous document  unige:82397  next document
add to browser collection
Title

Normalising orthographic and dialectal variants for the automatic processing of Swiss German

Authors
Glaser, Elvira
Published in Proceedings of the 7th Language and Technology Conference. Poznan - 27-29 Nov 2015 - . 2015
Abstract Swiss dialects of German are, unlike most dialects of well standardised languages, widely used in everyday communication. Despite this fact, they lack tools and resources for natural language processing. The main reason for this is the fact that the dialects are mostly spoken and that written resources are small and highly inconsistent. This paper addresses the great variability in writing that poses a problem for automatic processing. We propose an automatic approach to normalising the variants to a single representation intended for processing tools’ internal use (not shown to human users). We manually create a sample of transcribed and normalised texts, which we use to train and test three methods based on machine translation: word-by-word mappings, character-based machine translation, and language modelling. We show that an optimal combination of the three approaches gives better results than any of them separately.
Full text
Structures
Research group Laboratoire d'Analyse et de Traitement du Langage (LATL)
Citation
(ISO format)
SAMARDZIC, Tanja, SCHERRER, Yves, GLASER, Elvira. Normalising orthographic and dialectal variants for the automatic processing of Swiss German. In: Proceedings of the 7th Language and Technology Conference. Poznan. [s.l.] : [s.n.], 2015. https://archive-ouverte.unige.ch/unige:82397

269 hits

94 downloads

Update

Deposited on : 2016-04-06

Export document
Format :
Citation style :