fr
Chapitre d'actes
Accès libre
Anglais

Normalising orthographic and dialectal variants for the automatic processing of Swiss German

Présenté à Poznan, 27-29 Nov 2015
Date de publication2015
Résumé

Swiss dialects of German are, unlike most dialects of well standardised languages, widely used in everyday communication. Despite this fact, they lack tools and resources for natural language processing. The main reason for this is the fact that the dialects are mostly spoken and that written resources are small and highly inconsistent. This paper addresses the great variability in writing that poses a problem for automatic processing. We propose an automatic approach to normalising the variants to a single representation intended for processing tools' internal use (not shown to human users). We manually create a sample of transcribed and normalised texts, which we use to train and test three methods based on machine translation: word-by-word mappings, character-based machine translation, and language modelling. We show that an optimal combination of the three approaches gives better results than any of them separately.

Citation (format ISO)
SAMARDZIC, Tanja, SCHERRER, Yves, GLASER, Elvira. Normalising orthographic and dialectal variants for the automatic processing of Swiss German. In: Proceedings of the 7th Language and Technology Conference. Poznan. [s.l.] : [s.n.], 2015.
Fichiers principaux (1)
Proceedings chapter (Accepted version)
accessLevelPublic
Identifiants
  • PID : unige:82397
820vues
435téléchargements

Informations techniques

Création01/04/2016 12:32:00
Première validation01/04/2016 12:32:00
Heure de mise à jour15/03/2023 00:15:23
Changement de statut15/03/2023 00:15:22
Dernière indexation16/01/2024 20:33:30
All rights reserved by Archive ouverte UNIGE and the University of GenevaunigeBlack