fr
Chapitre d'actes
Accès libre
Anglais

Word-Based Dialect Identification with Georeferenced Rules

Contributeurs/tricesScherrer, Yves; Rambow, Owen
Publié dansProceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, Editeurs/trices Hang Li & Lluís Màrquez, p. 1151-1161
Présenté à Boston (USA), 9-11 October 2010
Maison d'éditionStroudsburg, PA (USA) : The Association for Computational Linguistics
Date de publication2010
Résumé

We present a novel approach for (written) dialect identification based on the discriminative potential of entire words. We generate Swiss German dialect words from a Standard German lexicon with the help of hand-crafted phonetic/graphemic rules that are associated with occurrence maps extracted from a linguistic atlas created through extensive empirical fieldwork. In comparison with a character n-gram approach to dialect identification, our model is more robust to individual spelling differences, which are frequently encountered in non-standardized dialect writing. Moreover, it covers the whole Swiss German dialect continuum, which trained models struggle to achieve due to sparsity of training data.

Citation (format ISO)
SCHERRER, Yves, RAMBOW, Owen. Word-Based Dialect Identification with Georeferenced Rules. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. Boston (USA). Stroudsburg, PA (USA) : The Association for Computational Linguistics, 2010. p. 1151–1161.
Fichiers principaux (1)
Proceedings chapter
accessLevelPublic
Identifiants
  • PID : unige:22821
ISBN978-1-932432-86-2
571vues
190téléchargements

Informations techniques

Création29.08.2012 14:28:00
Première validation29.08.2012 14:28:00
Heure de mise à jour14.03.2023 17:40:27
Changement de statut14.03.2023 17:40:27
Dernière indexation12.02.2024 20:25:02
All rights reserved by Archive ouverte UNIGE and the University of GenevaunigeBlack