Conference presentation
Open access

Machine translation into multiple dialects: The example of Swiss German

ContributorsScherrer, Yves
Presented at7th SIDG Congress - Dialect 2.0, Vienna (Austria), 23rd - 28th July 2012
Publication date2012

In this paper, we propose to approach dialects and dialectology from a Natural Language Processing (NLP) point of view. NLP covers a series of computational applications that analyze and/or transform linguistic data, such as machine translation, parsing, or text summarization. For practical reasons, most NLP applications focus on standardized, written language varieties. We argue that non-standard varieties, often also characterized by internal variation, can result in interesting methodological insights in NLP. Our work focuses on Swiss German dialects. Today, dialect represents the default variety of oral communication in the German-speaking part of Switzerland (Standard German is almost exclusively used for writing). Recently, dialect writing has also become popular in electronic media (Siebenhaar 2003). This evolution justifies the development of dialect NLP tools, and at the same time provides us with data to validate them. We present a system that automatically translates (written) Standard German sentences into (written) sentences of any Swiss German dialect. It is based on hand-built transfer rules operating on several linguistic levels such as phonology, morphology and syntax (Scherrer 2011a, 2011b). The target variety is not homogeneous, but a continuum of dialects. This multi-dialectal approach requires the rules to be linked to distributional information. We extracted these data from existing Swiss German dialect atlases (Hotzenköcherle et al. 1962-1997; Bucheli & Glaser 2002). This paper focuses on two crucial aspects of our work. First, we discuss methodological choices and issues involved in processing the dialectological data. Indeed, a large part of the maps had to be digitized and interpolated to fit our needs for probabilistic interpretation. For this task, we rely on methods recently proposed in dialectometry (Rumpf et al. 2009). Second, we present different datasets and different methodologies that have been used to evaluate the proposed system.

Citation (ISO format)
SCHERRER, Yves. Machine translation into multiple dialects: The example of Swiss German. In: 7th SIDG Congress - Dialect 2.0. Vienna (Austria). 2012.
Main files (1)
  • PID : unige:22818

Technical informations

Creation08/29/2012 3:05:00 PM
First validation08/29/2012 3:05:00 PM
Update time03/14/2023 5:40:26 PM
Status update03/14/2023 5:40:26 PM
Last indexation01/16/2024 12:12:10 AM
All rights reserved by Archive ouverte UNIGE and the University of GenevaunigeBlack