UNIGE document Conference Presentation
previous document  unige:22818  next document
add to browser collection

Machine translation into multiple dialects: The example of Swiss German

Presented at 7th SIDG Congress - Dialect 2.0. Vienna (Austria) - 23rd - 28th July 2012 - . 2012
Abstract In this paper, we propose to approach dialects and dialectology from a Natural Language Processing (NLP) point of view. NLP covers a series of computational applications that analyze and/or transform linguistic data, such as machine translation, parsing, or text summarization. For practical reasons, most NLP applications focus on standardized, written language varieties. We argue that non-standard varieties, often also characterized by internal variation, can result in interesting methodological insights in NLP. Our work focuses on Swiss German dialects. Today, dialect represents the default variety of oral communication in the German-speaking part of Switzerland (Standard German is almost exclusively used for writing). Recently, dialect writing has also become popular in electronic media (Siebenhaar 2003). This evolution justifies the development of dialect NLP tools, and at the same time provides us with data to validate them. We present a system that automatically translates (written) Standard German sentences into (written) sentences of any Swiss German dialect. It is based on hand-built transfer rules operating on several linguistic levels such as phonology, morphology and syntax (Scherrer 2011a, 2011b). The target variety is not homogeneous, but a continuum of dialects. This multi-dialectal approach requires the rules to be linked to distributional information. We extracted these data from existing Swiss German dialect atlases (Hotzenköcherle et al. 1962-1997; Bucheli & Glaser 2002). This paper focuses on two crucial aspects of our work. First, we discuss methodological choices and issues involved in processing the dialectological data. Indeed, a large part of the maps had to be digitized and interpolated to fit our needs for probabilistic interpretation. For this task, we rely on methods recently proposed in dialectometry (Rumpf et al. 2009). Second, we present different datasets and different methodologies that have been used to evaluate the proposed system.
Full text
Presentation (2.1 MB) - public document Free access
Research group Laboratoire d'Analyse et de Traitement du Langage (LATL)
(ISO format)
SCHERRER, Yves. Machine translation into multiple dialects: The example of Swiss German. In: 7th SIDG Congress - Dialect 2.0. Vienna (Austria). 2012. https://archive-ouverte.unige.ch/unige:22818

451 hits



Deposited on : 2012-09-11

Export document
Format :
Citation style :