en
Proceedings chapter
Open access
English

Two Approaches to Correcting Homophone Confusions in a Hybrid Machine Translation System

Presented at Sofia (Bulgaria), 8 Aug. 2013
PublisherAssociation for Computational Linguistics
Publication date2013
Abstract

In the context of a hybrid French-to-English SMT system for translating online forum posts, we present two methods for addressing the common problem of homophone confusions in colloquial written language. The first is based on hand-coded rules; the second on weighted graphs derived from a large-scale pronunciation resource, with weights trained from a small bicorpus of domain language. With automatic evaluation, the weighted graph method yields an improvement of about +0.63 BLEU points, while the rulebased method scores about the same as the baseline. On contrastive manual evaluation, both methods give highly significant improvements (p < 0.0001) and score about equally when compared against each other.

Research group
Citation (ISO format)
BOUILLON, Pierrette et al. Two Approaches to Correcting Homophone Confusions in a Hybrid Machine Translation System. In: Proceedings of the Second Workshop on Hybrid Approaches to Translation. Sofia (Bulgaria). [s.l.] : Association for Computational Linguistics, 2013. p. 109–116.
Main files (1)
Proceedings chapter (Published version)
accessLevelPublic
Identifiers
  • PID : unige:30954
686views
245downloads

Technical informations

Creation11/05/2013 11:50:00 AM
First validation11/05/2013 11:50:00 AM
Update time03/14/2023 8:36:03 PM
Status update03/14/2023 8:36:02 PM
Last indexation01/16/2024 8:06:46 AM
All rights reserved by Archive ouverte UNIGE and the University of GenevaunigeBlack