en
Proceedings chapter
Open access
English

Combining pre-editing and post-editing to improve SMT of user-generated content

Published inProceedings of MT Summit XIV Workshop on Post-editing Technology and Practice, Editors O'Brien, S., Simard, M. & Specia, L., p. 45-53
Presented at Nice (France), 2 Sept. 2013
Publication date2013
Abstract

The poor quality of user-generated content (UGC) found in forums hinders both readability and machine-translatability. To improve these two aspects, we have developed human- and machine-oriented pre-editing rules, which correct or reformulate this content. In this paper we pre-sent the results of a study which investigates whether pre-editing rules that improve the quality of statistical machine translation (SMT) output also have a positive impact on post-editing productivity. For this study, pre-editing rules were applied to a set of French sentences extracted from a technical forum. After SMT, the post-editing temporal effort and final quality are compared for translations of the raw source and its pre-edited version. Results obtained suggest that pre-editing speeds up post-editing and that the combination of the two processes is worthy of further investigation.

Keywords
  • User-generated content
  • Pre-editing
  • Post-editing
  • Statistical machine translation
Research group
Citation (ISO format)
GERLACH, Johanna et al. Combining pre-editing and post-editing to improve SMT of user-generated content. In: Proceedings of MT Summit XIV Workshop on Post-editing Technology and Practice. Nice (France). [s.l.] : [s.n.], 2013. p. 45–53.
Main files (1)
Proceedings chapter (Accepted version)
accessLevelPublic
Identifiers
  • PID : unige:30952
2528views
1471downloads

Technical informations

Creation11/05/2013 12:21:00 PM
First validation11/05/2013 12:21:00 PM
Update time03/14/2023 8:36:02 PM
Status update03/14/2023 8:36:02 PM
Last indexation05/02/2024 1:26:44 PM
All rights reserved by Archive ouverte UNIGE and the University of GenevaunigeBlack