Proceedings chapter
OA Policy
English

Combining pre-editing and post-editing to improve SMT of user-generated content

Presented atNice (France), 2 Sept. 2013
Published inO'Brien, S., Simard, M. & Specia, L. (Ed.), Proceedings of MT Summit XIV Workshop on Post-editing Technology and Practice, p. 45-53
Publication date2013
Abstract

The poor quality of user-generated content (UGC) found in forums hinders both readability and machine-translatability. To improve these two aspects, we have developed human- and machine-oriented pre-editing rules, which correct or reformulate this content. In this paper we pre-sent the results of a study which investigates whether pre-editing rules that improve the quality of statistical machine translation (SMT) output also have a positive impact on post-editing productivity. For this study, pre-editing rules were applied to a set of French sentences extracted from a technical forum. After SMT, the post-editing temporal effort and final quality are compared for translations of the raw source and its pre-edited version. Results obtained suggest that pre-editing speeds up post-editing and that the combination of the two processes is worthy of further investigation.

Keywords
  • User-generated content
  • Pre-editing
  • Post-editing
  • Statistical machine translation
Research groups
Citation (ISO format)
GERLACH, Johanna et al. Combining pre-editing and post-editing to improve SMT of user-generated content. In: Proceedings of MT Summit XIV Workshop on Post-editing Technology and Practice. O’Brien, S., Simard, M. & Specia, L. (Ed.). Nice (France). [s.l.] : [s.n.], 2013. p. 45–53.
Main files (1)
Proceedings chapter (Accepted version)
accessLevelPublic
Identifiers
  • PID : unige:30952
2568views
1477downloads

Technical informations

Creation11/05/2013 12:21:00 PM
First validation11/05/2013 12:21:00 PM
Update time03/14/2023 8:36:02 PM
Status update03/14/2023 8:36:02 PM
Last indexation10/30/2024 2:52:44 PM
All rights reserved by Archive ouverte UNIGE and the University of GenevaunigeBlack