Rule-based Automatic Post-processing of SMT Output to Reduce Human Post-editing Effort

Published in Translating and the Computer 36. Londres (Royaume-Uni) - 27-28 novembre 2014 - . 2014
Abstract To enhance sharing of knowledge across the language barrier, the ACCEPT project focuses on improving machine translation of user-generated content by investigating pre- and post-editing strategies. Within this context, we have developed automatic monolingual post-editing rules for French, aimed at correcting frequent errors automatically. The rules were developed using the AcrolinxIQ technology, which relies on shallow linguistic analysis. In this paper, we present an evaluation of these rules, considering their impact on the readability of MT output and their usefulness for subsequent manual post-editing. Results show that the readability of a high proportion of the data is indeed improved when automatic post-editing rules are applied. Their usefulness is confirmed by the fact that a large share of the edits brought about by the rules are in fact kept by human post-editors. Moreover, results reveal that edits which improve readability are not necessarily the same as those preserved by post-editors in the final output, hence the importance of considering both readability and post-editing effort in the evaluation of post-editing strategies.
Keywords Post-editingStatistical machine translationUser-generated contentLanguage communities
Research group TIM/ISSCO
PORRO RODRIGUEZ, Victoria et al. Rule-based Automatic Post-processing of SMT Output to Reduce Human Post-editing Effort. In: Translating and the Computer 36. Londres (Royaume-Uni). [s.l.] : [s.n.], 2014. https://archive-ouverte.unige.ch/unige:42657

Deposited on : 2014-12-06

