Master
OA Policy
English

The Effect of Simple English Wikipedia on Machine Translation Output: An Evaluation with Cloze Procedure

ContributorsFitzroy, Kirsty
Master program titleMaîtrise universitaire en traduction et technologies Mention localisation et traduction automatique
Defense date2022
Abstract

Simplified text is known to improve the quality of machine translation output. Wikipedia and its simplified counterpart, Simple English Wikipedia, have been used extensively in datasets designed to capitalize on this. However, Simple English Wikipedia has previously been found to be unreliable as a simplified corpus. In this work, passages from the original Wikipedia and Simple English Wikipedia were machine translated by two neural systems, Google Translate and DeepL, and the output evaluated by human judges: firstly for quality, and secondly for comprehensibility using cloze procedure, an uncommon method. No differences were found in output. A corpus analysis of the source texts yielded few signs of simplification. These findings support previous studies showing that Simple English Wikipedia cannot be relied upon as a simplified corpus. Results also showed DeepL to be the best performing system and cloze procedure to be a suitable evaluation protocol for comprehensibility.

Keywords
  • Neural machine translation
  • Simple English Wikipedia
  • Cloze pocedure
  • Text simplification
  • Google Translate
  • DeepL
  • Gap-filling
  • Traduction automatique neuronale
  • Simplification de textes
Citation (ISO format)
FITZROY, Kirsty. The Effect of Simple English Wikipedia on Machine Translation Output: An Evaluation with Cloze Procedure. Master, 2022.
Main files (1)
Master thesis
accessLevelPublic
Identifiers
  • PID : unige:164439
430views
315downloads

Technical informations

Creation26/10/2022 09:45:00
First validation26/10/2022 09:45:00
Update time16/03/2023 08:03:44
Status update16/03/2023 08:03:43
Last indexation17/12/2024 15:38:06
All rights reserved by Archive ouverte UNIGE and the University of GenevaunigeBlack