Proceedings chapter
OA Policy
English

A New Method for the Study of Correlations between MT Evaluation Metrics

Presented atSkvvde (Sweden)
Publication date2007
Abstract

This paper aims at providing a reliable method for measuring the correlations between different scores of evaluation metrics applied to machine translated texts. A series of examples from recent MT evaluation experiments are first discussed, including results and data from the recent French MT evaluation campaign, CESTA, which is used here. To compute correlation, a set of 1,500 samples for each system and each evaluation metric are created using bootstrapping. Correlations between metrics, both automatic and applied by human judges, are then computed over these samples. The results confirm the previously observed correlations between some automatic metrics, but also indicate a lack of correlation between human and automatic metrics on the CESTA data, which raises a number of questions regarding their validity. In addition, the roles of the corpus size and of the selection procedure for bootstrapping (low vs. high scores) are also examined.

Citation (ISO format)
ESTRELLA, Paula Susana, POPESCU-BELIS, Andréi, KING, Margaret. A New Method for the Study of Correlations between MT Evaluation Metrics. In: Proceedings of the 11th Conference on Theoretical and Methodological Issues in Machine Translation (TMI-07). Skvvde (Sweden). [s.l.] : [s.n.], 2007. p. 55–64.
Main files (1)
Proceedings chapter
accessLevelPublic
Identifiers
  • PID : unige:3460
452views
121downloads

Technical informations

Creation10/02/2009 9:28:57 AM
First validation10/02/2009 9:28:57 AM
Update time03/14/2023 3:14:53 PM
Status update03/14/2023 3:14:52 PM
Last indexation10/29/2024 12:18:28 PM
All rights reserved by Archive ouverte UNIGE and the University of GenevaunigeBlack