Abu-MaTran at WMT15 Machine Translation and Quality Estimation Shared Tasks

Project researchers Raphaël Rubino and Miquel Esplà-Gomis represented the Abu-MaTran consortium at the Tenth Workshop on Statistical Machine Translation (WMT 15), co-located with EMNLP. We participated in two shared tasks (Machine Translation and Quality Estimation), and in both cases our submissions ranked first: Machine Translation for English-to-Finnish and Quality Estimation at word-level, respectively.

Raphaël presented the systems submitted by the Abu-MaTran consortium to the Machine Translation shared task. We participated in the Finnish–English language pair, in which we tackled the lack of resources and complex morphology of the Finnish language by (i) crawling parallel (FiEnWaC) and monolingual (FiWaC) data from the Web and (ii) applying rule-based and unsupervised methods for morphological segmentation. Our submissions were the top performing English-to-Finnish unconstrained (according to all automatic metrics) and constrained (according to BLEU), and Finnish-to-English constrained (according to TER) systems.

Miquel presented the systems submitted to the Quality Estimation shared task. We participated in the word-level sub-task with a method that uses external sources of bilingual information as a black box to spot sub-segment correspondences between a source segment and the translation hypothesis produced by a machine translation system. We used two sources of bilingual information in our submissions: machine translation (Apertium and Google Translate) and the bilingual concordancer Reverso Context. Our system ranked first in the sub-task.