Generalised alignment templates for the inference of shallow-transfer MT rules from small parallel corpora – Felipe Sánchez Martínez, Universitat d’Alacant
Rule-based machine translation (MT) is the paradigm of choice when the amount of bilingual resources available is not large enough to train a full-fledged statistical MT system. Building a rule-based MT system usually implies a considerable investment in the development of linguistics resources. However, even in those cases in which bilingual parallel corpora are scarce, automatic inference methods can be used to automatically infer structural transfer rules.
In this talk I will present the current developments at Universitat d’Alacant aimed at learning shallow-transfer MT rules from small parallel corpora for their used by the shallow-transfer MT platform Apertium. Inspired by the work by Sánchez-Martínez & Forcada (2009) we use alignment templates (AT), like those used in statistical MT, and overcomes the main limitations of their approach: the inability of finding the appropriate level of generalisation for the ATs from which rules are generated; the inability to perform context-dependent lexicalisations to be able to give a different treatment to those words that are incorrectly translated by more general ATs; and the deficient selection of the sequences of lexical categories for which transfer rules are generated. Preliminary experiments show that translation quality is improved as compared to the method by Sánchez-Martínez & Forcada (2009), and the number of inferred rules is considerably smaller.
Time: 3-4pm on Friday, November 29th
Venue: CG05 (Henry Grattan Building)