Abu-MaTran at the Machine Translation Marathon 2015

The Abu-MaTran project is present this week at the tenth Machine Translation Marathon in Prague (Czech Republic).

Project researcher Jorge Ferrández Tordera from Prompsit Language Engineering presents a poster about CloudLM, a novel tool that allows to use cloud-based language models in statistical machine translation systems. This paper is co-authored by project researchers Sergio Ortiz-Rojas and Antonio Toral. Most of the work leading to the publication was carried out during Jorge’s industry-to-academia secondment at Dublin City University.

mtm15_cloudlm_jferrandez

Second Workshop on Data Creation for Apertium

The second Workshop on data creation has been completed!

This workshop, entitled “Workshop on the Apertium free/open-source machine translation platform: transferring structures from one language to another”, took place on 22nd May 2015 at the University of Zagreb (Croatia) and aims at encouraging people to contribute to the Apertium platform for the creation of data (transfer rules) for South-Slavic languages.

Participants of the first workshop acquired an advanced knowledge of one of the most interesting modules inside Apertium: the transfer, where translation of structures from one language to another takes place.

These are the materials that were produced and used for the workshop:

– the workshop guide: for abumatran-apertium-workshop2-guide

– the workshop slides: for abumatran-apertium-workshop2-slides

All materials are distributed with free/open-source licenses. For a copy of the original files, just contact us.

If you use this materials, please, let us know. We will be glad to have some feedback from you.

Workshop on data creation for Apertium RBMT language pairs

We are back to you to share the materials of the Workshop on data creation created within the Abu-MaTran project as one of the outreach activities planned.

This workshop, entitled “Workshop on the Apertium free/open-source machine translation platform: basics on how to control the engine through linguistics”, took place in November 2014 at the University of Zagreb (Croatia) and aims at encouraging people to contribute to the Apertium platform for the creation of data (dictionaries and manually disambiguated corpora) for South-Slavic languages.

Newcomers and already Apertium contributors are targeted as testers of a new approach seeking to low the bar for contributions through experimental user interfaces.

We are making available:

– the workshop guide: for days 1 and 2

– the workshop slides: for day 1 and day 2

All materials are distributed with free/open-source licenses. For a copy of the original files, just contact us.

If you use this materials, please, let us know. We will be glad to have some feedback from you.

Guest lecture by Antonio Toral at Dublin City University

Project researcher Antonio Toral is giving a guest lecture at Dublin City University as part of the Digital World module.

Multimedia in Mobile-based Machine Translation (PDF slides)

Advances in recent years in the fields of machine translation and mobile computing have led to  the long-awaited dream of having automatic translation in a handheld device to be feasible.
This lecture aims to give a general overview of the techniques behind machine translation and the technologies that allow to deal with multimedia, so that one can build mobile applications that translate not only from text but also from other types of media such as speech and image.

Time: 10.00, 30th October 2014
Venue: Room XG20, Science Buliding, Dublin City University

Talk by Gema Ramírez at UZ NLP circle

This afternoon, Gema Ramírez, from partner Prompsit, has been invited to give a talk at the NLP circle at the University of Zagreb that has a monthly meeting on the last Monday of each month.

More than 50 people among researchers, students and professionals will attend this talk about the free/open source machine translation platform Apertium and the company Gema’s managages, Prompsit, offering services related to this platform and other NLP services.

The slides of the talk if you are not able to make it, are available here.

Enjoy!

When: 27 October 2014, 17:00
Where: Meeting Room. 2nd floor. Faculty of Humanities and Social Sciences at the University of Zagreb
Tittle: The Apertium plaform: opportunities for research and business

Three highlights of the Abu-MaTran project at the mid-term review

We have prepared a short visual presentation concerning three highlights of the Abu-MaTran project when we arrive at the mid-term review. These highlights cover the following topics:

Talk by Andy Way at Universitat d’Alacant

Project researcher Andy Way is giving an invited talk at Universitat d’Alacant.

Bilingual Termbank Creation via Log-Likelihood Comparison and Phrase-Based Statistical Machine Translation

Bilingual termbanks are important for many natural language processing (NLP) applications, especially in translation workflows in industrial settings. In this paper, we apply a log-likelihood comparison method to extract monolingual terminology from the source and target sides of a parallel corpus. Then, using a Phrase-Based Statistical Machine Translation model, we create a bilingual terminology with the extracted monolingual term lists. We manually evaluate our novel terminology extraction model on English-to-Spanish and English-to-Hindi data sets, and observe excellent performance for all domains. Furthermore, we report the performance of our monolingual terminology extraction model comparing with a number of the state-of-the-art terminology extraction models on the English-to-Hindi datasets.

Time: 10.30, 12th September 2014,
Venue: Seminari de III cicle del Departament de Llenguatges i Sistemes Informàtics, Edifici Politècnica IV (Edifici 39), Mòdul 2, 1a planta

 

Abu-MaTran at WMT 2014

The Abu-MaTran project was present at the ninth Workshop on Statistical Machine Translation (WMT) 2014, organised jointly with the annual Association for Computational Linguistics conference.

The workshop took place in Baltimore, MD, the 26th and 27th June 2014. Two papers related to the Abu-MaTran project were presented, describing the Machine Translation systems built for the English-French translation task. The data-constrained Abu-MaTran system was ranked first and second, based on human and automatic evaluations respectively.

Abu-MaTran at WMT 2014 Translation Task: Two-step Data Selection and RBMT-Style Synthetic Rules
Raphael Rubino, Antonio Toral, Víctor M. Sánchez-Cartagena, Jorge Ferrández-Tordera, Sergio Ortiz Rojas, Gema Ramírez-Sánchez, Felipe Sánchez-Martínez and Andy Way

20140626_123502

The UA-Prompsit hybrid machine translation system for the 2014 Workshop on Statistical Machine Translation
Víctor M. Sánchez-Cartagena, Juan Antonio Pérez-Ortiz and Felipe Sánchez-Martínez

victor_wmt14

Abu-MaTran at EAMT 2014

The Abu-MaTran project was present at the recent conference of the European Association for Machine Translation, held in Dubrovnik (Croatia).

Two papers related to the Abu-MaTran project were presented:

Extrinsic Evaluation of Web-Crawlers in Machine Translation: a Case Study on Croatian–English for the Tourism Domain
Antonio Toral, Raphael Rubino, Miquel Esplà, Tommi Pirinen, Andy Way and Gema Ramírez-Sánchez.

An efficient method to assist non-expert users in extending dictionaries by assigning stems and inflectional paradigms to unknown words
Miquel Esplà-Gomis, Víctor M. Sánchez-Cartagena, Juan A. Pérez-Ortiz, Felipe Sánchez-Martínez, Mikel L. Forcada and Rafael C. Carrasco

mforcada_eamt14

Moreover, a poster dedicated to the Abu-MaTran project was presented during the conference, as well as a demo of our tourism-specific Machine Translation system working on a mobile phone.

abumatran_eamt14

Talk by Mikel Artetxe at Dublin City University

Mikel Artetxe from the University of the Basque Country will be giving an invited talk as part of the NCLT Seminar Series.

Mitzuli: offline machine translation on a mobile phone

Mobile platforms are changing the way in which people interact with technology, and they offer a whole new world of possibilities to make something like machine translation more useful for the general public. This talk is about Mitzuli, a translator app for Android that includes support for OCR, TTS and a full offline mode. The challenges of creating something like this will be presented in the talk, analyzing the main features and restrictions of these mobile platforms when compared to the traditional desktop platforms. As an example of this, we will see how Apertium, the RBMT system that Mitzuli is based on, was ported to Android.

Time: 3pm, Tuesday June 10th
Venue: S206 (Engineering)