Application of Factored Models in English-Latvian Statistical Machine Translation System (2009-2012)
Aim of the project. The project aims to evaluate main factors influencing performance of English-Latvian baseline SMT system (developed in project
Evaluation of statistical Machine Translation methods for English-Latvian translation system (2005-2008) ) and integrate different factors into baseline system to improve translation quality and widen domain of application.
Expected results. Several theoretical and practical results are planned in the project. The main theoretical results will be: analysis of translation quality of the baseline system and factored system, recommendations for development of English-Latvian factored SMT, evaluation of syntax based methods in statistical MT. The main practical result will be prototype of factored English-Latvian statistical MT system.
Tools and resources We use GIZA++ for alignment, SRILM for language models, Ailab tagger for Latvian text annotion and Modes decoder for baseline and factored SMT. Intially JRC Acquis paralell corpus (version 3. 0) was used. Now several other Web resources are added.
Project coordinator: Dr. Inguna Skadiņa
Related publications:
- Skadiņa I. Machine Translation for Latvian. In: Proceedings of First Baltic Conference „Human Language Technologies – the Baltic Perspective”, Riga, 2004, 102-106.
- Skadiņa I. Studies of English-Latvian Legal texts for Machine Translation. // Meaningful Texts: The Extraction of Semantic Information from Monolingual and Multilingual Corpora, Continuum, 2005, 188-195
- Skadiņa I., Brālītis E. Experimental Statistical Machine Translation System for Latvian. // Proceedings of the 3rd Baltic Conference on HLT, Vilnius, 2008, 281-286.
- Skadiņa I., Brālītis E. English-Latvian SMT: knowledge or data? // Proceedings of the 17th Nordic Conference on Computational Linguistics NODALIDA, May 14-16, 2009, Odense, Denmark, NEALT Proceedings Series, Vol. 4 (2009), 242–245.
The project is funded by Latvian Council of Science
Latvijas Universitātes
Matemātikas un informātikas institūta
Mākslīgā intelekta laboratorija