Peeking through the language barrier: the development of a free/open-source gisting system for Basque to English based on apertium.org

  1. O'Regan, Jim
  2. Forcada Zubizarreta, Mikel L.
Revista:
Procesamiento del lenguaje natural

ISSN: 1135-5948

Año de publicación: 2013

Número: 51

Páginas: 15-22

Tipo: Artículo

Otras publicaciones en: Procesamiento del lenguaje natural

Resumen

El artculo describe el desarrollo de un sistema de traduccion automatica del euskera al ingles pensado para la asimilacion (comprension) construido sobre la plataforma de traduccion automatica libre/de codigo fuente abierto basada en reglas Apertium, y lo evalua preliminarmente usando un nuevo metodo basado en tests de clausura (cloze tests) en los que se pide que se rellenen huecos en una traduccion de referencia. Los resultados indican que la disponibilidad de las traducciones en bruto producidas por un sistema con un diccionario de unas 10.000 entradas y unas 300 reglas de traduccion incrementan signi cativamente la capacidad de quien lee para completar los tests con exito

Referencias bibliográficas

  • Bonev, Boyan, Gema Ramírez-Sánchez, and Sergio Ortiz Rojas. 2012. Opinum: statistical sentiment analysis for opinion classification. In Proceedings of the 3rd Workshop in Computational Approaches to Subjectivity and Sentiment Analysis, pages 29-37. Association for Computational Linguistics.
  • Callison-Burch, Chris, Philipp Koehn, Christof Monz, Kay Peterson, Mark Przybocki, and Omar Zaidan. 2010. Findings of the 2010 joint workshop on statistical machine translation and metrics for machine translation. In Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, pages 17-53, Uppsala, Sweden, July. Association for Computational Linguistics. Revised August 2010.
  • Callison-Burch, Chris, Philipp Koehn, Christof Monz, and Josh Schroeder. 2009. Findings of the 2009 Workshop on Statistical Machine Translation. In Proceedings of the Fourth Workshop on Statistical Machine Translation, pages 1-28, Athens, Greece, March. Association for Computational Linguistics.
  • Forcada, Mikel L., Mireia Ginestí-Rosell, Jacob Nordfalk, Jim O'Regan, Sergio Ortiz-Rojas, Juan Antonio Pérez-Ortiz, Felipe Sánchez-Martínez, Gema Ramírez-Sánchez, and Francis M. Tyers. 2011. Apertium: a free/open-source platform for rule-based machine translation. Machine Translation, 25(2):127-144.
  • Ginestí-Rosell, Mireia, Gema Ramírez-Sánchez, Sergio Ortiz-Rojas, Francis M Tyers, and Mikel L Forcada. 2009. Development of a free Basque to Spanish machine translation system. Procesamiento del Lenguaje Natural, 43:187-195.
  • Jones, Douglas, Martha Herzog, Hussny Ibrahim, Arvind Jairam, Wade Shen, Edward Gibson, and Michael Emonts. 2007. ILR-based MT comprehension test with multi-level questions. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers, pages 77-80. Association for Computational Linguistics.
  • Mayor, Aingeru, I~naki Alegria, Arantza Díaz De Ilarraza, Gorka Labaka, Mikel Lersundi, and Kepa Sarasola. 2011. Matxin, an open-source rule-based machine translation system for Basque. Machine Translation, 25(1):53-82.
  • Mayor, Aingeru and Francis M. Tyers. 2009. Matxin: moving towards language independence. In Proceedings of the first international workshop on free/open-source rule-based machine translation, Alacant, pages 11-17.
  • Porsiel, Jörg. 2008. Machine translation at volkswagen: a case study. Multilingual Computing & Technology, 100.
  • Somers, Harold and Elizabeth Wild. 2000. Evaluating machine translation: the cloze procedure revisited. In Translating and the Computer 22: proceedings of the Twenty-second International Conference on Translating and the Computer, 16-17 November 2000.
  • Taylor, Wilson L. 1953. \Cloze procedure": a new tool for measuring readability. Journalism Quarterly, 30:415-433.
  • Tiedemann, Jörg. 2009. News from opusa collection of multilingual parallel corpora with tools and interfaces. In Recent Advances in Natural Language Processing, volume 5, pages 237-248.
  • Tyers, Francis M. 2009. Rule-based augmentation of training data in breton-french statistical machine translation. In Proceedings of the 13th Annual Conference of the European Association of Machine Translation, EAMT09, pages 213-218.
  • Tyers, Francis M., Felipe Sánchez-Martínez, and Mikel L. Forcada. 2012. Flexible finite-state lexical selection for rule-based machine translation. In Mauro Cettolo, Marcello Federico, Lucia Specia, and Andy Way, editors, EAMT 2012: Proceedings of the 16th Annual Conference of the European Association for Machine Translation, Trento, Italy, May 28-30 2012, pages 213-220.