I.U. DE INVESTIGACIÓN INFORMÁTICA
Institut
Antonio
Toral Ruiz
Publications dans lesquelles il/elle collabore avec Antonio Toral Ruiz (34)
2024
-
Do Language Models Care About Text Quality? Evaluating Web-Crawled Corpora Across 11 Languages
2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings
2023
-
MaCoCu: Massive collection and curation of monolingual and bilingual data: focus on under-resourced languages
Proceedings of the 24th Annual Conference of the European Association for Machine Translation, EAMT 2023
2022
-
Building Domain-specific Corpora from the Web: the Case of European Digital Service Infrastructures
Proceedings of the International Conference on Language Resources and Evaluation, LREC 2022 - 15th Workshop on Building and Using Comparable Corpora, BUCC 2022
-
MaCoCu: Massive collection and curation of monolingual and bilingual data: focus on under-resourced languages
EAMT 2022 - Proceedings of the 23rd Annual Conference of the European Association for Machine Translation
2018
-
Quantitative fine-grained human evaluation of machine translation systems: a case study on English to Croatian
Machine Translation, Vol. 32, Núm. 3, pp. 195-215
2017
-
A multifaceted evaluation of neural versus phrase-based machine translation for 9 language directions
15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017 - Proceedings of Conference
-
Crawl and crowd to bring machine translation to under-resourced languages
Language Resources and Evaluation, Vol. 51, Núm. 4, pp. 1019-1051
-
Final results of Abu-MaTran (automatic building of machine translation)
20th Annual Conference of the European Association for Machine Translation, EAMT 2017
2016
-
Abu-MaTran at WMT 2016 Translation Task: Deep Learning, Morphological Segmentation and Tuning on Character Sequences
Proceedings of the Annual Meeting of the Association for Computational Linguistics
-
Producing monolingual and parallel web corpora at the same time - SpiderLing and Bitextor's love affair
Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016
-
Producing monolingual and parallel web corpora at the same time: SpiderLing and Bitextor's love affair
10th conference on International Language Resources and Evaluation (LREC'16) (European Language Resources Association), pp. 2949-2956
2015
-
Abu-matran at wmt 2015 translation task: Morphological segmentation andweb crawling
10th Workshop on Statistical Machine Translation, WMT 2015 at the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015 - Proceedings
-
Automatic Acquisition of Machine Translation Resources in the Abu-MaTran Project
Procesamiento del lenguaje natural, Núm. 55, pp. 185-188
2014
-
Abu-matran at WMT 2014 translation task: Two-step data selection and rbmt-style synthetic rules
Proceedings of the Annual Meeting of the Association for Computational Linguistics
-
Extrinsic evaluation of web-crawlers in machine translation: A case study on Croatian-English for the tourism domain
Proceedings of the 17th Annual Conference of the European Association for Machine Translation, EAMT 2014
2012
-
Web 2.0, Language Resources and standards to automatically build a multilingual Named Entity Lexicon
Language Resources and Evaluation, Vol. 46, Núm. 3, pp. 383-419
2009
-
A study on Linking Wikipedia categories to WordNet synsets using text similarity
International Conference Recent Advances in Natural Language Processing, RANLP
-
Exploiting Wikipedia and EuroWordNet to solve Cross-Lingual Question Answering
Information Sciences, Vol. 179, Núm. 20, pp. 3473-3488
2007
-
Applying Wikipedia's multilingual knowledge to cross-lingual question answering
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
-
GIR with geographic query expansion
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)