Publicaciones en colaboración con investigadores/as de Jožef Stefan Institute (4)

2024

  1. Do Language Models Care About Text Quality? Evaluating Web-Crawled Corpora Across 11 Languages

    2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings

2023

  1. MaCoCu: Massive collection and curation of monolingual and bilingual data: focus on under-resourced languages

    Proceedings of the 24th Annual Conference of the European Association for Machine Translation, EAMT 2023

2022

  1. MaCoCu: Massive collection and curation of monolingual and bilingual data: focus on under-resourced languages

    EAMT 2022 - Proceedings of the 23rd Annual Conference of the European Association for Machine Translation

2016

  1. Dealing with data sparseness in SMT with factored models and morphological expansion: A case study on Croatian

    Proceedings of the 19th Annual Conference of the European Association for Machine Translation, EAMT 2016