Publications in collaboration with researchers from Indiana University Bloomington (24)

2024

  1. Developing a Benchmark for Pronunciation Feedback: Creation of a Phonemically Annotated Speech Corpus of isiZulu Language Learner Speech

    2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings

  2. Producing a Parallel Universal Dependencies Treebank of Ancient Hebrew and Ancient Greek via Cross-Lingual Projection

    2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings

2023

  1. Codex to corpus: Exploring annotation and processing for an open and extensible machine-readable edition of the Florentine Codex

    Proceedings of the Annual Meeting of the Association for Computational Linguistics

  2. Comparing methods of orthographic conversion for Bàsàá, a language of Cameroon

    4th Workshop on Resources for African Indigenous Languages, RAIL 2023 - Proceedings of the Workshop

  3. The ITML Submission to the IberLEF2023 Shared Task on Guarani-Spanish Code Switching Analysis

    CEUR Workshop Proceedings

  4. Towards a finite-state morphological analyser for San Mateo Huave

    COMPUTEL 2023 - 6th Workshop on the Use of Computational Methods in the Study of Endangered Languages, Proceedings of the Workshop

2022

  1. A Free/Open-Source Morphological Analyser and Generator for Sakha

    2022 Language Resources and Evaluation Conference, LREC 2022

  2. CURRICULUM OPTIMIZATION FOR LOW-RESOURCE SPEECH RECOGNITION

    ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

  3. Non-finite verb forms in Turkic exhibit syncretism, not multifunctionality

    Folia Linguistica, Vol. 56, Núm. 3, pp. 693-742

2021

  1. A finite-state morphological analyser for Paraguayan Guaraní

    Proceedings of the 1st Workshop on Natural Language Processing for Indigenous Languages of the Americas, AmericasNLP 2021

  2. A survey of part-of-speech tagging approaches applied to K’iche’

    Proceedings of the 1st Workshop on Natural Language Processing for Indigenous Languages of the Americas, AmericasNLP 2021

  3. Correction to: Morphological analysis and disambiguation for Breton (Language Resources and Evaluation, (2021), 55, 2, (431-473), 10.1007/s10579-020-09510-8)

    Language Resources and Evaluation

  4. Investigating variation in written forms of Nahuatl using character-based language models

    Proceedings of the 1st Workshop on Natural Language Processing for Indigenous Languages of the Americas, AmericasNLP 2021

  5. Keyword spotting for audiovisual archival search in Uralic languages

    IWCLUL 2021 - 7th International Workshop on Computational Linguistics of Uralic Languages, Proceedings

  6. Morphological analysis and disambiguation for Breton

    Language Resources and Evaluation, Vol. 55, Núm. 2, pp. 431-473

  7. Recent advances in Apertium, a free/open-source rule-based machine translation platform for low-resource languages

    Machine Translation, Vol. 35, Núm. 4, pp. 475-502

2020

  1. An unsupervised method for weighting finite-state morphological analyzers

    LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings

  2. Common voice: A massively-multilingual speech corpus

    LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings

  3. Unsupervised weighting of transfer rules in rule-based machine translation using maximum-entropy approach

    Journal of Information Science and Engineering, Vol. 36, Núm. 2, pp. 309-322

2019

  1. An approach to abstractive summarization for norwegian bokmål

    Communications in Computer and Information Science