What matters more: the size of the corpora or their quality?the case of automatic translation of multiword expressions using comparable corpora

  1. Mitkov, Ruslan 1
  2. Taslimipoor, Shiva 1
  1. 1 University of Wolverhampton
    info

    University of Wolverhampton

    Wolverhampton, Reino Unido

    ROR https://ror.org/01k2y1055

Book:
Computational phraseology
  1. Corpas Pastor, Gloria (coord.)
  2. Colson, Jean-Pierre (coord.)

Publisher: John Benjamins

ISBN: 978-90-272-0535-3

Year of publication: 2020

Pages: 177-187

Type: Book chapter

Abstract

This study investigates (and compares) the impact of the size and the similarity/quality of comparable corpora on the specific task of extracting translation equivalents of verb-noun collocations from such corpora. The comprehensive evaluation of different configurations of English and Spanish corpora sheds some light on the more general and perennial question: what matters more – the quantity or quality of corpora?