IBEREVAL OMMining Opinions from the new textual genres

  1. Balahur Dobrescu, Alexandra
  2. Boldrini, Ester
  3. Montoyo Guijarro, Andrés
  4. Martínez Barco, Patricio
Procesamiento del lenguaje natural

ISSN: 1135-5948

Argitalpen urtea: 2010

Zenbakia: 45

Orrialdeak: 267-272

Mota: Artikulua

Beste argitalpen batzuk: Procesamiento del lenguaje natural


The increasing amount of subjective data on the Web is creating the need to develop effective Question Answering systems able to discriminate such information from factual data, and subsequently process it with specific methods. The participants in the IBEREVAL OM tasks will be given a set of opinion questions (in Spanish and English). Optionally, they will also be able to receive the same set of opinion questions, in which the source, target and expected polarity, as well as the time span the question is referring to are given. They will also be provided with a collection of blog posts, extracted using the Technorati blog search engine (in Spanish and English), in which the answers to the opinion questions should be found The gold standard for this blog posts collection will previously be annotated using the EmotiBlog scheme, by a number of 3 annotators. The EmotiBlog corpus and the set of questions presented in (Balahur et al., 2009) – in their present state will be provided for system training. The participants will be able to participate in two subtasks : 1) in the first one, they will be asked to provide the list of answers to each of the questions (in the same language as the questions, or in the other language); 2) in the second one, they will be asked to provide a summary of the question answers – the top x% of the most important answers, in a non-redundant manner. The Gold Standard for the summaries will be automatically extracted from the manual annotations, taking into account the “intensity” parameter of the opinions expressed.

Erreferentzia bibliografikoak

