EmotiBloga fine-grained annotation schema for labelling subjectivity in the new-textual genres born with the Web 2.0

  1. Boldrini, Ester
  2. Balahur Dobrescu, Alexandra
  3. Martínez Barco, Patricio
  4. Montoyo Guijarro, Andrés
Revista:
Procesamiento del lenguaje natural

ISSN: 1135-5948

Año de publicación: 2010

Número: 45

Páginas: 41-48

Tipo: Artículo

Otras publicaciones en: Procesamiento del lenguaje natural

Resumen

The exponential growth of the subjective information in the framework of the Web 2.0 has led to the need to create Natural Language Processing tools able to analyse and process such data for multiple practical applications. These applications require training on specifically annotated corpora, whose level of detail must be fine enough to capture the phenomena involved. This paper presents EmotiBlog – a fine-grained annotation scheme for subjectivity. We show the manner in which it is built and demonstrate the benefits it brings to the systems using it for training, through the experiments we carried out on opinion mining and emotion detection. We employ corpora of different textual genres –a set of annotated reported speech extracted from news articles, the set of news titles annotated with polarity and emotion from the SemEval 2007 (Task 14) and ISEAR, a corpus of real-life self-expressed emotion. We also show how the model built from the EmotiBlog annotations can be enhanced with external resources. The results demonstrate that EmotiBlog, through its structure and annotation paradigm, offers high quality training data for systems dealing both with opinion mining, as well as emotion detection.

Referencias bibliográficas

  • Balahur A., Steinberger R., Kabadjov M., Zavarella V., van der Goot E., Halkia M., Pouliquen B., and Belyaeva J. 2010. Sentiment Analysis in the News. In Proceedings of LREC 2010.
  • Balahur A., Boldrini E., Montoyo A., MartínezBarco P. 2009. A Comparative Study of Open Domain and Opinion Question Answering Systems for Factual and Opinionated Queries. In Proceedings of the Recent Advances in Natural Language Processing.
  • Balahur A., Montoyo A. 2008. Applying a Culture Dependent Emotion Triggers Database for Text Valence and Emotion Classification. In Proceedings of the AISB 2008 Symposium on Affective Language in Human and Machine, Aberdeen, Scotland.
  • Balahur A., Steinberger R., Rethinking Sentiment Analysis in the News: from Theory to Practice and back. In Proceeding of WOMSA 2009. Seville.
  • Balahur A., Boldrini E., Montoyo A., MartínezBarco P. 2009. Summarizing Threads in Blogs Using Opinion Polarity. In Proceedings of ETTS workshop. RANLP. 2009.
  • Boldrini E., Balahur A., Martínez-Barco P., Montoyo A. 2009. EmotiBlog: a fine-grained model for emotion detection in non-traditional textual genres. In Proceedings of WOMSA. Seville, Spain.
  • Boldrini E., Fernández J., Gómez J.M., MartínezBarco P. 2009. Machine Learning Techniques for Automatic Opinion Detection in Non-Traditional Textual Genres. In Proceedings of WOMSA 2009. Seville, Spain.
  • Chaovalit P, Zhou L. 2005. Movie Review Mining: a Comparison between Supervised and Unsupervised Classification Approaches. In Proceedings of HICSS-05.
  • Carletta J. 1996. Assessing agreement on classification task: the kappa statistic. Computational Linguistics, 22(2): 249–254.
  • Cui H., Mittal V., Datar M. 2006. Comparative Experiments on Sentiment Classification for Online Product Reviews. In Proceedings of the 21st National Conference on Artificial Intelligence AAAI.
  • Cerini S., Compagnoni V., Demontis A., Formentelli M., and Gandini G. 2007. Language resources and linguistic theory: Typology, second language acquisition. English linguistics (Forthcoming), chapter Micro-WNOp: A gold standard for the evaluation of automatically compiled lexical resources for opinion mining. Franco Angeli Editore, Milano, IT.
  • Dave K., Lawrence S., Pennock, D. “Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews”. In Proceedings of WWW-03. 2003.
  • Esuli A., Sebastiani F. 2006. SentiWordNet: A Publicly Available Resource for Opinion Mining. In Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2006, Genoa, Italy.
  • Goldberg A.B., Zhu J. 2006. Seeing stars when there aren’t many stars: Graph-based semi-supervised learning for sentiment categorization. In HLTNAACL 2006 Workshop on Textgraphs: Graphbased Algorithms for Natural Language Processing.
  • Hu M., Liu B. 2004. Mining Opinion Features in Customer Reviews. In Proceedings of Nineteenth National Conference on Artificial Intelligence AAAI.
  • Hatzivassiloglou V., Wiebe J. 2000. Effects of adjective orientation and gradability on sentence subjectivity. In Proceedings of COLING.
  • Kim S.M., Hovy E. 2004. Determining the Sentiment of Opinions. In Proceedings of COLING.
  • Mullen T., Collier N. 2006. Sentiment Analysis Using Support Vector Machines with Diverse Information Sources. In Proceedings of EMNLP. 2004. Lin, W.H., Wilson, T., Wiebe, J., Hauptman, A. “Which Side are You On? Identifying Perspectives at the Document and Sentence Levels”. In Proceedings of the Tenth Conference on Natural Language Learning CoNLL.2006.
  • Pang B., Lee L., Vaithyanathan S. 2002. Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of EMNLP02, the Conference on Empirical Methods in Natural Language Processing.
  • Riloff E., Wiebe J. 2003. Learning Extraction Patterns for Subjective Expressions. In Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing.
  • Scherer K. R. 2005. What are emotions? And how can they be measured? Social Science Information, 44(4), 693–727.
  • Stoyanov V. and Cardie C. 2006. Toward Opinion Summarization: Linking the Sources. COLINGACL. Workshop on Sentiment and Subjectivity in Text.
  • Stoyanov V., Cardie C., Litman D., and Wiebe J. 2004. Evaluating an Opinion Annotation Scheme Using a New Multi-Perspective Question and Answer Corpus. AAAI Spring Symposium on Exploring Attitude and Affect in Text.
  • Strapparava and Mihalcea, 2007 - SemEval 2007 Task 14: Affective Text. In Proceedings of the ACL.
  • Turney P. 2002. Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews. ACL 2002: 417-424.
  • Uspensky B. 1973. A Poetics of Composition. University of California Press, Berkeley, California.
  • Wiebe J. M. 1994. Tracking point of view in narrative. Computational Linguistics, vol. 20, pp. 233–287.
  • Wiebe J., Wilson T. and Cardie C. 2005. Annotating expressions of opinions and emotions in language. Language Resources and Evaluation.
  • Wilson T., Wiebe J., Hwa R. 2004. Just how mad are you? Finding strong and weak opinion clauses. In: Proceedings of AAAI.