Cross-Document Event Ordering through Temporal Relation Inference and Distributional Semantic Models
ISSN: 1135-5948
Año de publicación: 2017
Número: 58
Páginas: 61-68
Tipo: Artículo
Otras publicaciones en: Procesamiento del lenguaje natural
Resumen
Este artículo se centra en estudiar la contribución que la inferencia de relaciones temporales y los modelos semánticos distribucionales hacen a la tarea de ordenación de eventos. Nuestro sistema construye automáticamente líneas de tiempo con eventos extraídos de diferentes documentos escritos en inglés. Para ello realiza primero una agrupación temporal y posteriormente una agrupación semántica. Para determinar la compatibilidad temporal se realiza una inferencia sobre las relaciones temporales entre los eventos extraídos de un sistema automático de procesamiento de información temporal. Para la compatibilidad semántica entre eventos hemos analizado dos modelos semánticos distribucionales distintos: LDA Topic Modeling y Word2Vec Word Embeddings. Ambos modelos semánticos junto con la inferencia temporal han sido evaluados bajo el marco de evaluación de SemEval 2015 Task 4 Track B. Los experimentos muestran que, usando ambos modelos se mejora el estado del arte actual, implicando un avance importante en la tarea de ordenación de eventos multidocumento.
Referencias bibliográficas
- Bagga, A. and B. Baldwin. 1999. Cross document event coreference: Annotations, experiments, and observations. In In Proc. ACL-99 Workshop on Coreference and Its Applications, pages 1–8.
- Baroni, M., G. Dinu, and G. Kruszewski. 2014. Don’t count, predict! a systematic comparison of context-counting vs. contextpredicting semantic vectors. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 238–247, Baltimore, Maryland, June. Association for Computational Linguistics.
- Bejan, C. A. and S. Harabagiu. 2014. Unsupervised Event Coreference Resolution. Computational Linguistics, 40(2):311–347.
- Blei, D. M., A. Y. Ng, and M. I. Jordan. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research, 3:993–1022.
- Caselli, T., A. Fokkens, R. Morante, and P. Vossen. 2015. SPINOZA VU: An NLP Pipeline for Cross Document TimeLines. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), pages 787–791, Denver, Colorado, June. Association for Computational Linguistics.
- Collobert, R., J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa. 2011. Natural Language Processing (Almost) from Scratch. Journal of Machine Learning Research, pages 41–71.
- Cybulska, A. and P. Vossen. 2013. Semantic relations between events and their time, locations and participants for event coreference resolution. In RANLP, pages 156–163. RANLP 2011 Organising Committee / ACL.
- Goyal, K., S. K. Jauhar, H. Li, M. Sachan, S. Srivastava, and E. H. Hovy. 2013. A structured distributional semantic model for event co-reference. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, ACL 2013, 4-9 August 2013, Sofia, Bulgaria, Volume 2: Short Papers, pages 467–473. The Association for Computer Linguistics.
- Ji, H., R. Grishman, Z. Chen, and P. Gupta. 2009. Cross-document event extraction and tracking: Task, evaluation, techniques and challenges. In RANLP, pages 166–172. RANLP 2009 / ACL.
- Landauer, T. K. and S. T. Dumais. 1997. A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychological Review, 104(2):211–240.
- Laparra, E., I. Aldabe, and G. Rigau. 2015. Document level time-anchoring for timeline extraction. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2015), Beijing, China.
- Lee, H., M. Recasens, A. Chang, M. Surdeanu, and D. Jurafsky. 2012. Joint entity and event coreference resolution across documents. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL ’12, pages 489–500, Stroudsburg, PA, USA. Association for Computational Linguistics.
- Li, P., Q. Zhu, and X. Zhu. 2011. A clustering and ranking based approach for multidocument event fusion. In Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD), 2011 12th ACIS International Conference on, pages 159–165, July.
- Llorens, H., E. Saquete, and B. Navarro Colorado. 2012. Automatic System for Identifying and Categorizing Temporal Relations in Natural Language. International Journal of Intelligent Systems, 27(7):680–703.
- Lu, J. and V. Ng. 2016. Event coreference resolution with multi-pass sieves. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016).
- Manning, C. D., M. Surdeanu, J. Bauer, J. Finkel, S. J. Bethard, and D. McClosky. 2014. The Stanford CoreNLP natural language processing toolkit. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 55–60.
- Mikolov, T., I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. 2013. Distributed representations of words and phrases and their compositionality. In C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, , and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems (NIPS 2013), volume 26. pages 3111–3119.
- Minard, A.-L., M. Speranza, E. Agirre, I. A. adn Marieke van Erp, B. Magnini, G. Rigau, and R. Urizar. 2015. Semeval-2015 task 4: Timeline: Cross-document event ordering. In Proceedings of the 9th International Workshop on Semantic Evaluation, SemEval ’15, pages 778–786. Association for Computational Linguistics.
- Mitchell, J. and M. Lapata. 2010. Composition in Distributional Models of Semantics. Cognitive Science, 34:1388–1429.
- Moulahi, B., J. StroÌtgen, M. Gertz, and L. Tamine. 2015. Heideltoul: A baseline approach for cross-document event ordering. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), pages 825–829, Denver, Colorado, June. Association for Computational Linguistics.
- Navarro-Colorado, B. and E. Saquete. 2015. GPLSIUA: Combining Temporal Information and Topic Modeling for Cross-Document Event Ordering. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), pages 820–824, Denver, Colorado, June. Association for Computational Linguistics.
- Navarro-Colorado, B. and E. Saquete. 2016. Cross-document event ordering through temporal, lexical and distributional knowledge. Knowledge-Based Systems, 110:244– 254, October.
- Palmer, M., D. Gildea, and P. Kingsbury. 2005. The Proposition Bank: An Annotated Corpus of Semantic Roles. Computational Linguistics, 31.
- Saurí, R., J. Littman, R. Knippen, R. Gaizauskas, A. Setzer, and J. Pustejovsky, 2006. TimeML Annotation Guidelines 1.2.1 (http://www.timeml.org/).
- Sun, W., A. Rumshisky, and O. Uzuner. 2013. Evaluating temporal relations in clinical text: 2012 i2b2 challenge. In J Am Med Inform Assoc., pages 806–13, September-October.
- UzZaman, N., H. Llorens, L. Derczynski, J. Allen, M. Verhagen, and J. Pustejovsky. 2013. Semeval-2013 task 1: Tempeval-3: Evaluating time expressions, events, and temporal relations. In Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), pages 1–9, Atlanta, Georgia, USA. ACL. ISBN: 978-1-93728449-7.
- Verhagen, M., R. Gaizauskas, M. Hepple, F. Schilder, G. Katz, and J. Pustejovsky. 2007. Semeval-2007 task 15: Tempeval temporal relation identification. In Proceedings of the 4th International Workshop on Semantic Evaluations, pages 75–80, Prague. ACL.
- Verhagen, M., R. Saurí, T. Caselli, and J. Pustejovsky. 2010. Semeval-2010 task 13: Tempeval-2. In Proceedings of the 5th International Workshop on Semantic Evaluation, pages 57–62, Uppsala, Sweden. ACL.
- Yang, B., C. Cardie, and P. I. Frazier. 2015. A hierarchical distance-dependent bayesian model for event coreference resolution. TACL, 3:517–528.