Overview of FLARES at IberLEF 2024: Fine-grained Language-based Reliability Detection in Spanish News

Sepúlveda-Torres, Robiert; Bonet-Jover, Alba; Diab, Isam; Guillén-Pacho, Ibai; Cabrera-de Castro, Isabel; Badenes-Olmedo, Carlos; Saquete, Estela; Martín-Valdivia, M. Teresa; Martínez-Barco, Patricio; Ureña-López, L. Alfonso

Overview of FLARES at IberLEF 2024Fine-grained Language-based Reliability Detection in Spanish News

Sepúlveda-Torres, Robiert
Bonet-Jover, Alba
Diab, Isam
Guillén-Pacho, Ibai
Cabrera-de Castro, Isabel
Badenes-Olmedo, Carlos
Saquete, Estela
Martín-Valdivia, M. Teresa
Martínez-Barco, Patricio
Ureña-López, L. Alfonso

Revista:

Procesamiento del lenguaje natural

ISSN: 1135-5948

Año de publicación: 2024

Número: 73

Páginas: 369-379

Tipo: Artículo

DIALNET GOOGLE SCHOLAR Acceso abierto editor

Otras publicaciones en: Procesamiento del lenguaje natural

Resumen

Este articulo presenta FLARES, una tarea compartida organizada en el marco de la campaña de evaluación de sistemas de Procesamiento del Lenguaje Natural en español y otras lenguas ibéricas, IberLEF 2024. FLARES tiene como objetivo detectar patrones de confiabilidad en el lenguaje utilizado en las noticias que permita desarrollar técnicas eficaces para la futura detección de información engañosa. Para ello, se propone como base la técnica periodística de las 5W1H para detectar el contenido relevante de una noticia, así como una guía de anotación diseñada para detectar la confiabilidad lingüística. Se proponen dos subtareas: la primera centrada en la identificación de los elementos 5W1H y la segunda en la detección de la confiabilidad. Un total de 7 participantes se registraron en la tarea compartida, de los cuales 3 participaron en la primera subtarea y 4 en la segunda. Los equipos propusieron diversos enfoques, especialmente basado en el ajuste de modelos de codificación y en el ajuste de instrucciones en modelos de decodificación.

Referencias bibliográficas

Abas, A. R., I. El-Henawy, H. Mohamed, and A. Abdellatif. 2020. Deep learning model for fine-grained aspect-based opinion mining. IEEE Access, 8:128845–128855.
AI@Meta. 2024. Llama 3 model card. Bani-Hani, A., O. Adedugbe, E. Benkhelifa, M. Majdalawieh, and F. Al-Obeidat. 2020. A semantic model for context-based fake news detection on social media. In 2020 IEEE/ACS 17th International Conference on Computer Systems and Applications (AICCSA), pages 1–7. IEEE.
Bonet-Jover, A., R. Sepúlveda-Torres, E. Saquete, P. Martínez-Barco, and M. Nieto-Pérez. 2024. Run-as: A novel approach to annotate news reliability for disinformation detection. Language Resources and Evaluation, 58(2):609–639.
Cañete, J., G. Chaperon, R. Fuentes, J.-H. Ho, H. Kang, and J. Pérez. 2023. Spanish pretrained bert model and evaluation data. arXiv preprint arXiv:2308.02976.
Chakma, K. and A. Das. 2018. A 5w1h based annotation scheme for semantic role labelling of english tweets. Computación y Sistemas, 22(3):747–755.
Chakma, K., S. D. Swamy, A. Das, and S. Debbarma. 2020. 5w1h-based semantic segmentation of tweets for event detection using bert. In International Conference on Machine Learning, Image Processing, Network Security and Data Sciences, pages 57–72. Springer.
Chung, H. W., L. Hou, S. Longpre, B. Zoph, Y. Tay, W. Fedus, E. Li, X. Wang, M. Dehghani, S. Brahma, A. Webson, S. S. Gu, Z. Dai, M. Suzgun, X. Chen, A. Chowdhery, D. Valter, S. Narang, G. Mishra, A. W. Yu, V. Zhao, Y. Huang, A. M. Dai, H. Yu, S. Petrov, E. H. hsin Chi, J. Dean, J. Devlin, A. Roberts, D. Zhou, Q. V. Le, and J. Wei. 2022. Scaling instruction-finetuned language models. ArXiv, abs/2210.11416.
Grande, E. and A. Begga. 2024. Syntax Savants-UA at IberLEF 2024: Leveraging FLAN-T5-XXL for Automatic 5W1H Identification in Texts. In In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2024), co-located with the 40th Conference of the Spanish Society for Natural Language Processing (SEPLN 2024), CEURWS.org.
Grieve, J. and H. Woodfield. 2023. The language of fake news. Cambridge University Press.
Gutiérrez-Fandiño, A., J. Armengol-Estapé, M. Pàmies, J. Llop-Palao, J. Silveira-Ocampo, C. P. Carrino, A. Gonzalez-Agirre, C. Armentano-Oller, C. Rodriguez-Penagos, and M. Villegas. 2021. Maria: Spanish language models. arXiv preprint arXiv:2107.07253.
Horne, B. and S. Adali. 2017. This just in: Fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news. In Proceedings of the international AAAI conference on web and social media, volume 11, pages 759–766.
Ibrahim, M. 2024. Fine-Grained Language-based Reliability Detection in Spanish New with Fine-Tuned Llama-3 Model. In In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2024), co-located with the 40th Conference of the Spanish Society for Natural Language Processing (SEPLN 2024), CEURWS.org.
Keith, B., M. Horning, and T. Mitra. 2020. Evaluating the inverted pyramid structure through automatic 5w1h extraction and summarization. Computational Journalism C+ J.
Khodra, M. L. 2015. Event extraction on Indonesian news article using multiclass categorization. In 2015 2nd International Conference on Advanced Informatics: Concepts, Theory and Applications (ICAICTA), pages 1–5. IEEE.
Lugea, J. 2021. Linguistic approaches to fake news detection. Data science for fake news: Surveys and perspectives, pages 287–302.
Mangrulkar, S., S. Gugger, L. Debut, Y. Belkada, S. Paul, and B. Bossan. 2022. Peft: State-of-the-art parameter-efficient fine-tuning methods. https://github.com/huggingface/peft.
Pan, R., J. A. García-Díaz, F. García-Sánchez, and R. Valencia-García. 2024. UMUTeam at FLARES@IberLEF 2024: Enhancing Disinformation Detection with 5W1H Techniques and Transformer Models. In In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2024), co-located with the 40th Conference of the Spanish Society for Natural Language Processing (SEPLN 2024), CEURWS.org.
Pardo, J., J. Liu, V. Ramón-Ferrer, E. Amador-Domínguez, and P. Calleja. 2024. K-Flares: A K-Adapter Based Approach for the FLARES Challenge. In In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2024), co-located with the 40th Conference of the Spanish Society for Natural Language Processing (SEPLN 2024), CEURWS.org.
Piad-Morffis, A., Y. Gutiérrez, Y. Almeida-Cruz, and R. Muñoz. 2020. A computational ecosystem to support ehealth knowledge discovery technologies in spanish. Journal of Biomedical Informatics, 109:103517.
Saquete, E., D. Tomás, P. Moreda, P. Martínez-Barco, and M. Palomar. 2020. Fighting post-truth using natural language processing: A review and open challenges. Expert systems with applications, 141:112943.
Seddari, N., A. Derhab, M. Belaoued, W. Halboob, J. Al-Muhtadi, and A. Bouras. 2022. A hybrid linguistic and knowledge-based analysis approach for fake news detection on social media. IEEE Access, 10:62097–62109.
Vaswani, A., N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems, pages 5998–6008.
Vlachos, A. and S. Riedel. 2014. Fact checking: Task definition and dataset construction. In Proceedings of the ACL 2014 workshop on language technologies and computational social science, pages 18–22.
Vosoughi, S., D. Roy, and S. Aral. 2018. The spread of true and false news online. science, 359(6380):1146–1151.
Wang, R., D. Tang, N. Duan, Z. Wei, X. Huang, G. Cao, D. Jiang, M. Zhou, et al. 2020. K-adapter: Infusing knowledge into pretrained models with adapters. arXiv preprint arXiv:2002.01808.
Zhang, H., X. Chen, and S. Ma. 2019. Dynamic news recommendation with hierarchical attention network. In 2019 IEEE International Conference on Data Mining (ICDM), pages 1456–1461. IEEE.
Zhao, S., F. You, and Z. Y. Liu. 2020. Leveraging pre-trained language model for summary generation on short text. IEEE Access, 8:228798–228803.
Zhou, L. and D. Zhang. 2008. Following linguistic footprints: Automatic deception detection in online communication. Communications of the ACM, 51(9):119–122.
Zhou, X. and R. Zafarani. 2020. A survey of fake news: Fundamental theories, detection methods, and opportunities. ACM Computing Surveys (CSUR), 53(5):1–40.

Fuente de los datos: Dialnet