Extracting Multiple Relations between Entities from Unstructured Threat Intelligence Reports
DOI:
https://doi.org/10.54654/isj.v3i20.976Keywords:
Relation extraction, named entity recognition, threat intelligence, deep learningTóm tắt
Abstract— Cyber threats are becoming an ever-increasing concern for organizations and even countries. Attackers are constantly in search of new and sophisticated attack vectors. On the other side, security defenders need to gather as much as possible information about the threats on the Internet and analyze them to understand current and emerging attack trends for effectively detecting and mitigating potential threats with fast response. This paper addresses the problem of automatically extracting threat intelligence from unstructured text sources. We focus specifically on the possibility of multiple relations between two entities and propose a two-stage process that allows any binary classifier to be used for multi-class classification without interfering with the binary algorithm used. The experimental results illustrate the efficiency of our proposed approach.
Downloads
References
Wiem Tounsi and Helmi Rais. A Survey on Technical Threat Intelligence in the Age of Sophisticated Cyber Attacks. Computers & Security, 2017.
Robert A. Bridges, Corinne L. Jones, Michael D. Iannacone, Kelly M. Testa, and John R. Goodall. Automatic Labeling for Entity Extraction in Cyber Security. In Proceedings of the 2014 ASE International Conference on Cyber Security, 2014.
Nuno Dionísio, Fernando Alves, Pedro M. Ferreira, and Alysson Bessani. Towards End-to-End Cyberthreat Detection from Twitter Using Multi-Task Learning. In Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), 2020.
Nuno Dionísio, Fernando Alves, Pedro M. Ferreira, and Alysson Bessani. Cyberthreat Detection from Twitter Using Deep Neural Networks. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), 2019.
Ghaith Husari, Ehab Al-Shaer, Mohiuddin Ahmed, Bill Chu, and Xi Niu. TTPDrill: Automatic and Accurate Extraction of Threat Actions from Unstructured Text of CTI Sources. In Proceedings of the 33rd Annual Computer Security Applications Conference (ACSAC ’17), 103–115, 2017.
Gyeongmin Kim, Chanhee Lee, Jaechoon Jo, and Heuiseok Lim. Automatic Extraction of Named Entities of Cyber Threats Using a Deep Bi-LSTM-CRF Network. International Journal of Machine Learning and Cybernetics, 11, 2341-2355, 2020.
Xuren Wang, Xinpei Liu, Shengqin Ao, Ning Li, Zhengwei Jiang, Zongyi Xu, Zihan Xiong, Mengbo Xiong, and Xiaoqing Zhang. DNRTI: A Large-Scale Dataset for Named Entity Recognition in Threat Intelligence. In Proceedings of the 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), 2020.
Jun Zhao, Qiben Yan, Jianxin Li, Minglai Shao, Zuti He, and Bo Li. TIMiner: Automatically Extracting and Analyzing Categorized Cyber Threat Intelligence from Social Data. Computers & Security, Volume 95, 2020.
Hyeonseong Jo, Yongjae Lee, and Seungwon Shin. Vulcan: Automatic Extraction and Analysis of Cyber Threat Intelligence from Unstructured Text. Computers & Security, Volume 120, Issue C, 2022.
Aditya Pingle, Aritran Piplai, Sudip Mittal, Anupam Joshi, James Holt, and Richard Zak. RelExt: Relation Extraction Using Deep Learning Approaches for Cybersecurity Knowledge Graph Improvement. In Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM ’19), 879-886, 2019.
Xiaojing Liao, Kan Yuan, XiaoFeng Wang, Zhou Li, Luyi Xing, and Raheem Beyah. Acing the IOC Game: Toward Automatic Discovery and Analysis of Open-Source Cyber Threat Intelligence. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS ’16), 755–766, 2016.
MITRE Corporation, Standardizing Cyber Threat Intelligence Information with the Structured Threat Information eXpression (STIX™), 2012.
Nadeesha Perera et al., “Named Entity Recognition and Relation Detection for Biomedical Information Extraction”, Front. Cell Dev. Biol., 2020.
Pennington et al., “GloVe: Global Vectors for Word Representation,” inProc. of the Empirical Methods in NaturalLanguage Processing, 2014.
Awais Ahmed Shujrah et al., “Measurement of E-Learners’ Level of Interest in Online Course Using Support Vector Machine”, Indian Journal of Science and Technology, 2019.
N. Srivastava et al., “Dropout: A Simple Way to Prevent Neural Networks from Overfitting”, Journal of Machine Learning Research, 2014.
Ashish Vaswani et al., “Attention Is All You Need”, 31st Conference on Neural Information Processing Systems (NIPS 2017), 2017.
Haoyu Wang et al., “Extracting Multiple-Relations in One-Pass with Pre-Trained Transformers”, Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019.
Kyle Wilhoit, Joseph Opacki, Operationalizing Threat Intelligence, Packt Publishing, 2022, pp. 3-5, 11, 16, 36, 317.
Zichao Yang et al., “Hierarchical attention networks for document classification”, InProc. of NAACL-HLT 2016, 2016.
Ningyu Zhang et al., “Attention-Based Capsule Networks with Dynamic Routing for Relation Extraction”, 2018.
Downloads
Published
How to Cite
Issue
Section
License
Proposed Policy for Journals That Offer Open Access
Authors who publish with this journal agree to the following terms:
1. Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
Proposed Policy for Journals That Offer Delayed Open Access
Authors who publish with this journal agree to the following terms:
1. Authors retain copyright and grant the journal right of first publication, with the work [SPECIFY PERIOD OF TIME] after publication simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).