Technical research of detection algorithmically generated malicious domain names using machine learning methods
DOI:
https://doi.org/10.54654/isj.v7i1.54Keywords:
cyber security, malicious domain, Domain Generation Algorithm, machine learning, deep learningTóm tắt
Abstract— In recent years, many malware use domain generation algorithm for generating a large of domains to maintain their Command and Control (C&C) network infrastructure. In this paper, we present an approach for detecting malicious domain names using machine learning methods. This approach is using Viterbi algorithm and dictionary for constructing feature of domain names. The approach is demonstrated using a range of legitimate domains and a number of malicious algorithmically generated domain names. The numerical results show the efficiency of this method.
Tóm tắt— Trong những năm gần đây, nhiều phần mềm độc hại sử dụng thuật toán sinh tên miền tạo ra lượng lớn các tên miền để duy trì cơ sở hạ tầng mạng ra lệnh và điều khiển (C&C). Trong bài báo này, chúng tôi trình bày một cách tiếp cận để phát hiện tên miền độc hại bằng phương pháp học máy. Cách tiếp cận này sử dụng thuật toán Viterbi và tập từ điển để trích xuất các đặc trưng của tên miền. Cách tiếp cận được thể hiện bằng cách sử dụng một lượng lớn các tên miền hợp pháp và một lượng lớn tên miền độc hại được tạo ra bằng thuật toán sinh tên miền. Các kết quả thực nghiệm đã chỉ ra tính hiệu quả của phương pháp.
Downloads
References
[1].Hà Quang Thụy, Nguyễn Hà Nam, Nguyễn Trí Thành, “Giáo trình khai phá dữ liệu”, VNU Publishing, 2013.
[2]. Moran Baruch, “DGA Detection Using Machine Learning Methods”, Master Thesis, University of Jyväskylä, 2016.
[3]. Thomas Edgar and David Manz, “Research Methods for Cyber Security”, Syngress, 2017.
[4]. Xingguo Li, Junfeng Wang, and Xiao song Zhang, “Botnet Detection Technology Based on DNS”, Future Internet 2017, 9, 55.
[5]. Michael Sikorski, Andrew Honig, “Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software”, No Starch Press, 2012.
[6]. Konrad Rieck, Philipp Trinius, Carsten Willems, and Thorsten Holz, “Automatic Analysis of Malware Behavior using Machine Learning”, 2011.
[7]. Daisuke Miyamoto, Hiroaki Hazeyama, Youki Kadobayashi, “An Evaluation of Machine Learning-based Methods for Detection of Phishing Sites”, 2017.
[8]. Jasper Abbink, “Popularity-based Detection of Domain Generation Algorithms, Master Thesis”, Delft University of Technology, 2017.
[9]. Christopher M. Bishop, “Pattern Recognition and Machine Learning”, Springer Science, 2006.
[10]. Ian Goodfellow, Yoshua Bengio, Aaron Courville, “Deep Learning”, MIT Press book, 2016.
[11]. M. Namazifar, Y. Pan, “Research Spotlight: Detecting Algorithmically Generated Domains”, Cisco, 2015.
[12]. Enoch Agyepong, William J. Buchanan, Kevin Jones, “Detection of Algorithmically Generated Malicious Domain”, Conference: 6th International Conference of Advanced Computer Science & Information Technology, 2018.
[13]. M. Antonakakis, R. Perdisci, Y. Nadji, N. Vasiloglou II, S. Abu-Nimeh, W. Lee, and D. Dagon, “From Throw-Away Traffic to Bots: Detecting the Rise of DGA-Based Malware”. In USENIX security symposium Vol. 12, 2012.
[14]. S. Yadav, A.K.K Reddy, A.L. Reddy, and S. Ranjan, (2010, November). “Detecting Algorithmically generated malicious domain names”. In Proceedings of the 10th ACM SIGCOMM conference on Internet measurement pp. 48-61. ACM.
[15]. G. Zhao, K. Xu, L. Xu, and B. Wu, (2015). “Detecting APT Malware Infections Based on Malicious DNS and Traffic A nalysis”. IEEE Access, 3, pp. 1132-1142, 2015.
[16]. N. Goodman, “A Survey of Advances in Botnet Technologies”. arXiv preprint arXiv:1702.01132, 2017.
[17]. V. Oujezsky, T. Horvath, and V. Skorpil, “Botnet C&C Traffic and Flow Lifespans Using Survival Analysis”. International Journal of Advances in Telecommunications, Electrotechnics, Signals and Systems, 6(1), pp. 38-44, 201.
[18]. R. Sharifnya, and M. Abadi, DFBotKiller: “Domain-flux botnet detection based on the history of group activities and failures in DNS traffic”. Digital Investigation, 12, pp. 15-26, 2015.
[19]. Kotsiantis, Sotiris B., I. Zaharakis, and P. Pintelas. “Supervised machine learning: A review of classification techniques”. (2007): 3-24.
[20]. A. Chailytko, and A. Trafimchuk, “DGA clustering and analysis: mastering modern, evolving threats”, 2015.
[21]. L. Bilge, E. Kirda, C. Kruegel, and M. Balduzzi, “Finding Malicious Domains Using Passive DNS Analysis”, In Ndss, 2011.
[22]. J. Kwon, J. Lee, H. Lee and A Perrig, PsyBoG: “A scalable botnet detection method for large-scale DNS traffic, Computer Networks”, 97, pp. 48-73, 2016.
[23]. J. Lee, and H. Lee, “GMAD: Graph-based Malware Activity Detection by DNS traffic analysis”, Computer Communications, 49, 33-47, 2014.
[24]. R. Sharifnya, and M. Abadi, “DFBotKiller: Domain-flux botnet detection based on the history of group activities and failures in DNS traffic”, Digital Investigation, 12, pp. 15-26, 2015.
[25]. Yu Fu, Lu Yu, Richard Brooks, “Poster: Zero-day Botnet Domain Generation Algorithm (DGA) Detection using Hidden Markov Models (HMMs)”, 38th IEEE Symposium on Security and Privacy, 2017.
[26]. Tianyu Wang, Li-Chiou Ch, “Detecting Algorithmically Generated Domains Using Data Visualization and N-Grams Methods”, Proceedings of Student-Faculty Research Day, 2017.
[27]. Jonathan Woodbridge, Hyrum S. Anderson, Anjum Ahuja, and Daniel Grant, “Predicting Domain Generation Algorithms with Long Short-Term Memory Networks”, arXiv:1611.00791, 2016
Downloads
Published
How to Cite
Issue
Section
License
Proposed Policy for Journals That Offer Open Access
Authors who publish with this journal agree to the following terms:
1. Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
Proposed Policy for Journals That Offer Delayed Open Access
Authors who publish with this journal agree to the following terms:
1. Authors retain copyright and grant the journal right of first publication, with the work [SPECIFY PERIOD OF TIME] after publication simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).