A method of generating mutated Windows malware to evade ensemble learning
DOI:
https://doi.org/10.54654/isj.v1i18.906Keywords:
evasion attack, adversarial attack, malware mutation, generative adversarial networks, reinforcement learning, ensemble learningTóm tắt
Abstract— Recently, the application of machine learning (ML) in the field of cybersecurity, particularly in the detection and prevention of malware, has received significant attention and interest. Numerous research works on malware analysis have been proposed, showing promising results for practical applications. In such works, the use of Generative Adversarial Networks (GANs) or Reinforcement Learning (RL) can help adversaries create mutated malware to evade detection. In this study, we propose a method for generating mutated Windows malware against malware detection based on ensemble learning by combining GANs and RL to overcome the limitations of the MalGAN model. Specifically, we develop the FeaGAN model, an extension of MalGAN, by incorporating the model with the Deep Q-network anti-malware Engines Attacking Framework (DQEAF) RL model. Furthermore, the FeaGAN model employs ensemble learning for malware detection to enhance the evasion capabilities of the generated adversarial samples. Experimental results show that 100% of the selected mutation samples maintain their format integrity. Additionally, the ability to preserve the executable functionality of the malware variants achieves promising results with a stable success rate.
Downloads
References
D. Li and Q. Li, “Adversarial Deep Ensemble: Evasion Attacks and Defenses for Malware Detection,” IEEE Trans. Inf. Forensics Secur., vol. 15, pp. 3886–3900, 2020, doi: 10.1109/TIFS.2020.3003571.
X. Ling et al., “Adversarial Attacks against Windows PE Malware Detection: A Survey of the State-of-the-Art.” arXiv, Dec. 19, 2022. Accessed: Jan. 03, 2023. [Online]. Available: http://arxiv.org/abs/2112.12310
Toan, N. N. ., Dung, L. T., & Thang, D. Q. (2022). Static Feature Selection for IoT Malware Detection. Journal of Science and Technology on Information Security, 1(15), 74-84. https://doi.org/10.54654/isj.v1i15.844.
I. Goodfellow et al., “Generative Adversarial Nets”.
H. Lee, S. Han, and J. Lee, “Generative Adversarial Trainer: Defense to Adversarial Perturbations with GAN.” arXiv, May 26, 2017. Accessed: Jan. 03, 2023. [Online]. Available: http://arxiv.org/abs/1705.03387
R. Damaševičius, A. Venčkauskas, J. Toldinas, and Š. Grigaliūnas, “Ensemble-Based Classification Using Neural Networks and Machine Learning Models for Windows PE Malware Detection,” Electronics, vol. 10, no. 4, p. 485, Feb. 2021, doi: 10.3390/electronics10040485.
F. Tramèr and D. Boneh, “Adversarial Training and Robustness for Multiple Perturbations.” arXiv, Oct. 17, 2019. Accessed: Jan. 03, 2023. [Online]. Available: http://arxiv.org/abs/1904.13000
Y. Liu, X. Chen, C. Liu, and D. Song, “Delving into Transferable Adversarial Examples and Black-box Attacks.” arXiv, Feb. 07, 2017. Accessed: Jan. 03, 2023. [Online]. Available: http://arxiv.org/abs/1611.02770
W. Hu and Y. Tan, “Generating Adversarial Malware Examples for Black-Box Attacks Based on GAN.” arXiv, Feb. 20, 2017. Accessed: Jan. 03, 2023. [Online]. Available: http://arxiv.org/abs/1702.05983
Z. Fang, J. Wang, B. Li, S. Wu, Y. Zhou, and H. Huang, “Evading Anti-Malware Engines With Deep Reinforcement Learning,” IEEE Access, vol. 7, pp. 48867–48879, 2019, doi: 10.1109/ACCESS.2019.2908033.
Michae Lester, “PE Malware Machine Learning Dataset,” 2018. https://practicalsecurityanalytics.com/pe-malware-machine-learning-dataset/
H. S. Anderson, A. Kharkar, B. Filar, D. Evans, and P. Roth, “Learning to Evade Static PE Machine Learning Malware Models via Reinforcement Learning.” arXiv, Jan. 30, 2018. Accessed: Jan. 03, 2023. [Online]. Available: http://arxiv.org/abs/1801.08917
P. I. Wójcik and M. Kurdziel, “Training neural networks on high-dimensional data using random projection,” Pattern Anal. Appl., vol. 22, no. 3, pp. 1221–1231, Aug. 2019, doi: 10.1007/s10044-018-0697-0.
X. Liu, J. Zhang, Y. Lin, and H. Li, “ATMPA: attacking machine learning-based malware visualization detection methods via adversarial examples,” in Proceedings of the International Symposium on Quality of Service, Phoenix Arizona, Jun. 2019, pp. 1–10. doi: 10.1145/3326285.3329073.
K. Lucas, M. Sharif, L. Bauer, M. K. Reiter, and S. Shintre, “Malware Makeover: Breaking ML-based Static Analysis by Modifying Executable Bytes,” in Proceedings of the 2021 ACM Asia Conference on Computer and Communications Security, Virtual Event Hong Kong, May 2021, pp. 744–758. doi: 10.1145/3433210.3453086.
Erwin Quiring, Alwin Maier, and Konrad Rieck, “Misleading authorship attribution of source code using adversarial learning,” 2019, pp. 479–496.
I. Rosenberg, A. Shabtai, Y. Elovici, and L. Rokach, “Query-Efficient Black-Box Attack Against Sequence-Based Malware Classifiers,” in Annual Computer Security Applications Conference, Austin USA, Dec. 2020, pp. 611–626. doi: 10.1145/3427228.3427230.
M. Kawai, K. Ota, and M. Dong, “Improved MalGAN: Avoiding Malware Detector by Leaning Cleanware Features,” in 2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), Okinawa, Japan, Feb. 2019, pp. 040–045. doi: 10.1109/ICAIIC.2019.8669079.
J. Yuan, S. Zhou, L. Lin, F. Wang, and J. Cui, “Black-box Adversarial Attacks Against Deep Learning Based Malware Binaries Detection with GAN,” Santiago Compost., 2020.
M. Ebrahimi, J. Pacheco, W. Li, J. L. Hu, and H. Chen, “Binary Black-Box Attacks Against Static Malware Detectors with Reinforcement Learning in Discrete Action Spaces,” in 2021 IEEE Security and Privacy Workshops (SPW), San Francisco, CA, USA, May 2021, pp. 85–91. doi: 10.1109/SPW53761.2021.00021.
R. Labaca-Castro, S. Franz, and G. D. Rodosek, “AIMED-RL: Exploring Adversarial Malware Examples with Reinforcement Learning,” in Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track, vol. 12978, Y. Dong, N. Kourtellis, B. Hammer, and J. A. Lozano, Eds. Cham: Springer International Publishing, 2021, pp. 37–52. doi: 10.1007/978-3-030-86514-6_3.
L. Demetrio, B. Biggio, and F. Roli, “Practical Attacks on Machine Learning: A Case Study on Adversarial Windows Malware,” IEEE Secur. Priv., vol. 20, no. 5, pp. 77–85, Sep. 2022, doi: 10.1109/MSEC.2022.3182356.
Imambi, Sagar, Kolla Bhanu Prakash, and G. R. Kanagachidambaresan. "PyTorch." Programming with TensorFlow: Solution for Edge Computing Applications (2021): 87-104.
Kramer, Oliver, and Oliver Kramer. "Scikit-learn." Machine learning for evolution strategies (2016): 45-53.
Thomas, Romain. "Lief-library to instrument executable formats." Retrieved February 22 (2017): 2022.
Ho, H. D., & Ho, H. V. (2020). Technical research of detection algorithmically generated malicious domain names using machine learning methods. Journal of Science and Technology on Information Security, 7(1), 37-43. https://doi.org/10.54654/isj.v7i1.54.
Downloads
Published
How to Cite
Issue
Section
License
Proposed Policy for Journals That Offer Open Access
Authors who publish with this journal agree to the following terms:
1. Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
Proposed Policy for Journals That Offer Delayed Open Access
Authors who publish with this journal agree to the following terms:
1. Authors retain copyright and grant the journal right of first publication, with the work [SPECIFY PERIOD OF TIME] after publication simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).