AnLibsXAI: Android malware classification using library lists and explainable artificial intelligence with SHAP

Authors

  • Nguyen Tan Cam University of Information Technology, Ho Chi Minh City, Vietnam. Vietnam National University, Ho Chi Minh City, Vietnam

DOI:

https://doi.org/10.54654/isj.v2i22.1018

Keywords:

Android malware, malware classification, machine learning, explainable artificial intelligence, SVM

Tóm tắt

In this study, we use various features, including permission lists, API system calls, and library lists, to create a system named AnLibsXAI to classify Android malware. We experimented with six machine learning models such as Support Vector Machine (SVM), Random Forest (RF), Logistic Regression (LR), Decision Tree (DT), K-Nearest Neighbors (KNN) and Multilayer Perceptron (MLP). Additionally, this study applies explainable artificial intelligence platforms to assess the importance and impact of features in Android malware classification models. The evaluation results on the CICMalDroid 2020 dataset show that SVM has the highest accuracy, reaching 96%. The findings of this study can be applied to future research related to Android malware classification

Downloads

Download data is not yet available.

References

B. Wang, W. Pei, B. Xue, and M. Zhang, "A multi-objective genetic algorithm to evolving local interpretable model-agnostic explanations for deep neural networks in image classification," IEEE Transactions on Evolutionary Computation, 2022.

S. Lundberg. (2023, October, 10). SHapley Additive exPlanations. Available: https://shap.readthedocs.io/en/latest/

D. Macha, M. Kozielski, Ł. Wróbel, and M. Sikora, "RuleXAI—A package for rule-based explanations of machine learning model," SoftwareX, vol. 20, p. 101209, 2022.

Toan, N. N. ., Dung, L. T., & Thang, D. Q. (2022). Static Feature Selection for IoT Malware Detection. Journal of Science and Technology on Information Security, 1(15), 74-84. https://doi.org/10.54654/isj.v1i15.844

R. Vinayakumar, K. Soman, and P. Poornachandran, "Deep android malware detection and classification," in 2017 International conference on advances in computing, communications and informatics (ICACCI), 2017, pp. 1677-1683: IEEE.

D. Ö. Şahın, S. Akleylek, and E. Kiliç, "LinRegDroid: Detection of Android malware using multiple linear regression models-based classifiers," IEEE Access, vol. 10, pp. 14246-14259, 2022.

G. D’Angelo, F. Palmieri, and A. Robustelli, "A federated approach to Android malware classification through Perm-Maps," Cluster Computing, vol. 25, no. 4, pp. 2487-2500, 2022.

X. Zhang et al., "An early detection of android malware using system calls based machine learning model," in Proceedings of the 17th International Conference on Availability, Reliability and Security, 2022, pp. 1-9.

K. Vinayaka and C. Jaidhar, "Android malware detection using function call graph with graph convolutional networks," in 2021 2nd International Conference on Secure Cyber Computing and Communications (ICSCCC), 2021, pp. 279-287: IEEE.

Y. Wu, J. Shi, P. Wang, D. Zeng, and C. Sun, "DeepCatra: Learning flow‐and graph‐based behaviours for Android malware detection," IET Information Security, vol. 17, no. 1, pp. 118-130, 2023.

J. Kim, Y. Ban, E. Ko, H. Cho, and J. H. Yi, "MAPAS: a practical deep learning-based android malware detection system," International Journal of Information Security, vol. 21, no. 4, pp. 725-738, 2022.

S. R. T. Mat, M. F. Ab Razak, M. N. M. Kahar, J. M. Arif, and A. Firdaus, "A Bayesian probability model for Android malware detection," ICT Express, vol. 8, no. 3, pp. 424-431, 2022.

S. Mahdavifar, D. Alhadidi, and A. A. Ghorbani, "Effective and efficient hybrid android malware classification using pseudo-label stacked auto-encoder," Journal of network and systems management, vol. 30, pp. 1-34, 2022.

Y. Ding, X. Zhang, J. Hu, and W. Xu, "Android malware detection method based on bytecode image," Journal of Ambient Intelligence and Humanized Computing, vol. 14, no. 5, pp. 6401-6410, 2023.

K. Allix, T. F. Bissyandé, J. Klein, and Y. Le Traon, "Androzoo: Collecting millions of android apps for the research community," in Mining Software Repositories (MSR), 2016 IEEE/ACM 13th Working Conference on, 2016, pp. 468-471: IEEE.

D. Arp, M. Spreitzenbarth, M. Hübner, H. Gascon, and K. Rieck, "Drebin: Effective and Explainable Detection of Android Malware in Your Pocket," 2014.

M. Kinkead, S. Millar, N. McLaughlin, and P. O’Kane, "Towards explainable CNNs for Android malware detection," Procedia Computer Science, vol. 184, pp. 959-965, 2021.

A. Galli, V. La Gatta, V. Moscato, M. Postiglione, and G. Sperlì, "Explainability in AI-based behavioral malware detection systems," Computers & Security, vol. 141, p. 103842, 2024.

A. Martín, R. Lara-Cabrera, and D. Camacho, "A new tool for static and dynamic Android malware analysis," in Data Science and Knowledge Engineering for Sensing Decision Support: Proceedings of the 13th International FLINS Conference (FLINS 2018), 2018, pp. 509-516: World Scientific.

D. Boxler and K. R. Walcott, "Static Taint Analysis Tools to Detect Information Flows," in Proceedings of the International Conference on Software Engineering Research and Practice (SERP), 2018, pp. 46-52: The Steering Committee of The World Congress in Computer Science, Computer ….

C. Molnar, "Interpretable Machine Learning: A Guide for Making Black Box Models Explainable.", 2020, [Online]. Available: https://christophm.github.io/interpretable-ml-book/. [Accessed: Dec. 10, 2023].

Downloads

Abstract views: 97 / PDF downloads: 47

Published

2024-10-01

How to Cite

Cam, N. T. (2024). AnLibsXAI: Android malware classification using library lists and explainable artificial intelligence with SHAP. Journal of Science and Technology on Information Security, 2(22), 5-16. https://doi.org/10.54654/isj.v2i22.1018

Issue

Section

Papers