Improve the effectiveness of machine learning models in detecting website phishing using morphological features in URL analysis

Authors

  • Dang Thi Mai
  • Nguyễn Việt Hùng

DOI:

https://doi.org/10.54654/isj.v2i22.1040

Keywords:

website phishing detection, phishing URL, morphological features

Tóm tắt

With the proliferation of the Internet, the emergence of various threats has become increasingly prevalent, particularly the danger posed by phishing websites. These websites are designed with malicious content aimed at exploiting users who inadvertently access them. This method of attack represents a significant potential risk for users in cyberspace. The problem of detecting and eliminating phishing websites has garnered significant interest and research within the community. In this study, we propose a set of morphological features in URL path analysis, combined with machine learning methods, to detect phishing website URLs. Experimental evaluation with the UCI Repository dataset results have demonstrated the effectiveness of the proposed feature set in terms of all metrics (Accuracy, Precision, Recall, and F1 Score) compared to previous methods.

Downloads

Download data is not yet available.

Author Biography

Dang Thi Mai

She received her BSc, MSc in Applied Mathematics and Informatics from Hanoi university of Natural Science, PhD degrees in Applied Mathematics from Moscow Institute of Physics and Technology in 2012.

References

Number of unique phishing sites detected worldwide from 3rd quarter 2013 to 1st quarter 2024 - https://www.statista.com/ - Last accessed: 6/2024.

Narendra. M. Shekokar, Chaitali Shah, Mrunal Mahajan, Shruti Rachh, An Ideal Approach for Detection and Prevention of Phishing Attacks, Procedia Computer Science, Volume 49, 2015, Pages 82-91.

S. A. Murad, N. Rahimi and A. J. Md Muzahid, "PhishGuard: Machine Learning-Powered Phishing URL Detection," 2023 Congress in Computer Science, Computer Engineering, & Applied Computing (CSCE), Las Vegas, NV, USA, 2023, pp. 2279-2284.

M. Aydin and N. Baykal, "Feature extraction and classification phishing websites based on URL," 2015 IEEE Conference on Communications and Network Security (CNS), Florence, Italy, 2015, pp. 769-770

Rami M. Mohammad, Fadi Thabtah, and Lee McCluskey.Phishing websites features, 2015. Unpublished. Available via:http://eprints.hud.ac.uk/24330/6/RamiPhishing_Websites_Features.pdf.

J. Kumar, A. Santhanavijayan, B. Janet, B. Rajendran and B. S. Bindhumadhava, "Phishing Website Classification and Detection Using Machine Learning," 2020 International Conference on Computer Communication and Informatics (ICCCI), Coimbatore, India, 2020, pp. 1-6.

Mehmet Korkmaz, Ozgur Koray Sahingoz, Banu Diri, Detection of phishing websites by using machine learning-based URL analysis, 11nth International Conference on Computing, Communication and Networking Technologies(ICCCNT), 2020.

M. N. Alam, D. Sarma, F. F. Lima, I. Saha, R. -E. -. Ulfath and S. Hossain, "Phishing Attacks Detection using Machine Learning Approach," 2020 Third International Conference on Smart Systems and Inventive Technology (ICSSIT), Tirunelveli, India, 2020, pp. 1173-1179.

Subasi, A.; Molah, E.; Almkallawi, F.; Chaudhery, T.J. Intelligent phishing website detection using random forest classifier. In Proceedings of the International Conference on Electrical and Computing Technologies and Applications (ICECTA), Phuket, Thailand, 12–13 October 2017; pp. 1–5.

Rishikesh Mahajan, and Irfan Siddavatam, Phishing website detection using machine learning algorithms, International Journal of Computer Applications(0975-8887), vol. 181, no. 23, 2018.

Khan, S.A.; Khan, W.; Hussain, A. Phishing Attacks and Websites Classification Using Machine Learning and Multiple Datasets (A Comparative Analysis). In Intelligent Computing Methodologies: 16th International Conference, ICIC 2020, Bari, Italy, 2–5 October 2020, Proceedings, Part III; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2020; Volume 12465.

Saleem Raja Abdul Samad, Sundarvadivazhagan Balasubaramanian, Amna Salim Al-Kaabi, Bhisham Sharma, Subrata Chowdhury, Abolfazl Mehbodniya, Julian L. Webber and Ali Bostani. Analysis of the Performance Impact of Fine-Tuned Machine Learning Model for Phishing URL Detection-Electronics 2023, 12, 1642.

S. C. Jeeva and E. B. Rajsingh, “Intelligent phishing url detection using association rule mining,” Human-centric Computing and Information Sciences, vol. 6, no. 1, Oct. 2016.

Thang, N. M., Anh, L. Q., Toan, H. S., & Trung, N. Q. (2023). A novel method for detecting URLs phishing using hybrid machine learning algorithm. Journal of Science and Technology on Information Security, 2(19), 15-28. https://doi.org/10.54654/isj.v2i19.978.

M.Babagoli, M. P.Aghababa, and V.Solouk, “Heuristic nonlinear regression strategy for detecting phishing websites,” Soft Computing, vol. 23, no. 12, pp. 4315–4327, 2018.

Shaoming Chen, Yiyang Wang, and Xingkai Cheng. An approach for detecting phishing websites by using search engine. In Proceedings of the 2023 4th International Conference on Machine Learning and Computer Application (ICMLCA '23).

Cho Do Xuan, Hoa Dinh Nguyen and Tisenko Victor Nikolaevich, “Malicious URL Detection based on Machine Learning” International Journal of Advanced Computer Science and Applications(IJACSA), 11(1), 2020. http://dx.doi.org/10.14569/IJACSA.2020.0110119.

Mohammad, R.; McCluskey, T.L.; Thabtah, F. UCI Machine Learning Repository: Phishing Websites Data Set. Available online: https://archive.ics.uci.edu/dataset/327/phishing+websites (accessed on 20 March, 2024).

Dutta AK. Detecting phishing websites using machine learning technique. PLoS One. 2021 Oct 11;16(10):e0258361. doi: 10.1371/journal.pone.0258361. PMID: 34634081; PMCID: PMC8504731.

S. Anderson, “A-Morphous Morphology” (2011). Cambridge University Press.

Downloads

Abstract views: 133 / PDF downloads: 61

Published

2024-10-01

How to Cite

Mai, D. T., & Hùng, N. V. (2024). Improve the effectiveness of machine learning models in detecting website phishing using morphological features in URL analysis. Journal of Science and Technology on Information Security, 2(22), 49-57. https://doi.org/10.54654/isj.v2i22.1040

Issue

Section

Papers