Improve the effectiveness of machine learning models in detecting website phishing using morphological features in URL analysis
DOI:
https://doi.org/10.54654/isj.v2i22.1040Keywords:
website phishing detection, phishing URL, morphological featuresTóm tắt
With the proliferation of the Internet, the emergence of various threats has become increasingly prevalent, particularly the danger posed by phishing websites. These websites are designed with malicious content aimed at exploiting users who inadvertently access them. This method of attack represents a significant potential risk for users in cyberspace. The problem of detecting and eliminating phishing websites has garnered significant interest and research within the community. In this study, we propose a set of morphological features in URL path analysis, combined with machine learning methods, to detect phishing website URLs. Experimental evaluation with the UCI Repository dataset results have demonstrated the effectiveness of the proposed feature set in terms of all metrics (Accuracy, Precision, Recall, and F1 Score) compared to previous methods.
Downloads
References
Number of unique phishing sites detected worldwide from 3rd quarter 2013 to 1st quarter 2024 - https://www.statista.com/ - Last accessed: 6/2024.
Narendra. M. Shekokar, Chaitali Shah, Mrunal Mahajan, Shruti Rachh, An Ideal Approach for Detection and Prevention of Phishing Attacks, Procedia Computer Science, Volume 49, 2015, Pages 82-91.
S. A. Murad, N. Rahimi and A. J. Md Muzahid, "PhishGuard: Machine Learning-Powered Phishing URL Detection," 2023 Congress in Computer Science, Computer Engineering, & Applied Computing (CSCE), Las Vegas, NV, USA, 2023, pp. 2279-2284.
M. Aydin and N. Baykal, "Feature extraction and classification phishing websites based on URL," 2015 IEEE Conference on Communications and Network Security (CNS), Florence, Italy, 2015, pp. 769-770
Rami M. Mohammad, Fadi Thabtah, and Lee McCluskey.Phishing websites features, 2015. Unpublished. Available via:http://eprints.hud.ac.uk/24330/6/RamiPhishing_Websites_Features.pdf.
J. Kumar, A. Santhanavijayan, B. Janet, B. Rajendran and B. S. Bindhumadhava, "Phishing Website Classification and Detection Using Machine Learning," 2020 International Conference on Computer Communication and Informatics (ICCCI), Coimbatore, India, 2020, pp. 1-6.
Mehmet Korkmaz, Ozgur Koray Sahingoz, Banu Diri, Detection of phishing websites by using machine learning-based URL analysis, 11nth International Conference on Computing, Communication and Networking Technologies(ICCCNT), 2020.
M. N. Alam, D. Sarma, F. F. Lima, I. Saha, R. -E. -. Ulfath and S. Hossain, "Phishing Attacks Detection using Machine Learning Approach," 2020 Third International Conference on Smart Systems and Inventive Technology (ICSSIT), Tirunelveli, India, 2020, pp. 1173-1179.
Subasi, A.; Molah, E.; Almkallawi, F.; Chaudhery, T.J. Intelligent phishing website detection using random forest classifier. In Proceedings of the International Conference on Electrical and Computing Technologies and Applications (ICECTA), Phuket, Thailand, 12–13 October 2017; pp. 1–5.
Rishikesh Mahajan, and Irfan Siddavatam, Phishing website detection using machine learning algorithms, International Journal of Computer Applications(0975-8887), vol. 181, no. 23, 2018.
Khan, S.A.; Khan, W.; Hussain, A. Phishing Attacks and Websites Classification Using Machine Learning and Multiple Datasets (A Comparative Analysis). In Intelligent Computing Methodologies: 16th International Conference, ICIC 2020, Bari, Italy, 2–5 October 2020, Proceedings, Part III; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2020; Volume 12465.
Saleem Raja Abdul Samad, Sundarvadivazhagan Balasubaramanian, Amna Salim Al-Kaabi, Bhisham Sharma, Subrata Chowdhury, Abolfazl Mehbodniya, Julian L. Webber and Ali Bostani. Analysis of the Performance Impact of Fine-Tuned Machine Learning Model for Phishing URL Detection-Electronics 2023, 12, 1642.
S. C. Jeeva and E. B. Rajsingh, “Intelligent phishing url detection using association rule mining,” Human-centric Computing and Information Sciences, vol. 6, no. 1, Oct. 2016.
Thang, N. M., Anh, L. Q., Toan, H. S., & Trung, N. Q. (2023). A novel method for detecting URLs phishing using hybrid machine learning algorithm. Journal of Science and Technology on Information Security, 2(19), 15-28. https://doi.org/10.54654/isj.v2i19.978.
M.Babagoli, M. P.Aghababa, and V.Solouk, “Heuristic nonlinear regression strategy for detecting phishing websites,” Soft Computing, vol. 23, no. 12, pp. 4315–4327, 2018.
Shaoming Chen, Yiyang Wang, and Xingkai Cheng. An approach for detecting phishing websites by using search engine. In Proceedings of the 2023 4th International Conference on Machine Learning and Computer Application (ICMLCA '23).
Cho Do Xuan, Hoa Dinh Nguyen and Tisenko Victor Nikolaevich, “Malicious URL Detection based on Machine Learning” International Journal of Advanced Computer Science and Applications(IJACSA), 11(1), 2020. http://dx.doi.org/10.14569/IJACSA.2020.0110119.
Mohammad, R.; McCluskey, T.L.; Thabtah, F. UCI Machine Learning Repository: Phishing Websites Data Set. Available online: https://archive.ics.uci.edu/dataset/327/phishing+websites (accessed on 20 March, 2024).
Dutta AK. Detecting phishing websites using machine learning technique. PLoS One. 2021 Oct 11;16(10):e0258361. doi: 10.1371/journal.pone.0258361. PMID: 34634081; PMCID: PMC8504731.
S. Anderson, “A-Morphous Morphology” (2011). Cambridge University Press.
Downloads
Published
How to Cite
Issue
Section
License
Proposed Policy for Journals That Offer Open Access
Authors who publish with this journal agree to the following terms:
1. Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
Proposed Policy for Journals That Offer Delayed Open Access
Authors who publish with this journal agree to the following terms:
1. Authors retain copyright and grant the journal right of first publication, with the work [SPECIFY PERIOD OF TIME] after publication simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).