A novel method for detecting URLs phishing using hybrid machine learning algorithm
DOI:
https://doi.org/10.54654/isj.v2i19.978Keywords:
URL, phishing, SVM, Naive Bayes, machine learningTóm tắt
Abstract— The phishing attack is the type of cyberattack that targets people’s trust by masking the malicious intent of the attack as communications from reputable sources. The goal is to steal sensitive data from the victim(s) (banking information, social identification, credentials, etc.) for various purposes (selling for monetary gain, performing identity thief, using as a lever for escalation attack). In 2022, the number of reported phishing attacks will reach a whopping 255 million cases, an increment of 61% compared to 2021. Existing methods of phishing URL detection have limitations. The article proposes a method to increase the accuracy of detecting malicious URL by using machine learning methods Linear Support Vector Classification and multinomial Naive Bayes with voting mechanisms.
Downloads
References
. What is URL phishing [Digital resource].– URL: https://surfshark.com/blog/what-is-url-phishing (access date: 15.12.2022).
. Charan A. N. S., Chen Y. H., Chen J. L. Phishing Websites Detection using Machine Learning with URL Analysis /2022 IEEE World Conference on Applied Intelligence and Computing (AIC).– IEEE, 2022.– P. 808-812.
. Uddin M. M. et al. A Comparative Analysis of Machine Learning-Based Website Phishing Detection Using URL Information //2022 5th International Conference on Pattern Recognition and Artificial Intelligence (PRAI).– IEEE, 2022.– P. 220-224.
. Sindhu S. et al. Phishing detection using random forest, SVM and neural network with backpropagation //2020 International Conference on Smart Technologies in Computing, Electrical and Electronics (ICSTCEE).– IEEE, 2020, – P. 391-394.
. Athulya A. A., Praveen K. Towards the detection of phishing attacks //2020 4th international conference on trends in electronics and informatics (ICOEI)(48184).– IEEE, 2020, – P. 337-343.
. Bouijij H., Berqia A. Machine learning algorithms evaluation for phishing URL classification //2021 4th International Symposium on Advanced
Electrical and Communication Technologies (ISAECT).– IEEE, 2021.– P. 01-05.
. Amen K., Zohdy M., Mahmoud M. Machine Learning for Multiple Stage Phishing URL Prediction //2021 International Conference on
Computational Science and Computational Intelligence (CSCI).– IEEE, 2021.– P. 794-800.
. Dr U. S., Patil A., Mohana M. Malicious URL Detection and Classification Analysis using Machine Learning Models //2023 International Conference on Intelligent Data Communication Technologies and Internet of Things (IDCIoT).– IEEE, 2023.– P. 470-476.
. Phising and Benign Websites [Digital resource] ¬– URL: https://www.kaggle.com/datasets/peyamowar/phishing-and-benign- website (access date: 15.12.2022.).
. Phising Site URL [Digital resource].– URL: https://www.kaggle.com/datasets/taruntiwarihp/phishing-site-URL (access date 15.12.2022).
. Urlib.parse library [Digital resource].– URL: https://docs.python.org/3/library/urllib.parse.html (access date: 15.12.2022).
. Linear support vector classifier [Digital resource].– URL: https://scikitlearn.org/stable/modules/generated/sklearn.svm.LinearSVC.html (access date: 15.12.2022).
. Logistic regression [Digital resource].– URL: https://scikitlearn.org/stable/modules/generated/sklearn.linear model.LogisticRegression.html (access date: 15.12.2022).
. Multinomial naive Bayes [Digital resource].– URL: https://scikitlearn.org/stable/modules/generated/sklearn.naive bayes.MultinomialNB.html (access date: 15.12.2022).
. Decision tree classifier [Digital resource].– URL: https://scikitlearn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html
(access date: 15.12.2022).
. Random forest classifier [Digital resource].– URL: https://scikitlearn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html (access date: 15.12.2022)
. Voting classifier [Digital resource].– URL: https://scikitlearn.org/stable/modules/generated/sklearn.ensemble.VotingClassifier.html (access date: 15.12.2022).
. Thang, N. M., & Luong, T. T. (2022). Algorithm for detecting attacks on Web applications based on machine learning methods and attributes queries. Journal of Science and Technology on Information Security, 2(14), 26-34.
Downloads
Published
How to Cite
Issue
Section
License
Proposed Policy for Journals That Offer Open Access
Authors who publish with this journal agree to the following terms:
1. Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
Proposed Policy for Journals That Offer Delayed Open Access
Authors who publish with this journal agree to the following terms:
1. Authors retain copyright and grant the journal right of first publication, with the work [SPECIFY PERIOD OF TIME] after publication simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).