 Open Access
				Open Access 
				 Subscription Access
									Subscription Access
							Content based Detection and Blocking of Spam/Phishing Emails using Machine Learning
Abstract
Utilising the web has been increasing day by day, as a greater number of people are using it, especially for communication. E-mail remains to be one of the most efficient ways of communication techniques and one of the most effective tools for communication for social to business purposes, due to its cost and minimum time consumption. Through e-mail, one can flood the internet by sending multiple copies of same message to large number of users. One important issue to be addressed in e-mails is that our inboxes are generally affected by attacks which mainly includes spam. Currently, spam e-mails are identified by detecting stop words in it, however if any new spam, fake or irrelevant e-mail is sent without including the stop words, it isn't properly identified. Therefore, a system should learn the words and its meaning to detect spam e-mails efficiently. To overcome this issue of blocking new and unrecognised spam e-mails, Machine Learning based approach on ‘Phishing Websites’ dataset from the UCI repository is proposed. Our proposed methodology is to use Morphological Analysis in Natural Language Processing (NLP) for better spam identification. By utilising the machine learning techniques efficiently, spam and phishing emails are to be detected and blocked in the server side itself.
References
Nguyen, M., Nguyen, T., & Nguyen, T. H. (2018). A deep learning model with hierarchical lstms and supervised attention for anti-phishing. arXiv preprint arXiv:1805.01554..
Anti-Phishing Working Group. (2018). Phishing Activity Trends Report 1st Quarter 2018. Available: http://docs.apwg.org/reports/apwg_trends_report_q1_2018.pdf
PhishLabs. (2018). 2018 Phish Trends & Intelligence Report. Available: https://info.phishlabs.com/hubfs/2018%20PTI%20Report/PhishLabs%20Trend%20Report_2018-digital.pdf
Anti-Phishing Working Group. (2016). Phishing Activity Trends Report 4th Quarter 2016. Available: http://docs.apwg.org/reports/apwg_trends_report_q4_2016.pdf
Anti-Phishing Working Group. (2015). Phishing Activity Trends Report 1st-3rd Quarter 2015. Available: http://docs.apwg.org/reports/apwg_trends_report_q1-q3_2015.pdf
Phishing Websites Data Set - Machine Learning Repository. Available: https://archive.ics.uci.edu/ml/datasets/Phishing+Websites
Mohammed, M. A., Mostafa, S. A., Obaid, O. I., Zeebaree, S. R., Abd Ghani, M. K., Mustapha, A., ... & AL-Dhief, F. T. (2019). An anti-spam detection model for emails of multi-natural language. Journal of Southwest Jiaotong University, 54(3).
Subramaniam, T., Jalab, H. A., & Taqa, A. Y. (2010). Overview of textual anti-spam filtering techniques. International Journal of Physical Sciences, 5(12), 1869-1882.
Sharma, A. K., & Sahni, S. (2011). A comparative study of classification algorithms for spam email data analysis. International Journal on Computer Science and Engineering, 3(5), 1890-1895.
Chhabra, P., Wadhvani, R., & Shukla, S. (2010). Spam filtering using support vector machine. Special Issue of IJCCT, 1(2), 3.
Refbacks
- There are currently no refbacks.