A Machine Learning Approach to Network Intrusion Detection System Using K Nearest Neighbor and Random Forest is a well researched Engineering Master’s Thesis topic for final year students and undergraduates, it is to be used as a guide or framework for your Academic Research.
The evolving area of cybersecurity presents a dynamic battlefield for cyber criminals and security experts. Intrusions have now become a major concern in the cyberspace. Different methods are employed in tackling these threats, but there has been a need now more than ever to updating the traditional methods from rudimentary approaches such as manually updated blacklists and whitelists.
Another method involves manually creating rules, this is usually one of the most common methods to date. A lot of similar research that involves incorporating machine learning and
artificial intelligence into both host and network-based intrusion systems recently.
Doing this originally presented problems of low accuracy, but the growth in the area of machine learning over the last decade has led to vast improvements in machine learning
algorithms and their requirements.
This research applies k nearest neighbours with 10-fold cross validation and random forest machine learning algorithms to a network-based intrusion detection system in order to improve the accuracy of the intrusion detection system.
This project focused on specific feature selection improve the increase the detection accuracy using the K-fold cross validation algorithm on the random forest algorithm on approximately 126,000 samples of the NSL-KDD dataset.
Cybersecurity is a growing problem in modern times because of the rapid growth and technological advancement. The internet provides all knowledge that has been
accumulated by man and with the advent of mobile computing at every person’s finger
tips, cyberattacks and cyber crimes have become all too popular.
A report from the antiphishing working group has shown that about 227,000 malware detections occur daily which is linked to over 20 million new malwares daily. Malwares can simply be defined as a computer program created to cause harm on a computer system (Kaspersky 2017).
There has been a straight forward method for dealing with malware in the past, but over the past two decades there has been an evolution in cyber attacks and how exploits are carried out, as such cybersecurity techniques are also undergoing an evolution into more intelligent approaches.
The main problem that has risen from the growth of technology and the internet is
the amount of skill required to perform an attack on an unsuspecting target computer.
Automated scripts and sophisticated programs that can bypass and evade security measures are readily available for anyone who wishes to perform an attack, and attacks from low skilled cyber criminals has been on the rise (Aliyev, 2010).
A 2016 report from the APWG showed that billions of US Dollars were lost due to phishing attacks, and 42.71% of these attacks were targeted at the retail industry. Such a large scale of attack towards business infrastructure in a country can greatly cripple the growth of businesses, in this report it was also shown that The United States hosted the largest number of phishing websites and china was the most affected by phishing.
This has motivated research into an improvement of the cybersecurity
applications to mitigate loss of personal data and reduce the damages caused by cyber attacks because a properly coordinated cyber attack can cause extensive damage to a business.
The traditional methods that exist in place cannot keep up with the rapid innovations that happing in the cyber crime space. An example of a traditional method for mitigating phishing attacks is the use of blacklists. A blacklist is a curated list of harmful URLs that are curatted and updated by a security company such as Avast.
A network blocks all URLs that exist on the blacklist and allow all other URLs to flow
through the network