Skip to Main content Skip to Navigation

Anomaly-based network intrusion detection using machine learning

Abstract : In recent years, hacking has become an industry unto itself, increasing the number and diversity of cyber attacks. Threats on computer networks range from malware to denial of service attacks, phishing and social engineering. An effective cyber security plan can no longer rely solely on antiviruses and firewalls to counter these threats: it must include several layers of defence. Network-based Intrusion Detection Systems (IDSs) are a complementary means of enhancing security, with the ability to monitor packets from OSI layer 2 (Data link) to layer 7 (Application). Intrusion detection techniques are traditionally divided into two categories: signatured-based (or misuse) detection and anomaly detection. Most IDSs in use today rely on signature-based detection; however, they can only detect known attacks. IDSs using anomaly detection are able to detect unknown attacks, but are unfortunately less accurate, which generates a large number of false alarms. In this context, the creation of precise anomaly-based IDS is of great value in order to be able to identify attacks that are still unknown.In this thesis, machine learning models are studied to create IDSs that can be deployed in real computer networks. Firstly, a three-step optimization method is proposed to improve the quality of detection: 1/ data augmentation to rebalance the dataset, 2/ parameters optimization to improve the model performance and 3/ ensemble learning to combine the results of the best models. Flows detected as attacks can be analyzed to generate signatures to feed signature-based IDS databases. However, this method has the disadvantage of requiring labelled datasets, which are rarely available in real-life situations. Transfer learning is therefore studied in order to train machine learning models on large labeled datasets, then finetune them on benign traffic of the network to be monitored. This method also has flaws since the models learn from already known attacks, and therefore do not actually perform anomaly detection. Thus, a new solution based on unsupervised learning is proposed. It uses network protocol header analysis to model normal traffic behavior. Anomalies detected are then aggregated into attacks or ignored when isolated. Finally, the detection of network congestion is studied. The bandwidth utilization between different links is predicted in order to correct issues before they occur.
Complete list of metadata

Cited literature [123 references]  Display  Hide  Download
Contributor : Abes Star :  Contact
Submitted on : Wednesday, November 4, 2020 - 4:29:29 PM
Last modification on : Saturday, June 26, 2021 - 3:41:08 AM
Long-term archiving on: : Friday, February 5, 2021 - 6:55:36 PM


Version validated by the jury (STAR)


  • HAL Id : tel-02988296, version 1


Maxime Labonne. Anomaly-based network intrusion detection using machine learning. Cryptography and Security [cs.CR]. Institut Polytechnique de Paris, 2020. English. ⟨NNT : 2020IPPAS011⟩. ⟨tel-02988296⟩



Les métriques sont temporairement indisponibles