Proposed Research Plan for an M. Sc. Thesis
Thesis Title:
Efficient Network Intrusion Detection Model Using Dynamic Machine Learning
Submitted to
Proposed By
Under Supervision of
2019
Introduction
Intrusion Detection system are an important tool for monitoring and securing the networking traffic and infrastructure. (NIDS sensors) is the networking component that responsible for tracking and monitoring all the networking traffic either coming in or going out to detect the abnormal traffic. Due to the huge differences of the intrusions technique, the signature-based detection methods which is a dataset contains a signatures value of the previously known attacks cannot cap and provide the optimized security solution since it cannot detect the novels attacks. Most of researches now a days moved to evaluate and improve the anomaly-based intrusion detection systems which is a complicated algorithm that give machines the ability to distinguish between the normal and abnormal networking behavior, and we still at an early stage where the deep learning techniques can make great improvement in the field of networking security
Related Work
Bo Li, Khe Chai Sim have proposed a Deep Split Temporal Context (DSTC) structure for DNNs to directly model long temporal contexts. By assuming independence between the sub-contexts of a long span of speech signals in early acoustic modeling stages, the complete context is split into multiple sub-contexts. They are first modeled independently and then merged at the last hidden layer to jointly give the final predictions. Even their system DSTC system is a generic acoustic modeling approach that does not explicitly handle noisy data, and yet it gives competitive performance [1]. Also there is a group of Technical University of Munich worked on construct deep neural networks to improve modeling and classification of system call sequences. By combining convolutional and recurrent layers in one neural network architecture we obtain optimal classification results. Using a hybrid neural network containing two convolutional layers and one recurrent layer we get a novel approach to malware classification. Their neural network outperforms not only other simpler neural architectures, but also previously widely-used Hidden Markov Models and Support Vector Machines. Overall, there approach exhibits better performance results when compared to previous malware classification approaches [2]. And there is a group of scientist their work shows that UMAP produces equally meaningful representations compared with t-SNE, particularly in its ability to resolve subtly differing cell populations. It also provides the useful and intuitively pleasing feature that it preserves more of the global structure and, notably, the continuity of the cell subsets. In addition to making plots easier to interpret, they have noted that this also improves its utility for generating hypotheses related to cellular development. UMAP outputs are faster to compute compared with Barnes–Hut t-SNE, much faster than scvis, and comparable to FIt-SNE [3].
Objectives
Due to the increase number and type of intrusions to systems, the improvement for the automatic tools to detect the intrusions became an utmost necessity. Deep Learning is a characteristic decision to adapt to this expansion, since it tends to the need of finding fundamental examples in substantial scale datasets. In this paper, I attempt to model the UMAP dimensional reduction technique for feature extraction and Artificial Recurrent Neural Network (RNN) architecture as learning model which know as Long short-term memory (LSTM) that based on time series data for the purpose of inventing a Efficient Network Intrusion Detection Using Dynamic Machine Learning Model to enhance the learning model by storing latest networking sequence to learn from, which will enhance the performance of the IDS accuracy and efficiency as an approach to find the best feature subset and detection result. The solution will be tested by networking traffic dataset know as (CICIDS2017) and the result will be benchmarking with other available sufficient techniques. So the Objectives of this paper could be listed as follow:
1. Propose an automated system that dynamically enhance the learning accuracy based on time based data and feature extraction using the dimensionality reduction techniques (UMAP)
2. Implement the proposed model using CICIDS2017 dataset
3. Compare the performance of proposed Model with other Deep Learning Models that using the same Dataset.
Problem Statement
As attacks become more sophisticated and evolving rapidly, a method for protection against attacks has to be more advanced and smarter. As different types of attacks appear the signature-based IDS cannot show a good performance at attacks prevention. Due to this limitation of static-based approaches, the needs for dynamic-based approaches is essential. We propose to build an efficient dynamic machine learning model for NIDS detection using deep learning and new techniques for feature selection and dimensionality reduction.
Proposed Methodology
UMAP Dimensional Reduction:
UMAP (Uniform Manifold Approximation and Projection) is a novel manifold learning technique for dimension reduction. UMAP is constructed format he or ethical framework based in Riemannian geometry and algebraic topology. The result is a practical scalable algorithm that applies to real world data. The UMAP algorithm is competitive with t-SNE for visualization quality, and arguably preserves more of the global structure with superior run time performance. Furthermore, UMAP has no computational restrictions on embedding dimension, making it viable as a general purpose dimension reduction technique for machine learning. Dimension reduction seeks to produce a low dimensional representation of high dimensional data that preserves relevant structure (relevance often being application dependent). Dimension reduction is an important problem in data science for both visualization, and a potential pre-processing step for machine learning [4].
Deep Learning for the Learning Model:
Since 2006, deep structured learning, or more commonly called deep learning or hierarchical learning, has emerged as a new area of machine learning research. During the past several years, the techniques developed from deep learning research have already been impacting a wide range of signal and information processing work within the traditional and the new, widened scopes including key aspects of machine learning and artificial intelligence [5]
Research plan
My working plan will be on some main point that will help me organize my thesis work. The work plan will be as follow:
I. Review the previous related topic to hear their recommendations and avoid their mistakes.
II. Investigate the Dimensional reduction capability and limitation
III. Investigate the deep learning technology learning model capability and limitation to find the proper way for implementation
IV. Apply the proposed model
V. Test, train, and check our propose solution Model performance and benchmarking with the other anomaly-based IDS
VI. Start to write thesis (chapter 1,2,3,4,5)
VII. Discovery and finalization
VIII. Review and revise
IX. Final submits
X. Thesis decision
Time plan
References
[1] . Li and K. C. Sim, “Modeling long temporal contexts forrobust dnn-based speech recognition,” inINTERSPEECH,2014, pp. 353–357.
[2] Kolosnjaji, B., Zarras, A., Webster, G. and Eckert, C. (2019). Deep Learning for Classification of Malware System Call Sequences.
[3] McInnes, L. & Healy, J. Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426(2018).
[4] McInnes and J. Healy. UMAP: Uniform Manifold Approx-imation and Projection for Dimension Reduction.ArXive-prints, February 2018.
[5] Deng, L. and Yu, D. (2014). Deep Learning: Methods and Applications. NOW Publishers