Network Intrusion Detection is a well-known, relevant problem that has gained even more interest due to the exponential growth of technologies, systems and volume of data. The constantly evolving attacks require a continuous effort towards the development of novel and robust detection solutions. In this paper we propose a Heterogeneous Ensemble and Active Learning (HEAL) system, a novel tool that incorporates the implementation of a dynamic heterogeneous ensemble model with active learning capabilities. This ensures a solution that: i) adapts to changes in data through time, ii) remains robust providing good performance, iii) handles a continuous flow of data, and iv) requires less human intervention when compared against pure active learning solutions. HEAL system uses multiple individual base models to build a heterogeneous ensemble learner that adapts to the specific data characteristics. Then, active learning is applied to the ensemble so that it is retrained and re-evaluated with respect to time and new instances. Instances where the model has a low confidence are labeled by a domain expert. A new model is retrained with these instances and its performance is evaluated. The deployed model is replaced when the new model exhibits performance advantages. Finally, an experimental comparison of the performance at different stages is carried out in a case study using the well-known NSL-KDD data set. In this study we show the advantages of using HEAL system.
Article ID: 2021S03
Publisher: Canadian Artificial Intelligence Association