SAR / RAR

Intrusion Detection System (IDS): Anomaly Detection Using Outlier Detection Approach by J. Jabez and B. Muthukumar from Procedia Computer Science is available under a Creative Commons NonCommercial -NoDerivatives 4.0 International license. Copyright © 2015 The Authors.

Available online at www.sciencedirect.com

1877-0509 © 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC -ND license

( http://creativecommons.org/licenses/by -nc -nd/4.0/ ).

Peer -review under responsibility of scientific committee of International Conference on Computer, Communication and Convergence (ICCC 2015)

doi:10.1016/j.procs.2015.04.191

1. Introduction

With the high usage of Internet in our day today life, security of network has become the key foundation to

all web applications, like online auctions, online retail sales, etc. Detection of Intrusion, attempts to detect the

attacks of computer by examining different information records observed in network processes [2] [9]. This can be

considered as one of the significant ways to effectively deal with the problems in network security.

An intrusion in the internet can compromise the data security through several internet means. Nowadays,

the fast rising networks proliferation, data tran sfer rate, and an unpredictable Internet usage have added more

anomaly problems. Thus researchers need to develop more reliable, effective, and self -monitoring systems, which

sort troubles and can carry out operation devoid of human interaction. By undergoing this kind of attempts,

catastrophic failures of susceptible systems can be reduced.

Detection stability and detection precision are two key indicators used to evaluate IDS (Intrusion Detection System)

[26]. Many of the IDS research studies have been done in order to improve the detection stability and detection

precision [22]. In the beginning stage, the research work focus lies in using statistical approaches and rule -based

expert s ystems [17]. But, the results of statistical approaches and rule -based expert s ystems were not accurate, when

encountering larger datasets. In order to overcome the abovementioned problem, many data mining techniques were

developed [7] .

Some machine- learning paradigms containing Linear Genetic Programming (LGP) [19], neural networks [18],

Bayesian networks, Support Vector Machines (SVM), Fuzzy Inference Systems (FISs) [25], Multivariate Adaptive

Regression Splines (MARS) [20] etc., have been investigated for the design of Intrusion Detection System (IDS).

Thus, one of the most common techniques in machine- learning paradigms is known as Neural Network (NN) that

should be used for resolving a lot of complex practical problems which has been successfully applied into Intrusion

Detection System [9]. Nevertheless, the major drawbacks of Neural Network -based IDS exist in two features:

1. Lower Detection Precision - particularly for low -frequent attacks, e.g., U2R (User to Root), R2L (Remote

to Local).

2. Weaker detection stability [4].

To solve the above two problems, this research work propose a novel approach for outlier computation -

based IDS, Outlier Detection Approach, to enhance the detection precision for low -frequent attacks and

detection stability. The proposed approach has got two stages such as training with normal big datasets and

testing with intrusion datasets. A set of various big datasets are used to train our IDS in the initial stage at

distributed storage environment. Normal big datasets are improving the performance of Intrusion Detection

System. Assume an intrusion dataset which is used to compute an error value with trained big data sets. If

number of error value is increased such as the specified threshold then the tested data set consider as

anomaly dataset.

The rest of the paper is organized as follows: Section II explains the existing work. Next, Section III

provides the details of the concept and classification of normal intrusion detection system components and

its proposed approach. Section IV shows the proposed approach, the experimental results and its analysis.

Finally, Section V concludes the work and its future directions.

2. Literature Survey

This section deals with the attempts made by researcher in the area of network based intrusion detection system and

most of the detection works were based on KDD dataset. An expert system based on rules and statistical approaches

are the two commonly used approaches to ensure intrusion detection. The Expert system based on rules will detect

the known intrusion in high rate and it will not identify new intrusion. Where, the database should be continuously

updated. In statistical approach, Intrusion Detection System includes different methods like Cluster analysis,

Multivariate analysis, Bayesian analysis, and Principal component analysis. Many new techniques from data mining

should be proposed to overcome the problems of above mentioned approaches. Many results are produced in the

KDD cup 99 dataset research work and they are briefly discussed.

A nderson [25] suggested an intrusion detection method to efficiently detect the intrusion. An Intrusion

Detection Mechanism using Time - series, Markov chains, and statistics was developed by Denning [3] Denning

considered that the changes in the normal behavior of user are treated as anomalous. For monitoring and detecting

user’s ev ents an Expert System of intrusion detection was developed by Stanford Research Centre. This centre also

developed next generation mechanism which includes audit profiles of user’s and can monitor the current status of

the user, if any change occurs with u ser’s activity compared with audit profile of user then it will generate am alarm.

Haystack [22] later developed a framework to estimate an intrusion detection method based on user and anomaly

strategies. Six types of intrusion were detected and those includes the masquerade attacks, malicious use, leakage,

service denial, unauthorized user’s break - ins attempt, and access control of security system. The source fire

developed indicates a network based intrusion detection and prevention mechanism called SNORT system which is

an open source. Forrest [10] in 1996 created a normal profile based on analyzing the call sequences between

intrusion detection and protection against human system. An attack in this system is considered as the sequence

deviation from normal profile sequence. Thus, this system works offline using previously collected information and

implements view table algorithm for learning program profiles significantly.

Duan et al. [8] have concentrated on identifying compromised machines that are recruited to detect spam

zombies. An approach SPOT is proposed to scan sequentially outgoing messages by implementing SPRT

(Sequential Probability Ratio Test). This method quickly estimates whether a host is compromised or not.

Identifying compromised machines using malware infection system is stated by Bot hunter [13]. This system has

large no of steps that allow intrusion detection alarms correlation triggered using inbound traffic with outgoing

message exchange pattern results. Bot Sniffer [14] explained in his work about compromised machine

characteristics which are a uniform temporal -spatial behavior for detecting zombies. This method identifies zombies

by combining flows based on server connections and searching flows with similar behavior respectively

.

Kumar and Goyal [12] have explained implements genetic algorithms in dataset training to classify the

labels that are smurf attacked and achieves low false positive ratio of 0.2%. Further work done by Abdullah [1] and

co -workers elaborated intrusion detection classification rules u sing genetic algorithms. Intrusion detection rules

using genetic algorithms was also the study made by Ojugo et al. [21]. This method uses fitness function for

estimating the rules.

Machine learning techniques are also implemented to detect the intrusion. Existing machine learning

techniques (Artificial Neural Networks - ANN) for intrusion detection was described by Roshani team [23].

Gaikwad et al [11] introduced a technique based on fuzzy clustering and ANN approach. This method

could be applicable to overcome the issues of weak stability detection as well as low precision detection. The restore

point in this method was employed for registry keys, system files roll back, project database and installed programs.

Fuzzy clustering will generate different subsets for training in order to reduce the amount of subset size and

complexity. Then each subset is trained with different type of artificial neural network and finally processed to

obtain significant results. Jaiganesh et al [15] suggested a novel back propagation model for intrusion detection. This

method makes training pair with a combination of input and equivalent target were generated and implemented into

the network. Performance success can be measured by false alarm and detection rate. Detection rate was proven to

be less than 80% for U2R, R2L, DoS and Probe attacks. However, the major issue of the method was found to be

much inefficient to detect hidden attackers present in the system. Devikrishna et al [5] used MLP (Multi Layer

Perceptron) architecture for intrusion detection that detects and classifies attacks into six types. MLP method was

considered as a failure

model due to irrelevant output .In the present paper we have tried to overcome this query and

to establish a better detection technique .

Lin GU et al [16] proposed empirical study for right choice of unstable growing demand in processing big

data which entails huge burden of storage, data center communication and computation which brings substantial

operational expenditure for data providing centers. Apart from traditional cloud service, an important characteristic

of big data was found to be the tight coupling of computation and data computation tasks were performed only with

relevant data. But the means to improve the IDS is not clearly conveyed so far by any of the researchers. Thus, the

main aim of this paper is to implement a clear picture of the IDS using distributed big data concept.

Issues of existing techniques

Many issues are been stated in the existing literature survival like additional training time, accurate

identification of low common attacks and attacks classification. In order to solve the issue of additional training

time, it is must to develop a new high-speed algorithm for intrusion detection system and its results will be tested

with existing techniques. In contrast to the existing approaches that performed some kind of inefficiency in intrusion

detection, the main aim of our research work is to propose a new high speed algorithm for reducing training time.

The obtained results are also to be di scussed along with the existing method .