













Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Vincent Hu, Massachusetts Institute of Technology Lincoln Laboratory: Richard ... IDS test as are found in real networks. Lastly, IDSs can be configured and ...
Typology: Exams
1 / 21
This page cannot be seen from the preview
Don't miss anything!
1
Authors : National Institute of Standards and Technology ITL: Peter Mell, Vincent Hu, Massachusetts Institute of Technology Lincoln Laboratory: Richard Lippmann, Josh Haines, Marc Zissman
While intrusion detection systems are becoming ubiquitous defenses in today's networks, currently we have no comprehensive and scientifically rigorous methodology to test the effectiveness of these systems. This paper explores the types of performa nce measurements that are desired and that have been used in the past. We review many past evaluations that have been designed to assess these metrics. We also discuss the hurdles that have blocked successful measurements in this area and present suggestions for research directed toward improving our measurement capabilities.
positive, vulnerability, attack, ROC, stealthy, traffic
Within the last four years, the use of commercial intrusion detection system (IDS) technology has grown considerably, and IDSs are now standard equipment for large networks. According to International Data Corp., the projected revenue from IDSs and related services is expected to rise to $443.5 million in the year 2002 from an estimated $350 million in IDS revenue realized in 2001 [1]. Despite this enormous investment in IDS technology, no comprehensive and scientifically rigorous methodology is available today to test the effectiveness of these systems. Some might blame consumers for not demanding such effectiveness measures before purchasing IDSs. Some might also blame research- funding entities for not investing more money in this area. DARPA, which does fund research in measurement methodologies, is the exception. However, quantitative IDS performance measurements are not available because there are research hurdles that must be overcome before we can create such tests. This paper outlines the quantitative measurements that we need, the obstacles that are impeding progress to developing these measurements, and our ideas for research in IDS performance measurement methodology to overcome those obstacles.
This work was sponsored by the Defense Advanced Research Projects Agency under Air Force Contract F19628-00-C0002. The identification of any commercial product or trade name does not imply endorsement or recommendation by the National Institute of Standards and Technology.
There are many potential customers for the results of quantitative evaluations of IDS accuracy. Acquisition managers need such information to improve the process of system selection, which is too often based only on the claims of the vendors and limited-scope reviews in trade magazines. Security analysts who review the output of IDSs would like to know the likelihood that alerts will result when particular kinds of attacks are initiated. Finally, R&D program managers need to understand the strengths and weaknesses of currently available systems so that they can effectively focus research efforts on improving systems, and measure their progress.
In this section we list a partial set of measurements that can be made on IDSs. We focus specifically upon those measurements that are quantitative and that relate to detection accuracy.
3.1 Coverage This measurement determines which attacks an IDS can detect under ideal conditions. For signature-based systems, this would simply consist of counting the number of signatures and mapping them to a standard naming scheme. For non- signature based systems, one would need to determine which attacks out of the set of all known attacks could be detected by a particular methodology. The number of dimensions that make up each attack makes this measurement difficult. Each attack has a particular goal (e.g. denial of service, penetration, or scanning), works against particular software running on particular versions of an operating systems or against a particular protocol, and leaves evidence or a trace in different locations. Attacks may also depend upon the hardware of the system that is attacked, on the version of a protocol used, or on the mode of operation used (e.g. stealthiness techniques). Researchers emphasize a variety of different features in their measurement studies including: the attack goal; the victim type’ the data that must be collected to obtain evidence of the attack; whether the attack uses stealthy IDS evasion techniques; and combinations of these features. The result is that some researchers define attacks with coarse granularity (and acknowledge that each attack has multiple targets and modes of operation), while others define attacks at the finest level of granularity (where each attack has a very specific target configuration and mode of operation). This disparity concerning the proper level of granularity for viewing attacks makes it difficult to count the number of attacks that an IDS detects and to compare the coverage of multiple IDSs. This problem is somewhat alleviated by the Common Vulnerabilities and Exposures (CVE), which is a standard list of virtually all known vulnerabilities [2]. However, the CVE approach does not solve this problem when multiple attacks are used to exploit the same vulnerability using different approaches to evade IDS systems. To address this issue, the CVE standards group has started a project to name attacks, but this work is still in the research stages.
summarizes the relationship between two of the most important IDS characteristics: false positive and detection probability. In Figure 1, we show a receiver operating characteristic (ROC) curve produced by two systems in an IDS test. The x axis shows the percentage of fa lse alarms produced during a test and the y axis shows the percent of detected attacks at any given false alarm percentage. Note that an IDS can be operated at any given point on the curve. In this example, system 1 can be tuned to a variety of operating points while system 2 would be difficult to configure because it has only one realistic operating point. By definition, a receiver operating characteristic curve shows probabilities on the x and y axes, but sometimes the unit of measurement for normal traffic is difficult to define. As a result, researchers have used false alarms per unit time on the x axis. A unit of measurement is required to determine the maximum number of false alarms that could occur over a given time period. This unit of measurement depends critically on how features are extracted in an IDS from the input data used and is difficult to determine for commercial IDSs and other black-box systems. For evaluation purposes, it is probably better to plot detection rate versus false alarms per unit time. These curves are not truly ROC curves, but they convey information of importance when analyzing and comparing IDSs. In Figure 1, both systems were network-based IDSs, and the unit of measurement was the total number of packets transmitted over the network. The analysis assumed that, at worst, an IDS could issue one alert per packet, and the maximum number of alerts was the total number of packets transmitted.
0
20
40
60
80
100
0
20
40
60
80
100
0 20 40 60 80 100
% Detection
% False Alarm
System 1
System 2
Figure 1. Receiver operating characteristic (ROC) curve plotting the percentage attacks detected versus the percentage of false alarms.
An interesting aspect of ROC curves is that there is an optimal operating point for an IDS, given a particular network being monitored. However, to determine it, one must know the cost of a false alarm, the value of a correct detection, and the prior probabilities of normal and attack traffic.
3.3 Probability of Detection This measurement determines the rate of attacks detected correctly by an IDS in a given environment during a particular time frame. The difficulty in measuring the detection rate is that the success of an IDS is largely dependent upon the set of attacks used during the test. Also, the probability of detection varies with the false positive rate, and an IDS can be configured or tuned to favor either the ability to detect attacks or to minimize false positives (see section 3.2 for an explanation of this). One must be careful to use the same configuration during testing for false positives and hit rates. Further, an NIDS can be evaded by stealthy versions of attacks. An NIDS may detect an attack when it is launched in a simple straightforward manner, but not when even simple approaches to stealthiness are used. Techniques used to make attacks stealthy include fragmenting packets, using various types of data encoding, using unusual TCP flags, encrypting attack packets, spreading attacks over multiple network sessions, and launching attacks from multiple sources [3, 4].
3.4 Resistance to Attacks Directed at the IDS This measurement demonstrates how resistant an IDS is to an attacker's attempt to disrupt the correct operation of the IDS. Attacks against an IDS may take the form of:
3.5 Ability to Handle High Bandwidth Traffic This measurement demonstrates how well an IDS will function when presented with a large volume of traffic. Most network-based IDSs will begin to drop packets as the traffic volume increases, thereby causing the IDS to miss a percentage of the attacks. At a certain threshold, most IDSs will stop detecting
load as it does on a quiescent network. For example, Hall [34] has proposed a test methodology and traffic metrics for standardized capacity benchmarking of NIDS. The NIDS customers can then use the standardized capacity test results for each metric and a profile of their networks to determine if the NIDS is even capable of sustaining inspection of the traffic.
3.11 Other Measurements There are other measurements, such as ease of use, ease of maintenance, deployments issues, resource requirements, availability and quality of support etc. These measurements are not directly related to the IDS performance but may be more significant in many commercial situations.
IDS testing efforts vary significantly in their depth, scope, methodology, and focus. Table 1 summarizes the characteristics of some of the more developed efforts listed in chronological order. This table demonstrates that evaluations have increased in complexity over time to include more IDSs and more attack types, such as stealthy and denial of service (DoS) attacks. Only research evaluations have included novel attacks designed specifically for the evaluation, evaluated the performance of anomaly detection systems, and generated ROC curves. Evaluations of commercial systems have included measurements of performance under high traffic loads. Traffic loads were generated using real high- volume background traffic mirrored from a live network and also with commercial load- testing tools. The remainder of this section provides details concerning all the evaluations shown in Table 1, as well as for a few other more limited evaluations. Evaluations of research systems are described first, followed by descriptions of commercial system evaluations.
4.1 University of California at Davis (UCD) Early research efforts at the University of California at Davis led to the first IDS testing platform that automatically launched attacks using interactive telnet, FTP, and rlogin sessions [5]. Scripts of normal and attack sessions were launched during tests that evaluated that ability of an IDS to distinguish intrusions from normal behavior and to operate when the computer running the IDS was under high load. An early network intrusion detection system called the Network Security Monitor (NSM) [6], which uses keyword matching to detect attacks in network traffic, was evaluated. The evaluation used a few attacks including password guessing, transmitting a password file to a remote host, and exploiting a vulnerability in the load module program to obtain super user status on a Unix machine. NSM missed some of these attacks until additional keywords were added. Under high CPU loads, it dropped packets and did not successfully detect attacks.
4.2 IBM Zurich A similar IDS testing platform, not included in Table 1 (detail data not available), was developed at the IBM Zurich Research Laboratory to support IDS research [7]. Automated attacks and background traffic were generated as in [5] and used to improve IDS systems that were designed to detect attacks against FTP servers. Background traffic was generated only for FTP servers, and a small number of FTP attacks were scripted. It was noted that initial low detection and high false alarm rates were improved by cyc lical repeatable testing and that generating realistic normal background traffic was complex and time-consuming. A focus of this effort was an attempt to completely automate the tasks required to run an evaluation.
4.3 MIT Lincoln Laboratory (MIT/LL) MIT/LL has performed the most extensive quantitative IDS testing to date. Sponsored by the Department of Defense Advanced Research Projects Agency (DARPA), MIT/LL conducted large-scale testing of research IDSs in 1998 and 1999 [8, 9, 10]. The 1999 evaluation effort used a test bed, which generated live background traffic similar to that on an Air Force base containing hundreds of users on thousands of hosts. More than 200 instances of 58 attack types (including stealthy and novel attacks) were embedded in seven weeks of training data and two weeks of test data. Different from their 1998 evaluation, the 1999 evaluation was designed primarily to measure the ability of systems to detect new attacks without first training on instances of these attacks. Automated attacks were launched against three UNIX victim machines (SunOS, Solaris, Linux), Windows NT hosts, and a router in the presence of background traffic. Detection rates for attacks, false alarm rates for background traffic, and ROC curves were generated for more than 18 research IDS systems and attack categories including DoS, probe, remote-to-local, and user-to-super-user attacks. Attacks were counted as detected if an IDS produced an alert for the attacked machine that indicated traffic or actions on a victim host generated by the attack. An analysis of attack identification was also performed to determine if IDSs could provide the correct name for old attacks and attack details including the IP source address, ports used, and beginning and end of the attack. In addition, detailed analyses of detections and high-scoring false alarms were performed for a smaller number of high- performance systems to determine why specific attacks were missed. Results demonstrated that a combination of network and host-based approaches would have provided the best attack coverage. Also, novel attacks are difficult to detect because signatures do not generalize to new attacks and because network protocols or host audit logs were not analyzed sufficiently to extract attack evidence. Systems that detected old attacks could provide the correct names for these attacks and provide accurate information on the attack source, protocols, and ports used. The MIT/LL evaluations resulted in the development of an intrusion detection corpus tha t includes weeks of background traffic and host audit logs, and hundreds of labeled and documented attacks. This corpus, used extensively by researchers, has been used as part of a data mining competition [11], and was
capabilities; response capabilities; and remote management system. These analyses revealed weaknesses in the implementation and stability of some of the systems tested. Two of the commercial systems tested detected many more low- level attacks than the three government-developed systems, presumably because they had many more signatures for these attacks. The government-developed systems, however, were better suited to detecting and analyzing high- level attacks performed during interactive sessions. Attackers, given knowledge of the signatures used in the IDSs, were able to create stealthy attacks that were not detected.
4.6 Neohapsis/Network-Computing Since 1999, Network Computing magazine has sponsored evaluation of commercial intrusion detection products by Neohapsis Laboratories[18-20]. The most recent review [20] included 13 commercial IDSs and the open-source Snort IDS [21]. These reviews assess performance under highly realistic traffic loads and have grown in complexity over the years. Qualitative results focus on practical characteristics including ease-of-use of the management framework, stability, cost effectiveness, signature quality/depth, and ease of customization. Quantitative results include the number of attacks detected [18-20] and sometimes the maximum traffic rates that can be handled before the IDS starts dropping packets, fails to perform signature checking, or fails to reassemble fragmented packets [19]. Table 1 shows characteristics of the most recent evaluation [20]. Realistic background traffic was created by mirroring traffic from DePaul University in Chicago onto an isolated testbed network. This traffic was from a backbone network with roughly 10,000 hosts and traffic rates of 30- to 88-M bits/sec and 5,000 to 7,000 packets per second. The evaluation showed that only seven of the IDSs tested could operate at these high traffic loads and that one crashed after only a few minutes. No careful analysis was made of false alarm rates. Nine recent attacks were launched against eight network IDSs. One detected all nine attacks, two detected only two attacks, and the other six detected from five to eight attacks. Each IDS was scored in the areas of management framework, signature quality/depth, stability of engine, cost-effectiveness, and customization. Detailed tables were also provided for each IDS, including features such as availability of a back-end database API, ability to reassemble fragmented packets and TCP streams, automatic signature update capabilities, availability of CVE cross references for signatures, and ability to log and display offending packets. A somewhat surprising result is that the open-source Snort IDS was rated third, an outcome that was better than 7 of the 9 commercial products tested.
4.7 The NSS Group The NSS Group evaluated IDSs and vulnerability scanners in 2000 and 2001. The report of the 2001 IDS evaluation [22] includes 15 commercial IDS products and the open-source Snort IDS [21]. The bulk of this report presents detailed information for each IDS on its architecture, ease of installation and configuration, and the types of reporting and analysis provided. Twelve IDSs were compared using either 18 or 66 commonly available exploits including port
scans, DoS, Distributed DoS, Trojans, Web, FTP, SMTP, POP3, ICMP, and Finger attacks. Attacks were counted as detected only when they were reported “in as straightforward and clear a manner as possible.” Attack detection rates were measured on a 100 M bits/s network with no traffic and with small (64 byte), real-world, and large (1514 byte) packets that consumed 0%, 25%, 50%, 75%, and 100% of the network bandwidth. In addition, IDS attack detection for fewer attacks (7 to 8) was measured using packet manipulation IDS evasion techniques described in [3] and HTTP syntax manipulation provided by a tool designed to hide web attacks called “whisker.” Vulnerability to two tools named “stick” and “snot” designed to generate packets that elicit false alarms on IDSs was also determined. Attack detection rates are difficult to compare to other studies because correct detection required correct labeling and not simply detection. Detection rates with no traffic were high and above 80% for all but two IDSs. Detection rates fell off dramatically for some systems under high traffic loads, but remained high for others. Most systems were not vulnerable to the IDS evasion techniques tested and did not respond to tools designed to produce false alarms.
4.8 Network World Fusion A more limited review of five commercial IDSs was reported by Network World Fusion magazine [23]. This review discussed ease of setup and use as well as features, but it also evaluated detection accuracy using 27 common attacks launched against three victim machines. These included stealthy web attacks generated using the “whisker” attack tool. Systems were tested using no background traffic and with artificially- generated loads of 40 M bits/sec (9, packets/sec), 60 M bits/sec (14,200 packets/sec), and 90 M bits/sec (67, packets/sec) on a 100 M bit/sec testbed network. Under no load, from roughly 78% to 93% of the 27 attacks were detected. Under the highest load of 90 M bits/sec, performance degraded substantially. Detection accuracy fell to 15% for one system and ranged from 63% to 78% for the other four.
MITRE 1997
UC Davis 1997
MIT /LL 1998
MIT /LL 1999
AFRL 1998
Neo- hapsis 2001
NSS 2001
Network World 2001 Attacks Number of Attacks
10 4 38 56 19 9 66 27
Number of victims
6 1 4 5 4 3 3 3
New/Novel Attacks
Stealthy Attacks (^) √ √ √ √ √ √ √ DoS Attacks (^) √ √ √ √ √ √
Metrics Probability of False Alarms
Probability of Detection
5.2 Differing Requirements for Testing Signature Based vs. Anomaly Based IDSs Although most commercial IDSs are signature based, many research systems are anomaly-based, and it would be ideal if an IDS testing methodology would work for both of them. This is especially important since we would like to compare the performance of upcoming research systems to existing commercial ones. However, creating a single test to cover both types of systems presents some problems. First anomaly-based systems require normal traffic for training that does not include attacks. The MIT/LL 1999 evaluation [9,10], provided this type of training data specifically to train anomaly detection systems, but it was not available for other evaluations. Anomaly based-systems also may learn artifacts of the testing methodology and thereby perform well without actually detecting any real attacks. This may happen when all the attacks in a test are launched from a particular user, IP address, subnet, or MAC address. However, anomaly systems may learn subtle characteristics that are hard to predetermine such as packet window size, ports, typing speed, command set used, TCP flags, or connection durations which enable them to perform artificially well in the test environment. Conversely, testing signature based IDSs also presents some problems. Each signature based IDS detects a particular set of attacks. This means that the performance of a signature-based IDS in a test will, to a large degree, reflect the set of attacks used in the test. This presents a problem since researchers must decide which attacks to include, and this decision may arbitrarily favor one of the tested systems. To a lesser extent, anomaly systems can also have this problem since they will only process certain data streams when looking for attacks and they can't detect attacks in the unexamined data streams.
5.3 Differing Requirements for Testing Network Based vs. Host Based IDSs Testing host based IDSs presents some difficulties not present when testing network based IDSs. In particular, network based IDSs can be tested in an off- line manner by creating a log file containing TCP traffic and then replaying that traffic to IDSs, as reported in [8-13]. This is convenient as all of the IDSs do not have to be tested at the same time, and the repeatability of the test is easy to achieve. Alternately, host based IDSs use a variety of system inputs in order to determine whether or not a system is under attack. This set of inputs changes between IDSs. Also, host based IDSs are designed to monitor a host as opposed to a single data feed (like network based IDSs). This makes it difficult to replay activity from log files in order to test a host based IDS. Since it is difficult to test a host based IDS in an off- line manner, researchers must explore more difficult real-time testing. Real-time testing presents problems of repeatability and consistency between runs.
5.4 Four Approaches to Using Background Traffic in IDS Tests Most IDS testing approaches can be classified in one of four categories with regard to their use of background traffic: testing using no background traffic/logs, testing using real traffic/logs, testing using sanitized traffic/logs, and testing using simulated traffic/logs. While there may be other valid approaches, most
researchers find it necessary to choose among these categories when designing their experiments. Furthermore, it is not yet clear which approach is the most effective for testing IDSs since each has unique advantages and disadvantages.
Testing using no background traffic/logs Many evaluations [19,22,23] test IDSs using no background traffic as a reference condition. In such experiments, an IDS is set up on a host or network on whic h there is no activity. Then, computer attacks are launched on this host or network to determine whether or not the IDS can detect the attacks. This technique can determine an IDS hit rate but can say nothing about false positives. This approach is useful for verifying that an IDS has signatures for a set of attacks and that the IDS can properly label each attack. Furthermore, testing schemes using this approach are often much less costly to implement than the ones that include, sanitize, or create background traffic or logs.
However, one drawback is that tests using this scheme are based on the implicit assumption that an IDSs ability to detect an attack is same regardless of background activity. At low levels of background activity it is not known whether or not this assumption is true (although we believe that it is). At high levels of background activity we know that IDSs performance often degrades (e.g. see [19, 22, 23] and thus the assumption may no longer be true.
Testing using real traffic/logs Some researchers have tested IDSs by injecting attacks into a stream of real background activity [20]. This is a very effective approach for determining the hit rate of an IDS given a particular level of background activity. Hit rate tests using this technique may be well received because the background activity is real and it contains all of the anomalies and subtleties of background activity. Furthermore, this technique enables the comparison of IDS hit rates at differing levels of activity.
However, there are some drawbacks to using this technique:
having unidentified attacks in the background activity would pose a problem for any false positive testing. Further, sanitizing data may remove information needed to detect attacks.
Testing by generating traffic on a testbed network The most common approach to testing IDSs is to create a testbed network with hosts and network infrastructure that can be successfully attacked and to generate background traffic on this network [5, 7, 8-10, 12, 16, 19, 22, 23]. The testbed network includes victims of interest with background traffic generated by complex traffic generators that model the actual network traffic statistics [8-10]. Simpler commercial traffic generators can also be used to create a small number of packet types at a high rate [19, 22, 23]. Network traffic and host audit logs can be recorded in such a testbed for later playback [8-10,12] or evaluations can be performed in real time [16, 22, 23]. An advantage of this approach is that the data can be distributed freely since it does not contain any private or sensitive information. Another advantage is that we can guarantee that the background activity does not contain any unknown attacks since we created the background activity using the simulator. Lastly, IDS tests using simulated traffic are usually repeatable since one can either replay previously generated background activity or have the simulator regenerate the same background activity that was used in a previous test. This kind of test would appear to be one of the best since you can test both hit rates and false positive rates and use them to create ROC curves.
However, there are several difficulties to this approach:
This section presents a variety of research recommendations in the area of IDS testing. We first present recommendations for improving data sets and then present recommendations for enhancing metrics.
6.1 Shared Datasets There is a great need for IDS testing datasets that can be shared openly between multiple organizations. There are very few such datasets that have even semi- realistic data or have the attacks within the background traffic labeled. Existing datasets [12, 29] have been used frequently by researchers. Without such shareable datasets IDS researchers must either expend enormous resources creating proprietary datasets or use fairly simplistic data for their testing.
6.2 Attack Traces In section 5 we pointed out that it was difficult and expensive to collect a large set of attacks scripts for the purposes of IDS testing. A possible alternative is to use attack "traces" instead of real attacks. Attack traces are the log files that are produced when an attack is launched and that specify exactly what happened during the attack. Such traces usually consist of files containing network packets or systems logs that correspond to an instance of an attack. We need a better understanding of the advantages and disadvantages of replaying such traces as a part of an IDS test. In addition, there is a great need to provide the security community with a large set of attack traces. Suc h information could be easily added to and would greatly augment existing vulnerability databases such as [25] or [26]. The resulting vulnerability/attack trace databases would aid IDS testing researchers and would also provide valuable data for IDS developers.
6.3 Cleansing Real Data Real data generally cannot be distributed due to privacy and sensitivity issues. Research into methods to remove the confidential data within background traffic while preserving the essential features of the traffic could enable the use of such data within IDS tests. Such an advance would alleviate the need for researchers to spend additional effort creating expensive simulated environments.
Another problem with real background data is that it may contain attacks about which we know nothing. It is possible, however, that such attacks could be automatically removed. One idea is to collect a trace of events in the real world and use a simulation system to produce data similar to those in the collected trace. The simulator would use its traffic generation routines to approximate the real world trace. After the creation and storage of the simulated activity, the original collected traces could be discarded. Thus, we have a system that can model real world environments with the model only restricted by the complexity of the simulator. Since the test data is created by the simulation system with only a model of the system being simulated, traffic in the trace that does not conform to the specified model (and that the simulator can thus not reproduce), such as attacks, would be filtered out by default. In addition, there should no longer be any concern about privacy issues since the traces that drive the simulator would contain only traffic summaries and not proprietary traffic content.
6.4 Sensor and Detector Alert Datasets Some intrusion correlation systems do not use a raw data stream (like network or audit data) as input, but instead rely upon alerts and aggregated information reports from IDSs and other sensors [27, 28]. We need to develop systems that can generate realistic alert log files for testing correlation systems. A solution is to deploy real sensors and to “sanitize” the resulting alert stream by replacing IP addresses. Sanitization in general is difficult for network activity traces but it is relatively easy in this special case since alert streams use well defined formats and generally contain little sensitive data (the exception being IP addresses and
In this paper, we have described some of the previous efforts to measure IDS accuracy, and we have outlined some of the difficulties that have been encountered. We believe that a periodic, comprehensive evaluation of IDSs could be valuable for acquisition managers, security analysts and R&D program managers. However, because both normal and attack traffic are so variable from site to site, and because normal and attack traffic evolve over time, these evaluations will likely be complex and expensive to conduct. To enable evaluations to be made more efficiently, we recommend that the community find ways to create, label, share and update relevant data sets containing normal and attack activity.
[1] Howell D., “hackers often choose their corporate targets” , Investors Business Daily, January 30, 2002
[2]. Mann D. E. and Christey S. M., Towards a Common Enumeration of Vulnerabilities , 2nd Workshop on Research with Security Vulnerability Databases, Purdue University, West Lafayette, Indiana, January 21-22, 1999, http://www.cve.mitre.org/docs/cerias.html.
[3] Ptacek T.H. and Newsham T.N., Insertion, Evasion, and Denial of Service: Eluding Network Intrusion Detection , 1998, Secure Networks Inc., http://secinf.net/info/ids/idspaper/idspaper.html.
[4] http://www.networkmagazine.com/article/NMG20000517S
[5] Puketza N., Chung M., Olsson R. O., and Mukherjee B., A Software Platform for Testing Intrusion Detection Systems. IEEE Software, 1997. 14,(5), 43-51, http://seclab.cs.ucdavis.edu/papers/pdfs/np-mc-97.pdf.
[6] Heberlein T.L., Dias G. V., Levitt K. N., Mukherjee B., Wood J., and Wolber D., A Network Security Monitor , in Proceedings 1990 IEEE Symposium on Research in Security and Privacy. 1990, IEEE Computer Society Press: Los Alamitos, CA. p. 296-304, http://seclab.cs.ucdavis.edu/papers/pdfs/th-gd-90.pdf.
[7] Debar H., Dacier M., Wespi A., and Lampart S., A workbench for intrusion detection systems , March 1998, IBM Zurich Research Laboratory, Ruschlikon, Switzerland.
[8]Lippmann R.P., Fried D.J., Graf I., Haines J.W., Kendall K.R., McClung D., Weber D., Webster S.E., Wyschogrod D., Cunningham R.K., and Zissman M.A., Evaluating Intrusion Detection Systems: The 1998 DARPA Off-Line Intrusion Detection Evaluation, in Proceedings of the 2000 DARPA Information Survivability Conference and Exposition (DISCEX), Vol. 2, 12-26, 2000, IEEE Computer Society Press: Los Alamitos, CA, http://www.ll.mit.edu/IST/pubs/discex2000-rpl-paper.pdf.
[9]Lippmann, R.P., Haines J.W., Fried D.J., Korba J., and Das K., The 1999 DARPA OffLine Intrusion Detection Evaluation. Computer Networks, 2000. 34(2), 579-595, http://www.ll.mit.edu/IST/ideval/pubs/2000/1999Eval-ComputerNetworks2000.pdf.
[10]Lippmann R.P. and Haines J., Analysis and Results of the 1999 DARPA Off-Line Intrusion Detection Evaluation, in Recent Advances in Intrusion Detection, Third International Workshop,
RAID 2000 Toulouse, France, October 204, 2000 Proceedings, H. Debar, L. Me, and S.F. Wu, Editors. 2000, Springer Verlag, 162-182, http://link.springer.de/link/service/series/0558/bibs/1907/19070162.htm.
[11] Elkan C., Results of the KDD'99 Classifier Learning Contest , September 1999, Sponsored by the International Conference on Knowledge Discovery in Databases, http://www- cse.ucsd.edu/users/elkan/clresults.html.
[12] MIT Lincoln Laboratory, DARPA Intrusion Detection Evaluation Data Sets, Jan. 2002, http://www.ll.mit.edu/IST/ideval/data/data_index.html
[13] Song D., Shaffer G., and Undy M., Nidsbench - A network intrusion detection test suite. 1999Recent Advances in Intrusion Detection, Second International Workshop, RAID 1999, West Lafayette, Indianahttp://citeseer.nj.nec.com/cache/papers/cs/19418/http:zSzzSzwww.monkey.orgzSz~dugson gzSztalkszSznidsbench-slides.pdf/song99nidsbench.pdf.
[14] Haines J.A., Rossey L.M., Lippmann R.P., and Cunningham R. K.. Extending the DARPA Off-Line Intrusion Detection Evaluations in Darpa Information Survivability Conference and Exposition (DISCEX) II. 2001. Anaheim, CA: IEEE Computer Society.
[15] McHugh J., Testing intrusion detection systems: A critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory. ACM Transactions on Informationand System Security, 2000. 3, (4), 262-294, http://www1.acm.org/pubs/citations/journals/tissec/2000-3-4/p262-mchugh/.
[16] Durst R., Champion T., Witten B., Miller E., and Spagnuolo L., Testing and Evaluating Computer Intrusion Detection Systems. Communications of the ACM, 1999. 42, (7), 53-61, http://www1.acm.org/pubs/articles/journals/cacm/1999-42-7/p53-durst/p53-durst.pdf.
[17] Aguirre S.J. and Hill W.H., Intrusion Detection Fly-Off: Implications for the United States Navy , September 1997, MITRE Technical Report MTR 97W096, McLean, Virginia.
[18] Shipley G., ISS RealSecure Pushes Past Newer IDS Players. Network Computing, 17 May 1999, http://www.networkcomputing.com/1010/1010r1.html.
[19] Shipley G., Intrusion Detection, Take Two. Network Computing, 15 November 1999, http://www.nwc.com/1023/1023f1.html.
[20] Mueller P. and Shipley G., Dragon claws its way to the top. Network Computing, 20 August 2001, 45-67, http://www.networkcomputing.com/1217/1217f2.html.
[21] Roesch M., Snort - Lightweight Intrusion Detection for Networks , in USENIX 13th Systems Administration Conference - LISA '99. 1999: Seattle, Washington, http://www.usenix.org/publications/library/proceedings/lisa99/roesch.html.
[22] The NSS Group, Intrusion Detection Systems Group Test (Edition 2). December 2001, Oakwood House, Wennington, Cambridgeshire, Englandhttp://www.nss.co.uk/ids/.
[23] Yocom B. and Brown K., Intrusion battleground evolves, Network World Fusion, October 8 2001, 53-62, http://www.nwfusion.com/reviews/2001/1008bg.html.
[24] National Laboratory for Applied Network Research, NLAR network traffic packet header traces, 2002, http://pma.nlanr.net/Traces/.