Machine Learning Intruder Detection Approaches

Neural networks: Simulate human brain operaion with neurons and synapse between them

clustering and out lier detection: Group the observed data into clusters then identify subsequent data as either belonging to cluster or as an outlier.

Limitations of Anomaly Detection
They are generally trained on legitimate data
This limits the effectiveness of some of the techniques discussed.

Relatively high false positive rate anomalies can just be new normal activities

Detect intrusion by:
– observing events in the system
– applying a set of patterns or rules to the data
– determining if the is intrusive or normal

Signature Approaches
– match a large collection of known patterns of malicious data against data stored on system or in transit over a network
– the signature need to be large enough to minimize the false alarm rate, while still detecting a sufficiently large fraction of malicious data
– Widely used in anti-virus products, network traffic scanning proxies, and in NIDS

Signature Approach
-Advantages:
low cost in time and reource use
Wide acceptance
-Disadvantages:
significant effort to identify and review new malware to create signatures
inability to detect zero-day attacks

Rule-Based Detection
-involves the use of rules for identifying known penetrations or penetraions that would exploit known weakness
-Rules can also defined that identify suspicious behavior
-Typically rules used are specific

A variety of classification approaches

Statistical: Analysis of the observed behavior using univariate, multivariate, or time-series models of observed metrics.
Knowledge based: Approaches use an expert system that classifies observed behavior according to a set of rules that model legitimate behavior.
Machine learning: Approaches automatically determine a suitable classification model from the training data using data mining techniques.

Issues Affecting Performance:
Efficiency, cost of detection

Statistical Approaches
characteristics:
– use captured sensor data
– multivariate models using time of and order of event

Advantages:
– their relative simplicity
– low computation cost
– lack of assumptions about expected behavior

Disadvantages:
– difficulty selecting suitable metrics
– not all behaviors can be modeled using these approaches.

Knowledge base approaches
– developed during training to characterize data into distinct classes

advantages:
– robust
– flexible

disadvantages:
– the difficulty and time required to develop knowledge from the data
– human experts must assist with the process

Machine learning approaches
– use data mining techniques to develop a model that can classify data as normal or anomalous

Advantages:
– flexibility
– adaptability
– ability to capture inter-dependencies between observed metrics

disadvantages:
– dependency on assumptions about accepted behavior
– high false alarm rate
– high resource cost
– significant time and computational resources

Bayesian networks: encode probabilistic relationship among observed metrics
Markov models: Develop a model with sets of states

Elements of intrusion detection

components of intrusion detection systems:
From an algorithmic perspective
-Features – capture intrusion evidences
-Models – piece evidences together

From a system architecture perspective:
Audit data processor, knowledge base, decision engine, alarm generation and responses

Data preprocessor
Detection Engine <- Detection Models Decision Engine <- Decision Table Modeling and analysis - misuse detection(a.k.a. signature-based) - anomaly detection Deployment - host-based - network-based Development and maintenance - hand-coding of "expert knowledge" - learning based on data Analysis Approaches - anomaly detection - misuse / signature detection Anomaly Detection: involves the collection of data relating to the behavior of legitimate users over a period of time current observed behavior is analyzed to determine whether this behavior is that of a legitimate user or that of an intruder Misuse/ Signature Detection uses a set of known malicious data patterns or attack rules that are compared with current behavior also known as misuse detection Can only identify known attacks for which it has patterns or rules

Defense-in-Depth

Prevent -> Detect -> Survive

Instruction Examples
– remote root compromise, running packet sniffer, web server defacement, distributing pirated software, guessing/cracking password, using an unsecured modem to access internal network, copying databases containing credit card numbers, impersonating an executive to get information, viewing sensitive data without authorization, using an unattended workstation

Designed to Counter Threats:
known, less sophisticated attacks
sophisticated targeted attacks
new, zero-day exploits

Defense-in-depth strategies include:
encryption
detailed audit trails
strong authentication and authorization controls
active management of operation systems
application security

Intruder behavior
primary assumptions:
system activities are observable
Normal and intrusive activities have distinct evidence

Personal Firewalls

-Can be housed in a router that connects all of the home computers to a DSL, cable modem, or other Internet interface
– Typcially much less complex than server-based or standalone firewalls
– Primary role is to deny unauthorized remote access
– May also monitor outgoing traffic to detect and block worms and malware activity

Stealth Mode hides the system from the internet by dropping unsolicited communication packets
UDP packets can be blocked
Logging for checking on unwanted activity
Applications must have authorization to provide services

Deploying firewalls
– Internal DMZ network
– Internal protected network

Add more stringent filtering capabililty
Provide two-way protection with respect to the DMZ
Multiple firewalls can be used to protect portions of the internal network from each other

An important aspect of distribute firewall configuration: security monitoring

Host-resident firewall, screening router, single bastion inline, single bastion, double bastion inline, double bastion T, distribution firewall configuration

Bastion Host

Serves as a platform for an application-level gateway
System identified as a critical strong point in the network’s security

common characteristics
– runs secure o/s, only essential services
– may require user authentication to access proxy or host
– each proxy can restrict features, hosts accessed
– each proxy is small, simple, checked for security
– limited disk use, hence read-only code
– each proxy runs as a non-privileged user in a private and secured directory on the bastion host

Host Based Firewalls
– used to secure an individual host
– available in operating systems or can be provided as an add-on package
– Filter and restrict packet flows
– Common location is a server

Advantages:
filtering rules can be tailored to the host envrionment
protection is provided independent of topology
provides an additional layer of protection

Personal Firewalls
– controls traffic between a personal computer or workstation and the internet or enterprise network
– for both home or corporate use
– typically is a software module on a personal computer

Packet Filtering Firewall Countermeasures

IP Address spoofing Countermeasure: Discard packets with an inside source address if the packet arrives on an external interface.
Source routing attacks countermeasure: Discard all packets in which the source destination specifies the route.
Tiny Fragment Attack Countermeasure: Enforcing a rule that the first fragment of a packet must contain a predefined minimum amount of the transport header.

Tightens rules for TCP traffic by creating a directory of TCP connections
– there is an entry for each currently established connection
– Packet filter will allows incoming traffic to high-numbered ports only for those packets that fit the profile of one of the entries in this directory

Reviews packet information but also records information about TCP connections
– Keep track of TCP sequence numbers to prevent attacks that depend on the sequence number
– Inspects data for protocols like FTP, IM, and SIPS commands

Application-Level Gateway
Also called an application proxy
Acts as a relay of application-level traffic(basically a man or system in the middle)

User -> Gateway -> RemoteHost

Must have proxy code for each application
– may restrict application features supported
– tend to be more secure than packet filters

Disadvantage
– Additional processing overhead on each connection

Packet Filtering

Filtering rules are based on information contained in a network packet:
– source IP address
– Destination IP address
– Source and destination transport-level address:
– IP protocol field
– Interface

Two default policies:
-Discard prohibit unless expressly permitted
more conservative, controlled, visible to users
-Forward – permit unless expressly prohibited
easier to manage and use but less secure

If dynamic protocols are in use, entire ranges of ports must be allowed for the protocol to work.
Ports > 1024 left open

Packet filtering advantages
– simplicity
– Typically transparent to users and are very fast

Cannot prevent attacks that employ application specific vulnerabilities or functions
limited logging functionality
vulnerable to attacks and exploits that take advantage of TCP/IP
Packet filter firewalls are susceptible to security breaches caused by improper configurations

Firewalls

Firewall Design Goals
– Enforcement of security policies
All traffic from internal network to the Internet, and vice versa, must pass through firewall
Only traffic authorized by policy is allowed to pass
Dependable
The firewall itself is immune to subversion

Lists the types of traffic authorized to pass through the firewall
includes: address ranges, protocols, applications and content types

Developed from the organization’s information security risk assessment and policy, and a broad specification of which traffic types the organization needs to support
– Refined to detail the filter elements that can be implemented within an appropriate firewall topology

firewalls cannot protect..
traffic that does not cross it
– routing around
– internal traffic
When misconfigured

Gives insight into traffic mix via logging
Network address translation
Encryption

Firewalls and Filtering
-packets checked then passed
-inbound & outbound affect when policy is checked

Filtering Types
-Packet filtering
access control list
-Session filtering
dynamic packet filtering
stateful inspection
context based access control

Decision made on a per-packet basis
No state information saved

Applies rules to each incoming and outgoing IP packet
typically a list of rules based on matches in the IP or TCP header
Forwards or discards the packet based on rules match

Botnet C&C design

How can bots contact their master safely?
Simple, naive approach:
victims contact single IP, website, ping a server, etc.
Easily defeated (ISP intervention, blackhole routing, etc.)
still used by script-kiddies, first-time malware authors

Efficient and reliable
– able to reach to a sizable set of bots within a time limit
– hard to detect(i.e., blended with normal/regular traffic)
– Hard to disable or block

Advanced Persistent Threat(APT)
-Advanced:
malware, special operation and operators
-Persistent:
Long-term presence, multi-step, “low-and-slow”
-Threat:
Targeted at high-value organization and information

APT characteristics
– Zero-day exploit or a specially crafted malware
– No readily available signature for its detection

Social-engineering to trick even the most sophisticated users
– First compromise core internal network control elements such as routers and web servers to learn about the valuable targets
– Then play man-in-the-middle on the compromised routers/server to make social-engineering attacks very convincing to even forge answer challenge or inquiry by suspecting users