Higher Level DoS

SSL/TLS handshake
RSA Encrypt -> RSA Decrypt

DoS Mitigation
Client puzzle: slow down attacker
Moderately hard problem: given challenge c find x such that
-LSBn(SHA-1 (c || x)) = 0^n
hardness of challenge:n
-decided based on DoS attack volume
Limitations:
-requires changes to both clients and servers
-Hurts low power legitimate clients during attack
CPU power ratio
-high end server / low end cell phone = 8000
-> impossible to scale to hard puzzles
Interesting observation
– Main memory access time ratio
– high end server / low end cell phone = 2
Solution requires many main memory accesses
– dwork-goldberg-naor, crypto
– abadi-burrows-manasse-wobber, acm toit

Traceback
-given set of attack packets
-determine path to source
assumption
-most routers remain uncompromised
-attacker sends many packets

TCP

IP header format
-connectionless
-unrerliable
-best effort

version, header length, type of service, total length, identification, flags, fragment offset, time to live, protocol, header checksum, source address of originating host, destination address of target host, options, padding, ip data

TCP
-session based, congestion control, in order delivery
source port, dest port, seq number, ack number, urg, ack, psh, psr, syn, fin, other stuff

TCP handhake
syn: SNc <- randc, ANc <- 0 SYN packets with random source IP addresses Fills up backlog queue on server No further connections possible A classic SYN flood example MS Blaster worm(2003) - SYN flood on port 80 to windowsupdate.com -50 SYN packets every second, each packet is 40 bytes -spoofed source IP:a.b.X.Y where X,Y random Low rate SYN flood defenses Non-solution -increase backlog queue size or decrease timeout Correct solution - sycookies: remove state from server Massive flood
command bot army to flood specific target
20,000 bots can generate 2Gb/sec of SYNs
at web site:
saturates network uplink or network router
random source IP -> attack SYNs look the same as real SYNs

Idea: only forward established TCP connections to site

Stronger attacks: TCP connection flood
Command bot army
-complete TCP connection to web site
-send short HTTP head request
-repeat

will bypass SYN flood protection proxy but
attacker can no longer use random source IPs
reveals location of bot zombies
proxy can now block or rate-limit bots

Javascript-based DDoS:
github.com <- honest end user <- inject imageFlood.js <- popular server imageFlood.js

Function imgflood(){
	var TARGET = ‘victim-website.com/index.php?’
	var rand = Math.floor(Math.random() * 1000)
	var pic = new Image()
	Pic.src = ‘http://’+TARGET+rand+’=val’
}
setInterval(imgflood,10)

DOS Taxonomy

subnet spoofing: Generate random addresses within a given address space
random spoofing: Generate 32-bit numbers and stamp packets with them.
fixed spoofing: The spoofed address is the address of the target.

server application: the attack is targeted to a specific application on a server
network access: the attack is used to overload or crash the communication mechanism of a network
infrastructure: the motivation of this attack is a crucial service of a global internet operation, for example core router

application
-> small number of packets -> big effect
application attacks:
DoS bug: Design flaw allowing one Machine to disrupt a service
Dos flood: Command botnet to generate flood of requests
sample Dos at different layers: link, TCP/UDP, application

First Set Algorithm

for each nonterminal X do First(X) :={}
while there are changes to any First(X) do
	for each production rule X ->X1X2...Xn do
		k:=1;
		while k<=n do
			First(X) = First(X) U (First(Xk)-{ε})
			if ε is not in First(Xk) then break;
			k := k+1;
		if (k>n) then First(X) = First(X) U {ε} 

follow sets

stmt -> if-stmt|other
if-stmt -> if(exp)stmt else-part
else-part -> else stmt | ε
exp -> 0 | 1

Regular Expressions

Regular Expressions are used in grep, sed, awk, perl, vi, shells, lex, yacc
– Each may use slightly different convention

State Machines

Parser Classification
LR: Bottom Up Parser
-L: scan left to right
-R: traces rightmost derivation of input string

LL: Top Down Parser
-L: Scan left to right
-L: traces leftmost derivation of input string

Typical notation
– LL(0), LL(1), LR(1), LR(k)
– number(k) refers to maximum look ahead

Recursive Descent Parsing
Non-terminal variable
convert
correct select rule for expansion matchToken
consumes token or Error

EBNF to BNF
 ::=  '*'  | 
EBNF
 ::=  {'*'  }
BNF
 ::= 
 ::= '*'   | ε

RegEx: Finite Automata

Deterministic Finite Automata (DFA)

Lexical Analysis(Scanner)
Read input: one character at a time
Group characters into tokens
Remove white spaces and comments
Encode token types and form tuples and return to parser

type value
FloatConst 123.45
VAR String=DaysOfWeek
Operator Value =+

Regular expression r (a pattern of characters) and its language L(r)
-> used in many Unix programs (e.g., grep, vi., etc.)
State machine (finite automata)

Representations of String
Basics of Regular Expression
Σ= AaBbCcDdEeFf…. {}|[]”;’<>?,./

Symbols & alphabet
-> A symbol is a valid character in a language
-> alphabet is set of legal symbols

Metacharacters/metasymbols that have special meanings:
Defining reg-ex operations(e.g. |,(,), *, + etc.)
Escape character(\) to turn off special meanings
Empty string e and empty set

a | b* = {a, ε, b, bb, bbb, …}
(a | b)* = {ε, a, b, ab, ba, aa, bb, aaa, aab, …} = Any number (including zero) of a’s and b’s in any order

unix-style regular expression
a-b same as abcd
^1-3 denotes characters other than 1, 2, or 3
-, [], +, ?, ^ and $
[a-zA-Z_][a-zA-Z_0-9]*
[0-9][0-9]*
(+|-)?[0-9][0-9]*

semantic symbols

-> #put-type;#do-decl
-> int | float
-> #add-decl
-> #add-decl
-> ID#proc-decl
#put-type puts given type on semantic stack
#proc-decl builds decl record for var on stack
#add-decl builds decl-chain
p#do-decl traverses chain on semantic stack using backwards pointers entering each var into symbol table

float a, b, c

translation: generate temporary values, propagate them to keep semantic context

Compiler Structure
scanner -> parser -> semantic analysis -> intermdeiate representation -> code generation -> assembly code

semantic actions

Symbol Table

int a, b;
declares a and b, within current scope, type integer

Basic symbol table
name, type, scope
a, int, “main”

semantic action
1. enter variable declaration into symbol table
2. look up variables in symbol table
3. do binding of looked-up variables (scoping rules, etc.)
4. Do type checking for compatibility
5. keep the semantic context of processing
e.g. a + b + c -> t1 = a + b
t2 = t1 + c

How Compilers Work

Start
-parse
-get next token
-semantic analysis
-look up variable name
-do semantic checks -> generate code(passed) or semantic error(failed)
-put in symbol table

Front End: Analysis
1. Lexical Analysis
2. Syntax Analysis
3. Semantic Analysis

Back End: Systhesis
4. Code Generation
5. Optimization

Lexical rules of language dictate how legal word is formed by concatenating alphabet.

If x==1 then if y==2 print 1 else print 2
stmt -> if expr then stmt else stmt

why compilers?

High-level Language

C/Java Compiler

temp = v[k];
v[k] = v[k+1];
v[k+1] = temp;

Fortran Compiler

TEMP = V(K)
V(K) = V(K+1)
V(K+1) = TEMP

Assembly Language

lw $to, 0($2)
lw $t1, 4($2)
sw $t1, 0($2)
sw $to, 4($2)
#include

int main()
{
	char A = '2';
	int sum = A + 10;
	std::cout<<"sum ="<
		Posted on Categories Compiler