R(s) -> +100, -100, -3
E[∞Σt=0 γtRt] -> max
value iteration
V(a3, E) = 0.8×100 -3 = 77
V(s) <- [max aγΣs'P(s'(s,a)V(s'))]+ R(s)
back-up theorem convercet
随机应变 ABCD: Always Be Coding and … : хороший
R(s) -> +100, -100, -3
E[∞Σt=0 γtRt] -> max
value iteration
V(a3, E) = 0.8×100 -3 = 77
V(s) <- [max aγΣs'P(s'(s,a)V(s'))]+ R(s)
back-up theorem convercet
Planning under uncertainty and learning
MDP, POMDPs
deterministic, stochastic
fully observable A*, depth filter, deapth first, mdp
partially observable, POMDP
Markov decision process(MDP)
state, actions, state transition,
T(s,a,s’)
reword function R(s)
MDP cridworld
policy π(s)->A
tree too deep
stole
blanching factor large
many states visitied more than once
Stochastic
Multi agent
Partial serviceability [A, S, F, B]
– unknown
– hierarchical
[s, r, s][s, while a:r, s]
[a, s, f] result(result(a, a->s), s->f) <- goals s' = result + (s, a) b' = update(redirect(b, a), 0) classical planning state space: k-boolean(2k) world state: complete assignment belief state: complete assignment, partial assignment, arbitrary formula Action(fly(p, x, y)) prerecord : plan(p)^ airport(x) ^ airport(y) ^ a + (p, x) effect: ¬a+(p,x) ^ A +(p, y) at(D, sfo) at(c, sfo) load(c, d1, sfo) Regression vs Progression Action(buy(b),effect:ISBN(b), eff:own(b)) goal(own(0136042597)) situation calculus actions: objects fly(p, x, y) situation: objects successor-state axioms A +(p,x,s)
propositional logic
(E V B) => A
A => (J A M)
J <=> M
J <=> ¬M
{B true, E false}
Truth Table
O P O=> P
(E V B) => A
A => (J ^ M)
first-order logic rel, object, func T/F/?
propositional logic facts T/F/?
probability theory facts [0..1]
atomic -> problem solving
factored
structured
{P:T, Q:F}
Syntax
-sentences terms
vowel(A)
above(A, B)
2 = 2
operators A v ¬ => <= ( )
terms
A, B, 2 x, y
number of A
quantifiers: vowel(x) => number of (x) = 1
Number of (x) = 2
Dimensional reduction
local linear embedding
iso map
cluster by affinity
do em/kneans succeed
in finding the 2 closure
Affinity matrix
dimentionality for large environment
supervised vs unsupervised learnings
Maximum likelihood
3, 4, 5, 6, 7
m = 5
μ = 5
σ2 = 2
3, 9, 9, 3
μ = 6
σ2 = 9
Gaussians
– functional form
– fit from data
– multivariate gaussians
Expectation maximization
P(x) = Σi=i k P(c=i)p(x|C=i)
πi μiΣi
EM versus K-mean
minimize: -Σj log p(xjlσΣ1k)+ cosf k
guess
run EM
remove
clustering
– k-means, em
Unsupervised learning
-constructure
density estimation
-clustering
-dimensionality reduction
blind separation
K Means Clustering
– need to know k
– local minimum
– high dimentionality
– lack of mathematics
Gaussian Learning
pacamakes of a gaussian
f(x1u102)=1/√2πΘ exp(x-μ)2/2α2
μ=1/m mΣj=1 xg
Data x1…xm p(x1…xm|μ1Θ2)=πi f(xi|μ1Θ2)=(1/2πΘ2)m/2 exp – Σπ(xi-μ)2/2α2
m/2 log 1/2πα2 – 1/2α2 mΣi=i(xi-μ)2
Gradient
L = Σj(yj – w1x0 – w0)2 ->min
ΘL/w1 = -2Σj(yj-w1xj-w0)xj
ΘL/w0 = -2Σj(yj-w1xj-w0)
Perception algorithm
Linear seperator
w1x + w0 >= 0
0 if w1x + w0 < 0
Linear function
Linear Method
-regression vs classification
-exact solution vs iterative solution
-smoothing
-non-linear problems
Supervised Learning
-> parametic
KNN definition
learning: memorize all data
Problems of KNN
-very large data set
kdd trees
-very large feature spaces
Minimize quadratic loss
minΣ(yi-w1xi-w0)2 = L
ΘL/Θw0
Σxiyi – 1/m ΣyiΣxi – w/m(Σxi)2 = w1Σxi2
f(x)= w1X + w0
w0 = 3
w1 = -1
Sum(x_i y_i) – (1/M) Sum(y_i) Sum(x_i) + (w_1/M)( Sum(x_i) )^2 = w_1 Sum(x_i^2)
Regularization
loss = loss(data)+loss(parameters)
Σj(yi-wixi-w0)^2 + Σi|wi|p
advanced spam filters
-know spamming ip?
-have you emailed reason before?
-have other people received same message?
-email header consistent
-all caps
-do inline urls point to where they say?
-are you addressed by name?
Digit recognition
-input vector = pixel values
16 x 16
over fitting prevention
-Occam’s razor k?
cross validation
supervised learning
->classification yie{0,1}
->regression yie[0,1] eR
f(x) = w1X + w0
w0 = 3, w1= -1
Linear Regression
Data f(x)=w1x + w0, f(x)=wx+w0
y = f(x)
Loss = Σj(yj-x1xg-w0)2