Probability in AI

Bayes network

altenator broken, fanbelt broken ->
battery dead -> battery flat -> car won’t start
-battery meter, battery age
light, oil light, gas gague
no oil, no gas, fuel line blocked, starter broken

Binary events
Probability
Simple bayes networks
Conditional independence
Bayes networks
D-seperation
Parameter counts

Bayes networks -> diagnostics, prediction, machine learning
Finance, Google, Robotics
particle filters, HMM, MDP + POMDPs, KALMAN filters …

Probabilities is certainty in AI
P(head) = 1/2, P(Tail) = 1/2
P(h, h, h) = 1/8, P(h) = 1/2
P(x1=x2=x3=x4)=0.125,
P({x1,x2,x3,x4} contains >= 3 h) = 5 / 16

Search Comparison

Breakth-frist
Cheapest-first
Depth-first

Greedy best-first search
A* algorithm
f = g + h
g(path) = path cost
h(path) = h(s) = estimated distance to goal

A* finds lowest cost path is:
h(s) < true cost Sliding blocks puzzle (15puzzle) h1 = #misplaced blocks h2 = sum(distances of blocks) a block can move A -> B
if (A adjacent to B)
and (B is blank)
h2 h1
h = max(h1, h2)

Problem-solving works when:
-fully observable
-known
-discrete
-deterministic
-static

AI and Uncertainty

AI as uncertainty management
AI = What to do when you don’t know what to do?
Reasons for uncertainty.
Sensor limits

Definition
-initial state
-action(s) -> {a1, a2, a3 …}
-result(s,a) -> s1
-GoalTest(s) -> T|F
-PATH Cost(s->a -> s ->a ->s)-> n
step cost(s, a, s’) -> n

Tree search

function TREE SEARCH(problem):
	frontier = {[initial]}
	loop:
		if frontier is empty: return FAIL
		path = remove.choice(frontier)
		s = path.end
		if s is a goal: return path
		for a in actions:
			add [path + a -> result(s, a)]
			to frontier

Terminology for AI

1.Fully versus partially observablez

-perception action cycle
Agent, State
(sensors, actuators)

2.Deterministic versus stochastic

3.Discrete versus continuous

4.Benign(no objective) versus adversarial(such as chess, games)

for example:
robot car -> partially observable, stochastic, continuous, adverial

The Basic of AI

A AI Program is called
・INTELLIGENT AGENT

how does agent make a decision?
AI has successfully been used in
-finance
-robotics
-games
-medicine
-the web

ex.
trading agent for stock market, bonds market, commodity
->with online news, buy or sell decision

AI in Robotics
camera, microphone, touch
-> motors, voice

AI in games
 game agent play against you. your moves and own moves.

AI in medicine
diagnostic agent get vital signals

AI on the web
crawler

ReducerCode

def reducer():
	salesTotal = 0
	oldKey = None

	for line in sys.stdin:
		data = line.strip().split("\t")

		if len(data) != 2
			continue

		thisKey, thisSale = data

		if OldKey, thisSale = data
			print "{0}\t{1}".format(oldKey, salesTotal)

			salesTotal = 0

		oldKey = thisKey
		salesTotal += float(thisSale)

Defensive Mapper

def mapper():
	for line in sys.stdin:
		data = line.strip().split("\t")
		date, time, store, item, cost, payment = data
		print "{0}\t{1}".format(store, cost)

Using match

def get_db(db_name):
	from pymongo import MongoClient
	client = MongoClient('localhost:27017')
	db = client[db_name]
	return db

def make_pipeline():
	pipeline = [ ]
	return pipeline

def aggregate(db, pipeline):
	return [doc for doc in db.tweets.aggregate(pipeline)]

if __name__ == '__main__':
	db = get_db('twitter')
	pipeline = make_pipeline()
	result = aggregate(db, pipeline)
	import pprint
	assert len(result) == 1
	assert result[0]["followers"] == 17209

twitter data-set

{
	"_id" : ObjectID("xxxx"),
	"text" : "Something interesting ...",
	"entities" : {
		"user_mentions" : [
			{
				"screen_name" : "somebody_else",
				...
			}
		],
		"urls" : [],
		"hashtags": []
	},
	"user" : {
		"friends_count" : 544,
		"screen_name" : "somebody",
		"followers_count" : 100,
	}
}
from pymongo import MongoClient
import pprint

client = MongoClient("mongodb://localhost:27017")
db = client.twitter

def most_tweets():
	result = db.tweets.aggregate([
			{ "$group" : {"_id" : "$user.screen_name",
				"count": {"$sum" : 1}}},
			{ "$sort" : {"count" : -1 }}
		])
	return result

if __name__ == '__main__':
	result = most_tweets()
	pprint.pprint(result)