import ... public class ToastAdListener extends AdListener { private Context mContext; private String mErrorReason; public ToastAdListener(Context context) { this.mContext = context; } @Override public void onAdLoaded(){ Toast.makeText(mContext, "onAdLoaded()", Toast.LENGTH_SHORT).show(); } @Override public void onAdOpened(){ Toast.makeText(mContext, "onAdOpened()", Toast.LENGTH_SHORT).show(); } }
Banner Ad
package com.example.adviewer; import ... public class BannerActivity extends Activity { private AdView mAdView; @Override protected void onCreate(Bundle savedInstanceState){ super.onCreate(savedInstanceState); setContentView(R.layout.activity_banner); mAdView = (AdView) findViewById(R.id.adView); AdRequest adRequest = new AdRequest.Builder() .build(); mAdView.loadAd(adRequest); } }
AdListener
public void onAdLoaded()
public void onAdFailedToLoad(int code) {AdRequest Error}
public void onAdOpened()
public void onAdLeftApplication() (e.g. BROWSER)
AdMob
Easy to Launch, Long Term, User Value
Paid Donwloads: no, no, yes
Subscription: no, yes, yes
Displaying ADs: yes, yes, yes
In-app purchase: yes, yes, yes
Common Monetization Models
-ADs & In-app purchases, subscription & in-app purchases, paid download & in-app purchases
USERS -> APPS(publishers) -> AdMob -> ADVERTISERS
https://www.google.co.jp/admob/
Types of ADs
-Banner ADs: TEXT, Image
-Interstitional ADs: Text, image, video
-Native ADs
<?xml version="1.0" encoding="utf-8"?> <RelativeLayout xmlns:android="http://schemas.android.com/apk/res/android" xmlns:ads="http://schemas.android.com/apk/res-auto" android:id="@+id/mainLayout" android:layout_width="match_parent" android:layout_height="match_parent"> <com.google.android.gms.ads.AdView android:id="@+id/adView" android:layout_width="match_parent" android:layout_height="wrap_content" android:layout_alignParentBottom="true" android:layout_alignParentLeft="true" ads:adSize="BANNER" ads:adUnitId="cp-app-pub-xxxx/xxxx"/> </RelativeLayout>
Mapper and Reducer
import sys import string import logging from util import mapper_logfile logging.basicConfig(filename=mapper_logfile, format='%(message)s', level=logging.INFO, filemode='w') def mapper(): for line in sys.stdin: data = line.strip().split(",") if len(data) != 12 or data[0] == 'Register': continue print "{0}\t{1}".format(data[3], data[8]) mapper()
import sys import logging from util import reducer_logfile logging.basicConfig(filename=reducer_logfile, format='%(message)s', level=logging.INFO, filemode='w') def reducer(): aadhaar_generated = 0 old_key = None for line in sys.stdin: data = line.strip().split("\t") if len(data) != 2: continue this_key, count = data if old_key and old_key != this_key: print "{0}\t{1}".format(old_key, aadhaar_generated) aadhaar_generated = 0 old_key = this_key aadhaar_generated += float(count) if old_key != None: print "{0}\t{1}".format(old_key, aadhaar_generated) reducer()
Mapreduce programming model -> HADOOP!
(1)Hive, (2)Pig
mahout, giraph, cassandra
Using Mapreduce with Subway data
Mapper
def mapper(): for line in sys.stdin: data = line.strip.split("") for i in data: cleaned_data = i.translate(string.maketrans("",""), string.punctuation).lower() print "{0}\t{t}".format(cleaned_data,1) mapper()
Reduce stage -> reducer
import sys def reducer(): word_count = 0 old_key = None for line in sys.stdin: data = line.strip().split("\t") if len(data) != 2: continue if old_key and old_key != this_key: print"{0}\t{1}".format(old_key, word_count) word_count = 0 old_key = this_key word_count += float(count) if old_key != None: print "{0}\t{1}".format(old_key, word_count)
#! /bin/bash cat ../../data/aliceInWorderland.txt | python word_count_mapper.py | sort | python word_count_reducer.py
Scatter Plot
Line Chart
– Mitigate some shortcomings of scatterplot
– Emphasize trends
– Focus on year to year variability, not overall trends
LOESS Curve
– Emphasize long term trends
– LOESS weighted regression
– Easier to take a quick look at chart and understand big picture
Multivariate
– How to incorporate more variables
– Use an additional encoding
– Size
– Color / Saturation
Basics of mapreduce
Mapreduce is parallel programming model
python dictionary
{“alice”:1, “was”:1, “of”:2,…”do”:1}
Data Types
– Numeric data
A measurement (e.g. height, weight) or count(e.g. HR or hit)
Discrete and continuous
discrete: whole numbers(e.g., 10, 34, 25)
continuous: only number within range(e.g., 250, /357., .511)
Categorical Data
Represent characteristics(e.g., position, team, hometown, handedness)
can take on numerical values, but they don’t it have mathematical meaning
ordinal data
categories with ome orders or ranking
vary low average high
Time- series data
– data collected via repeated measurements over time
– example: average HR
from pandas import * from ggplot import * def lineplot_compare(hr_by_team_year_sf_la_csv): hr_year = pandas.read_csv('hr_by_team_year_sf_la.csv') print ggplot(hr_year, aes('yearID', 'HR', color='teamID')) + geom_point() + geom_line() + ggtitle('Total HRs by Year') + xlab('Year') + ylab('HR') if __name__ == '__main__': lineplot_compare()
Effective Information Visualization
-effective communication of complex quantitative ideas
clarity, precision, efficiency
Visual Encoding
position x, y
Length A, B, C
Angle
Visual Encoding: Direction, Shape, Area/Volume
Color: Hue, Saturation
Combination: min, max
Limit Hues
Plotting in Python
– Many packages
– matplotlib <- very popular
- ggplot <- use this, looks nicer, grammer of graphics
ggplot(data, qes(xvar, yvar)) + geom_point() + geom_line()
first step: create plot
second step: represent data with geometric objects
third step: add labels
from pandas import * from ggplot import * def lineplot(hr_year_csv): hr_year = pandas.read_csv(‘hr_year.csv’) print ggplot(hr_year, aes(‘yearID’, ‘HR’)) + geom_point(color=’red’) + geom_line(color=’red’) + ggtitle(‘Total HRs by Year’) + xlab(‘Year’) + ylab(‘HR’) if __name__ == ‘__main__’: lineplot()
Coefficient of Determination
Coefficient of Determination
-data = yi … yn
-predictions = fi..fn
-average of data = y
R^2 = 1 – Σn(yi-fi)/Σn(yi-y)^2
Calculating R^2
import numpy as np def compute_r_squared(data, predictions): SST = ((data-np.mean(data))**2).sum() SSReg = ((predictions-data)**2).sum() r_squared = 1 - SSReg / SST return r_squared
Additional Considerations
– other types of linear regression
– ordinary least squares regression
– parameter estimation
– under / overfitting
– multiple local minima
Types of Machine Learning
Different types of learning
Data -> Model -> Predictions
Supervised Learning
-trying to understand structure of data
-clustering
Linear Regression with gradient descent
mΣi=1*(Ypredicted – Yactual)^2
Gradient Descent – Cost Function: J(Θ)
Minimize J(Θ) … how?
import numpy import pandas def compute_cost(features, values, theta): m = len(values) sum_of_square_errors = numpy.square(numpy.dot(features, theta) - values).sum() cost = sum_of_square_errors / (2*m) return cost def gradient_descent(features, values, theta, alpha, num_iterations): cost_history = [] return theta, pandas.Series(cost_history)