The Questioning Phase

Questioning
Modeling
Validating
=>
Answers!

e.g.
“Can we predict the next time a person will tweet?”
=> time of day

regression estimator, hypothesis test, classification

r(time since last tweet(Δt)) = time next tweet

Prepare data for histogram

tweetsDF = pandas.io.json.read_json("new_gruber_tweets.json")
createdDF = tweetsDF.ix[0:, ["created_at"]]
createdTextDF = tweetsDF.ix[0:, ["created_at", "text"]]
createdTextVals = createdTextDF.values

Collect "created_at" attributes for each tweetsDF

tweetTimes = []
for i, row in createdDF.iterrows():
	tweetTimes.append(row["created_at"])
tweetTimes.sort()

Create initial histogram

timeToNextSeries.hist(bins=30, normed=True)
<matplotlib.axes.AxesSubplot at 0x10c625390>