Data Types

– Numeric data
A measurement (e.g. height, weight) or count(e.g. HR or hit)
Discrete and continuous
discrete: whole numbers(e.g., 10, 34, 25)
continuous: only number within range(e.g., 250, /357., .511)

Categorical Data
Represent characteristics(e.g., position, team, hometown, handedness)
can take on numerical values, but they don’t it have mathematical meaning
ordinal data
categories with ome orders or ranking
vary low average high

Time- series data
– data collected via repeated measurements over time
– example: average HR

from pandas import *
from ggplot import *

def lineplot_compare(hr_by_team_year_sf_la_csv):
	hr_year = pandas.read_csv('hr_by_team_year_sf_la.csv')
	print ggplot(hr_year, aes('yearID', 'HR', color='teamID')) + geom_point() + geom_line() + ggtitle('Total HRs by Year') + xlab('Year') + ylab('HR')

if __name__ == '__main__':
	lineplot_compare()