Distribution of variables

Inspecting Distribution of Variables
f0: Average Submitted Charge
f1: Average Payment Amount
f2: Average Allowed Amount

g0 = f_CA.average_submitted_chrg_amt.values
g1 = f_CA.average_Medicare_payment_amt.values
g0 = f_CA.average_Medicare_allowed_amt.values

n0, bins0, patches0=plt.hist(g0,50,normed=0, range=(0,1000),histtype='stepfilled')
n2, bins2, patches2=plt.hist(g2,50,normed=0, range=(0,1000),histtype='stepfilled')
plt.setp(patches0, 'facecolor','g','alpha', 0.75)
plt.setp(patches2, 'facecolor','b','alpha', 0.75)

n0, bins0, patches0=plt.hist(f0,50,normed=1,log=range=(0,1000))

Scale variable we can
f0 – f0.min() / (f0.max() – f0.min())

n0, bins0, patches0=plt.hist((f0-f0.min())/(f0.max()-f0.min()),50,normed=1, log=1,range(-0.2,1.2),histtype='stepfilled')
n1, bins1, patches1=plt.hist((f2-f2.min())/(f2.max()-f2.min()),40,normed=1, log=1,range(-0.2,1,2),histtype='stepfilled')
plt_step(patches0,'facecolor','g','alpha',0.75)
plt_step(patches1,'facecolor','r','alpha',0.75)

The range of scaled variables: [0, 1]

Calculating Correlations
f1, f2: linearly correlated
Converiance: E[(x-μx)(y-μy)]
Pearson’s Correlation Co-efficient: Px,y = cov(x,y)/αxαy= E[(x-μx)(y-μy)]/αxαy

Parametric, Non Parametric, Mathematical
Kenerl K, Density D, Estimates E