Stat 306: Finding Relaonships in Data.

37
Stat 306: Finding Rela1onships in Data. Lecture 1 Introduc1on to Course

Transcript of Stat 306: Finding Relaonships in Data.

Page 1: Stat 306: Finding Relaonships in Data.

Stat306:FindingRela1onshipsinData.

Lecture1Introduc1ontoCourse

Page 2: Stat 306: Finding Relaonships in Data.

Themaintopicofthiscourseisregression,whichmeansfiEngpredic1onequa1ons.Regressionisacommonsta1s1calmethodinscien1ficresearch.

Stat306:FindingRela1onshipsinData.

Page 3: Stat 306: Finding Relaonships in Data.

Sta1s1cs–Recap:thetwosamplet-testAgevs.Money

Page 4: Stat 306: Finding Relaonships in Data.

Agevs.Money

Page 5: Stat 306: Finding Relaonships in Data.

Agevs.Money

Dependent variable Independent variable

Page 6: Stat 306: Finding Relaonships in Data.

Agevs.Money

Dependent variable

X

Independent variable

Y

Page 7: Stat 306: Finding Relaonships in Data.

Agevs.Money

Dependent variable

X

Independent variable

Y old(0)Young(1)

dollars($)Inbankaccount

Page 8: Stat 306: Finding Relaonships in Data.

Agevs.Money

Dependent variable

X

Independent variable

Y

Popula.on

Dependent variable

X

Independent variable

Y dollars($)Inbankaccount

old(0)young(1)

Page 9: Stat 306: Finding Relaonships in Data.

Agevs.Money

Dependent variable

X

Independent variable

Y

Popula.on

Dependent variable

X

Independent variable

Y dollars($)Inbankaccount

μ0

σ2

μ1

Popula1onparameters

old(0)young(1)

Page 10: Stat 306: Finding Relaonships in Data.

Agevs.Money

Dependent variable

X

Independent variable

Y

Popula.on

Dependent variable

X

Independent variable

Y dollars($)Inbankaccount

μ0

σ2

μ1

Mean money ($) for old people

Mean money ($) for young people

Variance ($) for ever=one

Popula1onparameters

old(0)young(1)

Page 11: Stat 306: Finding Relaonships in Data.

Agevs.Money

Dependent variable

X

Independent variable

Y

Popula.on

Dependent variable

X

Independent variable

Y dollars($)Inbankaccount

μ0, σ2μ1,

Popula1onparameters

HypothesisTestH0:μ0=μ1H1:μ0≠μ1

old(0)young(1)

Page 12: Stat 306: Finding Relaonships in Data.

Agevs.Money

Dependent variable

X

Independent variable

Y

Popula.on

Dependent variable

X

Independent variable

Y dollars($)Inbankaccount

μ0, σ2μ1,

Popula1onparameters

HypothesisTestH0:μ0=μ1H1:μ0≠μ1

“Null” hyBothesis

“AlterEative” hyBothesis

old(0)young(1)

Page 13: Stat 306: Finding Relaonships in Data.

Agevs.Money

Dependent variable

X

Independent variable

Y

Popula.on

Dependent variable

X

Independent variable

Y dollars($)Inbankaccount

μ0, σ2μ1,

Popula1onparameters

HypothesisTestH0:μ0=μ1H1:μ0≠μ1

Sample

old(0)young(1)

Page 14: Stat 306: Finding Relaonships in Data.

Agevs.Money

Dependent variable

X

Independent variable

Y

Popula.on

Dependent variable

X

Independent variable

Y dollars($)Inbankaccount

μ0, σ2μ1,

Popula1onparameters

HypothesisTestH0:μ0=μ1H1:μ0≠μ1

Sample

old(0)young(1)

Page 15: Stat 306: Finding Relaonships in Data.

Agevs.Money

Dependent variable

X

Independent variable

Y

Popula.on

Dependent variable

X

Independent variable

Y dollars($)Inbankaccount

μ0, σ2μ1,

Popula1onparameters

HypothesisTestH0:μ0=μ1H1:μ0≠μ1

Sampleold

young

old(0)young(1)

Page 16: Stat 306: Finding Relaonships in Data.

Agevs.Money

Dependent variable

X

Independent variable

Y

Popula.on

Dependent variable

X

Independent variable

Y dollars($)Inbankaccount

μ0, σ2μ1,

Popula1onparameters

HypothesisTestH0:μ0=μ1H1:μ0≠μ1

John

Paul Mar=

LisaAndy

TimPeter

RoseTony

Sample,n=9

old(0)young(1)

Page 17: Stat 306: Finding Relaonships in Data.

Agevs.Money

Dependent variable

X

Independent variable

Y

Popula.on

Dependent variable

X

Independent variable

Y dollars($)Inbankaccount

μ0, σ2μ1,

Popula1onparameters

HypothesisTestH0:μ0=μ1H1:μ0≠μ1

old

young

oldold

young

youngyoung

youngyoung

X y

Sample,n=9

old(0)young(1)

715443

452111304510

Page 18: Stat 306: Finding Relaonships in Data.

Agevs.Money

Dependent variable

X

Independent variable

Y

Popula.on

Dependent variable

X

Independent variable

Y dollars($)Inbankaccount

μ0, σ2μ1,

Popula1onparameters

HypothesisTestH0:μ0=μ1H1:μ0≠μ1

Sample,n=9

old

young

oldold

young

youngyoung

youngyoung

X y Samplesta1s1cs

old(0)young(1)

715443452111304510

Page 19: Stat 306: Finding Relaonships in Data.

Agevs.Money

Dependent variable

X

Independent variable

Y

Popula.on

Dependent variable

X

Independent variable

Y dollars($)Inbankaccount

μ0, σ2μ1,

Popula1onparameters

HypothesisTestH0:μ0=μ1H1:μ0≠μ1

Sample,n=9

old

young

oldold

young

youngyoung

youngyoung

X y Samplesta1s1cs

old(0)young(1)

t=2.68,df=7p-value=0.0395%C.I.=[3.4,54.6]

715443452111304510

Page 20: Stat 306: Finding Relaonships in Data.

Agevs.MoneyObjec.ve: Thepurposeofthisobserva1onalstudywasto

demonstrateif,andtowhatextent,ageis associatedwithmoney.

DesignandMethods: Wesurveyedanumberindividualsandforeach

determinedapproximateage(recordedas“old”or“young”) andtheamountofmoney(indollars)intheirbankaccounts. ComparisonofthetwogroupswasdoneusingaStudent twosamplet-test.

Results: Weobtainedarandomsampleofn=9subjects. The“young”grouphadanaverageof$27,whilethe “old”grouphadanaverageof$56.Thises1mateddifference of$29(95%C.I.=[$3.4,$54.6])issta1s1callysignificant,t=2.68, df=7;p-value=0.03.

Conclusions: Wefoundthat,ashypothesized,ageisassociated

withmoney.Onaverage,youngerpeoplehavelessintheiraccountsthanolderpeople.SmallPrint: Theanalysisrestsonthefollowingassump1ons:

- theobserva1onsareindependentlyandiden1callydistributed. - theindependentvariable,money,isnormallydistributed. - thetwopopula1onsbeingcomparedhavethesamevariance.

t=2.68,df=7p-value=0.0395%C.I.=[3.4,54.6]

Page 21: Stat 306: Finding Relaonships in Data.

0

20

40

60

80

100

Boxplot

Age

Mon

ey ($

)

Young Old

Page 22: Stat 306: Finding Relaonships in Data.

Agevs.Money

Dependent variable

X

Independent variable

Y old(1)young(0)

dollars($)Inbankaccount

LinearRegression

Page 23: Stat 306: Finding Relaonships in Data.

Agevs.Money

Dependent variable

X

Independent variable

Y old(1)young(0)

dollars($)Inbankaccount

LinearRegression

Page 24: Stat 306: Finding Relaonships in Data.

Agevs.Money

Dependent variable

X

Independent variable

Y dollars($)Inbankaccount

LinearRegression

AgeinYears

Page 25: Stat 306: Finding Relaonships in Data.

Agevs.Money

PREDICTOR variable

X

RESPONSE variable

Y dollars($)Inbankaccount

AgeinYears

LinearRegression

Page 26: Stat 306: Finding Relaonships in Data.

Agevs.Money

Popula.on

dollars($)Inbankaccount

β0

σ2

β1

Popula1onparameters

AgeinYears

PREDICTOR variable

X

RESPONSE variable

Y

Page 27: Stat 306: Finding Relaonships in Data.

Agevs.Money

Popula.on

dollars($)Inbankaccount

β0, σ2β1,

Popula1onparameters

HypothesisTestH0:β1=0H1:β1≠0

AgeinYears

PREDICTOR variable

X

RESPONSE variable

Y

Page 28: Stat 306: Finding Relaonships in Data.

Agevs.Money

Popula.on

dollars($)Inbankaccount

Popula1onparameters

HypothesisTest“Null” hyBothesis

“AlterEative” hyBothesis

β0, σ2β1,

H0:β1=0H1:β1≠0

AgeinYears

PREDICTOR variable

X

RESPONSE variable

Y

Page 29: Stat 306: Finding Relaonships in Data.

Agevs.Money

Popula.on

dollars($)Inbankaccount

Popula1onparameters

HypothesisTest

Sample

β0, σ2β1,

H0:β1=0H1:β1≠0

AgeinYears

PREDICTOR variable

X

RESPONSE variable

Y

Page 30: Stat 306: Finding Relaonships in Data.

Agevs.Money

Popula.on

dollars($)Inbankaccount

Popula1onparameters

HypothesisTest

Sample

β0, σ2β1,

H0:β1=0H1:β1≠0

AgeinYears

PREDICTOR variable

X

RESPONSE variable

Y

Page 31: Stat 306: Finding Relaonships in Data.

Agevs.Money

Popula.on

dollars($)Inbankaccount

Popula1onparameters

HypothesisTest

Sampleold

young

β0, σ2β1,

H0:β1=0H1:β1≠0

AgeinYears

PREDICTOR variable

X

RESPONSE variable

Y

Page 32: Stat 306: Finding Relaonships in Data.

Agevs.Money

Popula.on

dollars($)Inbankaccount

Popula1onparameters

HypothesisTest

John

Paul Mar=

LisaAndy

TimPeter

RoseTony

Sample,n=9

β0, σ2β1,

H0:β1=0H1:β1≠0

AgeinYears

PREDICTOR variable

X

RESPONSE variable

Y

Page 33: Stat 306: Finding Relaonships in Data.

Agevs.Money

Popula.on

dollars($)Inbankaccount

Popula1onparameters

HypothesisTest

β0, σ2β1,

H0:β1=0H1:β1≠0

Sample,n=9

82

22

4571

29

129

1824

X y

71

54

43452111304510

AgeinYears

PREDICTOR variable

X

RESPONSE variable

Y

Page 34: Stat 306: Finding Relaonships in Data.

Agevs.Money

Popula.on

dollars($)Inbankaccount

Popula1onparameters

HypothesisTest

Sample,n=9Samplesta1s1cs

β0, σ2β1,

H0:β1=0H1:β1≠0

82

22

4571

29

129

1824

X y

71

54

43452111304510

AgeinYears

PREDICTOR variable

X

RESPONSE variable

Y

b0=17.7b1=0.55s=15.5R2=0.49

Page 35: Stat 306: Finding Relaonships in Data.

Agevs.Money

Popula.on

dollars($)Inbankaccount

Popula1onparameters

HypothesisTest

Sample,n=9Samplesta1s1cs

β0, σ2β1,

H0:β1=0H1:β1≠0

82

22

4571

29

129

1824

X y

71

54

43452111304510

AgeinYears

PREDICTOR variable

X

RESPONSE variable

Y

b0=17.7b1=0.55s=15.5R2=0.49

Forparameterβ1:

Page 36: Stat 306: Finding Relaonships in Data.

Agevs.MoneyObjec.ve: Thepurposeofthisobserva1onalstudywasto

demonstrateif,andtowhatextent,ageis associatedwithmoney.

DesignandMethods: Wecollectedarandomsampleofindividualsandforeach

determinedtheirage(recordedinyears)andtheamount ofmoney(indollars)intheiraccounts.Analysisof thedatawasdoneusinglinearregression.

Results: Weobtainedarandomsampleofn=9subjects. Thereisa

sta1s1callysignificantassocia1onbetweenageandmoney(p-value=0.036). Foreveryaddi1onalyearinage,anindividual’samountofmoneyincreases onaveragebyanes1matedof$0.55(95%C.I.=[$0.05,$1.05]).

Conclusions: Wefoundthat,ashypothesized,ageisassociatedwithmoney. Inoursampleageaccountedforabouthalfofthevariability observedinmoney(R2=0.49).Wepredictthata50yearoldwill have$45.1(95%P.I.=[$5.6,$84.5]),whereasa40year oldwillhave$39.6(95%P.I.=[$0.8,$78.4]).

SmallPrint: Theanalysisrestsonthefollowingassump1ons:

- theobserva1onsareindependentlyandiden1callydistributed. - theresponsevariable,money,isnormallydistributed. - Homoscedas1cityofresidualsorequalvariance. - therela1onshipbetweenresponseandpredictorvariablesislinear.

Forparameterβ1:

Page 37: Stat 306: Finding Relaonships in Data.

0 20 40 60 80 100

0

20

40

60

80

100

Age (years)

Mon

ey ($

)