Simple Linear Regression

download Simple Linear Regression

of 15

  • date post

    13-Mar-2016
  • Category

    Documents

  • view

    24
  • download

    3

Embed Size (px)

description

Simple Linear Regression. Often we want to understand the relationships among variables, e.g., SAT scores and college GPA car weight and gas mileage amount of a certain pollutant in wastewater and bacteria growth in local streams - PowerPoint PPT Presentation

Transcript of Simple Linear Regression

  • Simple Linear RegressionOften we want to understand the relationships among variables, e.g.,SAT scores and college GPAcar weight and gas mileageamount of a certain pollutant in wastewater and bacteria growth in local streamsnumber of takeoffs and landings and degree of metal fatigue in aircraft structuresSimplest relationship

    Y = 0 + 1xETM 620 - 09U*

    Chart1

    2.07

    2.8

    3.14

    2.26

    3.4

    3.89

    2.93

    2.66

    3.33

    3.54

    Predictor variable, x

    Response variable, y

    Excel Regression

    SUMMARY OUTPUT

    Regression Statistics

    Multiple R0.8848890612

    R Square0.7830286506

    Adjusted R Square0.7559072319

    Standard Error0.2826250474

    Observations10

    ANOVA

    dfSSMSFSignificance F

    Regression12.30614466062.306144660628.87122756890.0006670461

    Residual80.63901533940.0798769174

    Total92.94516

    CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%

    Intercept-7.87718668412.0266861397-3.88673240010.0046305403-12.5507363253-3.2036370428-12.5507363253-3.2036370428

    Attendance, x0.08675587470.01614604915.37319528480.00066704610.04952299470.12398875460.04952299470.1239887546

    2.3060056265

    0.0499998835

    Sheet1

    Attendance, xAmount Bet, YY(hat)ResidualResidual2(Y-Yave)2

    1172.072.27325065270.20325065270.04131082780.868624

    1282.83.22756527420.42756527420.18281206370.040804

    1223.142.7070300261-0.43296997390.18746299830.019044

    1192.262.44676240210.18676240210.03488019480.550564

    1313.43.48783289820.08783289820.0077146180.158404

    1353.893.8348563969-0.05514360310.0030408170.788544

    1252.932.96729765010.03729765010.00139111470.005184

    1202.662.5335182768-0.12648172320.01599762630.116964

    1303.333.40107702350.07107702350.00505194330.107584

    1273.543.1408093995-0.39919060050.15935313550.289444

    Total:125430.0230.020.63901533942.94516

    Average:125.43.0023.002

    a)Finding sums of squares: (see pp. 356-357)

    SSTO =S(Yi - Y(hat))2 =2.94516

    SSR =2.306145

    SSE =0.639015

    b)coefficient of determination, R2

    R2 =SSR/SSTO =0.7830286506= 1- SSE/SSTO

    c)ANOVA:

    H0 :b1 = 0Translation:

    H1 :b1 = 0

    Source:SSdfMSF

    RegressionSSR1

    ErrorSSEn-2

    TotalSSTOn-1

    d)Confidence Intervals:x(x-xave)2

    11770.56

    1. Around b0 :b0(hat) + t(a/2; n-2) s(b0(hat))1286.76

    12211.56

    t(a/2; n-2) =2.306005626511940.96

    s(b0(hat)) =2.026686139713131.36

    13592.16

    C.I. = [-12.55074-3.2036370428]1250.16

    12029.16

    13021.16

    1272.56

    Total:1254306.4

    Average:125.430.64

    2. Around the mean response:

    t(a/2; n-2) =2.3060056265

    s(b0(hat)+b1(hat)*x) =

    [=s(Y(hat))]

    xYY(hat)s(Y(hat))LowerUpper

    1172.072.27325065270.43810274291.26298326263.2835180428

    1282.83.22756527420.16003333092.85852751273.5966030356

    1223.142.70703002610.1952537382.25677380773.1572862446

    1192.262.44676240210.33877473771.6655459513.2279788532

    1313.43.48783289820.29956904462.79702499594.1786408004

    1353.893.83485639690.49824101782.68590980664.9838029872

    1252.932.96729765010.09167771582.75588832163.1787069787

    1202.662.53351827680.28983844581.865149193.2018873635

    1303.333.40107702350.25129809522.82158220213.9805718448

    1273.543.14080939950.12108460342.86158762273.4200311763

    125430.0230.02

    125.43.0023.002

    2. Around the prediction:

    t(a/2; n-2) =2.3060056265

    s(b0(hat)+b1(hat)*x) =

    [=s(Y(hat))]

    xYY(hat)s(Y(hat))LowerUpper

    1172.072.27325065270.52135489911.07100332213.4754979833

    1282.83.22756527420.32478852262.47860111363.9765294347

    1223.142.70703002610.34351264841.91488792613.4991721262

    1192.262.44676240210.44118617421.42938460213.4641402021

    1313.43.48783289820.41184770232.53810977954.4375560169

    1353.893.83485639690.57281849592.51393372245.1557790713

    1252.932.96729765010.29712240072.28213172233.6524635779

    1202.662.53351827680.40482495241.59998965883.4670468947

    1303.333.40107702350.37818996562.5289688354.273185212

    1273.543.14080939950.3074709722.43177960813.8498391909

    125430.0230.02

    125.43.0023.002

    YOUR TURN:

    Predict the amount bet and calculate the confidence intervals around the mean response and

    the prediction if the attendance is:

    126

    114

    Translation: The R2 value indicates that ___________% of the variability in the amount bet is explained by the regression line.

    Sheet1

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    Predictor variable, x

    Response variable, y

    Sheet2

    Sheet3

  • ExampleAn electric power cooperative is concerned about the cost of power outages in the winter and the analyst has an idea that these costs are directly related to the average temperature during the outage period. A random sampling of power outages over a number of years was conducted and the cost per 100 homes (adjusted for inflation) was determined, with these results:ETM 620 - 09U*

    Temp, FCost/ Outage45$3,639 42$4,111 44$3,928 37$4,252 33$5,020 45$3,838 35$4,293 38$4,244 39$4,227 40$4,111 30$5,335

    Chart1

    3639

    4111

    3928

    4252

    5020

    3838

    4293

    4244

    4227

    4111

    5335

    Avg. Cost/ Outage

    Temperature

    Cost

    Sheet4

    SUMMARY OUTPUT

    Regression Statistics

    Multiple R0.9318277576

    R Square0.8683029698

    Adjusted R Square0.8536699664

    Standard Error189.4337334352

    Observations11

    ANOVA

    dfSSMSFSignificance F

    Regression12129376.473003982129376.4730039859.33867084790.0000299026

    Residual9322966.25426874535885.1393631939

    Total102452342.72727273

    CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%

    Intercept7900.6087602079474.434405171516.65268933720.00000004546827.36357446548973.85394595046827.36357446548973.8539459504

    Temp-93.244617668912.1047232738-7.70315979630.0000299026-120.6274040705-65.8618312672-120.6274040705-65.8618312672

    RESIDUAL OUTPUT

    ObservationPredicted Avg. Cost/ OutageResidualsStandard Residuals

    13704.6009651076-65.6009651076-0.365032603

    23984.3348181143126.66518188570.7048207443

    33797.8455827765130.15441722350.7242363833

    44450.5579064588-198.5579064588-1.1048634624

    54823.5363771344196.46362286561.093209948

    63704.6009651076133.39903489240.7422908621

    74637.0471417966-344.0471417966-1.9144295137

    84357.3132887899-113.3132887899-0.6305249427

    94264.068671121-37.068671121-0.2062663787

    104170.8240534521-59.8240534521-0.3328873275

    115103.270230141231.72976985891.2894462901

    Sheet4

    Temp

    Residuals

    Temp Residual Plot

    Sheet5

    SUMMARY OUTPUT

    Regression Statistics

    Multiple R0.9318277576

    R Square0.8683029698

    Adjusted R Square0.8536699664

    Standard Error189.4337334352

    Observations11

    ANOVA

    dfSSMSFSignificance F

    Regression12129376.472129376.4759.340.00

    Residual9322966.2535885.14

    Total102452342.73

    CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%

    Intercept7900.61474.4316.650.006827.368973.856827.368973.85

    Temp-93.2412.10-7.700.00-120.63-65.86-120.63-65.86

    RESIDUAL OUTPUTPROBABILITY OUTPUT

    ObservationPredicted Avg. Cost/ OutageResidualsStandard ResidualsPercentileAvg. Cost/ Outage

    13704.6009651076-65.6009651076-0.3650326034.54545454553639

    23984.3348181143126.66518188570.704820744313.63636363643838

    33797.8455827765130.15441722350.724236383322.72727272733928

    44450.5579064588-198.5579064588-1.104863462431.81818181824111

    54823.5363771344196.46362286561.09320994840.90909090914111

    63704.6009651076133.39903489240.7422908621504227

    74637.0471417966-344.0471417966-1.914429513759.09090909094244

    84357.3132887899-113.3132887899-0.630524942768.18181818184252

    94264.068671121-37.068671121-0.206266378777.27272727274293

    104170.8240534521-59.8240534521-0.332887327586.36363636365020

    115103.270230141231.72976985891.289446290195.45454545455335

    Sheet5

    45

    42

    44

    37

    33

    45

    35

    38

    39

    40

    30

    Temp

    Residuals

    Temp Residual Plot

    Sheet1

    36390

    41110

    39280

    42520

    50200

    38380

    42930

    42440

    42270

    41110

    53350

    Avg. Cost/ Outage

    Predicted Avg. Cost/ Outage

    Temp

    Avg. Cost/ Outage

    Temp Line Fit Plot

    Sheet7

    Sample Percentile

    Avg. Cost/ Outage

    Normal Probability Plot

    Sheet2

    Temp, FAvg. Cost/ Outage

    45$3,639

    42$4,111

    44$3,928

    37$4,252

    33$5,020

    45$3,838

    35$4,293

    38$4,244

    39$4,227

    40$4,111

    30$5,335

    Sheet2

    Avg. Cost/ Outage

    Temperature

    Cost

    Sheet3

    SUMMARY OUTPUT

    Regression Statistics

    Multiple R0.9999820029

    R Square0.9999640062

    Adjusted R Square0.9999550077

    Standard Error3.3216948422

    Observations11

    ANOVA

    dfSSMSFSignificance F

    Regression22452254.458019731226127.22900986111126.0999594150

    Residual888.269252998711.0336566248

    Total102452342.72727273

    CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%

    Intercept34.193111099246.73157594590.73169180380.4852354688-73.5700961876141.956318386-73.5700961876141.956318386

    Temp-1.30035551120.5778759181-2.25023308720.0545474615-2.63293976690.0322287445-2.63293976690.0322287445

    Households14.79411349540.0864827441171.0643395506014.594683930114.9935