Regression Continued: Functional Form Notes/Fall... 4 Functional Form Functional Form A first point

download Regression Continued: Functional Form Notes/Fall... 4 Functional Form Functional Form A first point

of 37

  • date post

    08-Jul-2020
  • Category

    Documents

  • view

    1
  • download

    0

Embed Size (px)

Transcript of Regression Continued: Functional Form Notes/Fall... 4 Functional Form Functional Form A first point

  • 1

    Regression Continued: Functional Form

    LIR 832

    Topics for the Evening

    1 Qualitative Variables1. Qualitative Variables 2. Non-linear Estimation

  • 2

    Functional Form

    Not all relations among variables are linear:Not all relations among variables are linear: Our basic linear model:

    y=β0+ β1X1 + β2X2 +…+ βkXk + e

    Functional Form

    Q: Given that we are using OLS can weQ: Given that we are using OLS, can we mimic these non-linear forms? A: We have a small bag of tricks which we can use with OLS.

  • 3

    Functional Form

    Functional Form

  • 4

    Functional Form

    Functional Form

    A first point about functional form: You must have anA first point about functional form: You must have an intercept.

    Consider the following case: We estimate a model and test the intercept to determine if it is significantly different than zero. We are not able to reject the null in a hypothesis test and we decide to re-estimate the model without an intercept. What is really going on? Return to our basic model:

    y=β0+ β1X1 + β2X2 +…+ βkXk + e What are we doing when we remove the intercept?

    y=0+ β1X1 + β2X2 +…+ βkXk + e

  • 5

    Functional Form

    Functional Form

  • 6

    Functional Form /* Regression without an intercept */ Regression Analysis: weekearn versus years edRegression Analysis: weekearn versus years ed

    The regression equation is weekearn = 57.3 years ed

    47576 cases used, 7582 cases contain missing values

    Predictor Coef SE Coef T P Noconstant years ed 57.3005 0.1541 371.96 0.000

    S = 534.450

    Functional Form /* Regression with an intercept */ Regression Analysis: weekearn versus years edRegression Analysis: weekearn versus years ed

    The regression equation is weekearn = - 485 + 87.5 years ed

    47576 cases used, 7582 cases contain missing values

    Predictor Coef SE Coef T P Constant -484.57 18.18 -26.65 0.000 years ed 87.492 1.143 76.54 0.000

    S = 530.510 R-Sq = 11.0% R-Sq(adj) = 11.0%

  • 7

    Functional Form

    Consequences of forcing through zero:Consequences of forcing through zero: Unless the intercept is really zero, we are going to bias both the intercept and the slope coefficients. Remember that we calculate the intercept so that the line passes through the point of means:

    Assures that the Σε = 0 If we impose 0 as the intercept, the line may not pass through the

    i t f d th f th t lpoint of means and the sum of the errors may not equal zero. Biases the coefficients and leads to incorrect estimates of the standard errors of the βs.

    Never suppress the intercept, even if your theory suggests that it is not necessary.

    Functional Form /* What About Those Residuals? */

    S S SDescriptive Statistics: RESI1, RESI2

    Variable N N* Mean SE Mean StDev Minimum Q1 Median RESI1 47576 7582 -8.67 2.45 534.38 -1180.31 -359.12 -122.21 RESI2 47576 7582 0.00 2.43 530.50 -1329.77 -340.32 -107.62

    Variable Q3 Maximum RESI1 218.59 2311.61 RESI2 237.69 2494.26

  • 8

    Functional Form

    Returning to the issue of non-linearityReturning to the issue of non-linearity… In our basic model:

    β = ΔY/ΔX = change in Y for a one-unit change in X Consider the effect of Education on base salary…

    Functional Form Descriptive Statistics: years ed, Exp

    Variable N N* Mean SE Mean StDev Minimum Q1 Median Q3 Maximum years ed 55158 0 15.734 0.00941 2.211 1.000 14.000 16.000 18.000 21.000 Exp 55107 51 21.644 0.0496 11.640 0.0000 13.000 22.000 30.000 76.000

    Regression Analysis: weekearn versus years ed

    The regression equation is weekearn = - 485 + 87.5 years ed

    47576 cases used, 7582 cases contain missing values

    Predictor Coef SE Coef T P Constant -484.57 18.18 -26.65 0.000 years ed 87.492 1.143 76.54 0.000

    S = 530.510 R-Sq = 11.0% R-Sq(adj) = 11.0%

  • 9

    Functional Form

    Now create a graph in MINITAB:Now create a graph in MINITAB: Work in a new worksheet: Create values for years of education 0 - 21 Use the calculator to create the predicted weekly earnings. Use the scatterplot graphing function:

    Functional Form

    Every year of education increases earnings by $87.49!

  • 10

    Functional Form

    Q: How do we estimate non-linear relations?Q: How do we estimate non linear relations? A: We can use log transforms of variables to measure relations between variables as percentages rather than units.

    What is a log? What is a log transform? Take any number, let’s take 10. Then calculate b such that 10 = 2.71828b. Then b is the log of 10. In this case b = 2.302585. You can do this on your calculator, in a spreadsheet, or in MINITAB.

    Functional Form

    As your text shows:As your text shows: ln(100) = 4.605 100 = 2.71828b ln(1000) = 6.908 1000 = 2.71828b ln(10,000) = 9.210 10,000 = 2.71828b ln(1,000,000) = 13.816 1,000,000 = 2.71828b

    We typically do not write 2.71828, rather we b tit t th t l b (th l b 10substitute e the natural base (there are also base 10

    logs). So… 10 = e2.302585 Some nice properties of log functions:

    ln(X*Y) = ln(X) + ln(Y) ln(X2) = 2*ln(X)

  • 11

    Functional Form

    This property made it possible to manipulate very large p p y p p y g numbers very easily and provides the foundation for slide rules and many modern computer calculations.

    Consider: 1,212,345*375,282 A real mess to do by hand

    Now consider the following transformation of this problem: ln(1,212,345*375,282)

    =ln(1 212 345) + ln(375 282)=ln(1,212,345) + ln(375,282) =14.008067 + 12.83543 = 26.8435 = 2.7182826.8435 = antilog(26.8435) = 45,484,956.5078803

    Functional Form

    The Shell presentation has an equation associated with an p q upward curve of:

    Earnings = 62988x0.2676 Or… y=β0Xβ1

    We cannot estimate this in its current form using regression, but think about taking the log of each side:

    ln(y) = ln(β0Xβ1) ln(y) = ln(β )+ln(Xβ1)ln(y) = ln(β0)+ln(Xβ1) ln(y) = ln(β0)+β1ln(X)

    So, if we take the log of each side, we get a linear equation that we can estimate!

  • 12

    Functional Form

    Consider the following equation: (single logConsider the following equation: (single log equation)

    ln(weekearn) = β0 + β1*YearsEd + e The interpretation of the coefficient on years of education is now the % change in base salary for a 1 year change in Education. H t d thi i MINITABHow to do this in MINITAB:

    Calculate the log of weekly earnings Estimate the regression as…

    Functional Form Regression Analysis: ln week earn versus years ed

    The regression equation is ln week earn = 4.87 + 0.109 years ed

    47576 cases used, 7582 cases contain missing values

    Predictor Coef SE Coef T P Constant 4.86646 0.02382 204.33 0.000 years ed 0.108980 0.001497 72.78 0.000

    S = 0.694967 R-Sq = 10.0% R-Sq(adj) = 10.0%

    Analysis of Variance

    Source DF SS MS F P Regression 1 2558.4 2558.4 5297.03 0.000 Residual Error 47574 22977.3 0.5 Total 47575 25535.6

  • 13

    Functional Form

    Now we find that an additional year of educationNow we find that an additional year of education results in a 10.98% increase in salary.

    Interpretation is different from linear model r2 is different between linear and log model.

    Linear: r2 =11.0% Log: r2 = 10.0%

    Does this mean the fit of the log model is worse than the g linear model? No, cannot compare the two because you have transformed the equation. Fundamentally altered the variance of the dependent variable.

    Functional Form Descriptive Statistics: weekearn, ln week earn

    Variable N N* Mean SE Mean StDev Minimum Q1 Median weekearn 47576 7582 894.53 2.58 562.22 0.01 519.00 769.23 ln week earn 47576 7582 6.5843 0.00336 0.7326 -4.6052 6.2519 6.6454

    Variable Q3 Maximum weekearn 1153.00 2884.61 ln week earn 7.0501 7.967

    What Does the Log Model Look Like? -- How to create aWhat Does the Log Model Look Like? How to create a prediction in MINITAB & graph:

    Use regression equation to create estimated log wage from years of education data Exponentiate the predicted value using the MINITAB calculator Graph predicted wage against years of education

  • 14

    Functional Form

    Functional Form

    What is the equation underlying this model?What is the equation underlying this model?

    Model of growth (such as compound interest)interest)…

  • 15

    Functional Form

    Now lets try another approach, taking the log of bothNow lets try another approach, taking the log of both sides (double log equation):

    The interpretation of the coefficient on JEP is now the % change in base salary for a 1 % change in JEP. Note that this is an elasticity (which you will discuss in 809 in talking about supply and demand – the elasticity of labor demand with respect to the wage is the % change in the demand for labor for a 1% change in the wage).

    Fun