• date post

16-Apr-2017
• Category

## Data & Analytics

• view

443

4

Embed Size (px)

### Transcript of Logistic Regression: Behind the Scenes

• Logistic Regression: Behind the Scenes

Chris White

Capital One

October 9, 2016

Logistic Regression October 9, 2016 1 / 20

• Outline Logistic Regression: A quick refresher

Generative Model

yi |, xi Bernoulli((, xi )

)where

(, x) :=1

1 + exp ( x)

is the sigmoid function.

Logistic Regression October 9, 2016 2 / 20

• Outline Logistic Regression: A quick refresher

Interpretation

This setup implies that the log-odds are a linear function of the inputs

log

(P[y = 1]P[y = 0]

)= x

This linear relationship makes the individual coefficients k easy tointerpret: a unit increase in xk increases the odds of y = 1 by a factorof k (all else held equal)

Question: Given data, how do we determine the best values for ?

Logistic Regression October 9, 2016 3 / 20

• Outline Logistic Regression: A quick refresher

Interpretation

This setup implies that the log-odds are a linear function of the inputs

log

(P[y = 1]P[y = 0]

)= x

This linear relationship makes the individual coefficients k easy tointerpret: a unit increase in xk increases the odds of y = 1 by a factorof k (all else held equal)

Question: Given data, how do we determine the best values for ?

Logistic Regression October 9, 2016 3 / 20

• Outline Logistic Regression: A quick refresher

Interpretation

This setup implies that the log-odds are a linear function of the inputs

log

(P[y = 1]P[y = 0]

)= x

This linear relationship makes the individual coefficients k easy tointerpret: a unit increase in xk increases the odds of y = 1 by a factorof k (all else held equal)

Question: Given data, how do we determine the best values for ?

Logistic Regression October 9, 2016 3 / 20

• Outline Logistic Regression: A quick refresher

Maximum Likelihood Estimation

= arg max

i

( xi )yi (1 ( xi ))1yi

Log-likelihood is concave

Maximum Likelihood Estimation allows us to do classical statistics(i.e., we know the asymptotic distribution of the estimator)

Note: the likelihood is a function of the predicted probabilities andthe observed response

Logistic Regression October 9, 2016 4 / 20

• Outline Logistic Regression: A quick refresher

Maximum Likelihood Estimation

= arg max

i

( xi )yi (1 ( xi ))1yi

Log-likelihood is concave

Maximum Likelihood Estimation allows us to do classical statistics(i.e., we know the asymptotic distribution of the estimator)

Note: the likelihood is a function of the predicted probabilities andthe observed response

Logistic Regression October 9, 2016 4 / 20

• Outline Logistic Regression: A quick refresher

Maximum Likelihood Estimation

= arg max

i

( xi )yi (1 ( xi ))1yi

Log-likelihood is concave

Maximum Likelihood Estimation allows us to do classical statistics(i.e., we know the asymptotic distribution of the estimator)

Note: the likelihood is a function of the predicted probabilities andthe observed response

Logistic Regression October 9, 2016 4 / 20

• Outline Logistic Regression: A quick refresher

Maximum Likelihood Estimation

= arg max

i

( xi )yi (1 ( xi ))1yi

Log-likelihood is concave

Maximum Likelihood Estimation allows us to do classical statistics(i.e., we know the asymptotic distribution of the estimator)

Note: the likelihood is a function of the predicted probabilities andthe observed response

Logistic Regression October 9, 2016 4 / 20

• Outline This is not the end of the story

This is not the end of the story

Now we are confronted with an optimization problem...

The story doesnt end here!

Uncritically throwing your data into an off-the-shelf solver could result in abad model.

Logistic Regression October 9, 2016 5 / 20

• Outline This is not the end of the story

This is not the end of the story

Now we are confronted with an optimization problem...

The story doesnt end here!

Uncritically throwing your data into an off-the-shelf solver could result in abad model.

Logistic Regression October 9, 2016 5 / 20

• Outline This is not the end of the story

Important Decisions

Inference vs. Prediction: your goals should influence yourimplementation choices

Coefficient accuracy vs. speed vs. predictive accuracyMulticollinearity leads to both statistical and numerical issues

Desired Input (e.g., missing values?)

Desired Output (e.g., p-values)

Handling of edge cases (e.g., quasi-separable data)

Regularization and Bayesian Inference

Goal

We want to explore these questions with an eye on statsmodels,scikit-learn and SASs proc logistic.

Logistic Regression October 9, 2016 6 / 20

• Outline This is not the end of the story

Important Decisions

Inference vs. Prediction: your goals should influence yourimplementation choices

Coefficient accuracy vs. speed vs. predictive accuracyMulticollinearity leads to both statistical and numerical issues

Desired Input (e.g., missing values?)

Desired Output (e.g., p-values)

Handling of edge cases (e.g., quasi-separable data)

Regularization and Bayesian Inference

Goal

We want to explore these questions with an eye on statsmodels,scikit-learn and SASs proc logistic.

Logistic Regression October 9, 2016 6 / 20

• Outline This is not the end of the story

Important Decisions

Inference vs. Prediction: your goals should influence yourimplementation choices

Coefficient accuracy vs. speed vs. predictive accuracyMulticollinearity leads to both statistical and numerical issues

Desired Input (e.g., missing values?)

Desired Output (e.g., p-values)

Handling of edge cases (e.g., quasi-separable data)

Regularization and Bayesian Inference

Goal

We want to explore these questions with an eye on statsmodels,scikit-learn and SASs proc logistic.

Logistic Regression October 9, 2016 6 / 20

• Outline This is not the end of the story

Important Decisions

Inference vs. Prediction: your goals should influence yourimplementation choices

Coefficient accuracy vs. speed vs. predictive accuracyMulticollinearity leads to both statistical and numerical issues

Desired Input (e.g., missing values?)

Desired Output (e.g., p-values)

Handling of edge cases (e.g., quasi-separable data)

Regularization and Bayesian Inference

Goal

We want to explore these questions with an eye on statsmodels,scikit-learn and SASs proc logistic.

Logistic Regression October 9, 2016 6 / 20

• Outline This is not the end of the story

Important Decisions

Inference vs. Prediction: your goals should influence yourimplementation choices

Coefficient accuracy vs. speed vs. predictive accuracyMulticollinearity leads to both statistical and numerical issues

Desired Input (e.g., missing values?)

Desired Output (e.g., p-values)

Handling of edge cases (e.g., quasi-separable data)

Regularization and Bayesian Inference

Goal

We want to explore these questions with an eye on statsmodels,scikit-learn and SASs proc logistic.

Logistic Regression October 9, 2016 6 / 20

• Outline This is not the end of the story

Important Decisions

Inference vs. Prediction: your goals should influence yourimplementation choices

Coefficient accuracy vs. speed vs. predictive accuracyMulticollinearity leads to both statistical and numerical issues

Desired Input (e.g., missing values?)

Desired Output (e.g., p-values)

Handling of edge cases (e.g., quasi-separable data)

Regularization and Bayesian Inference

Goal

We want to explore these questions with an eye on statsmodels,scikit-learn and SASs proc logistic.

Logistic Regression October 9, 2016 6 / 20

• Outline This is not the end of the story

Important Decisions

Inference vs. Prediction: your goals should influence yourimplementation choices

Coefficient accuracy vs. speed vs. predictive accuracyMulticollinearity leads to both statistical and numerical issues

Desired Input (e.g., missing values?)

Desired Output (e.g., p-values)

Handling of edge cases (e.g., quasi-separable data)

Regularization and Bayesian Inference

Goal

We want to explore these questions with an eye on statsmodels,scikit-learn and SASs proc logistic.

Logistic Regression October 9, 2016 6 / 20

• Under the Hood How Needs Translate to Implementation Choices

Prediction vs. Inference

Coefficients and Convergence

In the case of inference, coefficient sign and precision are important.

Example: A model includes economic variables, and will be used forstress-testing adverse economic scenarios.

Logistic Regression October 9, 2016 7 / 20

• Under the Hood How Needs Translate to Implementation Choices

Prediction vs. Inference

Coefficients and Convergence

In the case of inference, coefficient sign and precision are important.

Example: A model includes economic variables, and will be used forstress-testing adverse economic scenarios.

Logistic Regression October 9, 2016 7 / 20

• Under the Hood How Needs Translate to Implementation Choices

Prediction vs. Inference, contd

Coefficients and Multicollinearity

Linear models are invariant under affine trans