Least squares and Regression Techniquesstockage.univ-brest.fr/~herbette/...least-squares.pdf ·...
Transcript of Least squares and Regression Techniquesstockage.univ-brest.fr/~herbette/...least-squares.pdf ·...
Least squares andRegression Techniques
Goodness of fits (and tests)Non linear least square techniques
Glover, D. M., W. J. Jenkins, S. C. Doney: ModelingMethods for Marine Science, Cambridge UniversityPress, Chapter 3
I. Basics Statistics used for Regression1. The Chi Squared χ2
How can we judge the goodness of a fit (outside from “eyeball”) ?
The “BEST” fit wants to reduce the “distance” between the collecteddata an the model.
If the distribution has a Gaussian nature, then the “chi-squared” χ2 provides a standard measure of this distance:
“estimation from our model” “real” (collected) data
sample index(ex: time index)
Uncertainty in the individualmeasurement y
i
I. Basics Statistics used for Regression1. The Chi Squared χ2
σi could :
● be the size of your smallest graduation on your measuring stick● be related to some fundamental physical limitation of your
measurement technique● depend on some internal statistics associated with the measurement
(ex: you may take as your measurement the “actual time average”over some given time)
I. Basics Statistics used for Regression2. The reduced Chi Squared χ2
The Root Mean Square (RMS) deviation normalized tomeasurement error should tend to one if things are working“correctly”. Here we define such a measure, as the reducedChi-Squared:
degrees of freedom
number ofcollected samples
number of parametersused by “your
regression fit” (your“model”)
I. Basics Statistics used for Regression2. The reduced Chi Squared χ2
Example 1:
You have N measurements
You “best estimate” is the mean:
Your “model” has therefore ONE parameter: n=1
And you have N-1 degrees of freedom (or N-1 independent variables.Knowing the mean and N-1 variables, you can deduce the Nth one)
I. Basics Statistics used for Regression2. The reduced Chi Squared χ2
Example 2:
You have N measurements
You “best estimate” is a linear fit (regression)
Your “model” has therefore TWO parameters: n=2
And you have N-2 degrees of freedom
I. Basics Statistics used for Regression2. The reduced Chi Squared χ2
If you are “doing a bad job” at collecting your measurements, or if your“model” is inappropriate, then your reduced chi-squared will havelarge values (much larger than 1)
If you have been too pessimistic about your measurement errors, thanyour chi-squared value will be very small (<0.1)
I. Basics Statistics used for Regression3. Look at the residuals
A good chi-squared” may not mean that you have a good fit (or “model”)
Always look at the “shape” of the residuals
Minimizing the chi-squared is thefoundation of all the least squares
regression techniques !
II. Least squares fitting a straight line Introduction
The most common data regression model (aside from the mean andstandard deviation) is the fit to a straight line.
We therefore define the following “mode”l:
This model is based on TWO parameters: a1 and a2
yi is the dependent variable
xi is the independent variable
II. Least squares fitting a straight line Introduction
We want to find the “BEST” estimates for the two parameters a1 and a2
TYPE I regression techniques: no uncertainty on the dependent variable x: σ
x~ 0
The “BEST” estimates for the two parameters are the ones thatminimize the chi-squared, i.e the VERTICAL distance betweenthe estimated y values and the measured y values
II. Least squares fitting a straight line Introduction
II. Least squares fitting a straight line 1. The normal equations
II. Least squares fitting a straight line 1. The normal equations
“normal”equations
Cramers'rule:
II. Least squares fitting a straight line 2. Uncertainties in the coefficients
Error onmeasurements
II. Least squares fitting a straight line 2. Uncertainties in the coefficients
If there is no systematic error (uncorrelated noise)between TWO distinct measurements (taken at adifferent time or at a different location, i≠j), then: - the cross-terms cancel
sample meanof the error
square(”dispersion”)
II. Least squares fitting a straight line 2. Uncertainties in the coefficients
Back to the Type I regression fitting to a straight line:
The amplitude of the error does NOT depend on the collected data (the yivalues)
The amplitude of the error depends: - on where (or at which time) you made the measurements (the xi values) - the uncertainties in the measurements (the σi values)
Maximizing Δ (the determinant of A) is a good thing !
II. Least squares fitting a straight line 2. Uncertainties in the coefficients
Maximizing Δ is equivalent to maximizing the spreading of themeasurements in time or space (increasing the range of the x-values):the larger you spread the data, the lower the uncertainty on theintercept and the slope. We want to spread the cloud of data around thecentroid.
Increasing the number of measurements ?It makes Sxx (which is always positive) grow.Therefore the determinant Δ also grows
II. Least squares fitting a straight line 2. Uncertainties in the coefficients
Largely spread data improves the uncertainty on the slope
II. Least squares fitting a straight line 2. Uncertainties in the coefficients
Poorly spread data far away for the x=0 axis leads to largeuncertainty on the interceptor
II. Least squares fitting a straight line 3. Uncertainties in the estimated y-values
II. Least squares fitting a straight line 4. Type 2 regression
for (2 independent variables)
We have assumed so far that we know x infinitely well
What about if we also have uncertainties on both the yi values and thexi values ?
Should we perform a fit of y against x or x against y ? If you performboth on scattered data, you will get significant difference in thepredicted slopes
Minimizing the vertical distances between the y-data and the fit is nowINCORRECT
You should consider the “TRUE” distance, and minimize theperpendicular distance
II. Least squares fitting a straight line 4. Type 2 regression
for (2 independent variables)
For a straight line, it becomes:
III. General Least squares techniques
You can derive the normal equations for any set of basis functions. Basisfunctions can be thought as building blocks for describing your data.
The more complicated the functions, the more difficult it is to write the normalequations, and the more the risk that the solution to the normal equationsbecomes numerically ill-behaved.
Example: polynomial sharpness
III. General Least squares techniquesThe design matrix approach
Example:
Linear in the parameter spaceCan be fit with linear least squares
Problem: A is not a square matrixWhere are the weighted factors, the σi ?
Where are the weighted factors, the σi ?
III. General Least squares techniquesSolving design matrix approach with
SVD
The problem turns out in minimizing the square of the residuals:
This is exactly what singular decomposition does !
II. Least squares fitting a straight line 3. Uncertainties in the estimated y-values
Singular Value decomposition (SVD)
For any matrix A (N rows x M columns), there exists a triple product of
U: column orthonormal matrix (i.e any column vector is orthogonal to theothers and the sum of the squares of elements is ONE)of size NxM
V: orthonormal square of size M*M matrix
S: diagonal matrix of size M x M. The diagonal elements are called thesingular values. These values may be zeros if the matrix is rank deficient(i.e the rank is less than the shortest dimension of the matrix A)
W is a diagonal matrix defined from diagonal matrix S
with εw a small threshold value
W is a diagonal matrix defined from diagonal matrix S
with εw a small threshold value
How do you compute the uncertainties ?
Covariance Matrix ofuncertainties
USEFUL MATRIX ALGEBRA:
Identity 1:
Identity 2:
Demo:
k=1,...,N
Looking for an extremum, and anticipating that thisextremum is a minimum...
To have a solution, this matrix must be invertibleIf not, then SVD methods
Generalization of the linear regression
Residuals Sum of Squares (RSS):
2 methods:ATA is invertible ATA is not invertible
Covariance matrix of
Covariance matrix of
Estimatimator of σε
Student Distributionwithy n=N-K degrees offreedom