Linear regression

Linear regression

By gradient descent(with thanks to Prof. Ng’s machine learning course)

Extending the single variablemultivariate linear regression

hΘ(x) = Θ0 + Θ1x

hΘ(x) = Θ0 + Θ1x1 + Θ2x2 + Θ3x3 + … Θnxn

e.g. start with house prices versus sq ft and then move to house prices versus sq ft, number of bedrooms, age of house

hΘ(x) = Θ0x0 + Θ1x1 + Θ2x2 + Θ3x3 + … Θnxn

With x0 = 1

hΘ(x) = ΘTx

Cost functionJ(Θ) = (1/2m)Σ i=1,m (hΘ(x(i)) – y(i))2

Gradient descent:

Repeat {Θj = Θj - α ∂J(Θ)/∂Θj

} for all j simultaneously

Θj = Θj - (α /m)Σ i=1,m (hΘ(x(i)) – y(i))

Θ0 = Θ0 - (α /m)Σ i=1,m (hΘ(x(i)) – y(i)) x0(i) 1

Θ1 = Θ1 - (α /m)Σ i=1,m (hΘ(x(i)) – y(i)) x1(i)

Θ2 = Θ2 - (α /m)Σ i=1,m (hΘ(x(i)) – y(i)) x2(i)

What the Equations MeanThe matrices: y and x

PRICE SQFT AGE FEATS 2050 1 2650 13 7 2150 1 2664 6 5 2150 1 2921 3 6 1999 1 2580 4 4 1900 1 2580 4 4 1800 1 2774 2 4

Feature ScalingWould like all features to fall roughly into range -1 ≤ x ≤ +1

xi replace with (xi - µi )/si where µi is the mean and si is the range;alternatively, use mean and standard deviation

Don’t scale x0

Converting results back

Learning Rate and Debugging

With small enough α, J should decrease on each iteration: this is first test. An α too large could have you going past the minimum and climbing other side of curve.

With α too small, convergence is too slow.

Try series of α values, say .oo1, .003,. 01, .03, .1, .3, 1, …

Matlab Implementation

Feature Normalizationfunction [X_norm, mu, sigma] = featureNormalize(X)

X_norm = X;mu = zeros(1, size(X, 2));sigma = zeros(1, size(X, 2));

mu = mean(X);sigma = std(X); m = size(X,1); A = repmat(mu,m,1); X_norm = X_norm - A; A = repmat(sigma,m,1); X_norm =X_norm./A;

end

Gradient Descent

function [theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters)

m = length(y); % number of training examplesJ_history = zeros(num_iters, 1);

for iter = 1:num_itersA = (X*theta - y);deltatheta = (alpha/m)*(A'*X);theta = theta - deltatheta'; J_history(iter) = computeCostMulti(X, y, theta);

endend

Cost Function

function J = computeCostMulti(X, y, theta)

m = length(y); % number of training examples

A = (X*theta - y); J = (1/(2*m))*(A'*A);

end

PolynomialshΘ(x) = Θ0 + Θ1x + Θ2x2 + Θ3x3

Replace x with x1, x2 with x2, x3 with x3

Scale the x, x2 , x3 values

Normal EquationsΘ = (A’ A)-1 A’y

A(:,n+1) = ones(length(x),1,class(x));

for a polynomial:for j = n:-1:1 A(:,j) = x.*A(:,j+1);end

W = A'*A Y = A'*y

Θ = W\Y

Linear regression

Documents

Transcript of Linear regression