ICME Refresher Course - Stanford...

30
ICME Refresher Course Lecture 1 Nicole Taheri Institute for Computational and Mathematical Engineering September 19, 2011 Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 1 / 20

Transcript of ICME Refresher Course - Stanford...

ICME Refresher Course

Lecture 1

Nicole Taheri

Institute for Computational and Mathematical Engineering

September 19, 2011

Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 1 / 20

scalars, and vectors, and matrices. oh my!

scalar: single quantity or measurement. (lowercase greek)α, β, γ ∈ R.

vector: ordered collection of scalars. (lowercase) x, y, z ∈ Rn.

x =

x1

x2

...xn

all vectors are column vectors, unless otherwise specified.

matrix: two-dimensional collection of scalars. (uppercase) A,B,Σ,Λ.A ∈ Rm×n is a matrix with m rows and n columns

A =

a1 a2 · · · an

=

a11 a12 · · · a1n

a21 a22 · · · a2n

. . .

am1 am2 · · · amn

, AT =

aT1

...aT

m

Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 2 / 20

operations on vectors

scalar multiplication: for α ∈ R, x ∈ Rn, αx =

αx1

αx2

...αxn

∈ Rn

vector addition: for x ∈ Rn, y ∈ Rn, z = x + y =

x1 + y1

x2 + y2

...xn + yn

∈ Rn

vector-vector multiplication: for x ∈ Rn, y ∈ Rn,

xT y =[

x1 x2 · · · xn

]

y1

y2

...yn

=

n∑

i=1

xiyi

Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 3 / 20

linear functions

Definition

a function f is linear if it satisfies the following two properties:

1 f(αx) = αf(x)

2 f(x + y) = f(x) + f(y)

Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 4 / 20

matrix-vector multiplication

let A ∈ Rm×n.we compute the operation of matrix-vector multiplication, as a functionf(x) = Ax, f : Rn → Rm, as

f(x) = Ax =

a11 a12 · · · a1n

a21 a22 · · · a2n

. . .

am1 am2 · · · amn

x1

x2

...xn

matrix-vector multiplication

let A ∈ Rm×n.we compute the operation of matrix-vector multiplication, as a functionf(x) = Ax, f : Rn → Rm, as

f(x) = Ax =

a11 a12 · · · a1n

a21 a22 · · · a2n

. . .

am1 am2 · · · amn

x1

x2

...xn

= x1

a11

a21

...am1

+ x2

a12

a22

...am2

+ · · · + xn

a1n

a2n

...amn

Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 5 / 20

matrix multiplication is a linear function

Claim

f(x) = Ax is a linear function

Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 6 / 20

matrix multiplication is a linear function

Claim

f(x) = Ax is a linear function

Proof.

1 f(αx) = αf(x)

f(αx) = A(αx) = (αx1)a1 + · · · + (αxn)an

= α(x1a1 + · · · + xnan) = αAx = αf(x).

2 f(x + y) = f(x) + f(y)

f(x + y) = A(x + y)

= (x1 + y1)a1 + · · · + (xn + yn)an

= x1a1 + · · · + xnan + y1a1 + · · · + ynan

= Ax + Ay

= f(x) + f(y).

Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 6 / 20

there is a matrix for every linear function

Theorem

for every linear function g : Rn → Rm there is a matrix A ∈ Rm×n such

that g(x) = Ax.

Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 7 / 20

there is a matrix for every linear function

Theorem

for every linear function g : Rn → Rm there is a matrix A ∈ Rm×n such

that g(x) = Ax.

Proof.

by construction. let the columns of A, a1, . . . , an ∈ Rm, be given byai = g(ei).

∀x ∈ Rn, x =

x1

...xn

= x1e1 + · · · + xnen

∀x ∈ Rn, g(x) = g(x1e1 + · · · + xnen)

= x1g(e1) + · · · + xng(en)

= x1a1 + · · · + xnan = Ax

Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 7 / 20

matrix-matrix multiplication is function composition

let A ∈ Rm×p and B ∈ Rp×n. define f(x) = Ax and g(y) = By.then for C = AB ∈ Rm×n,

f(g(y)) = f(By) = ABy = Cy

let the columns of the matrix C be c1, . . . , cn ∈ Rm, and the columnsof the matrix B be b1, . . . , bn ∈ Rp. then ci = Abi,

C =[

c1 · · · cn

]

= A[

b1 · · · bn

]

=[

Ab1 · · · Abn

]

Equivalent definition:

Cij =

p∑

k=1

AikBkj, i = 1, . . . ,m, j = 1, . . . , n

Example: outer-product, x ∈ Rm×1, yT ∈ R1×n

xyT =

x1

...xm

[

y1 · · · yn

]

=

x1y1 x1y2 · · · x1yn

......

. . ....

xmy1 xmy2 · · · xmyn

∈ Rm×n

Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 8 / 20

subspaces and linear combinations

Definition

a set S ⊆ Rn is a subspace if the following three properties hold:

1 0 ∈ S

2 x ∈ S =⇒ αx ∈ S, ∀α ∈ R.

3 x, y ∈ S =⇒ (x + y) ∈ S.

Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 9 / 20

subspaces and linear combinations

Definition

a set S ⊆ Rn is a subspace if the following three properties hold:

1 0 ∈ S

2 x ∈ S =⇒ αx ∈ S, ∀α ∈ R.

3 x, y ∈ S =⇒ (x + y) ∈ S.

Definition

a linear combination of n vectors a1, . . . , an ∈ Rm is a vector in the form

α1a1 + · · · + αnan, α1, . . . , αn ∈ Rthe set of all linear combinations of a1, . . . , an is a subspace. this is calledthe subspace spanned by a1, . . . , an.

Definition

a set of vectors a1, . . . , an ∈ S ⊆ Rm spans S if every x ∈ S can beexpressed as a linear combination of a1, . . . , an.

Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 9 / 20

linear dependence

Definition

the vectors a1, . . . , an are called linearly dependent if there is a set ofscalars α1, . . . , αn ∈ R not all zero (i.e. ∃αi 6= 0) such that

α1a1 + · · · + αnan = 0.

Definition

a set of vectors a1, . . . , an that are not linearly dependent are calledlinearly independent.

Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 10 / 20

linear dependence

Definition

the vectors a1, . . . , an are called linearly dependent if there is a set ofscalars α1, . . . , αn ∈ R not all zero (i.e. ∃αi 6= 0) such that

α1a1 + · · · + αnan = 0.

Definition

a set of vectors a1, . . . , an that are not linearly dependent are calledlinearly independent.

question: if a1, . . . , an are linearly independent can we have some ai = 0?

Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 10 / 20

linear dependence

Definition

the vectors a1, . . . , an are called linearly dependent if there is a set ofscalars α1, . . . , αn ∈ R not all zero (i.e. ∃αi 6= 0) such that

α1a1 + · · · + αnan = 0.

Definition

a set of vectors a1, . . . , an that are not linearly dependent are calledlinearly independent.

question: if a1, . . . , an are linearly independent can we have some ai = 0?answer: no. why not?

Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 10 / 20

linear independence

Lemma

suppose that the vectors x1, . . . , xn span the subspace S ⊆ Rm and the

vectors y1, . . . , yp ∈ S are linearly independent. then p ≤ n.

Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 11 / 20

linear independence

Lemma

suppose that the vectors x1, . . . , xn span the subspace S ⊆ Rm and the

vectors y1, . . . , yp ∈ S are linearly independent. then p ≤ n.

Proof.

since the x vectors span S, we can write y1 ∈ S as

y1 = α1x1 + · · · + αnxn

since y1 6= 0, not all the α’s are zero (∃αi 6= 0), so we can write

xi =1

αi

y1 +−α1

αi

x1 + · · · +−αn

αi

xn

so the set of x vectors with xi replaced by y1 spans S. repeat this stepn − 1 times to conclude that y1, . . . , yn span the space S. suppose p > n,then yn+1 = α1y1 + · · · + αnyn. this contradicts the linear independenceof the y vectors.

Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 11 / 20

basis and dimension

Definition

a set of vectors that span S and are linearly independent is called a basis

for S.

Theorem

all bases for a subspace S ⊆ Rm contain the same number of vectors. this

number is called the dimension of S and is denoted dim(S).

Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 12 / 20

basis and dimension

Definition

a set of vectors that span S and are linearly independent is called a basis

for S.

Theorem

all bases for a subspace S ⊆ Rm contain the same number of vectors. this

number is called the dimension of S and is denoted dim(S).

Proof.

this follows from the previous lemma and the definition of a basis.suppose we have two bases a1, . . . , an and b1, . . . , bp for S, with p > n.then since a1, . . . , an are a basis for S they must span S, and sinceb1, . . . , bp are linearly independent from the previous lemma we concludep ≤ n. this is a contradiction with p > n. similarly, we can find thatn ≤ p, and hence n = p.

Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 12 / 20

unique representation in a basis

Theorem

if the vectors a1, . . . , an are a basis for a subspace S, every vector in S can

be uniquely represented as a linear combination of these basis vectors.

Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 13 / 20

unique representation in a basis

Theorem

if the vectors a1, . . . , an are a basis for a subspace S, every vector in S can

be uniquely represented as a linear combination of these basis vectors.

Proof.

let b ∈ S, and suppose it can be written as two different linearcombinations

b = α1a1 + · · · + αnan = β1a1 + · · · + βnan

rearranging, we have

(α1 − β1)a1 + · · · + (αn − βn)an = 0

since the a’s are linearly independent, the above can only be true ifαi = βi for all i. thus, the coefficients are unique.

Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 13 / 20

the range of a matrix

Definition

the range of a matrix A ∈ Rm×n, denoted R(A), is the set

R(A) ≡ {Ax : x ∈ Rn}.

let a1, . . . , an ∈ Rm be the columns of A. then, from the definition ofmatrix-vector multiplication

Ax = x1a1 + x2a2 + · · · + xnan

so R(A) is the set of all linear combinations of the columns of A. thus,R(A) is a subspace of Rm. we also call R(A) the column space of A.

Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 14 / 20

the nullspace of a matrix

Definition

the nullspace of a matrix A ∈ Rm×n, denoted N (A), is the set

N (A) ≡ {x : Ax = 0}.

N (A) is a subspace of Rn.

since Ax = x1a1 + x2a2 + · · · + xnan, we note that

if the columns of A are linearly independent then Ax = 0 =⇒ x = 0.

if the columns of A are linearly dependent then

∃x 6= 0 such that Ax = 0

.

Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 15 / 20

the rank of a matrix

Definition

the column rank of a matrix A ∈ Rm×n is the dim(R(A)). the row rank

of a matrix A ∈ Rm×n is the dim(R(AT )).

Claim

row rank equals column rank.

Definition

a matrix A ∈ Rm×n is of full rank if rank(A) = min(m,n).

Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 16 / 20

one-to-one maps

Theorem

a matrix A ∈ Rm×n with m ≥ n has full rank if and only if it maps no two

distinct vectors to the same vector.

Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 17 / 20

one-to-one maps

Theorem

a matrix A ∈ Rm×n with m ≥ n has full rank if and only if it maps no two

distinct vectors to the same vector.

Proof.

=⇒ A is full rank means its columns are linearly independent. if thereare vectors x, y where x 6= y such that Ax = Ay, then A(x− y) = 0,or for z = x − y 6= 0, Az = 0. but this means thatz1a1 + z2a2 + · · · + znan = 0, which contradicts the linearindependence of the columns.

⇐= if there do not exist vectors x, y where x 6= y such that Ax = Ay,then there is no vector z = x − y 6= 0 such that Az = 0, orequivalently, there do not exist {z1, z2, . . . , zn} such thatz1a1 + z2a2 + · · · + znan = 0. therefore, the vectors a1, a2, . . . , an

are linearly independent, and rank(A) = n.

Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 17 / 20

nonsingular matrices

Corollary

if A is full rank with m ≥ n, then Ax = Ay implies x = y. but this is not

true for a matrix that is not full rank.

Definition

a square matrix A ∈ Rn×n not of full rank is said to be singular. a squarematrix A ∈ Rn×n of full rank is said to be nonsingular.

if A is nonsingular, we can uniquely express any vector in Rn as a Ax. inparticular we can express ei = Axi for i = 1, . . . , n.

AX = A[

x1 · · · xn

]

=[

Ax1 · · · Axn

]

=[

e1 · · · en

]

= I

I is the n × n matrix known as the identity.

Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 18 / 20

matrix inverses

the matrix B such that AB = I is called the matrix inverse of A andis denoted A−1.

any square nonsingular matrix A has a unique inverse A−1 whichsatisfies A−1Ax = x, A−1A = I, AA−1 = I.

if b = Ax then A−1b gives the vector of coefficients in the linearcombination of the columns of A that yields b

b = Ax = x1a1 + · · · + xnan

x = A−1b =

x1

...xn

∃A−1, B−1 then (AB)−1 = B−1A−1

Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 19 / 20

linear systems

one of the fundamental problems of linear algebra: Given A ∈ Rm×n

and b ∈ Rm find x ∈ Rn such that

Ax = b

we must have b ∈ R(A) for x to exist. if b ∈ R(A) the system iscompatible. if b /∈ R(A) system is incompatible.

if system is compatible there is a unique solution if and only if thecolumns of A are linearly independent.

if columns of A are linearly dependent there is an infinite number ofsolutions. since ∃z 6= 0 such that Az = 0, if x satisfies Ax = b thenfor any δ ∈ R, A(x + δz) = Ax + δAz = Ax = b.

Nicole Taheri (Stanford) ICME Refresher Course September 19, 2011 20 / 20