Post on 03-Jul-2020
Chapter 5
Orthogonality
The scalar product in R2
• Euclidean length ||x|| = (xTx)1/2 =√
x21 + x2
2
• Distance dist(x, y) = ||x − y||.
• xTy = ||x||||y||cos θ
xy-x
y
Law of cosinesca
b
(0,a)
(bcos(φ), bsin(φ))
φ
φ
c
• Cauchy-Schwarz Inequality: |xTy| ≤ ||x||||y|| with equality if
and only if one of the vectors is a multiple of the other.
August Cauchy, 1789-1852
Hermann Schwarz, 1843-1921
Definition 1 Two vectors x, y are called orthogonal if xTy = 0.
• Scalar projection of x onto y
α =xTy
||y||
• Vector projection of x onto y
p =α
||y||y.
x
yα
• Let N be a vector and let P0 be a point in R3. The set
of point P such that−−→P0P is orthogonal to N forms a plane
which is said to be normal to N .
The scalar product in Rn
• Euclidean length ||x|| = (xTx)1/2 =√
x21 + · · · + x2
n
• Distance dist(x, y) = ||x − y||.
• Cosine: cos θ = xTy||x||||y||
• Cauchy-Schwarz Inequality: |xTy| ≤ ||x||||y|| with equality if
and only if one of the vectors is a multiple of the other.
• Pythagorean Law: If x, y are orthogonal then
||x + y||2 = ||x||2 + ||y||2.
Example 1 (Searching a database) Consider the problem of
searching a database consisting of m documents.
• Let Q′ be the n×m matrix with rows corresponding to possible
keywords, columns to documents, the value Q′(i, j) is the
number of times the keyword i appears in document j.
• Let Q be obtained from Q′ by normalizing every column vec-
tor.
• Let x′ be an n × 1 search vector where we put 1 in the ith
position if we are searching for the ith keyword and let x be
x′/||x′||.
• Then the ith position in y = QTx is qTi x = cos θi ∈ [0,1] and
the documents with largest values align best with the search
vector.
• If the ith document does not contain any of the keywords
from the search vector then the vectors are orthogonal.
Orthogonal subspaces
Definition 2 • Two subspaces X, Y of Rn are called orthogo-
nal if xTy = 0 for every x ∈ X,y ∈ Y . We write X ⊥ Y .
• Let Y be a subspace of Rn. Then the orthogonal complement
of Y is
Y ⊥ = {x ∈ Rn|xTy = 0 for every y ∈ Y }.
Fact 1 • If X, Y are orthogonal subspaces of Rn then X ∩Y =
{0}.
• If Y is a subspace of Rn then Y ⊥ is a subspace of Rn.
Notation: Let A be an m × n matrix.
R(A) = {b ∈ Rm|b = Ax for some x ∈ Rn},
R(AT ) = {b ∈ Rn|b = ATx for some x ∈ Rm}.
Therefore, b is in R(AT ) if b is a linear combination of columns
of AT , i.e. rows of A. R(AT ) is essentially the same as the row
space of A.
Theorem 2 Let A be an m × n matrix. Then
N(A) = R(AT )⊥
N(AT ) = R(A)⊥.
Definition 3 Let U, V be subspaces of W such that every w ∈ W
can be written uniquely as u + v for some vector u ∈ U, v ∈ V .
Then we say that W is a direct sum of U, V and write W = U⊕V .
Theorem 3 Let S be a subspace of Rn.
• Then dim(S) + dim(S⊥) = n and if {x1, . . . , xr} is a basis of
S, {y1, . . . , yn−r} a basis of S⊥ then {x1, . . . , xr, y1, . . . , yn−r}is a basis of Rn.
• Rn = S ⊕ S⊥
• (S⊥)⊥ = S.
Least Squares Problems
Problem: Let A be an m × n matrix of rank n. Find a vector x
such that Ax is closest to b.
Method:
• Find x such that ||Ax − b|| is the least possible.
• Theorem 4 Let S be a subspace of Rm, b ∈ Rm. There is a
unique vector p in S such that
||p − b|| < ||y − b||
for every y in Rn. Moreover p is such that b − p ∈ S⊥.
• Let S = R(A). Find x such that b − Ax ∈ R(A)⊥.
• Since R(A)⊥ = N(AT), we want b − Ax ∈ N(AT ), that is
AT(b − Ax) = 0
ATAx = ATb.
• Theorem 5 If A is an m × n matrix of rank n then ATA is
nonsingular.
• x = (ATA)−1ATb is called the least square solution to Ax = b
and the vector p closest to b is A(ATA)−1ATb.
Approximation data with polynomials
• Given a set of point in R2, {(xi, yi)|i = 1, . . . n} find a straight
line y = c0 + c1x which is closest to these points.
1 x11 x2... ...1 xn
(
c0c1
)
=
y1y2...
yn
.
• Given a set of point in R2, {(xi, yi)|i = 1, . . . n} find a polyno-
mial f(x) =∑m
i=0 cixi of degree m which is closest to these
points.
1 x1 x21 . . . xm
1... ... ... ... ...
1 xn x2n . . . xm
n
c0c1. . .cm
=
y1y2...
yn
.
Inner product spaces
Definition 4 An inner product on a vector space V is a function
〈·, ·〉 : V × V → R such that
• 〈x, x〉 ≥ 0 with equality if and only if x = 0.
• 〈x, y〉 = 〈y,x〉.
• 〈αx + βy, z〉 = α〈x, z〉 + β〈y, z〉.
Example 2 • Rn
〈x, y〉 = xTy
• Rm×n
〈A, B〉 =m∑
i=1
n∑
j=1
aijbij
• C[a, b]
〈f, g〉 =
∫ b
af(t)g(t)dt
• Pn : Let x1, . . . , xn be real numbers.
〈f, g〉 =n∑
i=1
p(xi)q(xi)
Let V be an inner product vector space. Then the norm (or
length) is given by
||v|| =√
〈v, v〉.
Theorem 6 If x, y ∈ V are orthogonal then
||x + y||2 = ||x||2 + ||y||2
Theorem 7 (Cauchy-Schwarz) If x, y ∈ V then
|〈x,y〉| ≤ ||x|| · ||y||and the equality holds only when x,y are linearly dependent.
Normed linear spaces
Definition 5 A pair (V, || · ||) is called a normed linear space if V
is a vector space and || · || : V → R is such that
• ||v|| ≥ 0 with equality if and only if v = 0.
• ||αv|| = |α|||v||
• ||v + w|| ≤ ||v|| + ||w||
Example 3 • Let V = Rn. Then ||x||∞ = max{|xi|}.
• Let V = Rn. Then
||x||p =
n∑
i=1
|xi|p
1/p
• Let V = C[a, b]. Then
||f ||p =
(
∫ b
a|f |p
)1/p
||f ||∞ = max{|f(x)|}.
Orthonormal sets
Definition 6 • Vectors v1, . . . ,vn in a vector space V are called
orthogonal if for every i 6= j, 〈vi,vj〉 = 0.
• Vectors v1, . . . ,vn in a vector space V are called orthonormal
if
〈vi,vj〉 = δij.
Example 4 • Consider C[−π, π] with 〈f, g〉 = 1π
∫ π−π f(x)g(x)dx.
Then 1/√
2, cosx, cos 2x, . . . , cosnx is an orthonormal set of
vectors.
• Consider P4 with 〈p, q〉 =∫ 1−1 p(x)q(x)dx. Then
1, x,1
2(3x2 − 1),
1
2(5x3 − 3x)
are orthogonal.
Definition 7 A basis of orthonormal vectors is called an or-
thonormal basis.
Theorem 8 Let {u1, . . . ,un} be an orthonormal basis in an inner
product space V . If v =∑n
i=1 ciui then ci = 〈v,ui〉.
Theorem 9 Let {u1, . . . ,un} be an orthonormal basis in an inner
product space V .
• If v =∑n
i=1 ciui and u =∑n
i=1 diui then
〈v,u〉 =n∑
i=1
cidi.
• (Parseval’s Identity) If v =∑n
i=1 ciui then
||v||2 =n∑
i=1
c2i .
Theorem 10 Let S be a subspace of an inner product space V
and let x ∈ V . Let {x1, . . . , xk} be an orthonormal basis of S and
let p =∑k
i=1 cixi where ci = 〈x, xi〉. Then
• p − x ∈ S⊥.
• ||x − y||2 ≥ ||x − p||2 for every y ∈ S.
• The least squares approximation of x is
k∑
i=1
〈x,xi〉xi.
Definition 8 An n× n matrix is called orthogonal if its columns
form an orthonormal set in Rn.
Properties of orthogonal matrices:
• Columns for an orthonormal basis of Rn.
• QTQ = I, QT = Q−1
• 〈Qx, Qy〉 = 〈x, y, 〉, ||Qx||2 = ||x||2
The Gram-Shmidt orthogonalization
Problem: Find an orthonormal basis of a vector space.
Method:
• Find a basis, E = {x1, . . . ,xk}.
• Construct an orthonormal basis from E using the Gram-
Schmidt orthogonalization algorithm.
Gram-Schmidt Method:
•
u1 =
(
1
||x1||
)
x1
•
uk+1 =1
||xk+1 − pk||(xk+1 − pk)
where pk is the vector in Span(u1, . . . ,uk) which is closest to
xk+1, i.e. pk = 〈xk+1,u1〉u1 + · · · + 〈xk+1,uk〉uk
x
p
u
Theorem 11 If E = {x1, . . . ,xk} is a basis of an inner product
space V and {u1, . . . , uk} is obtained from E using the Gram-
Schmidt process then {u1, . . . ,uk} is an orthonormal basis of V .
Orthogonal Polynomials
Goal: Sequence of polynomials p0(x), . . . , pn(x), . . . such that
pn(x) is a polynomial of degree n and for i 6= j, 〈pi, pj〉 = 0.
Inner product:
〈p, q〉 =
∫ b
ap(x)q(x)w(x)dx
and depending on w different polynomials are obtained. How to
cook obtain an orthogonal sequence from 1, x, x2, . . . ?
• Gram-Schmidt procedure.
• Recurrence relation: For n ≥ 0,
αn+1pn+1(x) = (x − βn+1)pn(x) − αnγnpn−1(x)
where
αn =an−1
an, βn =
〈pn−1, xpn−1〉〈pn−1, pn−1〉
, γn =〈pn, pn〉
〈pn−1, pn−1〉,
ai is the lead coefficient of pi, p−1 = 0, α0 = γ0 = 1.