Calculus Vector Principia Mathematica

Calculus Vector PrincipiaMathematica

Lynne RyanAssociate Professor MathematicsBlue Ridge Community College

Defining a vector

Vectors in the plane

• A scalar is a quantity that can be represented by a single real number (such as length or area).

• A vector (denoted v) is a quantity that has both magnitude (denoted ||v||) and direction

(denoted θ). An example of a vector quantity would be force, such as a force of 4 N directed at a45 above the horizontal.

• Vectors can be sketched in the coordinate plane as directed line segments. A vector denotedv =< v1, v2 > can be represented with its tail at the origin (0, 0), and head at the point (v1, v2).

• A vector v =< v1, v2 > has x component v1 and y component v2. The relationship betweencomponents, magnitude, and direction angle is shown below:

v1 = ||v|| cos θ

v2 = ||v|| sin θ

v2

1+ v2

2= ||v||2

tan θ =v2

v1

V

Example: What are the magnitude and direction of the vector v =< 3, 4 >?

1

Example: What are the components of a vector with magnitude 25 m and direction θ = 2π3 ?

• Vectors can be relocated in space. For a vector with tail at the point (a1, a2) and head at the point(b1, b2)

v =< b1 − a1, b2 − a2 >

• A vector is really a set of directed line segments; all directed line segments with the same magnitudeand direction are considered to be the same vector, regardless of where their heads and tails arelocated.

2

Defining a vector

Vectors in three or more dimensions

Vectors in three dimensional space can bedefined as ordered triples of compo-

nents; the vector v =< v1, v2, v3 > has

x component v1

y component v2

z component v3

It may also be denoted < vx, vy, vz >.

• We can generalize the idea of a vector in n-dimensional space as an ordered n-tuple:

v =< v1, v2, ..., vn >

The v1 ... vn are the components of v.

• Vectors may be located anywhere in space. A vector with tail at (a1, a2, ..., an) and head at(b1, b2, ..., bn) has components

v =< b1 − a1, b2 − a2, ..., bn − an >

Example: How would you denote a vector with tail at (3,−2, 1) and head at (1, 1, 4)?

• For vectors in space, the magnitude is still obtained from the Pythagorean Theorem:

||v||2 = v2

1+ v2

2+ v2

3

or, in general,

||v||2 = v2

1+ v2

2+ ... + v2

n =n

∑

i=1

v2

i

3

• It takes more than one angle to fix the direction of a vector v =< v1, v2, v3 > in space. Direction

cosines can be used. Let

α be the angle between v and the positive x axisβ be the angle between v and the positive y axisγ be the angle between v and the positive z axis

Thencos α =

v1

||v||cos β =

v2

||v||cos γ =

v3

||v||

(The angles themselves are obtained by taking inverse cosines.)

Example: Find the magnitude and direction cosines of a vector with tail at (3,−2, 1) and head at(1, 1, 4). What are the angles between the vector and the x, y, and z axes?

• The distance formula can be written in terms of vectors. If v is the vector with tail at (a1, a2, ..., an)and head at (b1, b2, ..., bn), then the distance between head and tail is simply the magnitude of thevector v =< b1 − a1, b2 − a2, ..., bn − an >=< v1, v2, ..., vn >:

d = ||v|| =√

v21 + v2

2 + ... + v2n

Example: What is the distance between the points (1, 4, 5) and (−1,−2,−5)?

4

Vector addition and scalar multiplication

Using components

• Let u =< u1, u2, ..., un > and v =< v1, v2, ..., vn > be vectors and let c be a scalar. We definevector addition:

u + v =< u1 + v1, u2 + v2, ..., un + vn >

and scalar multiplication:cv =< cv1, cv2, ..., cvn >

Example: For u =< 1, 4, 5 >, v =< −1, 5, 2 >, c = −4, find

cu + v

• Properties of vector addition:

* u + v = v + u (commutative property of addition)

* u + (v + w) = (u + v) + w (associative property of addition)

* u + 0 = 0 + u = u (additive identity; the identity element is the zero vector, a vector withui = 0 for all i)

* u + −u = −u + u = 0 (additive inverse)

• Properties of scalar multiplication:

* c(u + v) = cu + cv (distributive property; scalar over vector sum)

* (c + d)u = cu + du (distributive property; vector over scalar sum)

* c(du) = (cd)u (associative; scalars with vector)

* 1u = u (multiplicative identity for scalar multiplication)

5

Proof: Prove that for vectors u and v,

u + v = v + u

Suppose that we have two vectors in n dimensional space:

u = < u1, u2, ..., un >

v = < v1, v2, ..., vn >

By the definition of vector addition,

u + v =< u1 + v1, u2 + v2, ..., un + vn >

Since the components themselves are scalars, by the commutative property of addition

u + v = < u1 + v1, u2 + v2, ..., un + vn >

= < v1 + u1, +v2 + u2, ..., vn + un >

= v + u

• The set of all vectors in the plane (v =< v1, v2 >) with accompanying scalars and the operationsof vector addition and scalar multiplication defined on them form a vector space, R2. In general,for any n, the set of all vectors in the form v =< v1, v2, ..., vn > with accompanying scalars form avector space Rn under the operations of vector addition and scalar multiplication.

6

Vector addition and scalar multiplication

Graphical interpretation

• Multiplying a vector by −1 has the effect of reversing the direction of the vector.

• Multiplying a vector by a constant c scales the vector to c times the original length

||cv|| = |c| ||v||

• Vectors can be added graphically by either the parallelogram method, or the tail to head

method.

• An expression in the form cu+dv (combining operations of scalar multiplication and vector addition)is a linear combination of u and v.

Example: For the vectors u =< 1,−2 > and v =< 2, 3 >, find −2u + v by sketching (use theparallelogram method).

7

Unit vectors and direction

• Any vector of length (magnitude) 1 is a unit vector. Given a vector v, you can construct a unitvector u in the same direction as v by normalizing v:

u =1

||v||v

Example: Normalize v =< 1,−1, 5 >.

• A unit vector can be used to represent the direction of a given vector - normalizing v gives usa way to borrow its direction, without borrowing its magnitude. To find a vector w with a givenmagnitude, and in the direction of a given vector v,

* Find the unit vector in the direction of v.

* Multiply the unit vector by the given magnitude, to scale it to the correct length.

Example: Find a vector w in the direction of v =< 2, 3 >, but with ||w|| = 5.

8

• In the plane, the vectors i =< 1, 0 > and j =< 0, 1 > are the standard unit vectors. In space,the standard unit vectors are i =< 1, 0, 0 >, j =< 0, 1, 0 >, and k =< 0, 0, 1 >.

• Any vector can be written as a linear combination of the standard unit vectors; for example

< 2,−3, 4 >= 2 < 1, 0, 0 > −3 < 0, 1, 0 > +4 < 0, 0, 1 >= 2i − 3j + 4k

• The vectors i and j are referred to as the standard basis for R2 (they span and are independent).i, j, and k are the standard basis for R3, and so on. The idea of basis is covered in detail in LinearAlgebra.

Example: Express < −2, 0,−4 > as a linear combination of the standard basis vectors.

Example: Calculate (3i − 2k) − 3(i + j + 5k)

9

Application

Forces

• Vectors can be used to represent forces - a force has both a magnitude and a direction. Theresultant of a collection of forces is the net force.

Example: Find the resultant force: the vector sum of < 1, 4 > and < −2, 5 >. What are the magnitudeand direction of the resultant? Sketch.

Example: An example of a tension problem is presented in the lecture. Take the time to work throughthe steps, and sketch and write yourself some notes as you go!

10

The dot product

Defining the dot product

• For u =< u1, u2, ..., un > and v =< v1, v2, ..., vn > in Rn we define the dot product

u · v := u1v1 + u2v2 + ...unvn =n

∑

i=1

uivi

• Note the dot product is a scalar. This function is a type of inner product.

Example: For u =< 1, 2, 3 >, v =< −4, 1, 0 >, find u · v.

• Properties of the dot product:

* u · v = v · u

* (u + v) · w = u · w + v · w

* cu · v = c(u · v) = u · cv

* u · u ≥ 0 and u · u = 0 if any only if u = 0

Proof: The first property:

Suppose u and v in Rn. Then

u · v =n

∑

i=1

uivi =n

∑

i=1

viui = v · u

since ui and vi are real numbers, and the commutative property of multiplication holds.

Example: Prove the second property:

11

Proof: The fourth property:

Prove u · u ≥ 0 and u · u = 0 if any only if u = 0

[First, prove that u · u ≥ 0]

u · u =n

∑

i=1

uiui

=n

∑

i=1

(ui)2

Since (ui)2 ≥ 0 for all ui, we must have

∑

n

i=1(ui)

2 ≥ 0. So, u · u ≥ 0.

[Now, the if and only if] Suppose u · u = 0. Then∑

n

i=1(ui)

2 = 0. So each ui must equal zero,since if any ui were nonzero, (ui)

2 would add a positive nonzero value to the sum. Since eachui = 0, u = 0.

[The other direction] Suppose u = 0. Then ui = 0 for each i, and∑

n

i=1(ui)

2 = 0. So u ·u = 0.

• Dot product and magnitude are related by the following formula:

||v||2 = v · v

Exercise: Prove it!

12

The dot product

The angle between two vectors

• If u and v are both unit vectors, then

u · v = cos θ

• If u and v are not both unit vectors, then we need to normalize:

(

u

||u||

)

·

(

v

||v||

)

= cos θ

This can be rearranged asu · v = ||u|| ||v|| cosθ

• The proof (for vectors in the plane) comes from the Law of Cosines. Please take a few minutesto work through the derivation - sketch and write it down!

13

• The relationship

cos θ =u · v

||u|| ||v||

is useful for calculating the angle between two vectors (and can be used to define the idea of“angle” between generalized vectors in any vector space).

Example: Find the angle between the vectors u =< 1,−2 > and v =< −3,−4 >.

Example: Find the angle between the vectors u = 3i − 2j + k and v = −j + 5k.

14

The dot product

Orthogonal vectors

• Two vectors u and v are orthogonal (perpendicular) if and only u · v = 0.

Example: Are the vectors u =< 1, 2,−4 > and v =< −1, 2, 4 > orthogonal?

Example: Find a vector orthogonal to u =< 1,−3 >. How many such vectors are there, and what dothey look like relative to u?

Example: Find a vector orthogonal to u =< 1,−3, 2 >. How many such vectors are there, and whatdo they look like ?

15

• An orthogonal set is a collection of vectors that is pairwise orthogonal (the dot product of anypair of vectors in the set is zero).

Example: The set of vectors < 1, 2, 1 >, < 4,−2, 0 >, < 2, 4,−10 > is an orthogonal set. The vectorsin any pairing are perpendicular - dot all possible combinations to see:

• A set of vectors is orthonormal if it is orthogonal and all the vectors in the set are normalized(have magnitude equal to 1)(are unit vectors). The set of vectors in the previous example is notorthonormal - the vectors are not unit vectors.

• However, you can construct an orthonormal set from an orthogonal set by normalizing each of thevectors:

16

The dot product

Vector projections and components

• Recall that a vector can be broken down into horizontal and vertical components:

v =< v1, v2 >= v1i + v2j

The v1 and v2 are scalars, and indicate the length of each component; the i and j are vectorsindicating the horizontal and vertical directions.

• In some applications, it may be more convenient to break a vector down into perpendicular com-ponents that are not horizontal and vertical.

• The vector projection of v onto u is denoted projuv and is shown below. It is also called the

vector component of v along u.

• The length (magnitude) of projuv is denoted comp

uv

compuv = ||proj

uv||

compuv = ||v|| cosθ

compuv =

(

u · v

||u||

)

• The direction of projuv is that of u. However, we need to normalize u in order to borrow its

direction:Direction of proj

uv =

u

||u||

17

• Putting it all together,

projuv = (magnitude)(direction)

= (compuv)

(

u

||u||

)

=

(

u · v

||u||

)(

u

||u||

)

or

projuv =

(

u · v

||u||2

)

u

Example: Let v =< 1, 4 > and u =< −3, 1 >. Find projuv, and proj

vu. Sketch both.

• The vector component of v orthogonal to u is denoted orthuv and is shown below:

18

• v is the vector sum of orthuv and projuv:

v = projuv + orthuv

soorthuv = v − proj

uv

Example: In the previous example, we found that for v =< 1, 4 > and u =< −3, 1 >,

projuv =< − 3

10 , 110 >. Use this to find orthuv.

Example: Sketch and work the last example of a box on a hill:

19

Application

Work

• When a constant force F is used to move an object through a distance d, and the force is entirelyin the same direction as the object, we have

W = Fd

In this version, familiar from Physics courses, force is represented as a scalar (and that’s OK, sincethe force and the distance line up), but this is really just a special case.

• When the force is directed at an angle from the line of motion, the above no longer holds. It isnecessary to think of force as a vector F, with magnitude and direction. Only the component of

F in the direction of the motion does work.

• We can also think of distance as a vector, so that we can describe work as

W = (compdF)(||d||)

W = ||F|| ||d|| cos θ

W = F · d

(Work is the dot product of the force vector and the distance vector.)

Example: What is the work done by a force of 30 N directed 70 above the horizontal, moving a boxup an incline of 25?

20

Example: A force of (30i − 20j + 100k)N is applied to move an object from a location at (1, 1, 1)m toa location at (10, 5, 1)m. What is the work done by the force?

21

Theorems

Cauchy-Schwarz inequality, Triangle inequality, Pythagorean theorem

• This section is a collection of three important theorems involving dot product and magnitude.Please work through the proofs!

• Cauchy-Schwarz inequality

For any two vectors u and v in a vector space,

|u · v| ≤ ||u|| ||v||

Example: Verify the Cauchy-Schwarz inequality for u =< 1, 4, 3 > and v =< −1, 0, 2 >.

Proof: Work through the proof:

Note that this form of the proof relies on the idea of angle- this proof (which is the simple one) is OKfor vectors in Rn, where we can verify the dot product formula through geometry. The C-S inequalitydoes hold for vectors in general vector spaces, but you have to be a little careful or you end up going incircles on the idea of angle- a different proof approach is preferred.

22

• Triangle inequality

For any two vectors u and v in a vector space

||u + v|| ≤ ||u||+ ||v||

Proof: Work through the proof:

The proof relies on the Cauchy-Schwarz inequality. When the C-S inequality is established for generalvector spaces, this also establishes the Triangle inequality for general vector spaces.

23

• Pythagorean theorem

If u and v are orthogonal, then

||u + v||2 = ||u||2 + ||v||2

Proof: Prove it! Start it like the proof of the Triangle inequality ... but what do you know about thedot product of orthogonal vectors? Use that, and the proof is over with quickly!

24

The cross product

Defining the cross product

• The cross product of two vectors u and v in space is defined by

u× v = (||u|| ||v|| sin θ)n

where θ is the angle between u and v.

• n is a unit vector orthogonal to both u and v, and indicates the direction of the crossproduct.

• The magnitude is given by the (||u|| ||v|| sin θ) part; i.e. the cross product is a multipleof n that has this length.

• It is important to distinguish between dot and cross products - the dot product resultsin a scalar, the cross product results in a vector.

• There are two possible choices for the unit direction vector n; the right hand rule isused to determine direction.

Example: Compute i × j.

25

Example: For an arbitrary vector v, what is v × v?

Example: Let u and v be vectors which lie in the xy plane. ||u|| = 10 and θu = 75 (theangle u makes with the x-axis). ||v|| = 20 and θv = 15 (the angle v makes with the x-axis).What are the magnitude and direction of u× v?

• The cross product is only defined for vectors in space (R3). Vectors in the plane are asubset of vectors in space, and can be crossed, but we should think of them as having athird (zero) component:

v = v1i + v2j = v1i + v2j + 0k

26

The cross product

Computing the cross product

• The definitionu× v = (||u|| ||v|| sin θ)n

is inconvenient for computing the cross product of arbitrary vectors in space.

• A list of what you would have to do appears on the slides. The point to this list is not that youshould do it, but that you shouldn’t ! Instead, we’ll derive a computational formula based on thecomponents of u and v.

• Let u =< u1, u2, u3 > and v =< v1, v2, v3 > be vectors in R3.The cross product can be computed by

u× v =< u2v3 − u3v2,−(u1v3 − u3v1), u1v2 − u2v1 >

Example: Let u =< 1,−2, 3 >, v =< −5,−1, 2 >. Compute u × v. Verify that u × v is orthogonalto both u and v.

• We need to verify that the computational formula matches the definition, and gives the correctmagnitude and direction.

27

• Verify magnitude:

||u|| ||v|| sin θ = ||u|| ||v||√

1 − cos2 θ

= ||u|| ||v||

√

√

√

√1 −

(

u · v

||u|| ||v||

)

2

= ||u|| ||v||

√

√

√

√

(||u|| ||v||)2 − (u · v)2

(||u|| ||v||)2

=√

(||u|| ||v||)2 − (u · v)2

=√

(u2

1 + u2

2 + u2

3)(v2

1 + v2

2 + v2

3) − (u1v1 + u2v2 + u3v3)

= (a good bit of algebra - distribute, regroup, factor)

=√

(u2v3 − u3v2)2 + (u1v3 − u3v1)2 + (u1v2 − u2v1)2

= ||u× v||

• Verify direction (orthogonality):

(u × v) · u = < u2v3 − u3v2,−(u1v3 − u3v1), u1v2 − u2v1 > · < u1, u2, u3 >

= (u2v3 − u3v2)u1 − (u1v3 − u3v1)u2 + (u1v2 − u2v1)u3

= u1u2v3 − u1u3v2 − u1u2v3 + u2u3v1 + u1u3v2 − u2u3v1

= 0

(Similar to verify (u× v) · v = 0.)

• The cross product can be used to find a vector which is orthogonal to two given vectors (since thisis what it, by definition, produces).

Example: Let u = 5i − j + k, v = −i − 2j + 4k. Find a vector which is orthogonal to both u and v.Is this vector unique?

28

The cross product

Using determinants to compute

• The cross product formula

u× v =< u2v3 − u3v2,−(u1v3 − u3v1), u1v2 − u2v1 >

can be remembered using a determinant structure.

• For a 2x2 matrix:

[

a b

c d

]

the determinant is denoted

∣

∣

∣

∣

a b

c d

∣

∣

∣

∣

and is computed by∣

∣

∣

∣

a b

c d

∣

∣

∣

∣

= ad − bc

Example: Compute∣

∣

∣

∣

1 −32 5

∣

∣

∣

∣

• Computing the cross product:

This is best explained by example. Suppose we wish to calculate u× v for

u = 3i − 2j + k,v = −i + 7j + 4k

Start by arranging u and v in a 3x3 determinant structure, with i, j, and k forming thetop row, and the components of u and v forming the second and third rows:

∣

∣

∣

∣

∣

∣

i j k

u1 u2 u3

v1 v2 v3

∣

∣

∣

∣

∣

∣

=

∣

∣

∣

∣

∣

∣

i j k

3 −2 1−1 7 4

∣

∣

∣

∣

∣

∣

Set up a structure and fill it in (watch the demo):

∣

∣

∣

∣

∣

∣

∣

∣

i −

∣

∣

∣

∣

∣

∣

∣

∣

j +

∣

∣

∣

∣

∣

∣

∣

∣

k

29

And compute using the rule for 2x2 determinants:

(−8 − 7)i − (12 + 1)j + (21 − 2)k

−15i − 13j + 19k

(or < −15,−13, 19 >)

• In general

u× v =

∣

∣

∣

∣

∣

∣

i j k

u1 u2 u3

v1 v2 v3

∣

∣

∣

∣

∣

∣

=

∣

∣

∣

∣

u2 u3

v2 v3

∣

∣

∣

∣

i−

∣

∣

∣

∣

u1 u3

v1 v3

∣

∣

∣

∣

j +

∣

∣

∣

∣

u1 u2

v1 v2

∣

∣

∣

∣

k

(Don’t memorize that as a formula; learn the pattern of crossing out columns in turn!)

Example: Use the determinant structure to compute u × v for u =< 1, 1,−4 >, v =<

3, 0, 1 >.

30

The cross product

The area of a parallelogram

• The area of a parallelogram with adjacent edges formed by the vectors u and v is given by

A = ||u× v||

v

Example: Use ||u× v|| to compute the area of a parallelogram with adjacent edges bounded byu =< 1, 1, 2 > and v =< −5, 4, 3 >.

31

Example: Compute the area of a parallelogram with vertices A = (5, 2, 0), B = (2, 6, 1), C = (2, 4, 7),D = (5, 0, 6).

• The area of a triangle with adjacent edges formed by the vectors u and v is given by

A =||u× v||

2

Example: Compute the area of a triangle with vertices A = (1, 1, 1), B = (−1, 4, 7), C = (0,−2, 2).

32

The cross product

Properties of the cross product

For all of the following, u, v, and w are vectors in R3, c scalar.

• Geometric properties of the cross product:

* u× v is orthogonal to both u and v.

* ||u× v|| = ||u|| ||v|| sin θ.

* u× v = 0 if and only if u = cv for some c.

* ||u× v|| gives the area of a parallelogram with adjacent sides u and v.

• Algebraic properties of the cross product:

* v × v = 0

* v × 0 = 0 = 0 × v

* c(u × v) = (cu) × v = u× (cv)

* u× (v + w) = (u × v) + (u× w)

* u× v = −(v × u)

Proof: Prove the third property:

c(u × v) = (cu) × v = u× (cv)

33

• The commutative property does not hold:

u× v 6= v × u

(This is important; we’re used to being able to “multiply” in any order, and we’ve nowmet a multiplication-type operation where that is no longer the case.)

Insteadu× v = −(v × u)

(Same magnitudes, but opposite directions because of the right hand rule.)

34

The cross product

The triple scalar product

• The triple scalar product of vectors u, v and w in space is defined by

u · (v × w)

Example: Compute the triple scalar product of the vectors

u =< 1, 2, 2 > v =< −1, 4, 7 > w =< 0,−2, 2 >

• The triple scalar product can be computed using the determinant structure:

u · (v × w) =

∣

∣

∣

∣

∣

∣

u1 u2 u3

v1 v2 v3

w1 w2 w3

∣

∣

∣

∣

∣

∣

= u1

∣

∣

∣

∣

v2 v3

w2 w3

∣

∣

∣

∣

− u2

∣

∣

∣

∣

v1 v3

w1 w3

∣

∣

∣

∣

+ u3

∣

∣

∣

∣

v1 v2

w1 w2

∣

∣

∣

∣

35

Example: Compute the triple scalar product of the vector (use the determinant structure):

u =< 1, 2, 2 > v =< −1, 4, 7 > w =< 0,−2, 2 >

• A parallelpiped is a polyhedron whose faces are parallelograms. The triple scalar prod-uct can be used to compute the volume of a parallelpiped with edges u, v and w:

V = |u · (v ×w)|

Example: Compute the volume of a parallelpiped with edges u = 5i− 2j+k, v = i+ j− 4k,w = 3i + k.

• Properties of the triple scalar product:

u · (v × w) = v · (w × u) = w · (u × v)

u · (v × w) = (u× v) ·w

36

Application

Torque

• Torque is the name given to the twisting effect of applying a force to an object, causingit to pivot.

• It depends on 3 things: (1) the amount of force applied, (2) the direction of the force,and (3) the distance from the point of application of the force to the pivot point.

• The lever arm is the line along which the force is transmitted to the pivot. It is denotedby a vector r. When r and F are orthogonal, we have

||τ || = ||r|| ||F||

* That τ is supposed to be boldface - torque is a vector. Can’t seem to get Greek andbold at the same time.

Example: What is the magnitude of the torque produced byapplying a force of 3 lb to the pictured wrench?

• When the force is directed at an angle, the component which produces torque is theorthogonal component of F along r: orthrF. The magnitude of orthrF is ||F|| sin θ, and

||τ || = ||r|| ||F|| sin θ

Example: What is the magnitude of the torque produced byapplying a force of 3 lb to the pictured wrench?

37

• We define torque as a vector by the cross product:

τ = r × F

• The convention for direction comes from the right hand rule, and we have

* Clockwise rotation = IN

* Counterclockwise rotation = OUT

• If we further apply the convention to a right handed coordinate system, where i× j =k determines the direction of the positive z axis, we have

* Clockwise rotation = IN = unit direction vector (−k)

* Counterclockwise rotation = OUT = unit direction vector (+k)

Example: What is the torque produced by applying a force of 3lb to the pictured wrench? Express the answer as a vector.

38

• Expressing torque as a vector cross product makes it possible to work with forces andpivoting objects in 3D space:

Suppose you have a lever with one end at the origin, and the other located atthe point (20, 30, 10) cm. The lever is free to pivot about the origin. A force of< 5, 10,−4 > N is applied to the end of the lever, causing it to pivot. What isthe torque produced by the force?

Compute the cross product τ = r × F:

And interpret your answer (what do the magnitude and direction tell you?):

39

Vector algebra

Summary of operations

• Let u and v be vectors, c scalar. We have defined

* The norm (magnitude) of a vector [||v||]

* Scalar multiplication [cv]

* Vector addition [u + v]

* Dot product of vectors [u · v]

* Cross product of vectors [u× v]

• Expressions containing these operations need to make sense; for example, we can’t add ascalar and a vector (or cross a scalar and a vector, or dot a scalar and a vector).

Examples: For each of the following expressions, decide whether the expression is defined,and if so, whether the final result is a vector or a scalar.

(u · v)(3u× v)

(u · v)× (−2v)

(u × v) · (−2v)

• As long as the expression is defined, we compute by building up in pieces.

Example: Let u =< 1, 1, 3 >, v =< 3,−4, 2 >, w =< 0, 1, 7 >. Find

||(u × v) + 3w||

40

Lines and planes

Lines

• Lines in 2D can be defined by a point and a slope. You should recall the various forms ofequations of lines:

* y − y0 = m(x− x0) (point-slope)

* y = mx + b (slope-intercept)

* Ax + By + C = 0 (general or standard)

• In 3D space, the idea of slope no longer makes sense. We can define a line in space usinga point (to fix its location) and a vector (to indicate its direction).

• The equation of a line in space is given by

r = r0 + tv

where

* P0 is a given point on the line.

* r0 is the position vector pointing to P0

* v is a vector indicating the direction of the line, called the parallel vector for theline

* t is an arbitrary scalar parameter. The quantity tv produces an infinite number ofscalar multiples of the vector v, extending the line infinitely in both directions.

* r, the vector sum of r0 and tv, is the position vector pointing to any arbitrary point(x, y, z) on the line. r varies with t.

41

• The equationr = r0 + tv

or< x, y, z >=< x0, y0, z0 > +t < a, b, c >

is the point-parallel form of the equation of a line.

Example: Write the equation of a line passing through the point (1, 4,−5) with directionparallel to the vector < −1,−2, 7 >.

Example: Give three points on the line

< x, y, z >=< 1, 1,−1 > +t < 2, 5,−3 >

Sketch.

• By rearranging the point-normal form and equating components, we arrive at the para-

metric equations for a line:

x = x0 + at

y = y0 + bt

z = z0 + ct

42

Example: Write the parametric equations for the line passing through (1, 5,−3)parallel to v =< −2, 0,−1 >.

• If we solve the parametric equations for t

x − x0 = at ⇒ t =x − x0

a

y − y0 = bt ⇒ t =y − y0

b

z − z0 = ct ⇒ t =z − z0

c

and equate:x− x0

a=

y − y0

b=

z − z0

c

we arrive at the symmetric equations for the line.

Example: Write the symmetric equations for the line passing through (1, 5,−3)parallel to v =< −2, 0,−1 >.

43

Example: Write the parametric and symmetric equations for the line passing through thepoints (−2, 4, 5) and (1, 0,−1).

• And, having done all this for lines in 3D space, we should note it works perfectly well forlines in the plane as well:

* < x, y >=< x0, y0 > +t < a, b > (point-parallel)

* x = x0 + at y = y0 + bt (parametric)

* x − x0

a =y − y0

b(symmetric)

44

Lines and planes

Intersection of lines

• We solve for the coordinates of points of intersection by equating expressions for the x,y, and z expressions of the parametric equations for the lines. The parametric equationsare

x = x0 + at

y = y0 + bt

z = z0 + ct

• As an example, look at

L1 :x = 1 − 2ty = 1 + 3tz = 1 + 4t

L2 :x = 1y = 2 − t

z = 3 − 2t

• And be careful, there’s a catch. Equating the x expressions gives a solution of 1 − 2t =1 ⇒ t = 0, but substituting this solution into the equations for y and z yields aninconsistent system. This is not the solution.

• Be sure to use a different parameter for the second line:

L1 :x = 1 − 2ty = 1 + 3tz = 1 + 4t

L2 :x = 1y = 2 − s

z = 3 − 2s

• We can now equate and solve (work through the solution here):

45

Example: Find the intersection of the lines

L1 : < x, y, z >=< 0, 1,−2 > +t < 1, 1, 4 >

L2 : < x, y, z >=< 1, 1, 0 > +t < 1,−1, 2 >

or show that they do not intersect.

• Parallel lines lie in the same plane and do not intersect. They must have paralleldirection vectors (i.e., v1 and v2 must be scalar multiples of each other).

• Skew lines do NOT lie in the same plane, and do not intersect. The direction vectorscannot be scalar multiples of each other.

Example: We have shown that the lines

L1 : < x, y, z >=< 0, 1,−2 > +t < 1, 1, 4 >

L2 : < x, y, z >=< 1, 1, 0 > +t < 1,−1, 2 >

do not intersect. Are they parallel or skew?

46

Example: The lines

L1 :x− 1

3=

y + 4

−1= z − 5

L2 :x + 1

−6=

y + 4

2=

z + 3

−2

do not intersect (take the time to verify this for practice!). Are they parallel or skew?

*The point to this question is to make sure you can start with any form of a line and get backto the other forms. What part of the symmetric equations shows the parallel vector? Whatpart shows the point?

47

Lines and planes

Planes

• A plane can be defined by specifying a point and a normal vector. The plane is the setof all (x, y, z) such that the vector from the fixed point to any (x, y, z) is orthogonal tothe normal vector.

• The point-normal form of the equation of a plane is

n · (r − r0) = 0

Suppose we have a plane with normal n =< 1,−2, 5 >, and a point on the plane (3, 4, 6).

n =< 1,−2, 5 > r0 =< 3, 4, 6 > r =< x, y, z >

thenn · (r − r0) = 0

becomes< 1,−2, 5 > · < x − 3, y − 4, z − 6 >= 0

*You should learn point-normal as

< n1, n2, n3 > · < x − x0, y − y0, z − z0 >= 0

• If you work through the dot product, you get the scalar form:

n1(x − x0) + n2(y − y0) + n3(z − z0) = 0

In our example:1(x − 3) − 2(y − 4) + 5(z − 6) = 0

48

• Distributing out the scalar form gives the general form:

Ax + By + Cz + D = 0

In our example:

1(x − 3) − 2(y − 4) + 5(z − 6) = 0

x − 3 − 2y + 8 + 5z − 30 = 0

x − 2y + 5z − 25 = 0

• Notice that the normal vector is still “visible” in Ax+By+Cz+D = 0: < n1, n2, n3 >=<

A, B, C >.

Example: Write equations (point-normal, scalar, and general forms) for a plane passingthrough (5,−1,−2) and normal to the vector < 6, 2,−3 >.

Example: The equation of a plane is given: 3x − 2z = 4. What is a normal vector for thisplane? Give the coordinates of three points on the plane.

49

• The general form can be rearranged again:

x − 2y + 5z − 25 = 0

5z = −x + 2y + 25

z = −1

5x +

2

5y + 5

This expresses z as a multivariable function of x and y [z = f(x, y)], with domain theentire xy plane: −∞ < x < ∞, −∞ < y < ∞.

• This form is usually needed to plot using software.

• Planes can be hand sketched using intercepts:

* to get the x-intercept, set y = 0, z = 0

* to get the y-intercept, set x = 0, z = 0

* to get the z-intercept, set x = 0, y = 0

• The angle between planes is the (acute) angle between normal vectors. Compute from

cos θ =|n1 · n2 |

||n1|| ||n2||

Example: Find the angle between planes 3x − 2y + z − 5 = 0 and −x + y + 3 = 0.

50

• You can determine whether planes are parallel, orthogonal, or neither by comparingnormal vectors.

Examples: Determine whether the planes are parallel, orthogonal, or neither:

3x − 2y + z = 5 and x + y − z = 1

3x − y + z = −7 and 2x + y − z = 1

x − 2y + 3z = −2 and − 2x + 4y − 6z = 1

51

Lines and planes

Writing equations of lines and planes

• Examples:

* Write the symmetric equations of a line through the point (1,−1, 2) that is parallel to the linewith the parametric equations

x = 3t , y = −5 − 2t , z = 4 + t

* Write the general equation of the plane that contains the intersecting lines

L1 : x = 3t , y = −5 − 2t , z = 4 + t

L2 : x = 3 − t , y = −7 + t , z = 5 + 2t

• There are too many possible relationships between lines and planes to learn each type of problem.Instead, we need to develop a general plan.

• The key to setting up these problems is to keep in mind that

* Lines are defined in term of parallel vectors (lying on the line)

* Planes are defined in terms of normal vectors.

• The Plan:

* What are you given?

* What do you want?

* How are they related?

* What’s the equation, and what form is it in?

52

• Work through the first example:

Write the symmetric equations of a line through the point (1,−1, 2) that is parallel tothe line with the parametric equations

x = 3t , y = −5 − 2t , z = 4 + t


* What do you want?



53

• Work through the second example:

Write the general equation of the plane that contains the intersecting lines

L1 : x = 3t , y = −5 − 2t , z = 4 + t

L2 : x = 3 − t , y = −7 + t , z = 5 + 2t


* What do you want?



54

Lines and planes

Intersection of planes

• Any two planes which are not parallel will eventually intersect in a line which is commonto both planes.

• To solve for the line of intersection, simultaneously solve the system of equations of theplanes. Equations should be in general form (and then, move the constant over to theright of the equals).

Example: Find the intersection of the planes (work through the solution):

3x − 4y + z − 10 = 0

x − 2y − z + 4 = 0

• Three planes in space may intersect in a point, a line, or not at all.

• Solving three equations in three unknowns is tedious; continue on to “Using SciLab tosolve for the intersection of three planes” for a crash course in matrix solving.

55

Lines and planes

Distances in space

• The distance between two points a = (a1, a2, a3) and b = (b1, b2, b3) in space is themagnitude of the vector v =< b1 − a1, b2 − a2, b3 − a3 >

d = ||v|| =√

(b1 − a1)2 + (b2 − a2)2 + (b3 − a3)2

Example: What is the distance between the points (1, 4, 5) and (1,−2,−5)?

• The shortest distance from a point to a line is the orthogonal distance. However, wegenerally don’t know the exact point on the line which is closest to the given point. Onthe other hand, it is easy to find an arbitrary point on the line. How do we relate thedistance we can find to the distance we want?

• Set up some vectors:

• And do some algebra:

Start with d = ||b|| sin θ, andmultiply both sides by ||v||:

||v||d = ||b|| ||v|| sin θ

The right hand side is the magnitude of the cross product of b and v, so

||v||d = ||b× v||

and

d =||b× v||

||v||

56

• So, to calculate the distance from a point to a line:

* A point P and the equation of a line L need to be given.

* Read the direction vector v from the equation of the line.

* Find the coordinates of a point Q on the line.

* Write the vector b from Q to P .

* And compute

d =||b× v||

||v||

Example: What is the distance from the point (1,−4, 7) to the line

x = 2 − t , y = 3 + 2t , z = −t

• Finding the distance from a point to a plane has the same problem - we generally don’tknow the exact point on the plane closest to the given point.

• As before, we can start by finding an arbitrary point on the plane.

• Then, since planes are defined by their normals, we can get the distance by projecting:

compnb =

b · n

||n||

57

• To calculate the distance from a point to a plane:

* A point P and the equation of a plane Ax + By + Cz + D = 0 (or one of the otherforms) need to be given.

* Read the normal vector n from the equation of the plane.

* Find the coordinates of a point Q on the plane.

* Write the vector b from Q to P .

* And compute

d =|b · n|

||n||

Example: What is the distance from the point (1,−4, 7) to the plane

3x − 2y + z − 1 = 0

58

• Line to plane and plane to plane distances only make sense when the objects are parallel(if they aren’t they’ll eventually intersect).

• To calculate distance from a line to a plane, or a plane to a plane:

* Verify that the figures are parallel.

* Find an arbitrary point on the line (or first plane).

* And go through the process of computing point-to-plane.

Example: What is the distance between the planes

3x − 2y + z − 1 = 0

−6x + 4y − 2z − 10 = 0

• Distance from line to line is a little tricky - you can’t pick arbitrary points on both lines.

• The trick is to recognize that non-intersecting lines lie in parallel planes. Figure out whatthe planes are, and restate as a plan-to-plane problem.

• Because the planes are parallel, they have the same normal vector n, orthogonal to bothv1 and v2. Find n = v1 × v2.

• To calculate the distance from a line to a line:

* Equations of two lines L1 and L2 need to be given.

* Read the direction vectors v1 and v2 from the equations for L1 and L2. Find a pointP on L1 and a point Q on L2.

* Find n = v1 × v2

* Use n and point Q to write the equation of a plane.

* And and find the distance from point P to the plane.

59

Example: What is the distance between the lines

L1 : x = 1 + t , y = −2 + 3t , z = 4 − t

L2 : x = 2t , y = 3 + t , z = −3 + 4t

Solution (in steps):

• What are v1 and v2? Find a point on each line (let t = 0).

• Find n = v1 × v2. Use n and Q to write the equation of a plane containing L2.

• Find the distance from the point P +(1,−2, 4) to the plane 13x− 6y − 5z +3 = 0. PointQ = (0, 3,−3) is a known point on the plane.

60

Vector valued functions

Defining vector valued functions

• A line is an example of a vector valued function; a function which takes a scalar (t)as input, and returns a vector

r(t) =< f(t), g(t), h(t) >

orr(t) = f(t)i + g(t)j + h(t)k

The components of the vector are all scalar functions of t.

• The graph of a vector valued function is a curve in space, where the points on the curveare traced out by the position vector

< x, y, z >=< f(t), g(t), h(t) >

• The functions f(t), g(t), and h(t) are the component functions.

• We can write vector valued functions in parametric form:

x = f(t)

y = g(t)

z = h(t)

Example: What are the parametric equations for the vector valued function

r(t) =< sin t, t2, t >

• The domain of a vector valued function is the set of values of t for which the function isdefined. r(t) is defined wherever all three of its component functions are defined, so thedomain of r(t) =< f(t), g(t), h(t) > is the intersection of the domains of f(t), g(t), andh(t).

• The range of a vector valued function is the set of all vectors < x, y, z > such that< x, y, z >= r(t) for some t in the domain.

61

Example: What are the domain and range of the vector valued function

r(t) =< t2, 3t + 1,√

t− 2 >

Example: Give a few points on the graph of

r(t) =< t2, 3t + 1,√

t− 2 >

Can you envision what it looks like?

62


Plane curves

• A vector valued function with two components

r(t) =< f(t), g(t) >= f(t)i + g(t)j

and parametric equationsx = f(t) , y = g(t)

traces out a curve in the xy plane:

< x, y >=< f(t), g(t) >

• We can eliminate the parameter t and write an equation for the curve in terms of x andy. This is the Cartesian equation for the curve.

Example: Eliminate the parameter and write the Cartesian equation for the curve

r(t) =< 3t + 1, t2 >

What is the graph of this function?

63

Example: Eliminate the parameter and write the Cartesian equation for the curve

r(t) =< 3 sin t, 2 cos t >


• The curvesr(t) =< cos t, sin t >

s(t) =< cos t,− sin t >

both trace out the circlex2 + y2 = 1

but they provide more information than the Cartesian equation. Parametric curves havea direction in which the curve is traced out.

• To determine direction, pick a few values for t. For example, the curve r(t) =< t, t2 >

has parametric equations x = t , y = t2 and draws y = x2 in the direction indicated(make yourself a little margin sketch).

• Domain information is also part of the vector valued function. For example, the graphof r(t) =<

√t, 3t > is not the entire parabola y = 3x2.

The domain of r(t) =<√

t, 3t > is t ≥ 0, because of the radical. This is passed throughto the values of x and y: since x =

√t, x ≥ 0, and since y = 3t but t ≥ 0, y ≥ 0 as well.

And, if we start picking values for t, we’ll never draw the left side of the parabola.

64

When we rewrite the parametric equations as the single equation

y = 3x2

we have to keep track of those restrictions; in particular, since y = f(x), we need to specifythat the domain for this function of x must be x ≥ 0. This should both be included withthe expression for the function:

y = 3x2 , x ≥ 0

and be reflected in the graph.

Example: Eliminate the parameter and write the Cartesian equation for

r(t) =<√

2 − t, 3t2 + 1 >


65


Projections onto planes

• Seeing a 3D object projected onto a 2D plane can help you visualize the object. Inparticular, we look at projections of space curves onto the xy, xz, and yz planes.

• Examine the projections of the graph of

r(t) =< cos t, sin t, t >

Make some sketches, and write the equations for the various projections.

66

Example: What are the projections of the graph of r(t) =< t, t2, et > onto the xy, yz, andxz planes? Write equations and sketch.

67

Calculus of vector valued functions

Limits and continuity

• For a vector valued function r(t) =< f(t), g(t), h(t) > we define the limit as t ap-

proaches a of r(t) by

limt→a

r(t) =< limt→a

f(t), limt→a

g(t), limt→a

h(t) >

Example: Find limt→3 r(t) for r(t) =< t3, t− 1t + 4 ,

√

9 − t >.

Example: Find limt→−2 r(t) for r(t) =< et, t2 − 4t + 2 , 3 − t2 >.

68

Example: Find limt→1 r(t) for r(t) =< et, ln(2 − t), 3t − 1 >.

• A vector valued function r(t) =< f(t), g(t), h(t) > is continuous at a if and only if

* r(a) exists

* limt→a r(t) exists

* limt→a r(t) = r(a)

Example: Are the three previous examples continuous at their respective values for a?

r(t) =< t3, t − 1t + 4 ,

√

9 − t > (a = 3)

r(t) =< et, t2 − 4t + 2 , 3 − t2 > (a = −2)

r(t) =< et, ln(2 − t), 3t − 1 > (a = 1)

69

• A vector valued function is continuous on an open interval (a, b) if it is continuousat every value for t in the interval.

• Left and right limits are defined by:

limt→a−

r(t) =< limt→a−

f(t), limt→a−

g(t), limt→a−

h(t) >

limt→a+

r(t) =< limt→a+

f(t), limt→a+

g(t), limt→a+

h(t) >

• A vector valued function is continuous on a closed interval [a, b] if it is continuouson the open interval (a, b) and if

limt→a+

r(t) = r(a)

limt→b−

r(t) = r(b)

Example: On what intervals are the previous examples continuous?

r(t) =< t3, t − 1t + 4 ,

√

9 − t > (a = 3)

r(t) =< et, t2 − 4t + 2 , 3 − t2 > (a = −2)

r(t) =< et, ln(2 − t), 3t − 1 > (a = 1)

70


Differentiation

• For a vector valued function r(t) =< f(t), g(t), h(t) >, we define the derivative of r(t),

denoted r′(t) or drdt

by

r′(t) = lim∆t→0

r(t + ∆t) − r(t)

∆t

• After some vector algebra:

r(t + ∆t)− r(t)

∆t=

1

∆t(< f(t + ∆t), g(t + ∆t), h(t + ∆t) > − < f(t), g(t), h(t) >)

=1

∆t< f(t + ∆t) − f(t), g(t + ∆t)− g(t), h(t + ∆t) − h(t) >

= <f(t + ∆t) − f(t)

∆t,g(t + ∆t)− g(t)

∆t,h(t + ∆t) − h(t)

∆t>

• And some vector calculus:

r′(t) = lim∆t→0

r(t + ∆t)− r(t)

∆t

= lim∆t→0

<f(t + ∆t)− f(t)

∆t,g(t + ∆t) − g(t)

∆t,h(t + ∆t)− h(t)

∆t>

= < lim∆t→0

f(t + ∆t) − f(t)

∆t, lim∆t→0

g(t + ∆t) − g(t)

∆t, lim∆t→0

h(t + ∆t)− h(t)

∆t>

• We getr′(t) =< f ′(t), g′(t), h′(t) >

Example: Given r(t) =< et, t4 + 3√

t, 1t2

>, find r′(t).

71

Example: Given r(t) =< sin t, t2, t >, find r′(t). Then, find r(4) and r′(4).

• The geometric interpretation of the derivative at t = a is a vector tangent to the curvetraced by r(t) at the point defined by r(a).

• The derivative function takes r′(t) at each value of t and uses it as the position to traceout another curve.

• A curve is smooth on an interval if its derivative is never zero on that interval. A curvehas a cusp at t = a if r′(a) = 0.

Example: Sketch and work the example shown for r(t) =< t3, t2 > at t = 0.

72


Tangents and tangent lines

• For a vector valued function r(t) and a given t = a, r′(a) gives us a vector tangent to thecurve at the point r(a).

• The unit tangent vector at t = a is obtained by normalizing r′(a):

T(a) =r′(a)

||r′(a)||

• The unit tangent vector function gives the value of the unit tangent vector at any t:

T(t) =r′(t)

||r′(t)||

Example: Find the unit tangent vector function for r(t) =< sin t, cos t, t2 >.

Example: Find a vector tangent to the curve traced by r(t) = (3t)i + (t2)j at t = 1. Then,find the unit tangent vector at t = 1. Sketch the curve, and indicate r(1), r′(1) and T(1) onthe sketch.

73

• The equation of a line in space is given by

< x, y, z >=< x0, y0, z0 > +t < a, b, c >

where (x0, y0, z0) is a point on the line, and < a, b, c > is a parallel vector indicating thedirection of the line. To write the equation of the tangent line at t = a, we get thepoint by evaluating r(a), and the direction vector by evaluating r′(a).

Example: Write the equation of the line tangent to the curve r(t) = (3t)i + (t2)j at t = 1.Convert to Cartesian form , and add a sketch of the tangent line to your previous sketch.

Example: Write the equation of the line tangent to the curve r(t) =< e2t, t3, 5t > at thepoint (e4, 8, 10).

74


Integration

• For a vector valued function r(t) =< f(t), g(t), h(t) > we define the indefinite integral

(antiderivative), denoted∫

r(t) dt, by

∫

r(t) dt =<

∫

f(t) dt,

∫

g(t) dt,

∫

h(t) dt >

Example: Given r(t) =< sin t, 1t , t3 +

√t >, find

∫

r(t) dt.

• For a vector valued function r(t) =< f(t), g(t), h(t) > we define the definite integral,

denoted∫

b

ar(t) dt, by the limit of the Riemann sum:

∫

b

a

r(t) dt = limn→∞

n∑

i=1

r(t∗i)∆t

= < limn→∞

n∑

i=1

f(t∗i)∆t, lim

n→∞

n∑

i=1

g(t∗i)∆t, lim

n→∞

n∑

i=1

h(t∗i)∆t >

∫

b

a

r(t) dt = <

∫

b

a

f(t) dt,

∫

b

a

g(t) dt,

∫

b

a

h(t) dt >

75

• We can extend the Fundamental Theorem of Calculus to vector valued functions: ifr(t) is continuous on (a, b), then

∫

b

a

r(t) dt = R(b) − R(a)

where R(t) is any antiderivative of r(t): R′(t) = r(t).

Example: Given r(t) =< t3, t4, t5 >, find∫

1

0r(t) dt.

76


Normal and binormal vectors

• The derivative of the unit tangent vector, T′(t), always produces vectors which are normal to thecurve r(t).

Proof: Prove that for any vector valued function v(t), if ||v(t)|| = c (constant), then v′(t) is orthogonalto v(t) for all t.

• Then, apply that result to T(t): since T(t) is by definition a unit vector with ||T(t)|| = 1 for all t,T(t) and T′(t) are orthogonal for all t, and for any t = a, T′(a) gives a vector normal to the curvedefined by r(t).

• The unit normal vector is obtained by normalizing T′:

N(t) =T′(t)

||T′(t)||

• And, we can obtain a third vector orthogonal to both T and N, the binormal vector:

B(t) = T(t) × N(t)

77

Example: Find T(t), N(t) and B(t) for r(t) =< sin(3t2), cos(3t2), 0 >.

Example: Sketch the graph of r(t) =< sin(3t2), cos(3t2), 0 >. Place the vectors T(0) and N(0) on thegraph. What is the direction of B(0) (in or out of the page)?

78

• The normal plane at r(a) containes the vectors N and B, and is orthogonal to the vector T.

• The osculating plane at R(a) contains the vectors T and N, and is orthogonal to the vector B.The circle contained in this plane and tangential to the curve is the osculating circle and is theclosest approximation to the curve near r(a)

79

Example: Find T, N and B for r(t) =< t2, 23t3, t > at the point (1, 2

3 , 1).

Example: Write the equations for the normal and osculating planes for r(t) =< t2, 23t3, t > at the

point (1, 23 , 1).

80


Arc length

• Recall that the length of a curve can be approximated using lengths of line segments

L ≈

n∑

i=1

√

(∆xi)2 + (∆yi)2

• If the curve is parameterized by x = f(t) and y = g(t), then

∆xi ≈ f ′(ti)∆ti ∆yi ≈ g′(ti)∆ti

and we have the Riemann sum

L ≈

n∑

i=1

√

(∆xi)2 + (∆yi)2

=n

∑

i=1

√

[f ′(ti)∆ti]2 + [g′(ti)∆ti]2

=n

∑

i=1

√

[f ′(ti)]2 + [g′(ti)]2∆ti

Taking the limit as n → ∞ gives the length of the curve, or arc length, on the interval(a, b):

L =

∫

b

a

√

[f ′(t)]2 + [g′(t)]2 dt

L =

∫

b

a

√

(

dx

dt

)2

+

(

dy

dt

)2

dt

• For a space curve, an analogous argument gives

L =

∫

b

a

√

(

dx

dt

)

2

+

(

dy

dt

)

2

+

(

dz

dt

)

2

dt

L =

∫

b

a

√

[f ′(t)]2 + [g′(t)]2 + [h′(t)]2 dt

L =

∫

b

a

√

(

dx

dt

)2

+

(

dy

dt

)2

+

(

dz

dt

)2

dt

• If the curve is defined by r(t) =< f(t), g(t), h(t) >, the formula can be written compactlyas

L =

∫

b

a

||r′(t)|| dt

81

Example: Find the length of the curve given by r(t) =< et cos t, et sin t >, 0 ≤ t ≤ π.

• The arc length function gives the length of a curve on (a, t) as t varies (a is fixed):

s(t) =

∫

t

a

||r′(u)|| du

Example: Find the arc length function for the curve r(t) =< 2 sin t, 5t, 2 cos t > t ≥ 0.

82

• Differentiating the arc length function

d

dts(t) =

d

dt

∫

t

a

||r′(u)|| du

and applying the Fundamental Theorem of Calculus gives

ds

dt= ||r′(t)||

The rate of change of the arc length function with respect to t is the same as the magnitudeof the rate of change of the position vector.

83


Curvature

• We can think of the unit tangent vector, T(t) =r′(t)

||r′(t)||, as providing information about

the direction of the curve at any t, without any additional magnitude information.

• Curvature is a measure of the tightness of the curve- how fast the direction is changingas you travel along length of curve.

• Curvature is defined as the norm of the rate of change of the unit tangent vector withrespect to arc length:

κ =

∣

∣

∣

∣

∣

∣

∣

∣

dT

ds

∣

∣

∣

∣

∣

∣

∣

∣

The implication is that T (unit tangent) is expressed in terms of s (arc length), and notin terms of the usual parameter t. While it is possible to reparameterize T in terms of s,it is easier to modify the formula:

dT

dt=

dT

ds

ds

dt(chain rule)

||dT

dt|| = ||

dT

ds||

(

ds

dt

)

(don’t forget, s is scalar, and dsdt

= ||r′(t)|| )

||T′(t)|| = κ ||r′(t)||

κ =||T′(t)||

||r′(t)||

Emphasizing that κ is a function of t:

κ(t) =||T′(t)||

||r′(t)||

(You’ll see it both ways in various texts.)

84

Example: Find curvature as a function of t for r(t) =< t2, 2t, ln t >.

Example: What are the values of the curvature at t = 1 and t = 4 for r(t) =< t2, 2t, ln t >?

85

• The radius of curvature is the reciprocal of the curvature value:

ρ =1

κ

The radius of that of an inscribed circle tangent to the curve (so high κ value = small ρ

value = tight curve, and vice versa).

• An alternate curvature formula (the derivation of this formula is at the end of the notes):

κ(t) =||r′(t) × r′′(t)||

||r′(t)||3

Example: Use the alternate formula to find κ(t) for r(t) =< t2, 2t, ln t >.

86

• The is also a curvature formula for plane curves in the form y = f(x):

κ(x) =|f ′′(x)|

[1 + (f ′(x))2]3/2

(derivation at end of notes)

Example: Find the curvature and radius of curvature of the graph of y = cos 2x at x = 0.Sketch the graph and the inscribed circle.

87

Derivation of

κ(t) =||r′(t) × r′′(t)||

||r′(t)||3

Start with T(t) =r′(t)

||r′(t)||and solve for r′(t):

r′(t) = ||r′(t)||T(t)

Substitute ||r′(t)|| = dsdt

:

r′(t) =

(

ds

dt

)

T(t)

Differentiate both sides with respect to t. This involves the scalar/vector function product ruleon the right:

d

dtr′(t) =

d

dt[

(

ds

dt

)

T(t)]

r′′(t) =d

dt[

(

ds

dt

)

]T(t) +

(

ds

dt

)

d

dt[T(t)]

r′′(t) =

(

d2s

dt2

)

T(t) +

(

ds

dt

)

T′(t)

Cross r′(t) with both sides:

r′(t)× r′′(t) = r′(t) ×

[(

d2s

dt2

)

T(t) +

(

ds

dt

)

T′(t)

]

Apply properties of cross product (distributes over vector addition; scalars can move to thefront):

r′(t)× r′′(t) =

(

d2s

dt2

)

r′(t)× T(t) +

(

ds

dt

)

r′(t) ×T′(t)

Substitute in r′(t) = ||r′(t)||T(t):

r′(t)× r′′(t) =

(

d2s

dt2

)

||r′(t)||T(t)× T(t) +

(

ds

dt

)

||r′(t)||T(t)× T′(t)

Note that T(t) × T(t) = 0 (property of cross product; vector crossed with itself):

r′(t) × r′′(t) = 0 +

(

ds

dt

)

||r′(t)||T(t)× T′(t)

Substitute in ||r′(t)|| = dsdt

and combine:

r′(t) × r′′(t) =

(

ds

dt

)2

T(t)× T′(t)

Take magnitudes of both sides:

||r′(t) × r′′(t)|| = ||

(

ds

dt

)

2

T(t) × T′(t)||

=

∣

∣

∣

∣

∣

(

ds

dt

)2

∣

∣

∣

∣

∣

||T(t)× T′(t)||

88

We have ||T(t) × T′(t)|| = ||T(t)|| ||T′(t)|| sin θ. We have also proven that since ||T(t)|| = 1(constant), T(t) and T′(t) are orthogonal and therefore sin θ = sin(π/2) = 1. So

||r′(t) × r′′(t)|| =

∣

∣

∣

∣

∣

(

ds

dt

)2

∣

∣

∣

∣

∣

||T(t)|| ||T′(t)||

But ||T(t)|| = 1:

||r′(t) × r′′(t)|| =

∣

∣

∣

∣

∣

(

ds

dt

)2

∣

∣

∣

∣

∣

||T′(t)||

Substitute back dsdt

= ||r′(t)|| (and yes, we did have to work with dsdt

for a while to differentiate,

so the original substitution was necessary):

||r′(t) × r′′(t)|| = ||r′(t)||2 ||T′(t)||

Solve for ||T′(t)||:

||T′(t)|| =||r′(t) × r′′(t)||

||r′(t)||2

Divide both sides by ||r′(t)||:||T′(t)||

||r′(t)||=

||r′(t) × r′′(t)||

||r′(t)||3

The expression on the left was our first formula for κ(t). So

κ(t) =||r′(t) × r′′(t)||

||r′(t)||3

Because it involves the cross product, this formula only makes sense for vectors in R3 (andR2 as a subset of R3 with 0k). This is still pretty useful, as we are unlikely to be sketchingany four dimensional curves. It should be noted that the idea of curvature (and the originalformula) could be defined in any dimension.

89

Derivation of

κ(x) =|f ′′(x)|

[1 + (f ′(x))2]3/2

for plane curves in the form y = f(x):

If y = f(x), then a parameterization of the curve is x = t, y = f(t), giving us the vectorfunction

r(t) =< t, f(t), 0 >

Note thatr′(t) =< 1, f ′(t), 0 > r′′(t) =< 0, f ′′(t), 0 >

and apply the previous formula:

r′(t) × r′′(t) =

∣

∣

∣

∣

∣

∣

i j k

1 f ′(t) 00 f ′′(t) 0

∣

∣

∣

∣

∣

∣

=< 0, 0, f ′′(t) >

Then||r′(t)|| =

√

1 + (f ′(t))2 ||r′(t)× r′′(t)|| =√

(f ′′(t))2 = |f ′′(t)|

and

κ(t) =|f ′′(t)|

(

√

1 + (f ′(t))2

)3

Since t = x,

κ(x) =|f ′′(x)|

[1 + (f ′(x))2]3/2

90

Application:

Differential equations and initial value problems

• A differential equation is an equation involving various order derivatives of a function (thesolution is a function that satisfies the equation). Simple differential equations of the form y′ =f ′(x), y′′ = f ′′(x) etc. can be solved by antidifferentiation, and we can do something analogouswith vector valued functions.

Example: Given r′(t) =< t2, sec2 t, 1 >, what is r(t)?

• An initial value problem is a differential equation with additional information about the functionand/or its derivatives that allows you to solve for constants of integration. For example, to solve

y′ = x2 + sinx y(0) = 1

we antidifferentiate

y =1

3x3 − cos x + C

and use y(0) = 1 to solve for the constant:

y(0) =1

303 − cos 0 + C = 1 ⇒ 0 − 1 + C = 1 ⇒ C = 2

y =1

3x3 − cos x + 2

For a vector valued function...

91

Example: Given r′(t) =< t2, sec2 t, 1 >, r(0) =< 1, 2,−5 >, what is r(t)?

Example: Given r′′(t) =< 2t, 3t2 + 1, cos t >, r′(π

2) =< 0, 0, 1 >, r(0) =< 1, 1,−1 >, what is r(t)?

92

Application:

Position, velocity, acceleration

• If t is interpreted as a time parameter, the vector valued function r(t) traces out a curve over time,and at t = a, r(a) points to the position of the object moving along that curve.

• The derivative r′(t) gives the rate of change of position with respect to time; i.e. the velocity.v(t) = r′(t).

• Differentiating again gives rate of change of velocity; i.e. the acceleration. a(t) = v′(t) = r′′(t).

Example: What are the velocity and acceleration functions for an object whose position is given byr(t) =< tet, t2, tan t >?

Example: What are the velocity and position functions for an object whose acceleration is given bya(t) =< 2t, 3t2, 4t3 > with v(0) =< 1, 1, 0 >, r(0) =< 0, 0, 10 >?

93

• In 2D and 3D motion, velocity and acceleration are vector valued functions, having both magnitudeand direction.

• The speed of a moving object is the magnitude of the velocity, denoted ||v|| or v.

• The direction of the velocity vector (which indicates the direction of motion) is the unit tangentvector

T(t) =r′(t)

||r′(t)|| =v(t)

||v(t)||and velocity can be expressed as a product of magnitude and direction:

v(t) = ||v(t)||(

v(t)

||v(t)||)

= v T

Example: What is the speed function for the object whose path is described by r(t) =< tet, t2, tan t >?How fast is it moving at t = 1s (r in meters)? Express v(1) in the form v = vT.

• In 1D motion, an object moving at a constant speed has zero acceleration. Is the same true for 2Dor 3D motion?

94

Application:

Tangential and normal components of acceleration

• Unit tangent(T = v

||v||)

and unit normal(N = T′

||T′||)

vectors provide information about the

direction of motion. We know that velocity is tangential to the curve, and can be expressed as

v = vT where v = ||v||

We are interested in how acceleration is related to these vectors.

• Since v = ||v||T and a = v′, we obtain by differentiating:

d

dtv =

d

dt[||v||T]

=

(d

dt[||v||]

)T + ||v|| d

dt[T]

=

(d

dt[||v||]

)T + ||v||T′

=

(d

dt[||v||]

)T + ||v||

(||T′||||T′||

)T′

=

(d

dt[||v||]

)T + (||v|| ||T′||) T′

||T′||

a =

(d

dt[||v||]

)T + (||v|| ||T′||)N

• In this expression, the scalar coefficient of the unit tangent vector is the tangential componentof acceleration:

aT =d

dt[||v||]

• The scalar coefficient of the unit normal vector is the normal component of acceleration:

aN = ||v|| ||T′||

• The form of the expressiona = aTT + aNN

tells us about the acceleration

• Accleleration lies in a plane containing T and N (the osculating plane)

• The tangential component aT = ddt

[||v||] is the rate of change of the speed.

• The normal component aN = ||v|| ||T′|| provides information about changes in direction. Theeasiest way to visualize the effects of aN is in terms of centripetal force: F = maN.

95

• Computing aT and aN from their respective definitions is tedious (aT isn’t too bad, but computingthe vector T′ to get aN is inconvenient).

• Consider the relationshipa = aTT + aNN

Since T and N are orthogonal, we have a right triangle with

||a||2 = (aT)2 + (aN)2

• We can further consider aTT as the vector projection of a onto T, and aT as the length of thatprojection. Working through the formulas we derived for the vector projection, we arrive at

aTT = projTa

aT = ||projTa|| (or compTa)

aT = ||projTa|| =a · T||T|| = a ·T (since ||T|| = 1)

Using T = v||v|| , we have

aT =a · v||v||

• We can also consider aNN as the orthogonal projection of a onto T, and aN as the length of thatprojection. Working through the formulas we derived for the orthogonal projection, we arrive at

aNN = orthTa

aN = ||orthTa||

aN = ||orthTa|| =||a×T||||T|| = ||a×T|| (since ||T|| = 1)

Using T = v||v|| , we have

aN =||a× v||||v||

96

• So, after a great deal of explanation, we come up with two simple formulas for aT and aN:

aT =a · v||v|| aN =

||a× v||||v||

• Alternately, compute aT and ||a||, and get aN from

aN =√||a||2 − (aT)2

Example: What are the tangential and normal components of acceleration for r(t) =< t2, et, ln t > mat t = 1 s?

Important footnote to all this... Computing aT and aN turns out to be painless; however these areonly the scalar components; i.e. magnitudes. In themselves, they tell us nothing about the directionof the accleration. To examine the directions that these magnitudes belong to, we must go through theprocess of computing the vectors T and N described in the section on “Unit tangent and unit normalvectors”. In particular, there is no nice, short way to compute the normal vector N.

Once useful thing you can do to visualize tangential and normal components (without doing the vectorcalculations) is take a look in MVT’s “TNB Frames” application (see the link from the web page). TheTNB frames will show you the tangent, normal, and binormal vectors as you move along the curve.Combined with your calculations for ||v||, aT and aN, you can get a feel both both the size AND thedirection of the quantities involved.

97

Functions of several variables


• Functions of several variables, or multivariable functions have the form

f(x) = f(x1, x2, ..., xn)

for examplef(x1, x2, ..., xn) = x2

1 + x22 − x3 + sin x4

The variables x1, x2, ..., xn are all independent.

• For the most part, we’ll be looking at functions of two variables, expressed as

z = f(x, y)

(e.g. z = x2y + sin y) where x and y are the independent variables, and z is dependent on x and y.

• The graph of z = f(x, y) is the set of all ordered triplets that satsify the relation defined byz = f(x, y). These triplets form a surface in space.

• The domain of a multivariable function f(x) = f(x1, x2, ..., xn) is the set of all values x1, x2, ..., xn

for which it is defined.

For example, the domain of f(w, x, y, z) =√

w + x + y + z is the set of all values for w, x, y, andz which do not produce a negative quantity under the square root:

(w, x, y, z) |w + x + y + z ≥ 0

Examples: Give the domains of

• f(x, y) = ln(xy)

• g(x, y, z) = 1x2 + y2 + z2

• h(x, y) = 2 − x√4 − x2 − y2

98

• In the case of a function of two variables, the domain will be a region in the xy plane.

For example, we found the domain of f(x, y) = ln(xy) to be (x, y) | xy > 0. For this to hold, wemust have x and y both positive (1st quadrant), or x and y both negative (3rd quadrant).

Example: Sketch the domain of h(x, y) = 2 − x√4 − x2 − y2

.

• The graph of a function of three variables is four dimensional (and tough to sketch). The domains ofthese functions are expressed in terms of three variables, and can be described as regions of 3D space.

We found the domain of g(x, y, z) = 1x2 + y2 + z2 to be (x, y, z) | (x, y, z) = (0, 0, 0). This is all

of space except the origin.

Example: Find and describe the domain of f(x, y, z) = ln(z − sin(y)).

• Projections of surfaces onto various planes may not be that informative. Take a moment andsketch the example of the projections of f(x, y) = x2 + y2:

99

• Projecting a function of two variables onto the xy plane in particular shows the region that is thedomain of the function. You can see (sort of, owing to the limitations of graphing software and

some bad behavior around asymptotes) that the projection of h(x, y) = 2 − x√4 − x2 − y2

is the circular

region you found for the domain in a previous example.

• Operations on multivariable functions are analogous to operations on single variable functions:

* (cf)(x, y) = cf(x, y) (scalar multiple)

* (f ± g)(x, y) = f(x, y) ± g(x, y) (sum or difference)

* (fg)(x, y) = f(x, y)g(x, y) (product)

* (fg )(x, y) =f(x, y)g(x, y)

(quotient)

Domains of the resulting functions are the intersections of the domains of f and g, with g nonzeroin the case of the quotient.

• We can define the composition of a single variable function with a multivariable function. Supposeg(x) is a function of one variable, and h(x, y) is a function of two variables. The composition

(g h)(x, y) = g(h(x, y))

makes sense, and produces a function of two variables

f(x, y) = (g h)(x, y)

Example: What is the composition of g(x) = 1√x

with h(x, y) = x2 + y2?

100


Intersections of surfaces and traces

• Intersections of surfaces are curves in space, and we should be able to write a set of parametricequations that describe this curve.

Example: Letz = f1(x, y) = 8 − x2 − y2

z = f2(x, y) = x2 + y2

The intersection of these surfaces appears to give a circle. Work through the example screens where wesolve for the intersection of the surfaces:

Example: Find parametric equations for the intersection of the surfaces z = sin(x) and z = x + y

101

• The trace of a surface in a plane is the intersection of that surface with the plane. Traces canbe expressesed in parametric form (as above), or, more commonly, in Cartesian form (specify theplane that the equation lies in).

For example, the trace of the surface z = f(x, y) = x2 + 4y2 in the plane z = 2 is the ellipse

2 = x2 + 4y2

(the curve lies in the plane z = 2).

Example: What is the trace of the surface z = x2 + 4y2 in the plane x = 2?

Example: What are the traces of the surface z = sin x + cos y in the xz, yz, and xy planes?

102

(The algebra on this one is kind of involved, so here’s some more room. Just keep working ...)

• The traces of a surface in the xy, xz, and yz planes can be used to visualize the surface. A 3Dsurface would be hand sketched by sketching its traces.

Example: What are the traces in the xz, yz, and xy planes of the surface z = 9 − x2 − y2? Use thetraces to hand sketch the surface.

103

Example: You’ve found the traces of the surface z = sin x + cos y in the xz, yz, and xy planes. Canyou build up a picture of the surface and sketch it?

104

Calculus of multivariable functions

Limits, part 1: the intuitive approach

• In single variable Calculus, we start by introducing the idea of a limit intuitively (through graphsand tables), before turning to the formal defnition. We’re going to do the same thing here.

• We think of limx→a f(x) = L in terms of “approaching” - as x values get close to a, function valuesget close to L.

• And we think of contiunuity in terms of “unbroken” - can be drawn without lifting your pencil.

• Recall that most of the functions we work with (polynomial, rational, radical, trigonometric, loga-rithmic, exponential) are continuous on their domains.

• And the first technique we learn for evaluating limits is direct substitution- we expect that aslong as we’re in the domain of the function, that limx→a f(x) = f(a).

• So, we go through the same process with functions of two variables(or more, but as usual, we’llstick with two because we can graph the surfaces). Keep in mind you can approach a point (a, b)from all directions in the xy plane.

• Informal definition of limit: We say that the limit of f(x, y) as (x, y) approaches (a, b) is L andwrite

lim(x,y)→(a,b)

f(x, y) = L

if the values of f(x, y) can be made arbitrarily close to L by choosing points (x, y) sufficiently closeto (a, b).

• A function is continuous at (a, b) if and only if

lim(x,y)→(a,b)

f(x, y) = f(a, b)

• A function is continuous on an open region R if it continuous at every point in R.

• We expect all the familiar functions to be continuous on their domains, and for compositionsg(h(x, y)) of continuous g(x) and h(x, y) to be continuous. If we know that the function we’reinterested in is continuous on its domain, and (a, b) is in the domain of the function, the limit maybe evaluated by direct substitution:

lim(x,y)→(a,b)

f(x, y) = f(a, b)

Example: Find lim(x,y)→(1,2) x2y + xy2.

105

Example: Find lim(x,y)→(1,2)x2y + 2x3xy + 6 .

Example: Find lim(x,y)→(−1,2)x2y + 2x3xy + 6 .

Examples: Where are the functions below continuous?

f(x, y) = ln(xy)

h(x, y) = 2 − x√4 − x2 − y2

106


Limits, part 2: using paths to show a limit does not exist

• For a single variable function, one way to show a limit does NOT exist is to consider the limit fromthe left and from the right; if these values do not agree, then the limit does not exist. For example,for the function

f(x) =

x2 x < 1x + 3 x ≥ 1

we know that limx→1 f(x) does not exist, because

limx→1−

f(x) = limx→1−

x2 = 1

andlim

x→1+f(x) = lim

x→1+(x + 3) = 4

When we do this, we are taking two different paths to the value a = 1, “from the left” and “fromthe right”.

• We can extend this approach to functions of two variables - if we are interested in

lim(x,y)→(a,b)

f(x, y)

we can look at that limit by approaching (a, b) along various paths in the (x, y) plane. Unlike thesingle variable case, though, there are an infinite number of paths passing through a given (a, b),so we can never show a limit exists by considering paths - there’s no way to consider them all! Ifwe suspect the limit doesn’t exist, though, we only need to find two that disagree.

Example: What is lim(x,y)→(0,0)2x2 − 3y2

x2 + y2 along the path y = 0 (i.e., the x-axis)? Note that limiting

values to this path puts you on a curve that is the trace of the function in the xz plane. To solve, simplysubstitute y = 0 in the expression, and find the limit.

107

Example: What is lim(x,y)→(0,0)2x2 − 3y2

x2 + y2 along the path x = 0 (i.e., the y-axis)? Note that limiting

values to this path puts you on a curve that is the trace of the function in the yz plane. To solve, simplysubstitute x = 0 in the expression, and find the limit.

Example: What does that tell you about lim(x,y)→(0,0)2x2 − 3y2

x2 + y2 overall?

Example: Show that lim(x,y)→(0,0)xy

x2 + y2 does not exist by considering paths.

108

• Instead of looking at specific paths one at a time (y = 0, x = 0, y = x, y = 2x, and so on, if we’reconsidering a limit going to the origin), you can shorten things by checking all paths of a certaintype at once. For example, all lines through the origin (except x = 0) have the form y = kx.Making the substitution y = kx and examining the limit as x → 0 will give us an expression thatmay or may not depend on the value of k. If it depends on k, we know the limit does not exist.

Example: Find lim(x,y)→(0,0)x2

x2 + y2 along the paths y = kx. What can you conclude?

Example: Find lim(x,y)→(0,0)xy2

x2 + y4 along the paths y = kx. What can you conclude?

109


Limits, part 3: the delta-epsilon definition

• We have the informal definition of a limit:

We say that the limit of f(x, y) as (x, y) approaches (a, b) is L and write

lim(x,y)→(a,b)

f(x, y) = L

if the values of f(x, y) can be made arbitrarily close to L by choosing points (x, y) suffi-ciently close to (a, b).

• In this section, we are going to look at the formal definition of a limit, and define exactly what wemean by “sufficiently close to.”

• We start by defining a δ-neighborhood about a point (a, b) in the xy plane: a δ-neighborhoodabout (a, b) is a disk centered at (a, b) with radius δ > 0. We can can have open or closed δ-neighborhoods, depending on whether we include the boundary of the disk.

Open disk: (x, y) |√

(x − a)2 + (y − b)2 < δClosed disk: (x, y) |

√(x − a)2 + (y − b)2 ≤ δ

• Recall the definition of the limit for a single variable function:

Let f be a function defined on an open interval containing a (except possibly at a), andlet L be a real number. The statement

limx→a

f(x) = L

means that given any ε > 0, there exists a δ > 0 such that if

0 < |x − a| < δ, then |f(x) − L| < ε

• We can extend this to the formal δ-ε defintion of the limit for a function of two variables:

Let f be a function of two variables defined on an open disk centered at (a, b) (except,possibly, at (a, b) itself), and let L be a real number. We say that the limit of f(x, y) as(x, y) approaches (a, b) is L and write

lim(x,y)→(a,b)

f(x, y) = L

if, given any ε > 0, there exists δ > 0 such that if

0 <√

(x − a)2 + (y − b)2 < δ

then|f(x, y) − L| < ε

• Examples are posted as separate .pdf’s. 110

• General notation: We’ve been looking at these concepts in terms of two variable functions, withthe note that they can be extended to functions of three or more variables by analogy. Since we’rebeing formal here, I’d like to provide the general definition of the limit for a function of n variablesusing proper notation. The easiest way to do this is to make use of vectors (or ordered n-tuples).Let x and a be vectors in Rn:

x =< x1, x2, ..., xn >

a =< a1, a2, ..., an >

• We define an open δ-neighborhood about a by

x | ||x − a|| < δ

• And the limit

Let f(x) = f(< x1, x2, ..., xn >) be a function of n variables defined on an open intervalcontaining a =< a1, a2, ..., an > (except possibly at a), and let L be a real number. Thestatement

limx→a

f(x) = L

means that given any ε > 0, there exists a δ > 0 such that if

0 < ||x− a|| < δ, then |f(x) − L| < ε

• Finally, recall discussing continuity on an open region. Now that we’ve defined open negihborhood,we can formally define open region.

• A point (a, b) in a region R is an interior point if there exists a δ-neighborhood about (a, b) thatlies entirely within R. A point (a, b) in a region R is a boundary point if every δ-neighborhoodabout (a, b) contains points both inside and outside R.

• If every point in R is an interior point, it’s an open region (it doesn’t contain its own boundary).If R contains all its boundary points, it’s a closed region.

111


Limits, part 4: using the squeeze theorem to prove the limit exists

• Squeeze Theorem:

Ifg(x, y) ≤ f(x, y) ≤ h(x, y)

in a neighborhood of (a, b), then

lim(x,y)→(a,b)

g(x, y) ≤ lim(x,y)→(a,b)

f(x, y) ≤ lim(x,y)→(a,b)

h(x, y)

(assuming these limits exist).

If lim(x,y)→(a,b) g(x, y) and lim(x,y)→(a,b) h(x, y) exist and are equal, then lim(x,y)→(a,b) f(x, y) exists,and

lim(x,y)→(a,b)

f(x, y) = lim(x,y)→(a,b)

g(x, y) = lim(x,y)→(a,b)

h(x, y)

Example: Use the squeeze theorem to find

lim(x,y)→(0,0)

3x2y2

x2 + y2

* First, check some preliminaries - what does direct substitution give you? Can you reduce theexpression? What does the graph look like?

* Now, set up the squeeze:

* Next, pass the limit through:

112

* And conclude:

Example: Use the squeeze theorem to find

lim(x,y)→(0,0)

y4 sin2 y

x2 + y2

113

• Hint: for functions which are not nonnegative for all (x, y), try squeezing the absolute value of thefunction. This works when you suspect the limit is 0. The absolute value gives the distance fromthe function to 0, and if that turns out to be 0, you know the function itself must be approaching0.

Example: Use the squeeze theorem to show the limit is zero:

lim(x,y)→(0,0)

−2xy2

x4 + y2

114


Limits, part 5: general strategy

• In the past few sections, we’ve illustrated various techniques for showing whether alimit does/does not exist. It’s up to you to decide what technique to use - a typicallimit problem reads “Find the limit, if it exists, or show that it doesn’t.”

• Strategy. Which way you proceed depends on your intuition about whether the limitexists or not.

* Look at domain, try direct substitution. You may get a conclusive answer andnot have to go through the rest of the process!

* If you get an indeterminate form (usually 00), see if there’s any algebra you can

do to reduce.

* Now, if that hasn’t gotten you anywhere, you need to decide. Exists, or doesn’texist. If at all possible, GRAPH! Inspection will tell you if anything weird isgoing on.

* If you think the limit doesn’t exist, prove it by trying paths.

* If you think the limit does exist, prove it by trying to squeeze.

• The worst part of doing limit problems is deciding which way to go ... and that’sbeen simplified considerably with graphing software. If you aren’t in a position tograph and inspect, the usual approach is to try paths first, and if all the paths seemto be heading to the same place, change tactics and try for a squeeze.

Example 1: Find the limit, if it exists, or show that it doesn’t:

lim(x,y)→(0,0)

6x2y2

x4 + y4

* Substitute:

* Algebra:

115

* Graph:

* Paths:

Example 2: Find the limit, if it exists, or show that it doesn’t:

lim(x,y)→(0,0)

x3y2

x4 + y2

* Substitute:

* Algebra:

* Graph:

116

* Squeeze:

• Now that we have a stragtegy for examining limits, we can discuss the continuityof piecewise functions, such as

f(x, y) =

⎧⎨⎩

6x2y2

x4 + y4 (x, y) = (0, 0)

1 (x, y) = (0, 0)

When considering piecewise functions, we can apply the usual rules on the openintervals on which the pieces are defined. The main question is “What happens atthe spots the pieces switch?” (In this case, what happens at (0, 0)?)

Example: Discuss the continuity of

f(x, y) =

⎧⎨⎩

6x2y2

x4 + y4 (x, y) = (0, 0)

1 (x, y) = (0, 0)

117


Partial differentiation

• We define partial derivatives of functions of two or more variables by considering what happens ifwe hold one of the variables constant. Given a point (a, b), and a function f(x, y), we define thepartial derivative of f with respect to x at the point (a, b) by

fx(a, b) = limh→0

f(a + h, b) − f(a, b)

h

Similarly, we define the partial derivative of f with respect to y at the point (a, b) by

fy(a, b) = limk→0

f(a, b + k) − f(a, b)

k

• The definition of fx(a, b) is equivalent to

Hold y = b constant in the function f(x, y); i.e., intersect the function with the planey = b. f(x, y) becomes f(x, b) = g(x), a curve in that plane. Now, differentiate withrespect to x, and evaluate at a. fx(a, b) = g′(a) - the slope of the tangent to the curve inthat plane.

• The definition of fy(a, b) is equivalent to

Hold x = a constant in the function f(x, y); i.e., intersect the function with the planex = a. f(x, y) becomes f(a, y) = h(y), a curve in that plane. Now, differentiate withrespect to y, and evaluate at b. fy(a, b) = h′(b) - the slope of the tangent to the curve inthat plane.

118

• We get from the partial derivative at a point to the partial derivative function in exactly thesame way as the one variable case- within a fixed plane (say y = b) the values of the slope varyas we move along the trace. So, we could talk about fx(x, b) as a function of x. Then, if we allowthe planes y = b to vary as well, the slope will vary depending on which plane we’re in. So we candefine fx(x, y) as a function of two variables (and the same reasoning holds for fy(x, y)):

fx(x, y) = limh→0

f(x + h, y) − f(x, y)

h

fy(x, y) = limk→0

f(x, y + k) − f(x, y)

k

• Since the partial derivative is defined by holding one variable constant and differentiating withrespect to the other value, it follows that all the differentiation rules for functions of one variableapply here.

• Before working examples, it will help to introduce the partial differentation operators, ∂∂x

and ∂∂y . These act in the same way that d

dx acts on a single variable function; they indicate

“differentiate this” (with respect to x or with respect to y, as indicated). So,

fx(x, y) = fx =∂f

∂x=

∂

∂xf(x, y)

fy(x, y) = fy =∂f

∂y=

∂

∂yf(x, y)

Example: Find fx(x, y) and fy(x, y) for f(x, y) = x4 + x2y2 − y sin x.

119

Example: Find fx(x, y) and fy(x, y) forf(x, y) = x + ey

x − y .

Example: Find fx(x, y) and fy(x, y) for f(x, y) = ln(2x + 3y + xy).

120

• To find a partial derivative at a point (a, b), differentiate and evaluate at (a, b). And keep in mindwhat that gets you- fx(a, b) gives the slope of the line tangent to the surface at (a, b), in the planey = b; fy(a, b) gives the slope of the line tangent to the surface at (a, b), in the plane x = a.

Example: Find fx(1, 2) and fy(1, 2) for f(x, y) = xy.

Hint: think about this one a bit - there are two different differentiation rules applying, depending on who is constant. Ifyou’re holding y constant, the appropriate rule is the power rule (xn). If you’re holding x constant, the appropriate rule isthe exponential rule (ax, or in this case, ay).

• We can define partial derivatives for functions of more than two variables- differentiate with respectto one variable while holding the others constant. For example, for a function f(x, y, z),

fz(x, y, z) = limh→0

f(x, y, z + h) − f(x, y, z)

h

(hold x and y constant while differentiating with respect to x).

Example: Find fx(x, y, z), fy(x, y, z), and fz(x, y, z) for f(x, y, z) = e2x+y+z2.

121

• We can define higher order derivatives. There are four second partial derivatives:

* fxx = (fx)x: Take the function fx and differentiate it with respect to x.

* fxy = (fx)y: Take the function fx and differentiate it with respect to y.

* fyx = (fy)x: Take the function fy and differentiate it with respect to x.

* fyy = (fy)y: Take the function fy and differentiate it with respect to y.

fxy and fyx are referred to as mixed partial derivatives.

There’s a posted live example for higher order derivatives.

• Notation note: if you’re using the differentiation operators, pay attention to the difference inordering of the x’s and y’s (you’ll notice it in the mixed partials):

* fxx = ∂∂x(fx) = ∂

∂x

(∂f∂x

)= ∂2f

∂x2

* fxy = ∂∂y (fx) = ∂

∂y

(∂f∂x

)=

∂2f∂y∂x

* fyx = ∂∂x

(fy) = ∂∂x

(∂f∂y

)= ∂2f

∂x∂y

* fyy = ∂∂y (fy) = ∂

∂y

(∂f∂y

)= ∂2f

∂y2

• Clairaut’s Theorem (equality of mixed partials): Suppose f is defined on a disk D thatcontains the point (a, b). If the functions fxy and fyx are continuous at (a, b), then

fxy(a, b) = fyx(a, b)

• Higher higher order partial deriviatives, e.g fxxx or fxyx defined analogously. (Example posted)

122


Differentials

Review of differentials for single variable functions

The idea behind differentials goes back to the secant line vs. tangent line problem. Initially, we notedthat slope of a secant line between two points (a, f(a)) and (b, f(b)) on a curve is given by

msec =f(a + h) − f(a)

h=

f(b) − f(a)

b − a

and the slope of a line tangent to the curve at (a, f(a)) is given by

mtan = limh→0

f(a + h) − f(a)

h= f ′(a)

One of the first explorations of differentiation (before learning the various rules), was to compute theslopes of various secant lines, and notice that as Δx = h grew smaller and smaller, the slope valuesapproached the value of the slope of the tangent. In other words,

For small values of Δx = h, the slope of the secant line between (a, f(a)) and (b, f(b))provides a decent approximation to the slope of the tangent line at (a, f(a)).

Now that we’ve got some fairly easy ways to compute f ′(x) for any function f(x), we can turn this ideaaround:

For small values of Δx = h, the slope of the tangent line at (a, f(a)) provides a decentapproximation to the slope of the secant line between (a, f(a)) and (b, f(b)).

In addition,

A line tangent to a curve provides a decent approximation to the curve itself near thepoint of tangency.

123

In functions of a single variable, we manipulate the notation

dy

dx= f ′(x)

to obtaindy = f ′(x)dx

The quantities dy and dx on their own are strange little things – they are infinitesimally small quantitiescalled differentials. The most useful way to interpret them in the current context is that they representthe change in y and the change in x along the tangent line of the function. If we let dx = Δx (a changein the horizontal displacement), from a fixed point x = a, we get

Δy ≈ dy = f ′(x)dx

Δx represents a change in x (usually relative to some fixed point x = a.

Δy represents the corresponding change in the function value: Δy = f(a + Δx) − f(a).

Δx and Δy are called increments of x and y.

dydx gives the slope of the tangent (again, generally at a fixed x = a).

The individual symbols dx and dy are called differentials, and are related by dy = f ′(x)dx (ordy = f ′(a)dx at x = a).

When we let dx = Δx, then Δy ≈ dy.

124

The chain rule

The chain rule - part 1

• The chain rule for functions of one variable describes the relationship between the derivatives offunctions in a composition. Suppose y = f(x) and x = g(t). Then y(g(t)) is a function of t, and

y′(t) = f ′(g(t))g′(t)

dy

dt=

dy

dx

dx

dt

Example: If y = x2 and x = sin t, what isdydt

?

• In the previous example, we could also have composed before differentiating:

f(t) = f(x(t)) = f(sin t) = (sin t)2

sof ′(t) = 2(sin t)1(cos t) = 2 sin t cos t

That’s still using the chain rule, just without explicitly writing out the steps (you’re using the chainrule every time you differentiate a composition and think “derivative of the outside times derivativeof the inside”).

• There are several versions of the chain rule for multivariable functions. To figure out a chain rulefor a function that makes sense for that function, you need to identify

– the independent variables (which are they and how many are there?)

– the intermediate variables (which are they and how many are there?)

– the dependent variable (ultimately, what are you differentiating?)

125

Case 1:

f = f(x, y), x = x(t) and y = y(t)

This is best explained by example ...

Example: Supposef(x, y) = x2 − y3

• Here, x and y appear to be independent variables, and z = f(x, y) is the dependent variable.Because x and y are independent of each other, it makes sense to talk about the rate of change ofz with respect to x, or z with respect to y. These give us the partial derivatives

fx(x, y) =∂f

∂xor

∂z

∂x, and fy(x, y) =

∂f

∂yor

∂z

∂y

Find ∂z∂x

= fx(x, y) and ∂z∂y

= fy(x, y):

• Now, what if x and y themselves are dependent on another variable, say t? Let

x = x(t) = sin t y = y(t) = ln t

Here, t is the independent variable. Since both x and y are functions of a single variable, we candifferentiate them with respect to t (notice these aren’t partial derivatives!):

Find dxdt

anddydt

:

• All together, we have z = f(x, y) = x2 − y3, x = x(t) = sin t, and y = y(t) = ln t.

Perform the composition z(t) = f(x(t), y(t)):

126

How many variables is z ultimately a function of, and what derivative makes sense? Find thatderivative:

• Now, see how all the pieces fit together. We have

x = sin t and y = ln t

dxdt

= cos t and dydt

= 1t

∂z∂x

= 2x and ∂z∂y

= −3y2

dz

dt= 2(sin t)(cos t) − 3(ln t)2(

1

t)

dz

dt= 2x

dx

dt− 3y2

dy

dt

dz

dt=

∂z

∂x

dx

dt+

∂z

∂y

dy

dt

• The chain rule forz = f = f(x, y), x = x(t) and y = y(t)

is given bydz

dt=

∂z

∂x

dx

dt+

∂z

∂y

dy

dt

(or dfdt

= ∂f∂x

dxdt

+ ∂f∂y

dydt

, depending on how you prefer to notate).

z or f is the dependent variable, t is the independent variable, and x and y are the interme-

diate variables. z is a multivariable function of its intermediate variables, and ultimately a singlevariable function of the independent variable.

A proof of this chain rule is attached.127

Example: Use the chain rule to find dzdt

for z = f(x, y) = 4exy, with x = 5t3 and y = et.

• We can generalize this case of the chain rule to multivariable functions of n variables, where the n

variables are still all functions of a single variable (say t), and ultimately, the function depends ont:

f = f(x1, x2, ..., xn) with x1 = x1(t), x2 = x2(t), ..., xn = xn(t)

df

dt=

∂f

∂x1

dx1

dt+

∂f

∂x2

dx2

dt+ ... +

∂f

∂xn

dxn

dt

Example: Find dwdt

for w = f(x, y, z) = x2yz + yz3, x = et, y = 2t, z = cos t.

128

The chain rule

Implicit differentiation

The chain rule can be used for differentiating implicitly. Before we go there, take a look at

Case 1a

z = f = f(x, y), y = y(x)

• This is a variation on case 1: Suppose you have a function of two variables, z = f(x, y). Now,instead of x and y both dependent on another parameter t, suppose that x is independent, but yis dependent on x. So z = f(x, y) is really z = f(x, y(x)), and is at the bottom level a function of

one variable x. It now makes sense to talk about ∂z∂x

and ∂z∂y

, but also to talk about dzdx

. Applying

the chain rule as in case 1:

dz

dx=

∂z

∂x

dx

dx+

∂z

∂y

dy

dx

dz

dx=

∂z

∂x+

∂z

∂y

dy

dx

You’ll also frequently see it written in the following way:

f ′ = fx + fyy′

• One context where this occurs is in the study of differential equations in the form

y′ = f(x, y)

such asy′ = x sin y or y′ = x2 + y2 where y (implicitly) depends on x. Given this kind of setup fory′, it’s possible to find y′′ and higher derivatives; note

y′ = f(x, y) → y′′ = f ′(x, y) = fx + fyy′

soy′′ = fx + fyf

Example: For y′ = x sin y, compute y′′ using y′′ = fx + fyf .

129

Implicit differentiation

• We could have accomplished the same thing using implicit differentiation, but this can be a good biteasier for messy expressions. In fact, the chain rule is going to give us a nice alternative to implicitdifferentiation. The first thing you want to do here is make sure you recall how to differntiateimplicitly. If we have an equation in x and y, where y is understood to be an implicit function of

x, we use the chain rule (and product and quotient rules as needed) to find dydx .

Example: Use implicit differentiation to find dydx

for x2 − x sin y = y3.

• The chain rule with partial derivatives gives us an alternate way to find dydx

. Suppose you have anexpression

F (x, y) = 0

where x is independent, and y is dependent on x, as in case 1a. Differentiate both sides, using the

chain rule to find dFdx

:

d

dxF (x, y) =

d

dx0 ⇒ ∂F

∂x+

∂F

∂y

dy

dx= 0

And solve for dydx

:

∂F

∂x+

∂F

∂y

dy

dx= 0 ⇒ ∂F

∂y

dy

dx= −∂F

∂x⇒ dy

dx= −

∂F∂x∂F∂y

= −Fx

Fy

• So, a convenient formula for finding dydx , when y is implicitly defined as a function of x, is

dy

dx= −Fx

Fy

when the expression relating x and y is in the form F (x, y) = 0

130

Example: Use the above formula to find dydx for x2 − x sin y = y3.

• dydx = −Fx

Fyisn’t going to work under all conditions; the obvious one is that we have a problem if

Fy = 0.

• The Implicit Function Theorem gives the hypotheses under which this formula holds:

If F (x, y) is defined on a disk containing (a, b), F (a, b) = 0, Fx and Fy are continuouson the disk, and Fy(a, b) = 0, then F (x, y) = 0 implicitly defines y as a function of x near(a, b), and the differentiation formula holds.1

• A similar derivation gives us some formulas for implicit partial differentiation. Suppose we havean expression relating x, y, and z, where z is implicitly a multivariable function of x and y. If werearrange the expression into the form F (x, y, z) = 0, we have

∂z

∂x= −Fx

Fz

and∂z

∂y= −Fy

Fz

Example: Suppose z is implicitly a function of x and y, and xyz − 3x2 = sin z. Find ∂z∂x and ∂z

∂y .

1A proof of the Implicit Function Theorem is beyond the scope of this class; see for example Rudin, Principles ofMathematical Analysis, McGraw-Hill.

131

The chain rule

The chain rule, part 2

• We can construct more cases of the chain rule for functions of more than one variable that are offunctions of more than one variable, say f = f(x, y), x = x(s, t), y = y(s, t). If we substitute in,we have that f = f(x(s, t), y(s, t); i.e. f is still a function of two variables, s and t. In this case,we would call s and t the independent variables, x and y the intermediate variables, and f (or z)the dependent variable.

• Instead of memorizing a bunch of variations, the trick is to ask yourself for any given problem,“what derivatives make sense?” Drawing a chain rule diagram can be helpful.

Example: Let

z = f(x, y) = x2 + y3, x = sin(t + s), y = cos(t − s)

• What is z ultimately a function of?

• What derivatives of z does it now make sense to compute?

• What do you think the correct chain rules would be? Make a diagram.

• Compute the partials with respect to s and t of z:

132

Example: Let

w = f(x, y, z) = x2 + xyz, x = sin(t + s), y = 2t, z = 3s

• What is w ultimately a function of?

• What derivatives of w does it now make sense to compute?

• What do you think the correct chain rules would be? Make a diagram.

• Compute the partials with respect to s and t of w

133

Applications of partial differentiation

Tangent lines in the planes x = a and y = b

All of this bit is really just piecing together stuff you already know. Recall...

• fx(a, b) gives the slope of the line tangent to the graph of f at (a, b, f(a, b)) in the plane y = b.This line is in the direction of (parallel to) the x axis.

• fy(a, b) gives the slope of the line tangent to the graph of f at (a, b, f(a, b)) in the plane x = a.This line is in the direction of (parallel to) the y axis.

• To write the equation of a line in 3D space, we need a point on the line (which we have; it’s(a, b, f(a, b))), and a vector which is parallel to the line. The question is, how to convert the valuesof fx and fy (slopes) into vectors?

• In 2D, we can go from slope to vector easily, by considering slope as a ratio of m1 , sketching a

triangle with those proportions, and writing as a vector. A line with slope m has direction vector< 1, m >. We can still do this in the particular 3D cases under consideration, because in each case,we are holding one dimension constant in a plane.

134

• The slope fx(a, b) is the slope of the trace in the plane y = b. Expressed in 3D, the rise/run are forz and x, while y is held constant. A direction vector for a tangent line in the plane y = b with thisslope is given by

< 1, 0, fx(a, b) >

• The slope fy(a, b) is the slope of the trace in the plane x = a. Expressed in 3D, the rise/run arefor z and y, while x is held constant. A direction vector for a tangent line in the plane x = a withthis slope is given by

< 0, 1, fy(a, b) >

• Remeber that the point-parallel form of a line is given by

< x, y, z >=< x0, y0, z0 > +t < v1, v2, v3 >

We now have sufficient information to write the equations of tangent lines passing through (a, b)and parallel to the x and y axes.

Point: < x0, y0, z0 > = < a, b, f(a, b) >

Vector: either < v1, v2, v3 > = < 1, 0, fx(a, b) > (parallel to x axis)or < v1, v2, v3 > = < 0, 1, fy(a, b) > (parallel to y axis).

135

Example: Write the equation of the line which is parallel to the x axis and tangent to the surface

f(x, y) = x2 − 3y + ey

at the point (2, 0, f(2, 0)).

Example: Write the equation of the line which is parallel to the y axis and tangent to the surface

f(x, y) = x2 − 3y + ey

at the point (2, 0, f(2, 0)).

136


Tangent plane and normal line

• We have seen that that vectors < 1, 0, fx(a, b) > and < 0, 1, fy(a, b) > are tangential to the surfacef(x, y) at (a, b, f(a, b)) in the direction of the x and y axes, respectively.

• The plane which passes through the point (a, b) and contains these vectors is the tangent planeto the function f(x, y) at the point (a, b). (It also contains all other vectors/lines which are tangentto the plane at this point, but we only need two vectors to define the plane).

• It is simple to derive the equation for the tangent plane at (a, b, f(a, b)): We’ve got a point, weneed a normal, and the normal is obtained by crossing the vectors which lie in the plane. Do it.

< 1, 0, fx > × < 0, 1, fy >= ?

And now, use the point-normal equation for a plane to write the equation of the tangent plane:

137

Example: Write the equation of the plane tangent to the surface

f(x, y) = x2 − 3y + ey

at the point (2, 0, f(2, 0)).

• While we have a normal vector lying around, we might as well write the equation of the normalline passing through the surface at (a, b, f(a, b)).

This line has the vector we just found as its direction vector, and its equation is:

< x, y, z >=< a, b, f(a, b) > +t < −fx(a, b),−fy(a, b), 1 >

or< x, y, z >=< a, b, f(a, b) > +t < fx(a, b), fy(a, b),−1 >

138

Example: Write the equation of the plane tangent to the surface and the normal line passing throughthe surface for

z =√

4 − x2 − 2y2

at the point (1,−1, 1).

• The tangent plane is a linear function. It is the linear function that most closely resembles f(x, y)at the point (a, b). At that point, the plane and the function have the same (1) function value, and(2) partial derivatives with respect to x and y. As we move away from the point (a, b), the planeand the function no longer coincide, but in a small neighborhood around (a, b), the tangent planeprovides a good approximation to the function. For these reasons, the equation of the tangentplane is also called the linearization of f(x, y) at (a, b), or the linear approximation to f(x, y)at (a, b). We can then say:

* z = f(a, b) + fx(a, b)(x − a) + fy(a, b)(y − b) (equation of tangent plane)

* L(x, y) = f(a, b) + fx(a, b)(x − a) + fy(a, b)(y − b) (linearization)

* f(x, y) ≈ f(a, b) + fx(a, b)(x − a) + fy(a, b)(y − b) (linear approximation)

They all say the same thing, and are interchangeable.

Example: Find the linearization of

f(x, y) = xe2y + sin y

at the point (0, 0, 0).

139

Example: Use the linear approximation of f(x, y) = xe2y + sin y at the point (0, 0, 0) to compute thevalue of f(−.05, .03). Compare the approximate value to the ‘exact’1 value.

This would be the point of having linear approximations - I have a value that estimates the functionvalue ... and required no calculation other than adding two numbers.

• We have seen the tangent plane before in the context of differentials (so now, we’ve derivedwhere that equation came from). Go back and look at that again. And note, when we say thatthe tangent plane provides an approximation to the function at (a, b), we are assuming that thefunction is differentiable (as defined in that section) at (a, b) - otherwise, all bets are off.

1‘exact’ here would mean ‘however many digits your calculator gives you’

140


Directional derivatives

• Recall that we can specify direction (in the plane or in space) in terms of a unit vector. In thexy plane, “in the direction of the x axis” could be expressed as “in the direction of the vectori =< 1, 0 >, and “in the direction of the y axis” could be expressed as “in the direction of thevector j =< 0, 1 >

• When we compute the partial derivative fx for f(x, y), we get a function that, when evaluated at afixed point, gives us the rate of change in f as we move parallel to the x-axis (i.e. the slope of thetangent line along the surface parallel to the x-axis). We can think of this as the rate of change inthe direction of the vector < 1, 0 >. An equivalent way to write the formula for fx would be

fx(x, y) = limh→0

f(< x, y > +h < 1, 0 >) − f(< x, y >)

h

Similarly, fy gives us the rate of change in the direction of < 0, 1 > (parallel to the y axis):

fy(x, y) = limh→0

f(< x, y > +h < 0, 1 >) − f(< x, y >)

h

• Suppose we wish to consider the rate of change in the function f along some other unit vectoru =< u1, u2 >. Geometrically, for f = f(x, y), you can think of this as asking how does thefunction change as we walk out on a line which is not necessarily parallel to the x-axis or y-axis,and whose direction is given by a unit vector u.

• We refer to this quantity as the derivative of f in the direction of u, or the directionalderivative of f along u, and define it by

Duf = limh→0

f(x + hu1, y + hu2) − f(x, y)

hor lim

h→0

f(< x, y > +hu) − f(< x, y >)

h

(As written, this expression is the vector valued function; we could then evaluate the derivative ata specific point.)

141

• As you’d expect, we don’t generally want to calculate the directional derivative from the definition.It turns out that it can be computed very conveniently:

Duf =< fx, fy > ·u = fx(a, b)u1 + fy(a, b)u2 (at the point (a, b, f(a, b)))

= fx(x, y)u1 + fy(x, y)u2 (as a function of x and y)

(derivation attached)

• If the specified direction vector is not a unit vector, be sure to normalize it.

Example: What is the directional derivative of the function

f(x, y) = x2y − 3xy2 + 5

in the direction of v =< 1,−2 > at the point (−1, 3, f(−1, 3))?

• Compute fx and fy and evaluate at (−1, 3) (express as a vector < fx, fy >):

• Normalize v =< 1,−2 > to obtain a unit vector u = v||v|| :

• And compute Duf =< fx, fy > ·u:

• What you just found: the slope of the line tangent to the surface f(x, y) at the point (a, b), in thedirection of the vector u.

142

• A vector in the direction of this line is < u1, u2, Duf >. This allows us to write the equation of thetangent line using the point and direction vector.

< x, y, z >=< a, b, f(a, b) > +t < u1, u2, Duf >

Example: For the function f(x, y) = x2y − 3xy2 + 5, write the equation of the line tangent to thesurface at (−1, 3, f(−1, 3)) in the direction of v =< 1,−2 >.

• We can also specify direction as an angle θ, measured from the positive x axis. In this case, theunit vector u is obtained from

u =< cos θ, sin θ >

143

Example: Find the directional derivative of

f(x, y) = ln(x2 + y2)

at the point (3, 4, f(3, 4)), in the direction indicated by angle θ = π3 . Write the equation of the line

tangent to the curve at that point and in that direction.

144


The gradient vector

• Recall that the directional derivative Duf for a function f(x, y) is calculated by

Duf =< fx, fy > · < u1, u2 >

The quantity < fx, fy > (a vector formed from the first partial derivatives of f) gets its own notation(among other things, it’ll make writing formulas for functions of more than two variables a bit morecompact). We call it the gradient vector.

• The gradient vector of a function f(x, y) is denoted by ∇f , and is defined

∇f =< fx, fy >

Depending on the context, this can be a vector valued function:

∇f(x, y) =< fx(x, y), fy(x, y) >

or, evaluated at a point (a, b), a single vector:

∇f(a, b) =< fx(a, b), fy(a, b) >

For a function of three variables, f(x, y, z), we have

∇f =< fx, fy, fz >

and so on.

Example: For f(x, y) = x3ey + 2y2, find ∇f(x, y) and ∇f(1, 0).

• What does ∇f give us? First off, another way to compactly express the formula for calculatingDuf :

Duf = ∇f · u

But there are some additional properties associated with the gradient vector...

• Note that the gradient vector (at a point) is a 2D vector associated with a 3D surface. We canmove it around in space as needed, but it’s convenient to visualize it as a direction vector in the xyplane, or in a plane parallel to the xy plane, in the same way that we think of u when looking atdirectional derivatives.

145

• Consider the following: For any direction u, the derivative in that direction is calculated by

Duf = ∇f · uBy the definition of the dot product, this becomes

Duf = ||∇f || ||u|| cosθ

where θ is the angle between ∇f and u.

Since u is by definition a unit vector, it has magnitude 1, and

Duf = ||∇f || cos θ

* What θ would give the maximum value for Duf?

* What does this mean?

• The gradient vector is associated with the maximum increase in the surface:

* v = ∇f gives the direction of maximum increase (where the surface has the steepest positiveslope).

* u = ∇f||∇f || is the unit vector in that direction.

* The slope in this direction is given by

Duf = ∇f · u = ∇f · ∇f

||∇f || =||∇f ||2||∇f ||

Duf = ||∇f ||The value of the directional derivative in the direction of the gradient vector is the same as thenorm of the gradient vector.

146

* Looking in the opposite direction, letting u = −∇f gives the direction of maximum decrease.

* And the slope in that direction isDuf = −||∇f ||

Example: In what direction does the function f(x, y) = x2y− sin(2y)+ 5 increase most rapidly, fromthe point (1, π, f(1, π))? Decrease most rapidly from that point? What are the rates of increase anddecrease (slopes)?

• Recall that the level curves of a function z = f(x, y) are obtained by setting f(x, y) = k, kconstant. They represent the traces of the surface in the planes z = k. As such, they can bethought of as curves along with the rate of increase of the function is 0. How do gradient vectorsrelate to level curves? Derive it:

* For a expression f(x, y) = k, what is dydx? Think in terms of implicit differentiation using

partials of F (x, y) = f(x, y) − k.

* So, at a fixed point (a, b), what is the slope of the tangent?

* Translate that slope into a tangent vector:

147

* What is the gradient of f at (a, b)?

* What do you notice about how they are related? (Try dotting them.)

* And conclude:

Example: For the function z = f(x, y) = x2 + 4y2, sketch the level curve corresponding to k = 4, andfind (and sketch) normal vectors (to the level curve) at various points along the curve.

148


Notation for higher dimensions

• We generally work with f(x, y) as the basic example for defining concepts such as derivatives, partialderivatives, the gradient, an so on - easy to write,, easy to visualize. It’s easiest to extend theseconcepts by analogy - if I say that we define ∇f(x, y) as

∇f(x, y) =< fx(x, y), fy(x, y) >

and that analogously for a function f(x, y, z), we have

∇f(x, y, z) =< fx(x, y, z), fy(x, y, z), fz(x, y, z) >

you’ll be able to tell me immediately that for a function of four variables, f(w, x, y, z),

∇f(w, x, y, z) =< fw(w, x, y, z), fx(w, x, y, z), fy(w, x, y, z), fz(w, x, y, z) >

The concept here is “I make a vector out of the partial derivatives,” and you can figure out howthat works no matter how many variables we’ve got. The only drawback is that the notation isn’tvery general - I’d write yet another formula for a function of five variables (and eventually I’d startto run out of letters!).

• First off, if I’m going to be working with a bunch of variables (or a non-specific number of variables),it’s better to work with subscripts rather than working through the alphabet - (x, y) becomes(x1, x2), (x, y, z) becomes (x1, x2, x3). A function such as

f(x, y, z) = sin(xy) − z2

becomesf(x1, x2, x3) = sin(x1x2) − x2

3

This allows me to talk about a function of n variables in general as

f(x1, x2, ..., xn)

• Let’s look at this in the context of partial derivatives: if I were talking about a function f(x, y, z),I’d say that to get the partial with respect to x, hold y and z constant, and allow x to vary. To getthe partial with respect to y, hold x and y constant, and allow y to vary, and so on.

Under the subscripted notation, f(x1, x2, x3), I’d say to get the partial with respect to the firstvariable, x1, hold x2 and x3 constant, and allow x1 to vary, and so on. Doesn’t seem like much of animprovement ... but it allows me to describe a partial derivative easily for a function of n variablesin a general sense:

For a function f(x1, x2, ..., xn), to obtain the partial derivative with respect to the ithvariable, xi, hold all the other variables constant and allow xi to vary.

Note we still aren’t up to the notation, but we’re getting there!

149

• We need vectors - there’s not a lot of distinction between a point (x1, x2, ..., xn) and the vector< x1, x2, ..., xn > pointing to that point. Using vector notation, I can say let x =< x1, x2, ..., xn >,and start referring to the function

f(x1, x2, ..., xn)

as simplyf(x)

Things suddenly got a lot more compact.

I could denote the partial derivative of f with respect to (for example) the second variable as

fx2(x) or∂

∂x2f(x) or

∂f

∂x2

and the partial with respect to the i th variable in general as

fxi(x) or

∂

∂xi

f(x) or∂f

∂xi

Example: Let x =< x1, x2, ...x7 >, f(x) = x1x5 + 4x3 − cos(x7). What are ∂∂x7

f(x) and ∂∂x2

f(x)?

• We still don’t have quite enough notation to write the definitions of these things. Just one morething though, and we’ll be ready. We have seen i and j used to denote the vectors < 1, 0 > and< 0, 1 > in R2, i =< 1, 0, 0 >, j =< 0, 1, 0 > and k =< 0, 0, 1 > in R3. We’re running into thesame issue as with the components in higher dimensions - we’re going to run out of letters, so weturn to subscripting:

• Let the vectors ei represent the elementary basis vectors for Rn

e1 =< 1, 0, 0, ..., 0 >

e2 =< 0, 1, 0, ..., 0 >

en =< 0, 0, 0, ..., 1 >

e.g in R3, we have i = e1, j = e2, k = e3. The point is we can have as many of these as we needfor whatever dimension n we’re working in.

We have a subscripted list of vectors ... whose individual components are subscripted. To refer to the third compo-nent of the second basis vector, I’d use (e2)3.

150

• The basis vectors can be described as vectors such that (ei)i = 1 (the ith component of the ithvector is one), and (ei)j = 0, j = i (all the other components are zero).

• We now have the pieces in place to write the definitions of partial derivatives, the gradient, anddirectional derivatives in any number of variables in a compact form.

• Partial derivatives: Forf = f(x1, x2, ..., xn)

we letx =< x1, x2, ..., xn >

We can then write, using vector notation

fxi= lim

h→0

f(x + hei) − f(x)

h

• Gradient vector: Forf = f(x1, x2, ..., xn)

we letx =< x1, x2, ..., xn >

We can then write

∇f(x) =< fx1, fx2 , ..., fxn >

• Directional derivative: For

f = f(x1, x2, ..., xn), u =< u1, u2, ..., un >

we letx =< x1, x2, ..., xn >

We can then write

Duf = limh→0

f(x + hu) − f(x)

h

And we would compute the directional derivative using

Duf = ∇f · u

• The additional notation is for the purpose of having a good formal definition of all these quantitiesin a general sense. From a practical standpoint, there’s nothing new here - keep doing everythingexactly the way you’d expect to do it!

* To differentiate with respect to a variable, hold all the other variables constant.

* To write the gradient vector, form a vector from the partial derivatives.

* To find a directional derivative, dot the gradient with the unit direction vector. 151

Example: Find the directional derivative of

f(x, y, z) = xyz3

at the point (1, 2,−1, f(1, 2,−1)) in the direction of u =< 3√29

, 2√29

, −4√29

>.

152

Optimization

Maxima and minima of functions of one variable (a review)

Start by recalling how to locate local (relative) maxima and minima for functions of one variable:

• Obtain the critical values of the function by finding all x where f ′(x) = 0 or f ′(x) doesnot exist.

• These critical values are potentially values at which the maximum or minimum functionvalues are located, but some sort of test needs to be performed to determine if there’s amaximum, a minimum, or neither at each. In other words, everywhere f has a max or amin, we know f ′ = 0 (or does not exist), but the converse doesn’t hold - not all valueswhere f ′ = 0 (or DNE) are necessarily extrema. That’s why the testing.

• One test, the first derivative test involves checking for intervals of increasing/decreasingaround the critical values.

• But there’s another test, the second derivative test that allows you to immediately deter-mine if a critical value leads to a maximum or minimum. The sign of the second derivativeindicates whether the graph is concave up or concave down, which allows you to make thedetermination. This is the test you want to recall, as we’ll be doing something analogousfor surfaces.

• To obtain the max and min values themselves, you need to substitute back into theoriginal function.

153

Example: Find the local extrema (maxima and minima) of the function

f(x) = 2x3 + 3x2 − 36x + 6

Solution:

• Step 1: Find the critical values by setting f ′(x) = 0 and solving.

• Step 2: Examine the sign of f ′′(x) at each critical value. Interpret the concavity toidentify the locations of maxima and minima.

• Step 3: Find the value of f at each critical value. Identify maximum and minimum values.

154

Absolute extrema

A key result of single variable Calculus is that a continuous function on a closed interval has anabsolute maximum and absolute minimum value. To find absolute extrema on a given interval:

• Find all critical values that fall within the interval.

• Test each critical value, and the endpoints of the interval. In this case, the test is to simplyevaluate f(x) at each of those values; the largest is the absolute maximum, smallest theabsolute minimum.

Example: Find the absolute extrema of the function

f(x) = 2x3 + 3x2 − 36x + 6

on the interval [−5, 5].

Solution:

• Step 1: Find the critical values (as before) by setting f ′(x) = 0 and solving.

• Step 2: Note whether the critical values fall within the interval. Evaluate f at criticalvalues and endpoints, and identify maximum and minimum points.

Look for these ideas: As we go through the topic in multivariable Calculus, you’llsee them all extended by analogy. The horizontal tangent line translates to a horizontaltangent plane. f ′ = 0 translates to fx = 0 and fy = 0. And, there is a version of thesecond derivative test, for determining whether your critical value is a max or a min.Absolute extrema will be extended to optimization under constraints.

155

Optimization

Local extrema of functions of two variables

Definition:

A function of two variables has a local maximum [local minimum] at (a, b) if

f(x, y) ≤ f(a, b) [f(x, y) ≥ f(a, b)]

for all (x, y) in some disk with center (a, b).

Theorem:

If f has a local maximum or minimum and the first order partial derivatives existthere, then

fx(a, b) = 0 and fy(a, b) = 0

Proof:

Let g(x) = f(x, b) [hold y = b fixed and allow x to vary; you’re now looking atthe single variable function which is a slice of f(x, y) in the plane y = b]. If f hasa local maximum [or minimum] at (a, b), then g must have a local maximum [orminimum] at a. Since we’re now looking at a single variable function, we can applythe [previously established in single variable Calculus] result that g′(a) = 0. Sinceg′(a) = fx(a, b), we must have fx(a, b) = 0. The proof for fy(a, b) is analogous [holdx = a fixed and consider h(y) = f(a, y)].

This result is analogous to the one for single variable functions. If the function has a max ormin, and the (partial) derivatives are known to exist (meaning we won’t worry about the caseanalogous to f ′ DNE), then both partials must be zero. This implies that setting the partialsequal to zero and simultaneously solving for (x, y) pairs will give potential local extrema, whichcan then be tested.

Critical values of f are values (a, b) such that fx(a, b) = 0 (or DNE) and fy(a, b) = 0 (orDNE). Critical values are obtained by finding the partials, setting equal to zero, and solvingsimultaneously.

156

Example:

Find the critical values off(x, y) = x3y + 12x2

− 8y

We still need to test the location (a, b) = (2,−4) to determine whether the graph has a max-imum or minimum (or possibly neither) at that value. In single variable Calculus, we candetermine this by checking the concavity - the value of the second derivative. This test extendsto multivariable Calculus in the following way...

Second derivatives test

If the second partials (fxx, fxy, fyx, fyy) are continuous on a disk centered at (a, b), where (a, b)is a critical value of f(x, y), compute the quantity

D = fxx(a, b)fyy(a, b) − [fxy(a, b)]2

and test according to the following rules:

If sign of D is and also sign of fxx(a, b) then f(a, b) is a:

positive (D > 0) positive (fxx > 0) local minimumpositive (D > 0) negative (fxx < 0) local maximumnegative(D < 0) (doesn’t matter) saddle point

zero(D = 0) (doesn’t matter) test fails

*A derivation of how the second partials test gives results is generally omitted from Calculustexts at this point - it involves a Taylor polynomial expansion in two variables. I think the firsttime I saw one of those was in a partial differential equations course, a good bit further downthe line. If you’re curious, COW (Calculus on the Web) has an explanation/derivation.

157

The Hessian

For those of you with some background in Linear Algebra, the quantity

D = fxx(a, b)fyy(a, b) − [fxy(a, b)]2

can be expressed as the determinant of a matrix of the partials:

D =

∣

∣

∣

∣

fxx fxy

fyx fyy

∣

∣

∣

∣

Note that when the second partials are continuous, we have fxy = fyx by Clairault’s Theorem,so fxy · fyx is the same as the (fxy)

2 in the above formula for D.

The matrix

[

fxx fxy

fyx fyy

]

is referred to as the Hessian of f , so

Hess(f) =

[

fxx fxy

fyx fyy

]

andD = det(Hess(f))

158

Example (continued):

• Obtain expressions for the second partials of the function f(x, y) = x3y + 12x2 − 8y

• Then, evaluate the second partials at the critical value of (2,−4), and compute the valueof D.

• Finally, apply the test, and determine whether (−2, 4, f(−2, 4)) is a local maximum,minimum, or saddle point. You need the values of fxx(2,−4) and D.

What’s a saddle point?

You should recall from single variable Calculus that when f ′(x) = 0, the graph has a horizontaltangent. This may indicate a maximum or minimum, but it’s possible for the graph to havea horizontal tangent without it being either. The graph of x3 is a typical example; there is ahorizontal tangent at x = 0, and the graph crosses its tangent, rather than having a max ormin there.

A saddle point is the 3D version of this. The graph of f(x, y) has a horizontal tangent planeat (a, b), but the graph crosses its tangent plane.

159

Example:

Find all local maxima, minima and saddle points of

f(x, y) = x4 + y4− 4xy + 1

Start by getting fx and fy, setting to 0, and solving for critical values.

Then, get the second partials, and test each critical value. You might find it helpful to organizein a table.

fxx =

fxy =

fyy =

(a, b)[CV] fxx(a, b) fxy(a, b) fyy(a, b) D =

∣

∣

∣

∣

fxx fxy

fxy fyy

∣

∣

∣

∣

(−1,−1)

(0, 0)

(1, 1)

Then, interpret the results, and get function values [z = f(a, b)] for each of the critical values:

160

Optimization

Constrained optimization and Lagrange multipliers

Constrained optimization is what it sounds like - the problem of finding a maximum or mini-mum value (optimization), subject to some other restrictions or constraints.

Example:

Suppose we have a rectangle inscribed within a given ellipse, say x2

9 +y2

4 = 1. We would liketo find the dimensions of the rectangle that have the maximum area.

A = ??

2x

2y A = ??

2x

2y

The function that we wish to maximize, called the objective function, is the area function:

f(x, y) = (2x)(2y) = 4xy

The constraint is that x and y are related through the ellipse:

x2

9+

y2

4= 1 →

x2

9+

y2

4− 1 = 0

We let g(x, y) be the constraint function

g(x, y) =x2

9+

y2

4− 1

Now, this particular example can be solved through single variable Calculus, and I’ll suggestthat you try that as a refresher. It’ll also let us check the solution later on after we use themethod of Lagrange multipliers.

• Solve x2

9 +y2

4 = 1 for y (use the positive square root).

• Substitute that into the area function, creating a function of one variable, x.

• Use the usual single variable Calculus process for finding a max or min value - differentiate,set equal to 0 and solve for x, and verify/test that you in fact have a maximum.

161

The solution is posted as a separate file, and linked in the lecture - you should get x = 3√2

[and can easily obtain y from there]. Now, back to Lagrange multipliers...

If we look at the area as a function of two variables, we have a 3D surface. And just looking atf(x, y) = 4xy, we’d see that that function by itself doesn’t have a maximum value - as x andy get larger, the area grows without bound - so the constraint is needed to make sense of theproblem.

If we consider all possible values that the area could take on, e.g. an area of 1, 1.5, 2, 2.9996,and so on, so

f(x, y) = 4xy = 1

f(x, y) = 4xy = 1.5

f(x, y) = 4xy = 2

f(x, y) = 4xy = 2.996

etc., etc., then we are looking at things in the form

f(x, y) = 4xy = k

i.e., the level curves of the function. Here is a contour plot showing some of the level curvesof f(x, y):

If we superimpose the constraint ellipse on the level curves, we see first of all that only some ofthe level curves intersect with the constraint, and therefore contain allowable values for x andy, and we can discard any that don’t:

162

Now, it would help if I noted the heights that those level curves are at, which are the prospectivevalues for area, because you’d see that they are increasing as you move out from the center.So what we’re really looking for is the level curve that just barely intersects the constraintequation - that’s the one where the area will be maximal. “Just barely intersects” would bethe same as “tangent to.”

k = 1.5

k = .5

k = 3

k = 5

k = 12

k = 8

It looks like the maximal area that still intersects the constraint is when k = 12, so when

4xy = 12. Set 4xy = 12, so y = 3x and sub into x2

9 +y2

4 = 1 :

x2

9+

(3/x)2

4= 1

x2

9+

9

4x2= 1

4x4 + 81 = 36x2

4x4 − 36x2 + 81 = 0

(2x2 − 9)(2x2 − 9) = 0

2x2 − 9 = 0

x =3√

2

163

Now, we need to formalize that approach...

Method of Lagrange multipliers

The trick here is to use the gradient vectors, and recall the useful result that the gradient vectorat any point is always orthogonal to the level curve at that point. So it’s orthogonal to a vectortangent to the level curve in the plane of the curve. If you want the curves to be tangent to each

other, they need to have the parallel tangent vectors...which means they need to have parallelgradient vectors. And “parallel” for vectors means “the same vector, or scalar multiples of eachother.” So we need to find (x0, y0) that solve

∇f(x, y) = λ∇g(x, y)

and also satisfy the constraint equation ∇g(x, y) = 0.

Lagrange’s Theorem

Let f and g have continuous first partial derivatives such that f has an extremumat a point (x0, y0) on the smooth constraint curve g(x, y) = 0. If ∇g(x0, y0) 6= 0,then there is a real number λ such that

∇f(x0, y0) = λ∇g(x0, y0)

Proof:

Let r(t) be a parameterization of the smooth curve given by g(x, y) = 0;

r(t) = x(t)i + y(t)j

with r′(t) 6= 0, where x′ and y′ are continuous functions of t on an open interval I . [Note thatyou don’t have to find the parameterization to work the problem, but asserting that one existsis needed for the proof. The “smooth” part is where r′(t) 6= 0 comes in.]

Define a function h(t) = f(x(t), y(t)). Since (x0, y0) is an extreme value of f , we know that

h(t0) = f(x(t0), y(t0)) = f(x0, y0)

must be an extreme value of h. So h′(t0) = 0. [h(t) is a single variable function, and singlevariable Calculus tells us its derivative is zero wherever it has a max or min value.]

Since h(t) is a composite function, h(t) = f(x(t), y(t)), we apply the chain rule to get

dh

dt=

∂f

∂x

dx

dt+

∂f

∂y

dy

dt

orh′(t) = fx(x(t), y(t))x′(t) + fy(x(t), y(t))y′(t) = ∇f(x(t), y(t)) · r′(t)

Since h′(t0) = 0, ∇f(x0, y0) · r′(t0) = 0, and ∇f(x0, y0) is orthogonal to r′(t0).

164

Also, we have already established that the gradient is orthogonal to the level curve of a functionat a point: ∇g(x0, y0) is orthogonal to the level curve of g passing through that point. Since r(t)is the curve g(x, y), r′(t) must be tangential to that curve, and so ∇g(x0, y0) is also orthogonalto r′(t0).

Therefore, ∇f(x0, y0) and ∇g(x0, y0) must be parallel to (i.e. scalar multiples of) each other,and so there exists some λ such that

∇f(x0, y0) = λ∇g(x0, y0)

Back to the example...

Use the method of Lagrange multipliers to maximize

f(x, y) = 4xy

subject to the constraintx2

9+

y2

4− 1 = 0

• Step 1: Find ∇f for the objective function f(x, y) = 4xy, and ∇g for the constraint

function g(x, y) = x2

9 +y2

4 − 1

• Step 2: Set up the equation ∇f = λ∇g, and split that into two equations

fx = λgx

fy = λgy

165

• Step 3: Solve the system of three equations and three unknowns - the two equations thatcome from ∇f = λ∇g, and also the constraint g(x, y) = 0. Occasionally you may belucky enough to have a system of linear equations, but frequently, these will be nonlinear,and you need to resort to creative algebra.

166

In general, how do we know that the solution is the extremum we want? If the problem callsfor a max, and we get one solution, what’s telling us that that solution isn’t in fact a min?Recall that with previous optimization techniques, there’s been a test at the end to determinewhether you have a max or min (recently, we had a second derivative type test for two variableoptimization).

The method guarantees an extremum of some sort - so to test that your solution (x0, y0) pro-duces a maximum f(x0, y0), verifying that f(x0, y0) ≥ f(x, y) for all other (x, y) that satisfythe constraint reduces to the problem of verifying f(x0, y0) ≥ f(x, y) for just any one (x, y).Pick any other point on the graph of the constraint equation, and plug it in to f :

Since (x, y) = (0, 2) is a point on x2

9 +y2

4 = 1, compute f(0, 2) = 4(0)(2) = 0. Since

f( 3√2,√

2) ≥ f(0, 2), f( 3√2,√

2) = 12 is a maximum.

Finally...

• The process is more streamlined than it looks at first read - keep in mind we solved thesame problem 3 ways and derived the method in the process. Implementing it isn’t allthat tedious. Look for a live example.

• The method extends to three (and more) variables - in the three variable case, the levelcurves become level surfaces. You’ll see a couple of these in the suggested problems.

167

Iterated, double, and triple integrals

Iterated integrals

• Integrals in the forms

∫ d

c

∫ b

af(x, y) dx dy or

∫ b

a

∫ d

cf(x, y) dy dx or

∫ b

a

∫ d

c

∫ f

ef(x, y, z) dz dy dx

are iterated integrals; e.g. ∫ 2

1

∫ 8

−3(3x2y + y2) dx dy

is an example of an iterated integral of a function of two variables.

• We would interpret ∫ d

c

∫ b

af(x, y) dx dy

as ∫ d

c

[∫ b

af(x, y) dx

]dy

where ∫ b

af(x, y) dx

would indicate that we integrate f(x, y) with respect to x; in other words, treat y as a constantand integrate as you normally would for a function of one variable (x).

• Using the Fundamental Theorem of Calculus:

∫ b

af(x, y) dx = F (x, y)]x=b

x=a = F (b, y) − F (a, y)

where F is an antiderivative of f (with respect to x); i.e., Fx(x, y) = f(x, y).

• We would then go on to the outer integral and integrate the previous result.

• As with partial differentiation, integrating with respect to a variable while holding the othersconstant is simply working with a function of one variable, which is painless - assuming you recallall your integration techniques from first year Calculus! You may wish to refresh your memory onbasic techniques, u substitution, and integration by parts.

• Although there’s nothing new here, it will take a while to walk through even a basic example-you’re working with definite integrals, and those take a while to go through. More examples willbe posted, showing different integration techniques.

168

Example:

∫ 1

0

∫ 2

1(xy − ey) dx dy

Start with the inside: ∫ 2

1(xy − ey) dx

You are integrating with respect to x, so treat as a single variable function of x, with y constant.Until you’ve practiced a few, you may want to explicitly use the rules for sums/differences and constantmultiples:

I’ve split into two integrals, and pulled out constant multiples. Since we are integrating with respect tox, both y and ey are constants.

Now, integrate:

And evaluate, as by the Fundamental Theorem. Since you are integrating with respect to x, you areevaluating x from 1 to 2 (y is just hanging out being y for a while):

You have now obtained:

169

That was just the inner integral. You now have a function of y. Returning to the original problem, subin your result for the innter integral:

And you can finish integrating:

So the final answer is:

Example:

∫ π

0

∫ 1

0

∫ 3

12z

(y sin x

1 + y2

)dz dy dx

Go for the inner integral first: ∫ 3

12z

(y sin x

1 + y2

)dz

Now, nasty as that looks, everything except the z is a constant - pull it all out:

∫ 3

12z

(y sin x

1 + y2

)dz =

2y sin x

1 + y2

∫ 3

1z dz

And get to work ...

170

Here’s another page for you ... check against the solution when you’re done. Hint: at some point, you’llneed to do a u-substitution. But all the integration is pretty straightforward; it’s just that there’s somuch of it!

171


More general iterated integrals

• We have termed integrals in the form

∫ d

c

∫ b

af(x, y) dx dy

(and similar) iterated integrals, and we interpret them in terms of integration of single variablefunctions: ∫ d

c

∫ b

af(x, y) dx dy =

∫ d

c

[∫ b

af(x, y) dx

]dy

where ∫ b

af(x, y) dx

would indicate that we integrate f(x, y) with respect to x; in other words, treat y as a constantand integrate as you normally would for a function of one variable (x).

• So consider this expression a bit more:

∫ b

af(x, y) dx

Since y is constant (as far as integrating with respect to x is concerned), there’s no reason the upperand lower bounds on the integral can’t have y’s in them; for example

∫ 3+y

1−y(x2 + y2) dx

The FTC still holds - get the antiderivative, and evaluate.

Example: Find ∫ 3+y

1−y(x2 + y2) dx

172

• That was a single integral. Suppose that was the inner integral of an iterated integral, let’s say

∫ 1

0

∫ 3+y

1−y(x2 + y2) dx dy

Making the substitution gives ∫ 1

0(8

3y3 + 4y2 + 10y +

26

3) dy

• While we were considering x, that y was constant. Now we’re integrating with respect to y. Now,it’s a function of y. So integrate it:

• In general, since any function of x is a constant with respect to y, iterated integrals of the form:

∫ b

a

∫ y2=g2(x)

y1=g1(x)f(x, y) dy dx

make sense, and can be evaluated.

Similarly, since any function of y is constant with respect to x, iterated integrals of the form

∫ d

c

∫ x2=h2(y)

x1=h1(y)f(x, y) dx dy

also make sense, and can be evaluated.

173

Example: Find ∫ 1

0

∫ x

0

√1 − x2 dy dx

Example: Find ∫ 2

0

∫ 2y−y2

3y2−6y3y dx dy

174

• The rules for allowable bounds extend to three integrals in a row. Look at

∫ b

a

∫ g2(x)

g1(x)

∫ h2(x,y)

h1(x,y)f(x, y, z) dz dy dx

* The inner (dz) integral can have bounds with x’s and y’s (everything but z)

* The middle (dy) can have bounds with x’s (everything but y... and z is already gone)

* The outer (dx) must have constant bounds (everything but x... and y and z are already gone.Which doesn’t leave anything but constants.)

The same type of rule holds for other arrangements (dx dy dz, dy dz dx, etc.).

175


Iterated integrals and area

• We started off by looking at iterated integrals in the forms

∫ d

c

∫ b

af(x, y) dx dy and

∫ b

a

∫ d

cf(x, y) dy dx

Ignore the function and think about just the bounds for a moment; in fact, consider the iteratedintegrals ∫ d

c

∫ b

a1 dx dy and

∫ b

a

∫ d

c1 dy dx

• For ∫ d

c

∫ b

a1 dx dy

the inner dx integral tells me I’ll be evaluating from a low bound of x1 = a to a high bound ofx2 = b. The outer dy integral tells me I’ll evaluating from a low bound of y1 = c to y2 = d. In otherwords (or symbols), we’re talking about

a ≤ x ≤ b

c ≤ y ≤ d

This describes a rectangular region in the plane.

• Now, what would ∫ d

c

∫ b

a1 dx dy

give? ∫ b

a1 dx = x]ba = b − a

∫ d

c(b − a) dy = (b − a)y]dc = (b − a)(d − c)

(b − a) is the width of that rectangular region, and (d − c) is the height. So...

∫ d

c

∫ b

a1 dx dy

gives the area of that rectangular region bounded by a ≤ x ≤ b, c ≤ y ≤ d.

• What about ∫ b

a

∫ d

c1 dy dx

That’s still c ≤ y ≤ d and a ≤ x ≤ b - the same region. And,

∫ d

c1 dy = y]dc = d − c

∫ b

a(d − c) dx = (d − c)x]ba = (d − c)(b − a)

gives the same area.176

• So ∫ d

c

∫ b

a1 dx dy =

∫ b

a

∫ d

c1 dy dx

order doesn’t matter, area is area. Be careful not to overgeneralize this, though - right now, we’retalking about specifically f(x, y) = 1, and constant bounds on the integral.

• We will denote the area of a rectangular region R with a ≤ x ≤ b, c ≤ y ≤ d by the symbol∫∫R

dA

This is a special case of the double integral over a region R.∫∫R

dA =∫ d

c

∫ b

adx dy =

∫ b

a

∫ d

cdy dx

translates as “The area of the rectangular region R can be computed by using iterated integrals.”

Example: Find∫∫

R dA where R is the rectangle 3 ≤ x ≤ 5, −2 ≤ y ≤ 1.

• Turning to more general iterated integrals, the bounds on∫ b

a

∫ g2(x)

g1(x)1 dy dx

can be expressed asa ≤ x ≤ b

g1(x) ≤ y ≤ g2(x)

This (typically - note to follow) describes a region R with the vertical lines x1 = a and x2 = b as theleft and right bounds, and the functions y1 = g1(x) and y2 = g2(x) as the bottom and top bounds.

• Regions of this type are referred to as vertically simple. A vertical line drawn anywhere in theregion will always hit the same function g1(x) at the bottom, and the same function g2(x) at thetop.

177

• Note: the preceding gives the impression that you’ll always see vertical “walls” on the left and right- that isn’t true (although that’s the picture you want to have in mind if you’re thinking about ageneric vertically simple region). The region shown below is

0 ≤ x ≤ 1 x ≤ y ≤ √x

with ∫∫R

dA =∫ 1

0

∫ √x

xdy dx

It is still vertically simple, even though you don’t have walls - it is still true that the bottom functionis always g1(x) = x, and the the top function is always g2(x) =

√x, for 0 ≤ x ≤ 1.

∫∫R

dA =∫ 1

0

∫ √x

xdy dx

• Similarly, the bounds on ∫ d

c

∫ x2=h2(y)

x1=h1(y)1 dx dy

can be expressed ash1(y) ≤ x ≤ h2(y)

c ≤ y ≤ d

This (typically) describes a region R with the horizontal lines y1 = c and y2 = d as the bottom andtop bounds, and the functions x1 = h1(y) and x2 = h2(y) as the left and right bounds.

• Regions of this type are referred to as horizontally simple. A horizontal line drawn anywhere inthe region will always hit the same function h1(y) on the left, and the same function h2(y) on theright.

178

• The same note: as with the vertically simple regions, you won’t necessarily see the walls (horizontal,in this case), but can still have a horizontally simple region; the key is that it can be expressed as

h1(y) ≤ x ≤ h2(y)

c ≤ y ≤ d

The region below is

y2 ≤ x ≤ y 0 ≤ y ≤ 1

with ∫∫R

dA =∫ 1

0

∫ y

y2dx dy

Notice it’s the same region I used as a previous example. Some regions are both horizontally andvertically simple (and some are neither). You may get some choice as to how you want to set thearea integral up.

• In the lecture, you’ll see a link to an additional example, using MVT to plot a horizontally simpleregion.

Example: Sketch the region R whose area is given by

∫∫R

dA =∫ 1

0

∫ 3√

y

y2dx dy

Is the region vertically or horizontally simple?

179

Example: Sketch the region R whose area is given by

∫∫R

dA =∫ 2

−1

∫ e−x

0dy dx

Is the region vertically or horizontally simple?

Example: Sketch the region R bounded by the graphs of 2x − 3y = 0, x + y = 5 and y = 0. Hint:Since you won’t be able to decide whether it’s vertically or horizontally simple until after you’ve madethe sketch, you should solve in whatever way you prefer to sketch those lines. You can always rearrangeit afterwards.

Now that you’ve established what type of region it is, rearrange the expressions for the bounds in thecorrect form needed to set up the inner integral:

You could probably get the low and high bounds for y by inspection, but make a habit of solving forthem whenever they involve a point of intersection:

180

Summarize: write the inequalities that describe the region, and write the iterated integral that will giveyou the area of the region:

And solve for the area of the region R by evaluating the integral:

181


The double integral and volume (defining)

• We’re ready to take a look at the general definition of the double integral

∫∫R

f(x, y) dA

over a region R. We have already established the interpretation for the special case of

∫∫R

1 dA

showing that it gives the area of the region R, and can be computed using iterated integrals:

∫∫R

dA =∫ b

a

∫ g2(x)

g1(x)dy dx =

∫ d

c

∫ h2(y)

h1(y)dx dy

• We’ve managed to avoid the Riemann sum part of the picture so far, by building on results fromsingle variable Calculus (and assuming that you have of course remembered the Riemann sumdefinition of a single definite integral!). To properly define the double integral and discuss thegeometry, we need to return to Riemann sums and limits.

A quick refresher:

• Assume first that f is a continuous function on an interval [a, b], and for convenience, assumef(x) ≥ 0 for all x ∈ [a, b]. The (single) definite integral is defined by

∫ b

af(x) dx = lim

n→∞

n∑i=1

f(xi)Δx

where we

* Consider the interval [a, b] and form a partition of [a, b] by dividing into n subintervals. Inthe Riemann sum, we are using n subintervals of equal width (we can write a more generaldefinition where the partition does not need subintervals of equal width, but the above is a bitsimpler to work with and good enough for our purposes).

* Let Δx = b − an .

* Choose a point xi in each subinterval (for convenience, I’m showing the right endpoint).

* Evaluate the function at xi giving f(xi). At this point, we note that on any subinterval, wecan draw a rectangle with Δx as the width of the base, and f(xi) as the height.

* Compute the area of a rectangle: A = f(xi)Δx.

182

* Note that each subinterval has its own rectangle, and the sum of these n rectangles gives atotal area that approximately equals the area under the curve:

∑ni=1 f(xi)Δx

* And pushing n to infinity “smoothes out” the rectangular area, and gives us the exact areaunder the graph of f . ∫ b

af(x) dx = lim

n→∞

n∑i=1

f(xi)Δx

• The Fundamental Theorem of Calculus is the thing that tells us we don’t have to use thedefinition, but that there is a connection between the definite integral (representing area), and theantiderivative. The FTC is a marvelous thing, that lets us avoid much painful algebra. We are notgoing to prove it again here, though - in theory, you’ve already done that at some point in your life.

Suppose f(x) is a continuous function on an interval [a, b]. Then

∫ b

af(x) dx = F (b) − F (a)

where F is any antiderivative of f .

• If f changes sign on [a, b], we know that the definite integral gives us “net area” (the differencebetween the area that lies above the x axis and that which lies below), and we adjust our areaproblem setups accordingly to break into separate regions.

∫ b

af(x) dx = A1 − A2

Onward to 3D:

• Returning to multivariable Calculus, what do you expect the double integral

∫∫R

f(x, y) dA

to give? Volume under the surface (assuming f(x, y) ≥ 0 on R).

183

• Definition of the double integral: Suppose f(x, y) is a continuous function on a region R. Forconvenience, and for visualization purposes, let R be a rectangular region a ≤ x ≤ y, c ≤ y ≤ d,and assume f(x, y) ≥ 0 for all (x, y) ∈ R. The double integral over R is defined by

∫∫R

f(x, y) dA = limn→∞

n∑i=1

f(xi, yi)ΔA

where ...

* Consider the region R and form a partition by dividing [a, b] into j subintervals of equal width,and [c, d] into k subintervals of equal width. (Again, this is a simplification of a more generalpartitioning we could set up.) This creates a grid of rectangles in the xy plane. There aren = j ∗ k rectangles total.

* Let Δx = b − aj , and Δy = d − c

k (the length and width of each subinterval). The area of

each rectangle is ΔA = Δx ∗ Δy.

* Choose a point (xi, yi) in each subinterval (for convenience, I’m showing the center point).

* Evaluate the function at (xi, yi) giving f(xi, yi). At this point, we note that on any rectangle,we can draw a rectangular prism with ΔA as the area of the base, and f(xi, yi) as the height.

* Compute the volume of a prism: V = f(xi, yi)ΔA.

* Note that each rectangle has its own prism, and the sum of these n prisms gives a total volumethat approximately equals the volume under the curve:

∑ni=1 f(xi, yi)ΔA

* And pushing n to infinity “smoothes out” the rectangular volume, and gives us the exact volumeunder the graph of f . ∫∫

Rf(x, y) dA = lim

n→∞

n∑i=1

f(xi, yi)ΔA

184

Things to consider:

• As noted before, the rectangles don’t have to all be the same: the subintervals don’t have to beof equal length and width (it’s just convenient to simplify things a bit). We could write a moregeneral version of the definition.

• The region R doesn’t have to be rectangular. This will work perfectly well over any region in thexy plane. Keep those vertically and horizontally simple regions in mind.

• The definition doesn’t tell you how to compute the double integral ... just as the definition of thedefinite integral doesn’t tell you how to compute it. We need a theorem for that.

• If the surface f(x, y) dips below the xy plane, we’ll end up getting “net volume” - the differencebetween the volume which lies above the plane, and that which lies below.

185


The double integral and volume (computing)

• Theorem: The double integral over a region R,

∫∫R

f(x, y) dA

can be computed using iterated integrals, where the bounds on the iterated integrals describe theregion R in the plane.

If f(x, y) ≥ 0 for all (x, y) ∈ R, this will give the volume under the surface z = f(x, y), above thexy plane. (If f(x, y) dips below the xy plane, we get the difference in volume above and volumebelow.)

• For rectangular regions a ≤ x ≤ b, c ≤ y ≤ d,

V =∫∫

Rf(x, y) dA =

∫ b

a

∫ d

cf(x, y) dy dx =

∫ d

c

∫ b

af(x, y) dx dy

• For vertically simple regions a ≤ x ≤ b, g1(x) ≤ y ≤ g2(x),

V =∫∫

Rf(x, y) dA =

∫ b

a

∫ g2(x)

g1(x)f(x, y) dy dx

• For horizontally simple regions h1(y) ≤ x ≤ h2(y), c ≤ y ≤ d,

V =∫∫

Rf(x, y) dA =

∫ d

c

∫ h2(y)

h1(y)f(x, y) dx dy

• Setting up the region is no different than setting up the region for an area problem - the fact thatwe’re now looking at

∫∫R f(x, y) dA in general instead of

∫∫R 1 dA in particular doesn’t affect how

you set up the bounds on the iterated integral. That comes from the region R.

• You’ve already practiced the techniques of computing iterated integrals in the form∫ ba

∫ g2(x)g1(x) f(x, y) dy dx

and so on.

• Therefore, you already know how to do this (the only thing missing by the time we got to this pointwas the assertion that you could, in fact, use iterated integrals to compute volume).

186

Example: Let R be the region bounded by x = 12y, x =

√y, y = 0, y = 4. Let f(x, y) = x2y2. Find

the volume under the surface f(x, y) over the region R.

• Start by sketching R and identifying whether vertically or horizontally simple:

• Set up the iterated integral:

• And integrate:

187

• That’s pretty much all there is to it, although there are lots of variations in setting up the region(see posted examples). The only thing missing is that we haven’t proven the theorem ... and youmay be relieved to learn that we won’t prove it rigorously with the Riemann sums - there are sometechnical details involving double summations and double limits that are beyond the scope of thisclass. I will offer a geometric “proof”, however ...

Consider the iterated integral

∫ b

a

∫ g2(x)

g1(x)f(x, y) dy dx

(so we’ll look at a vertically simple region as an example - you could repeat this with adx dy setup as well). When we hold x constant and compute the inner integral

∫ g2(x)

g1(x)f(x, y) dy

we are getting the area of a cross section under the surface of f(x, y), in the plane of x= whatever x we’re stuck in (imagine the integration of f over dy as sweeping out thatplane).

The cross sectional area will vary depending on where x is. So, we can say that

A(x) =∫ g2(x)

g1(x)f(x, y) dy

is a function that gives cross sectional areas.

However, we have learned in in single variable Calculus when studying volumes byslicing that ∫ b

aA(x) dx

gives the volume of a solid with cross section area A(x) (you can think of integration withrespect to x as sweeping out the volume of the solid).

So, ∫ b

a

∫ g2(x)

g1(x)f(x, y) dy dx

gives the volume of the solid under the surface z = f(x, y) and above R in the xy plane.

V =∫ 2

0

∫ 2−x

0(4 − x2 + y2) dy dx

188


Changing the order of integration

• We have seen that many regions are both vertically and horizontally simple, for example, this one:

• We could compute the area of that region as either

∫∫R

dA =∫ 1

0

∫ √x

xdy dx

or ∫∫R

dA =∫ 1

0

∫ y

y2dx dy

and find the volume of a solid under a surface f(x, y) and over that region by

∫∫R

dA =∫ 1

0

∫ √x

xf(x, y) dy dx

or ∫∫R

dA =∫ 1

0

∫ y

y2f(x, y) dx dy

• The question is, why would we want to? Consider the problem of computing

∫∫R

f(x, y) dA =∫ 1

0

∫ 1

xsin(y2) dy dx

This would give the volume under f(x, y) = sin(y2) over the region R: 0 ≤ x ≤ 1, x ≤ y ≤ 1.Perfectly nice region, the problem is already set up, so we proceed to the inner integral

∫ 1

xsin(y2) dy

and start by antidifferentiating sin(y2) with respect to y. What’s the problem?

189

• It is possible (although not guaranteed) that switching the order of integration from

∫ 1

0

∫ 1

xsin(y2) dy dx

to ∫ d

c

∫ h2(y)

h1(y)sin(y2) dx dy

will improve things. Notice that I can’t just swap dx and dy in the integral - the bounds are tiedto the variables of integration. And we know that

∫ 1

x

∫ 1

0sin(y2) dx dy

is not allowed - the outer integral must always have constant bounds.

• We need to take steps to determine exactly what c, d, h1(y) and h2(y) are. The way to do this is to

* Sketch the region described by the original bounds.

* And “reimagine” it. If the bounds were orginally given as y as a function of x, the region wasvertically simple. Switch it to a horizontally simple setup, solving the expressions for x as afunction of y.

* If the bounds were orginally given as x as a function of y, the region was horizontally simple.Switch it to a vertically simple setup, solving the expressions for y as a function of x.

* Write the new iterated integral reflecting the new setup.

* And try again.

190

Example: What is the region described by the bounds on∫ 10

∫ 1x sin(y2) dy dx? Sketch it:

Rearrange the expressions so the region reads horizontally, instead of vertically - the left and right boundsshould become functions of y, while the top and bottom bounds should become constants:

Set up the new iterated integral:

And try again with the integration:

191

• A typical mistake is to do this without sketching, and simply rearrange the expressions and replacethem in the integral. Don’t - as you switch from a bottom to top setup to a left to right setup, youmay find that the top function (at the top of the integral) becomes the left function (at the bottomof the integral).

• Switching the bounds on regions to change the order of integration is typically done

* when the original problem cannot be integrated.

* when the original problem is annoying to integrate, and might work better in the other order.

* simply because I said to.

192


Double integrals in polar coordinates

• We’ve discussed integration over rectangular regions, and integration over general regions wherethe bounds for the regions can be expressed as functions of x or functions of y. For regions like theone below, it’s difficult to express the bounds as functions of x or y, but simple to express in polarcoordinates as functions of a radius r and an angle θ.

• Furthermore, there are integrals out there that can’t be integrated as functions of x and y, but caneasily be integrated if we could rewrite in terms of r and θ. To do this, we have to

* Recall the conversion between polar and rectangular coordinate systems

* Figure out what happens to the expression being integrated.

* Figure out how to rewrite the bounds.

• Converting: A point in the plane may be specified in terms of x and y (horizontal and verticaldisplacement from the origin), or r and θ (distance from the origin and angle with the positive xaxis). This should look familiar - it’s identical to what we do with vectors, and the points theypoint to. You can specify magnitude and direction (r, θ), or x and y components (x, y).

x = r cos θ

y = r sin θ

r2 = x2 + y2

tan θ =y

x

193

• Plotting: Plotting points in polar is a matter of looking in the direction of the angle, and movingout the distance specified by the radius. Polar graph paper with radius rings and marked angles issometimes used to plot.

For example, to plot (r, θ) = (2, π3 ), move out a dis-

tance of 2 along an angle of π3 .

The only odd thing to get used to is having negativeradius - interpret (r, θ) = (−3, π

4 ) as facing along a

line at the angle π4 ... and then walking backwards

along that line.

Curves in polar coordinates

Polar coordinates are well suited for describing circles centered at the origin and lines through the origin.

• A circle of radius a with equation x2 + y2 = a2 becomes r2 = a2, and the curve is described as

r = a

0 ≤ θ ≤ 2π

194

• You should recognize semicircles from the x2 + y2 = a2 equation solved for either x or y:

y =√

a2 − x2 draws the top half of a circle withradius a.

y = −√a2 − x2 draws the bottom half of a circle

with radius a.

x =√

a2 − y2 draws the right half of a circle withradius a.

x = −√a2 − y2 draws the left half of a circle with

radius a.

All of these equations are expressed in the form r = a, α ≤ θ ≤ β, where the range of θ’s drawsout the correct part of the semicircle.

• Lines through the origin are expressed in terms of their angle.

A line with slope m has tan θ = m, or θ = tan−1 m.r is allowed to vary (by not specfiying anythingabout r at all, we’re implying it runs from −∞ to∞, drawing out the line).

The example shown is y = x, with m = 1. So θ =tan−1 1 and

θ =π

4

is the polar equation of this line.

195

• Horizontal and vertical lines have more complicated expressions in polar than they do in rec-tangular (but we may need this for rectangular regions).

Vertical:

x = a

r cos θ = a

r =a

cos θ

r = a sec θ

Horizontal:

y = b

r sin θ = b

r =b

sin θ

r = b csc θ

• Cartesian equations in general are converted to polar by making the substitutions

x = r cos θ y = r sin θ

For example, iff(x, y) = x2 + xy

we can say

f(r cos θ, r sin θ) = (r cos θ)2 + (r cos θ)(r sin θ)

= r2 cos2 θ + r2 cos θ sin θ

= r2 cos θ(cos θ + sin θ)

Example: Express the paraboloid f(x, y) = 9 − x2 − y2 as a function of r and θ.

196

Regions in polar coordinates

Regions in polar coordinates are expressed as inequalities in r and θ. For a θ - simple region, we have

h1(θ) ≤ r ≤ h2(θ)

α ≤ θ ≤ β

Examples :

0 ≤ r ≤ 3π

4≤ θ ≤ 3π

4

1 ≤ r ≤ 3

0 ≤ θ ≤ 2π

Rectangular regions require a little work to expressin polar, since you’re slicing radially. The region

0 ≤ x ≤ 2

0 ≤ y ≤ 2

is broken into two regions:

0 ≤ r ≤ 2 sec θ

0 ≤ θ ≤ π

4

0 ≤ r ≤ 2 csc θπ

4≤ θ ≤ π

2

197

Rewriting integrals using polar coordinates:

If f is continuous on a polar region of the form

R = (r, θ)|α ≤ θ ≤ β, h1(θ) ≤ r ≤ h2(θ)

then ∫∫R

f(x, y) dA =∫ β

α

∫ h2(θ)

h1(θ)f(r cos θ, r sin θ)r dr dθ

Note that dA becomes r dr dθ. We won’t do a full derivation of how this change of variables works forthe integrand, but we will at least justify this geometrically.

• Proceed on to the posted examples of integration. The key to these will be to

* Express the bounds of the region in polar coordinates, and put these new bounds on theintegrals.

* Express the function being integrated as f(x, y) = f(r cos θ, r sin θ).

* Integrate∫ βα

∫ h2(θ)h1(θ) f(r cos θ, r sin θ)r dr dθ

198

Triple integrals

The triple integral and volume

The development of the triple integral is analogous to that of the double integral, simply movinginto more one more dimension, so I’m going to go light on the theory here, and simply presentseveral examples.

Area ⇒ Volume:

The double integral∫∫

dA over a region in the xy plane gives the area of that region and canbe computed with iterated integrals:

R : a ≤ x ≤ b, g1(x) ≤ y ≤ g2(x) ⇒ A =

∫∫

R

dA =

∫ b

a

∫ g2(x)

g1(x)

dy dx (vertically simple)

R : h1(y) ≤ x ≤ h2(y), c ≤ y ≤ d ⇒ A =

∫∫

R

dA =

∫ d

c

∫ h2(y)

h1(y)

dx dy (horizontally simple)

R : h1(θ) ≤ r ≤ h2(θ), α ≤ θ ≤ β ⇒ A =

∫∫

R

dA =

∫ β

α

∫ h2(θ)

h1(θ)

r dr dθ (polar, θ- simple)

∫ b

a

∫ g2(x)

g1(x)

dy dx =

∫ b

a

(g2(x)− g1(x)) dx

When we introduce a function f(x, y) as the integrand, with f(x, y) > 0 on R, we can interpret

∫∫

R

f(x, y)dA =

∫ b

a

∫ g2(x)

g1(x)

f(x, y) dy dx

(

or

∫ d

c

∫ h2(y)

h1(y)

f(x, y) dx dy or

∫ β

α

∫ h2(θ)

h1(θ)

f(r cos θ, r sin θ)r dr dθ

)

as the volume between the surface f(x, y) and the xy plane over R.

We could express the same quantity using three iterated integrals; performing the inner inte-gration on

∫ b

a

∫ g2(x)

g1(x)

∫ f(x,y)

0

dz dy dx

would immediately give the expression

∫ b

a

∫ g2(x)

g1(x)

f(x, y) dy dx

And I’m going to quit with the “or” at this point; assume that this could be done just as wellwith a dz dx dy setup or an r dz dr dθ setup. If we use the r dz dr dθ setup, we’re using thecylindrical coordinate system, which is simply the polar coordinate system extended into3D.

199

Example:Write an expression that would give the volume below the surface f(x, y) = x + y over theregion

R = (x, y) | 0 ≤ x ≤ 3, 0 ≤ y ≤ x

using (a) a double iterated integral, and (b) a triple iterated integral.

At this level, it’s a matter of perspective - are you integrating a function f(x, y) over a 2D planarregion, or are you integrating “1” over a 3D solid region? Same difference. But if we intro-duce the triple integral, we have a more flexible notation, and can extend things a bit further.

Note in the following I’m going to switch the notation (and if I’d had any sense I’d have done this right at thestart instead of following the textbook). We’re about to have a lot of letters kicking around, so instead of usinga’s and b’s and g1(x)’s and h1(y)’s, I’m going to switch to subscripting (x1 ≤ x ≤ x2 instead of a ≤ x ≤ b,y1(x) ≤ y ≤ y2(x) instead of g1(x) ≤ y ≤ g2(x), and so on). This will make it easier to keep track of everybody,so you aren’t looking around going “Now, do the h’s go with the y’s or with the z’s?”

200

Volume

The triple integral∫∫∫

DdV over a rectangular prism

D : x1 ≤ x ≤ x2, y1 ≤ y ≤ y2, z1 ≤ z ≤ z2

in space gives us the volume of that region and can be computed with iterated integrals:

V =

∫∫∫

D

dV =

∫ x2

x1

∫ y2

y1

∫ z2

z1

dz dy dx

V =

∫∫∫

D

dV =

∫ x2

x1

∫ z2

z1

∫ y2

y1

dy dz dx

V =

∫∫∫

D

dV =

∫ z2

z1

∫ y2

y1

∫ x2

x1

dx dy dz

(or any permutation that you like).

Over a general solid region in space, for example

D : x1 ≤ x ≤ x1, y1(x) ≤ y ≤ y2(x), 0 ≤ z ≤ z2(x, y)

, the triple integral gives the volume between z = z2(x, y) and the xy plane:

V =

∫∫∫

D

dV =

∫ x2

x1

∫ y2(x)

y1(x)

∫ z2(x,y)

0

dz dy dx

and more generally,∫∫∫

D

dV =

∫ x2

x1

∫ y2(x)

y1(x)

∫ z2(x,y)

z1(x,y)

dz dy dx

would give the volume (or net volume if the curves crossed each other) between z1 and z2.

Setting up and solving these is mostly done live - I’ve put a few example pictures in the slides,though, so you can see what we’re talking about...

201

Example:Let D be the region below the plane z = 4x + 2y + 1 and above the region in the xy planebounded by x = 0, y = 0, and y = 3 − x.

D = (x, y, z) | 0 ≤ x ≤ 3, 0 ≤ y ≤ 3 − x, 0 ≤ z ≤ 4x + 2y + 1

The integral that would be used to compute the volume of this region would be

V =

∫∫∫

D

dV =

∫ 3

0

∫ 3−x

0

∫ 4x+2y+1

0

dz dy dx

202

Example:Let D be the region bounded by the paraboloid y = x2 + z2 and the plane y = 4. Write theintegral that would give the volume of this region.

Notice that the way this one is oriented, its trace in the plane y = 4 (which is parallel to thexz plane) is the circle x2 + z2 = 4. You can think of this as the way the graph is facing - thebest way to set up the region is with y = 4 as the floor.

So the best way to describe D is

D = (x, y, z) | − 2 ≤ x ≤ 2, x2 + z2≤ y ≤ 4,−

√4 − x2 ≤ z ≤

√4 − x2

The iterated integral that would be used to compute the volume of that region is:

V =

∫∫∫

D

dV =

∫ 2

−2

∫

√

4−x2

−

√

4−x2

∫ 4

x2+z2

dy dz dx

203

Basically, we can order our integrals any way we need to describe the region, as long as wefollow the rules for two variables on the inner bound, one in the middle, constant on the outer.These are all permissible variations (and there are others):

V =

∫∫∫

D

dV =

∫ x2

x1

∫ y2(x)

y1(x)

∫ z2(x,y)

z1(x,y)

dz dy dx

V =

∫∫∫

D

dV =

∫ z2

z1

∫ x2(z)

x1(z)

∫ y2(x,z)

y1(x,z)

dy dx dz

V =

∫∫∫

D

dV =

∫ z2

z1

∫ y2(z)

y1(z)

∫ x2(y,z)

x1(y,z)

dx dy dz

V =

∫∫∫

D

dV =

∫ y2

y1

∫ z2(y)

z1(y)

∫ x2(y,z)

x1(y,z)

dx dz dy

Volume ⇒ ???:

So, we have this progression...

∫ x2

x1

[y2(x) − y1(x)] dx

is used in one variable Calculus to give the area between the curves y1(x) and y2(x) withx1 ≤ x ≤ x2. But we can also write this as

∫∫

R

1dA =

∫ x2

x1

∫ y2(x)

y1(x)

1dy dx

Drop a function (or difference of functions) in the “1” spot and we’ve bumped things up tovolume:

∫∫

R

[z2(x, y) − z1(x, y)] dA =

∫ x2

x1

∫ y2(x)

y1(x)

[z2(x, y) − z1(x, y)] dy dx

gives the volume between z1(x, y) and z2(x, y) over the region x1 ≤ x ≤ x2, y1(x) ≤ y ≤ y2(x),and we can turn that into the triple integral:

∫∫∫

D

1dV =

∫ x2

x1

∫ y2(x)

y1(x)

∫ z2(x,y)

z1(x,y)

1dz dy dx

So, by analogy, we could drop a function in the “1” spot there and move up another dimension(and we’ll go back to using f now...I suppose w would be the other alternative after we’ve usedx, y and z).

∫∫∫

D

f(x, y, z) dV =

∫ x2

x1

∫ y2(x)

y1(x)

∫ z2(x,y)

z1(x,y)

f(x, y, z) dz dy dx

What does that give? If we continue to extend the geometry by analogy, w = f(x, y, z) is nowa four dimensional surface, and we’re getting the “hypervolume” of that surface “over” the

204

region D. A bit tough to visualize, and no, we can’t exactly sketch it!

Depending on what f represents, however, we may have other interpretations. A commonapplication is that if a region has a variable density, ρ(x, y, z), integrating will give the mass ofthe solid region. We’ll look at a couple applications of that after we finish with the techniquesof integrating over 3D regions.

The critical thing with these is being able to set up the 3D region that forms the bounds thatyou’re integrating over. Whether the integrand is a 1dV or an f(x, y, z) dV plays no role insetting up that region - that function is simply the function that is being integrated.

205

Triple integrals

Integrals using cylindrical coordinates

Cylindrical coordinates are the extension of polar coordinates to 3D; we know that a point in2D can be specified by either (x, y), or (r, θ), related by

x = r cos θ y = r sin θ x2 + y2 = r2 tan θ =y

x

In 3D, we may specify a location in similar fashion, by either giving the (x, y, z) coordinates ofthe point, or the (r, θ, z) coordinates - r and θ specify a distance and angle with the x axis inthe xy plane, and the third z coordinate tells you how far up or down from the z axis to proceed.

Example:

Where would the point (r, θ, z) = (2, π6 , 4) be located? Sketch.

What are the (x, y, z) coordinates of the point (r, θ, z) = (2, π6 , 4) ?

Surfaces in cylindrical coordinates

Note that you need to pay attention to the context in order to recognize what geometric objectyou’re looking at. In the context of 2D space, what does x2 + y2 = 4 describe? What is thecorresponding polar equation?

In the context of 3D space, what does the equation x2 + y2 = 4 describe? What is the corre-sponding cylindrical equation?

206

Equation forms you should recognize

• r = C (radius is constant, θ and z are allowed to vary arbitrarily).

Cylinder of radius C . Trace in the xy plane iscircle r = C ; i.e. x2 + y2 = C2.

• θ = C (angle with x axis is constant, radius and z are allowed to vary).

Plane through the origin parallel to z. Trace inxy is the line θ = C ; in rectangular, that’s theline y = (tan θ)x

• z = C (z constant, radius and angle vary).

Plane through the origin parallel to xy at aheight of z. The polar part is causing it todrawn in a disk shape, but that disk extendsindefinitely, creating the plane z = C .

207

• z = Cr (height is proportional to radius, angle varies).

Double cone. Traces in planes z = k parallel to

the xy plane are circles in the polar form r = kC

,

rectangular form x2 + y2 =(

kC

)2

. The sides of

the cone have slope ±C .

Integration in cylindrical coordinates

The integrating factor for triple integrals in cylindrical coordinates looks like that for doubleintegrals in polar coordinates, because the conversion equations are identical (z remains z). Incylindrical coordinates

∫∫∫

D

f(x, y, z) dV =

∫

β

α

∫

r2(θ)

r1(θ)

∫

z2(r,θ)

z1(r,θ)

f(r cos θ, r sin θ, z) r dz dr dθ

Note this one’s order specific; we don’t get into how to switch the order of integration on thesethings (although it is possible). The inner bounds imply that z can be a function of r and θ,in the middle, r can be a function of θ, and on the outer bounds, θ must be constant.

Examples: Examples for this one are all done live; go there now.

208


Spherical coordinates

Spherical coordinates give an alternate way of specifying the location of a point based on radiusand angle. We use an ordered triplet (ρ, θ, φ) where

• ρ – radius, distance from origin.ρ ≥ 0.

• θ – angle with positive x axis, in xy plane.0 ≤ θ < 2π.

• φ – angle with positive z axis.0 ≤ φ ≤ π.

The best way to visualize (ρ, θ, φ) is as (distance, spin, tilt).

How this works

Take a segment (or vector) of length ρ.

Tilt it down into the xz plane at an angle of φ.

Drop a line down to the x axis. Think of this as the base of a triangular wedge.

Then, grab the wedge and spin it in the xy plane. That’s your θ.

209

Converting from spherical (ρ, θ, φ) to rectangular (x, y, z)

To derive the conversion factors, first focus on the triangle formed by dropping a line to the xy

plane. We see that the angle φ belongs in the top of that triangle. The length of the leg parallelto the z axis is also the distance from the point to the xy plane - it must be the z coordinateof the point. Also, one leg lies in the xy plane - call its length r.

z = ρ cos φ

r = ρ sin φ

Now, focus just on the xy plane. If you “drop” a leg parallel to the y axis, and perpendicularto x, you form another right triangle with hypotenuse r and angle θ.

x = r cos θ

y = r sin θ

The conversion factors that will take us back and forth between spherical and rectangularcoordinate systems are:

x = ρ sin φ cos θ

y = ρ sin φ sin θ

z = ρ cos φ

Example:

Convert the point (ρ, θ, φ) = (5, 3π4 , 5π

6 ) to rectangular coordinates.

210

Converting from rectangular to spherical

The same equations allow us to convert from rectangular to spherical; to go the other way, sayfrom (x, y, z) = (−1, 2, 3) into spherical, we could solve simultaneous equations

−1 = ρ sin φ cos θ

2 = ρ sin φ sin θ

3 = ρ cos φ

However, it’s probably better to go and derive some formulas in the abstract.

First, note from the triangles used to set up the conversions, we have x2 + y2 = r2, and thenr2 + z2 = ρ2, so

ρ2 = x2 + y2 + z2

ρ =√

x2 + y2 + z2

Then, to get φ, we have

cosφ =z

ρ

Since the range of the inverse cosine function lines up neatly with the allowable values for φ

(0 ≤ φ ≤ π), we can say

φ = cos−1z

ρ

Finally, note that if we dividey = ρ sin φ sin θ


we get

tan θ =y

x

as with cylindrical (which we should, it’s the same θ). Since 0 ≤ θ < 2π, be sure when solvingthat to locate θ in the correct quadrant depending on the signs of x and y.

Summary:

Rectangular to spherical



z = ρ cos φ

Spherical to rectangular

ρ =√

x2 + y2 + z2

φ = cos−1z

ρ

tan θ =y

x

211

Example:

Convert the point (x, y, z) = (−1, 2, 3) to spherical coordinates.

212

Triple integrals

Spherical regions and triple integrals in spherical coordinates

Integration using spherical coordinates

I’m going to get a little ahead of things on this one and start by just showing you what theintegral will look like in spherical coordinates:

∫∫∫

D

f(x, y, z) dV =

∫ φ2

φ1

∫ θ2

θ1

∫ ρ2

ρ1

f(ρ sin φ cos θ, ρ sin φ sin θ, ρ cos φ)ρ2 sinφdρ dθ dφ

This is a fairly limited form (note all the variables of integration have constant bounds), so wecan only use it on shapes that can be expressed as spherical wedges. That makes the region partof the program pretty simple - the main thing that we’re interested in expressing in sphericalcoordinates are spheres and cones.

Note: different texts use different orderings and notation conventions (and since all the boundsare constant, you can do these in any order). Don’t be surprised if different forms/ordering/notationappear in examples and problems.

Spherical regions

We’ve seen how to plot points in spherical coordinates. The previous integral suggests we areinterested in regions in the form

(ρ, θ, φ) | ρ1 ≤ ρ ≤ ρ2, θ1 ≤ θ ≤ θ2, φ2 ≤ φ ≤ φ2

The first thing is to look at a quick gallery of shapes that are easily drawn in spherical coordi-nates. (Images are in lecture.)

The basic shape is a sphere, of course: ρ = C translates as ρ2 = C2 and so x2 + y2 + z2 = C2;a sphere with radius C . With θ and φ unspecified, we get the full sphere.

The role of φ:

Recall that in spherical coordinates, (ρ, θ, φ) is (radius, spin, tilt). φ = C with 0 < φ < π2

describes a cone (with θ and ρ unspecified):

Side note on converting cones back to rectangular: The key one that you should justrecognize is that the cone φ = π

4 is the cone

z =√

x2 + y2

but it’s instructive to look at how that got there so you can generalize it to cones with the sidesat an angle other than 45.

213

Using the conversion factors



z = ρ cos φ

and plugging in φ = π4 , we get

x = ρ sinπ

4cos θ =

√2

2ρ cos θ

y = ρ sinπ

4sin θ =

√2

2ρ sin θ

z = ρ cosπ

4=

√2

2ρ

Then ρ = 2√2z, and we can plug that back in to the equations for x and y:

x = z cos θ

y = z sin θ

So xz = cos θ,

yz = sin θ and

(x

z

)2

+(y

z

)2

= 1

z2 = x2 + y2

z =√

x2 + y2

since initially we had the top cone. If you want the lower cone, z = −√

x2 + y2, specify φ = 3π4 .

Cones with other degrees of tilt can be converted in the same way, although it won’t work outquite as neatly when cos φ and sinφ have different values.

ρ and φ together:

Specifying ρ = C and φ1 ≤ φ ≤ φ2 will draw out part of the surface of the sphere, from tilt totilt. Note that this still isn’t giving us a spherical solid, since I have a fixed value for ρ.

To draw the “ice cream cone” solid, we need to fill it in by specifying 0 ≤ ρ ≤ C , φ1 ≤ φ ≤ φ2.Other solid wedges of spheres can be described by varying the bounds for radius and tilt, andyou’ll see some variations in the suggested problems.

214

The role of θ:

The equation θ = C describes a plane perpendicular to the xy plane in exactly the same wayit would in cylindrical. Allowing ρ to vary makes it look like it’s drawing out a disk, but thatdisk extends infinitely - so really, a plane.

Combined with the other parameters, specifying a range for θ will carve out wedges of the solid;for example 0 ≤ ρ ≤ 4, 0 ≤ φ ≤ π

2 , 0 ≤ θ ≤ π2 will give the eighth of a sphere that lies in the

first octant.

Integrals in spherical coordinates

∫∫∫

D

f(x, y, z) dV =

∫ φ2

φ1

∫ θ2

θ1

∫ ρ2

ρ1


Coming back to this thing, notice that (as with cylindrical coordinates), we pick up an extrafactor in the integral: dV becomes ρ2 sinφdρ dθ dφ. Where that comes from is shown in thepicture below, and I’m simply going to link to a textbook explanation rather than rewritingthe whole thing - it’s a bit involved.

So, to integrate over spherical regions

• Describe the region in terms of bounds on ρ, θ and φ. If the integral is given with x, y, z

or r, θ, z bounds, sketch and reinterpret in spherical.

• The integral∫∫∫

Df(x, y, z) dV =

∫ φ2

φ1

∫ θ2

θ1

∫ ρ2

ρ1

1 ρ2 sinφdρ dθ dφ would give the volume ofthe region.

• We can also integrate a function over that region. In the case of

∫∫∫

D

f(x, y, z) dV =

∫ φ2

φ1

∫ θ2

θ1

∫ ρ2

ρ1


be sure to convert the function over to spherical as well.

Turn to the live examples to see it in action...

215

Applications of double and triple integrals

Density, mass, and volume

We’ve established that the volume of solid a solid region D can be computed from

∫∫∫

D

1dV

where the triple integral itself can be expressed as iterated integrals in rectangular,cylindrical , or spherical - whichever is appropriate for the region.

We’ve also seen that we can integrate a function over a region in space:

∫∫∫

D

f(x, y, z) dV [or

∫∫∫

D

f(r, θ, z) dV or

∫∫∫

D

f(ρ, φ, θ) dV ]

and the natural question is “well, what is that?”

The answer so far has been to reason up by analogy - since you can also get volumeusing a double integral and integrating under a function, as in

∫

b

a

∫

d

c

∫

z=f(x,y)

0

1dz dy dx =

∫

b

a

∫

d

c

f(x, y) dy dx

you can interpret∫∫∫

D

f(x, y, z) dV

as the hypervolume “under” the 4D surface w = f(x, y, z). While a bit difficult topicture, that might be the interpretation you need in some contexts.

However, that’s not the only interpretation, and it really depends on what f(x, y, z)is representing. Instead of a fourth spatial dimension, it could be representing, say,the temperature of an object as a function of (x, y, z). The interpretation we’regoing to look at is to consider what you get if f(x, y, z) is a function giving thedensity of a 3D object at each point in space.

216

Constant density, volume, and mass

Recall the basic relationship: density (ρ) equals mass (m) over volume (V ), or

ρ =m

V

and thereforem = ρV

If a solid region is of uniform (constant) density ρ, we can compute the massimmediately by multiplying ρ by the volume obtained through integration:

m = ρ

∫∫∫

D

1dV

Notation note: the ρ here has nothing to do with the radius ρ used in sphericalcoordinates; it’s just conventionally used for density. And will probably confusethings a bit if we also happen to be working in spherical.

Variable density

Now, suppose density varies as a function of (x, y, z): ρ = ρ(x, y, z). Now we haveto consider a little arbitrary chunk of mass:

∆m = ρ(xi, yi, zi)∆V

Sum up all the little chunks in the usual Riemann sum, smooth it out with thelimit, and we get

m =

∫∫∫

D

ρ(x, y, z) dV

And that pretty much covers it for the concept, and the implementation; you’vealready been setting up triple integrals and integrating functions. Only thing newhere is a possible interpretation for what that function could be. And while I’musing the (x, y, z) form in the explanation, these problems could also be occuringin cylindrical and spherical coordinate systems as well.

Keep going for a couple of worked examples, and see the suggested problems forsome more.

217

Examples:

Example 1: Suppose the solid bounded by x = y2, z = x, z = 0, y = 0, x = 1has uniform density ρ = 3. Find the mass of the solid.

Additional notation note: yes, we lack units. Perk of a math class. If it makes youfeel better, imagine the (x, y, z) coordinate system is position in meters, and the

density is given inkgm3

. That’ll make the mass come out in kg.

The hardest thing about this problem is figuring out what that region looks like.If I had to hand sketch, I’d note that just x = y2, y = 0, x = 1 is pretty easy todeal with - it describes a parabola in the xy plane, and so a parabolic cylinder inx, y, z:

The plane z = x runs parallel to the y axis and slices through that cylinder,forming the top (the bottom is just z = 0). And if all else fails, there’s alwaysMaple (code for plot attached at end).

218

So the region D is best described by

(x, y, z) | 0 ≤ x ≤ y2, 0 ≤ y ≤ 1, 0 ≤ z ≤ x

In terms of setting up the integral, y needs to be to the outside, x in the middle,and z in the inner:

V =

∫∫∫

D

1dV =

∫

1

0

∫

y2

0

∫

x

0

1dz dx dy

Since this one’s easy enough to do by hand, I’ll just go ahead and do it - for someof them, you’ll just want to focus on the setup and use Maple.

Inner:∫

x

0

1dz = [z]x0

= x

Middle:

∫

y2

0

x dx =

[

1

2x2

]y2

0

=1

2y4

Outer:

∫

1

0

1

2y4 dy =

[

1

10y5

]1

0

=1

10

So the volume of the solid is V = 110m3. Since m = ρV with constant ρ = 3

kgm3

,

mass is

m = ρV = 3

(

1

10

)

=3

10= .3kg

219

Example 2:

Now, suppose we have the same solid, but density is variable:

ρ(x, y, z) = x + y + z

Now what is the mass of the region?

The bounds of the integral aren’t going to change, but we now have to incorporatethe density function into the integration, instead of multiplying at the end:

m =

∫∫∫

D

ρ(x, y, z) dV =

∫

1

0

∫

y2

0

∫

x

0

(x + y + z) dz dx dy

Still not particularly difficult, since it’s polynomial, but a slightly more annoyingintegral.

Inner:∫

x

0

1 (x + y + z)dz =

[

xz + yz +1

2z2

]x

0

= x(x) + y(x) +1

2x2 =

3

2x2 + xy

Middle:

∫

y2

0

(

3

2x2 + xy

)

dx =

[

1

2x3 +

1

2x2y

]y2

0

=1

2y6 +

1

2y5

Outer:

∫

1

0

(

1

2y6 +

1

2y5

)

dy =

[

1

14y7 +

1

12y6

]1

0

=1

14+

1

12=

13

84

The mass of the solid is ≈ .15kg.

And that would be the whole idea - keep doing what you’ve been doing. It’sjust now, you have a physical meaning to assign to what you’re getting besides“4D hypervolume.” For more examples using spherical and cylindrical coordinatesystems, see the suggested problems.

220

Applications of double and triple integrals

Moments and center of mass

This lecture is rather sparse, and assumes you recall what moment and center ofmass are from single variable Calculus (if not, dig out your Calc text and refresh,plus I’ve got some old notes that I dug up on the topic).

Moments and center of mass in 2D plane figures

Recall that we call a thin sheet of a plane region a lamina, and by “thin” we mean“has no thickness at all” - we are looking at something that is purely a 2D area,and not a solid. Despite that, we can associate a density with it, where instead ofthe usual density as mass per unit volume, we use mass per unit surface area.

Define the region by the usual set R of points in the plane. If the figure hasvariable density ρ(x, y) [or ρ(r, θ) in polar], we have a little chunk of mass givenby ∆m = ρ(xi, yi)∆A, and a total mass of

m =

∫∫

R

ρ(x, y) dA

If density is constant, we can pull it through the integral and have m = ρA in thesame way we’d have m = ρV .

222

We can also define moments about the x and y axes. These are sometimescalled the “first moments” and should not be confused with “moment of inertia”which is a bit different and a topic for another day. These moments are just yourmoments in the sense of torque, measuring the tendency to rotate around each ofthe axes. The lever arm is perpendicular to the axis of rotation, and so

Mx =

∫∫

R

yρ(x, y) dA My =

∫∫

R

xρ(x, y) dA

Then, we can compute the center of mass of the region (x, y) by

x =My

m=

∫∫

Rxρ(x, y) dA

∫∫

Rρ(x, y) dA

y =Mx

m=

∫∫

Ryρ(x, y) dA

∫∫

Rρ(x, y) dA

223

Example:

Find the center of mass of the lamina described by the region

R = (x, y) |x ≥ 0, 0 ≤ y ≤ 9 − x2

with density ρ(x, y) = xy.

Set up and evaluate the integrals for m, Mx and My:

m =

∫∫

R

ρ(x, y) dA =

∫

3

0

∫

9−x2

0

xy dy dx

Inner:

∫

9−x2

0

xy dy =1

2x

[

y2]9−x2

0=

1

2x(9 − x2)2 =

81

2x − 9x3 +

1

2x5

Outer:

∫

3

0

(

81

2x − 9x3 +

1

2x5

)

dx =

[

81

4x2 −

9

4x4 +

1

12x6

]3

0

=81

4(9) −

9

4(81) +

1

12(729) = 60.75

Mass is m = 60.75.

224

Mx =

∫∫

R

yρ(x, y) dA =

∫

3

0

∫

9−x2

0

y(xy) dy dx

Inner:

∫

9−x2

0

xy2 dy =1

3x

[

y3]9−x2

0=

1

3x(9 − x2)3 = −

1

3x7 + 9x5 − 81x3 + 243x

Outer:

∫

3

0

(

−1

3x7 + 9x5 − 81x3 + 243x

)

dx =

[

−1

24x8 +

3

2x6 −

81

4x4 +

243

2x2

]3

0

=2187

8= 273.375

Moment about x is Mx = 273.375.

My =

∫∫

R

xρ(x, y) dA =

∫

3

0

∫

9−x2

0

x(xy) dy dx

Inner:

∫

9−x2

0

x2y dy =1

2x2

[

y2]

9−x2

0=

1

2x2(9 − x2)2 =

1

2x6 − 9x4 +

81

2x2

Outer:

∫

3

0

(

1

2x6 − 9x4 +

81

2x2

)

dx =

[

1

14x7 −

9

5x5 +

27

2x5 +

243

2x2

]3

0

=2916

35≈ 83.314

Moment about y is My ≈ 83.314.

Center of mass:

(x, y) =

(

My

m,Mx

m

)

≈

(

83.314

60.75,273.375

60.75

)

≈ (1.37, 4.5)

225

Moments and center of mass in 3D solid figures

Define the region by the usual set D of points in the space (your choice of coordinatesystem). If the figure has variable density ρ(x, y, z) [or similar in cylindrical orspherical], we have seen that mass

m =

∫∫

D

ρ(x, y, z) dV

We define the moments about the three coordinate planes as:

Myz =

∫∫∫

D

xρ(x, y, z) dV

Mxz =

∫∫∫

D

yρ(x, y, z) dV

Mxy =

∫∫∫

D

zρ(x, y, z) dV

And locate the center of mass (x, y, z):

x =Myz

my =

Mxz

mz =

Mxy

m

226

Example:

Find the center of mass of the solid bounded by x = y2, z = x, z = 0, y = 0, x = 1with density ρ(x, y, z) = x + y + z.

This solid is the same one I used in the density/mass example, so we’ve alreadygot it sketched and the mass calculated.

D = (x, y, z) | 0 ≤ x ≤ y2, 0 ≤ y ≤ 1, 0 ≤ z ≤ x

m =

∫∫∫

D

ρ(x, y, z) dV =

∫

1

0

∫ y2

0

∫ x

0

(x + y + z) dz dx dy

The mass of the solid is 1384 ≈ .15kg.

Now, we’ll set up the integrals for the moments:

Myz =

∫∫∫

D

xρ(x, y, z) dV =

∫

1

0

∫ y2

0

∫ x

0

x(x + y + z) dz dx dy

Mxz =

∫∫∫

D

yρ(x, y, z) dV =

∫

1

0

∫ y2

0

∫ x

0

y(x + y + z) dz dx dy

Mxy =

∫∫∫

D

zρ(x, y, z) dV =

∫

1

0

∫ y2

0

∫ x

0

z(x + y + z) dz dx dy

It’s not that any one of those integrals is particularly difficult to do, it’s just that todo any center of mass problem, you have to do four separate integral calculations.At three integrals each, that’s twelve integrations to solve one problem. And theintroduction of the different variables x, y and z into the integrand means thatthe integrations don’t build off each other - each one has to be redone from scratch.

I...don’t have the patience. You probably don’t either. The expectation is that ifneeded, yes, you could calculate each of those integrals, but sitting around working

227

twelve integration problems to solve one example isn’t the best use of time. Hello,Maple.

Maple informs me (code attached at end) that

m =

∫

1

0

∫ y2

0

∫ x

0

(x + y + z) dz dx dy =13

84

Myz =

∫

1

0

∫ y2

0

∫ x

0

x(x + y + z) dz dx dy =1

12

Mxz =

∫

1

0

∫ y2

0

∫ x

0

y(x + y + z) dz dx dy =15

112

Mxy =

∫

1

0

∫ y2

0

∫ x

0

z(x + y + z) dz dx dy =19

432

And therefore the center of mass is

(x, y, z) =

(

Myz

m,Mxz

m,Mxy

m

)

=

(

7

13,45

52,133

468

)

These could be set up in cylindrical or spherical coordinate systems as well - seethe suggested problems. In general, for 3D center of mass problems, the focus willbe on the setup (and being able to switch between coordinate systems); we’ll letMaple handle the integration.

228

(6)(6)

(7)(7)

(2)(2)

(1)(1)

(4)(4)

(5)(5)

(3)(3)

Center of mass example:

Density function:d d x, y, z /xCyCz :

m d0

1

0

y2

0

x

d x, y, z dz dx dy;

1384

Myzd0

1

0

y2

0

x

x$d x, y, z dz dx dy;

112

Mxzd0

1

0

y2

0

x

y$d x, y, z dz dx dy;

15112

Mxy d0

1

0

y2

0

x

z$d x, y, z dz dx dy;

19432

xbard Myzm

;

713

ybard Mxzm

;

4552

zbar d Mxym

;

133468

229

Vector fields, line integrals, and Green’s Theorem

Vector fields

We’re already acquainted with vector valued functions - functions that take a single scalart as input and produce a vector as output. For example

r(t) =< t2, t, 3t− 1 >= t2i + tj + (3t − 1)k

The easiest way to visualize vector valued functions of this type is to think of t as an unseenparameter that generates ordered triplets (x, y, z). We imagine the curve being traced out overtime by a moving vector that points to points along it.

You can think of these functions as “scalar in, vector out.” The above function is a mappingfrom R to R

3.

We’re also acquainted with multivariable functions - functions that take an ordered pair (ortriplet, or higher) as input, and produce a single scalar as output. For example

f(x, y) = x2 + y2 − 3xy

We’ve been visualizing multivariable functions as points in space, where the coordinates of thexy floor determine the z coordinate of the function. Allowing x and y to vary generates a surface.

Although we haven’t really been describing them this way, multivariable functions are “vectorin, scalar out;” there really isn’t any distinction between an ordered pair (x, y) and the vector< x, y > that points to it. The above function is a mapping from R

2 to R.

So you can guess what’s coming next: “vector in, vector out.” It’s easy enough to write afunction that is a mapping from (say) R

2 to R2, such as

F(x, y) = (x2 + y)i + (2x + y)j

And easy enough to evaluate it; calculate F(1, 3).

230

So the only thing left is what is the best way to visualize it. From a strictly algebraic perspec-tive, it’s “vector in, vector out” - the image of the vector < 1, 3 > is the vector < 4, 5 > underthis mapping. It’s just inconvenient geometry to think of both of them as vectors. It’s alsoinconvenient geometry to think of them as both points - you’d just have the input and outputlying in the same plane with no real sense of the connection between them.

The best way to visualize is to treat the input as a point, and the output as a vector with itstail at that point. A representation of F(1, 3) =< 4, 5 > would be this:

Of course, (1, 3) is just one point in the domain of F. And the domain of F is R2 = R×R, the

set of all real valued ordered pairs. So each point in the plane has a vector associated with it.The overall picture of the function is a bunch of points with a bunch of vectors sticking out ofthem.

231

Example:

Generate a value table for F (evaluate at each of the points given).

Then, plot the various points in the plane with the associated vectors.

232

Sketching using vectors of equal magnitude

Obviously, we can’t sketch all the vectors in a vector field (infinite number of points and so on).One approach to sketching is analogous to the idea of level curves giving a picture of a surface;we look at set of vectors where ||F|| = c (or, to make things more convenient, ||F||2 = c2) forvarious values of c. Note that ||F|| is a scalar valued function, and for F : R

2 → R2, ||F|| = c

will describe a curve in the plane. All vectors of equal length will have their tails on that curve.

Example:

For the function F(x, y) = (x2 + y)i + (2x + y)j, what is the expression for the function||F(x, y)||2?

Plot an assortment of curves ||F(x, y)||2 = c2, say

c = 1 → x4 + 2x2y + 4x2 + 4xy + 2y2 = 1

c = 2 → x4 + 2x2y + 4x2 + 4xy + 2y2 = 4

c = 3 → x4 + 2x2y + 4x2 + 4xy + 2y2 = 9

You’ll want to use plotting software for this; these are implicitly defined and not a familiarshape like an ellipse or parabola. Implicitly defined curves can be plotted in Maple, MVT,Winplot, whatever, so pick something you’re comfortable with.

Then, focus in on one of the curves (say c = 1) and find some points that lie on that curve.

This is a bunch of tedious scratch work, so I’ll help you out with this bit and generate a fewfor you. The by hand approach would just be to pick a bunch of x’s and solve for the matchingy’s.

233

Evaluate F(x, y) = (x2 + y)i + (2x + y)j at each of those points, as before. You should be ableto observe that each of the vectors does have length 1, and you can attach them to the curve||F(x, y)||2 = 1.

Really, we just use software

Examples of vector fields plotted in MVT and Maple are shown in the slides. Instructions fordoing so are posted. The main thing of note there is the scaling - one thing you may havenoticed is that if the vectors are scaled at the same scale as the curves, they tend to take overthe graph. Software generated vector fields tend to scale things down a good bit - they’reproportional to themselves (correct slope/direction) and to each other (relative lengths), butnot necessarily the same scale as the underlying functions.

234

Domain and codomain - a note

It’s implied, but I should explicitly state this: vector fields are a subset of all possible vector-to-vector mappings. In particular, we call it a vector field when the domain and the codomainare the same space. Vector fields in the plane are mappings from R

2 (the (x, y) points) to R2

(the < x, y > vectors), and vectors fields in space are mappings from R3 (the (x, y, z) points)

to R3 (the < x, y, z > vectors).

We could certainly look at mappings Rn → R

m where n 6= m, but we wouldn’t call them vectorfields, and we’d have a whole new set of interpretations. That’s another class...

So what do they represent?

The main physical interpretation is of the vector field lines as forces, or flow lines, or basicallysomething vector-y acting on an object at a point. For example, if you look at the field inthe previous example and visualize it as flowing water, you can chart the course of an objectdropped in at any point - the vectors are describing how it will be carried along. A collectionof physical examples appears as a set of notes.

235


grad, div, curl

Recall that for a multivariable scalar function f(x, y) [or f(x, y, z) - unless otherwise stated,figure that anything we develop with 2 variables extends to 3 or more], the gradient functionof f is defined by

∇f(x, y) =< fx(x, y), fy(x, y) >

∇f(x, y, z) =< fx(x, y, z), fy(x, y, z), fz(x, y, z) >

Example:

For f(x, y) = x2y3, find ∇f(x, y).

The point to the preceding is that the gradient of a scalar function always produces a vectorfield:

F(x, y) = ∇f(x, y) = fx(x, y)i + fy(x, y)j

One question that’s natural to ask is does that work in the other direction: is every vector fieldthe gradient of some scalar function? The answer is “no, not always, but sometimes,” and thatone will be addressed in the material for conservative vector fields. The purpose of this lectureis different: we’re going to introduce the idea of using the gradient symbol ∇ notation as avector operator that can be applied to either a scalar function (producing the above mentionedgradient), but can also applied to vector fields using vector operations.

So this next bit is technically notation abuse, but it makes for a very simple way to expressand remember three key formulas.

236

The ∇ operator

Consider ∇ to be a vector “operator”

∇ =<∂

∂x,

∂

∂y>

That’s where the notation abuse comes in, because ∂∂x

and ∂∂y

are not quantities or functions.

∂∂x

means “take the derivative with respect to x of ...” and you’ll notice that the “of” isn’tspecified!

So to make sure we’ve got this perfectly straight, note the difference between these two things:

∂∂x

(f) or∂f∂x

is a function - the partial derivative of f with respect to x.

∂∂x

by itself is an operator looking for something to operate on - it stands for the “take thederivative with respect to x of...” part, but it needs to be paired with an operand before any-thing happens.

And now for more notation abuse...

∇ =< ∂∂x

, ∂∂y

> is an operator in the shape of a vector.

So ∇f would be notated

∇f =<∂

∂x,

∂

∂y> f

Well, that’s not really a vector “times” a scalar, because < ∂∂x

, ∂∂y

> isn’t a thing that can

be multiplied, but notationally, it certainly looks like one, so act like you’re “multiplying” the

scalar f into the vector < ∂∂x

, ∂∂y

>:

∇f =<∂

∂x(f),

∂

∂y(f) >

And interpret the operators as operators - ∂∂x

(f) doesn’t mean “some quantity times f ,” it still

means “the derivative of f with respect to x. In other words ∂∂x

(f) is still just∂f∂x

...otherwiseknown as fx, and

∇f =<∂

∂x(f),

∂

∂y(f) >=< fx, fy >

exactly as it should be.

Seems like an awfully convoluted way to go about thinking about the gradient, but the notationmakes two otherwise unwieldy new computational formulas that we apply to vector fields simpleto recall.

237

Divergence

Suppose F(x, y) = M(x, y)i+N(x, y)j is a vector field in the plane. We define the divergence

of F(x, y) by

div F(x, y) =∂M

∂x+

∂N

∂y

For a vector field in space F(x, y, z) = M(x, y, z)i + N(x, y, z)j + P (x, y, z)k,

div F(x, y, z) =∂M

∂x+

∂N

∂y+

∂P

∂z

Example:

Let F(x, y, z) = x2zi + ln(x2 + y2)j − 3xy2z3k. Find div F.

In vector operator notation,

div F(x, y, z) =∂M

∂x+

∂N

∂y+

∂P

∂z

=∂

∂x(M) +

∂

∂y(N) +

∂

∂z(P )

= <∂

∂x,

∂

∂y,

∂

∂z> · < M, N, P >

= ∇ · F

238

Curl

Suppose F(x, y, z) = M(x, y, z)i+ N(x, y, z)j + P (x, y, z)k is a vector field in space. We definethe curl of F(x, y, z) by

curl F =

(

∂P

∂y−

∂N

∂z

)

i −

(

∂P

∂x−

∂M

∂z

)

j +

(

∂N

∂x−

∂M

∂y

)

k

...which brings us to the real reason we like to think of ∇ as a vector operator. Gradient anddivergence are easy enough to remember in terms of differentiation, but the above formula lookslike a nightmare (differentiate who with respect to what, where?!)...until you recognize that thepattern of this thing minus that thing is very familiar - it’s the same structure as a cross product.

In vector operator notation,

curl F = ∇× F =

∣

∣

∣

∣

∣

∣

∣

i j k∂∂x

∂∂y

∂∂z

M N P

∣

∣

∣

∣

∣

∣

∣

Example:

Let F(x, y, z) = x2zi + ln(x2 + y2)j − 3xy2z3k. Find curl F.

This is just a matter of getting the correct derivatives of the correct things:

239

Summary

Using the vector operator notation, ∇ =< ∂∂x

, ∂∂y

, ∂∂z

> or < ∂∂x

, ∂∂y

> as appropriate, we

can succinctly state the three quantities grad, div, and curl as

• grad f = ∇f or ∇(f).

• div F = ∇ · F

• curl F = ∇× F

240


Conservative vector fields

We have seen that the gradient of a scalar function always produces a vector field:

∇f(x, y) = fx(x, y)i + fy(x, y)j = F(x, y) [in the plane]

∇f(x, y, z) = fx(x, y, z)i + fy(x, y, z)j + fz(x, y, z)k = F(x, y, z) [in space]

One question that comes up is, “Does this work in the other direction? Is every vector fieldthe gradient of some scalar function?” The answer is no, but some are, and it’s useful to havea test to determine which, and a process for finding a scalar function f such that ∇f = F.

Conservative vector field

A vector field F is called conservative if there exists a differentiable scalar function f suchthat ∇f = F. The scalar function f is called the potential function of F.

Theorem: Test for conservative vector field in the plane

Let M and N have continuous first partial derivatives on an open disk R. Thevector field given by F(x, y) = M(x, y)i + N(x, y)j is conservative if and only if

∂N

∂x=

∂M

∂y

Proof:

It’s simple to prove one direction: if F is conservative, then ∂N∂x

= ∂M∂y

. Suppose

that F is conservative; i.e. F is the gradient of some scalar function f . So

∇f = M i + Nj

and fx = M , fy = N .

Differentiating fx with respect to y gives fxy = ∂M∂y

. Differentiating fy with respect

to x gives fyx = ∂N∂x

. And Clairault’s theorem says that [if continuous, which is

part of the hypothesis] fxy = fyz and so

∂N

∂x=

∂M

∂y

The proof of the other direction, which is really the direction we use when applyingthe test, has to be put on hold. It requires the use of Green’s Theorem, which putsin an appearance later on in the material. We’ll prove it when we get there.

241

Example:

Determine whether F(x, y) = (2x)i + (y)j is conservative.

Example:

Determine whether F(x, y) = (x2y)i + (xy)j is conservative.

Finding a potential function

One it has been established that a vector field is conservative, the next question is “What isthe scalar function that this function is the gradient of?” or “What is the potential function?”

Since the assertion that F is conservative is an assertion that F = ∇f , that sets up the processfor solving - set ∇f = F, resulting in two equations: fx = M , and fy = N .

These are differential equations - we integrate M with respect to x, and N with respect to y.The only unusual thing here is the constants of integration. And, since we know that in an-tidifferentiation problems, there’s more than one function that fits (because of those constantsof integration), the question is really to find a potential function, rather than the potentialfunction.

242

Example:

Find a potential function for F(x, y) = (2x)i + (y)j (which we have already established is con-servative).

• Since fx = M = 2x, f(x, y) must equal∫

M dx, and we start by integrating M withrespect to x. However, where we would normally pick up a +C as the constant of inte-gration, this will be a little different. Since fx is the partial derivative with respect to x,any function of y is constant relative to that. So instead of tacking on a +C , tack on a+g(y) as the constant of integration.

• Do the same with fy = N = y: f(x, y) must also equal∫

N dy. But since any expressionwith an x in it is constant relative to fy, the constant of integration can be a +h(x).

• Since both functions are expressions for f(x, y), equate the two. There’s no one correctsolution, but there’s usually an obvious solution in how the pieces match up - you can seewhat g(y) and h(x) can be to satisfy the equation.

x2 + g(y) =1

2y2 + h(x)

• And write the final answer for f(x, y). You can check by differentiating - ∇f had betterequal the original F!

243

Theorem: Test for conservative vector field in space

Let M ,N , and P have continuous first partial derivatives on an open sphere Q. Thevector field given by F(x, y, z) = M(x, y, z)i+N(x, y, z)j+P (x, y, z)k is conservativeif and only if

∂P

∂y=

∂N

∂z,∂P

∂x=

∂M

∂z, and

∂N

∂x=

∂M

∂y

Proof:

The proof of this one is basically an extension of the proof of the “in the plane”version, but with more mixed partials to consider. In one direction, if F = ∇f forsome f , then, for example, we have

fx = M fz = P

fxz =∂M

∂zfzx =

∂P

∂x

and by equivalence of mixed partials, those have to be equal. The other variationsare similar. And as before, going in the other direction has to be deferred untilGreen’s Theorem.

The above result can be summarized and remembered more easily if you note that the patternof partials is once again a determinant. Recall that

curl F = ∇× F =

∣

∣

∣

∣

∣

∣

∣

i j k∂∂x

∂∂y

∂∂z

M N P

∣

∣

∣

∣

∣

∣

∣

So, for example, the i component of the curl is ∂P∂y

− ∂N∂z

, and a statement that ∂P∂y

= ∂N∂z

is

the same as ∂P∂y

− ∂N∂z

= 0; i.e. the i component of the curl is zero.

Look at the other components as well, and you’ll see that the test for mixed partials is equivalentto the statement F(x, y, z) = M(x, y, z)i + N(x, y, z)j + P (x, y, z)k is conservative if and onlyif

curl F = 0i + 0j + 0k = 0

Example:

Determine whether the vector field F(x, y, z) = yezi + xezj + ezk is conservative.

There’s a live example posted for finding a potential function for a vector field in space (sameprocess as for a vector field in the plane), and additional examples appear in the suggestedproblems.

244


Line integrals

The problem: Suppose you have a surface z = f(x, y) defined over a region D.

Restrict the domain of the function to the values of x and y which lie on the boundary of D.

On this domain, the values of f(x, y) define a curve in space. It does not make sense to askwhat the volume is under this curve.

However, imagine a fence put up along the boundary, whose height at each point is given bythe value of f(x, y). It does make sense to ask “What’s the surface area of that fence?” It alsoleads to questions such as “Is there a relationship between that area and the volume under thesurface?”

−3−2

−10

12

3

−3

−2

−1

0

1

2

3

0

1

2

3

4

5

6

x

f(x,y)

y

z

−3−2

−10

12

3

−3

−2

−1

0

1

2

3

0

1

2

3

4

5

6

x

f(x,y)

y

z

−3−2

−10

12

3

−3

−2

−1

0

1

2

3

0

1

2

3

4

5

6

x

f(x,y) on boundary

y

z

−3−2

−10

12

3

−3

−2

−1

0

1

2

3

0

1

2

3

4

5

6

x

f(x,y) on boundary

y

z

Figure 1:

The value of that area is one way to define the value of an integral of f(x, y) along the

boundary of D. Integrals of this type are called line integrals.

245

Notation

There are several ways to denote a line integral. The general version usual occurs outside offreshman Calc texts, works for any dimension, and uses the following conventions:

• Ω is a region in Rn.

• The boundary of the region is denoted ∂Ω.

• The vector x is n dimensional: x =< x1, x2, ...xn >.

• f(x) is an n + 1 dimensional function of n variables defined on Ω.

• The integral of f over Ω is denoted∫

Ω

f(x) dx

(in one variable, the area under f(x), in two variables, the volume under f(x1, x2) =f(x, y), in three variables, the hypervolume under f(x1, x2, x3) = f(x, y, z), and so on,for positive valued functions).

• The line integral of f over the boundary of Ω, ∂Ω is denoted∫

∂Ω

f(x) dS

We’ll pretty much stick with the two variable / 3D case (since that’s the one we can visualize).In this case, we have the notation

• D is a region in R2 (the xy plane).

• The boundary of the region is denoted C = ∂D. (The C stands for ‘curve’)

• The vector x is 2 dimensional: x =< x, y >.

• f(x, y) is an 3 dimensional function of 2 variables defined on D.

• The integral of f over D is denoted∫∫

D

f(x, y) dA

• The line integral of f over the boundary of D, denoted ∂D or C , is denoted∫

C

f(x, y) ds

• The curve C doesn’t have to be the boundary of an enclosed region; we can evaluate theline integral along a non closed curve.

246

Evaluating Line Integrals

First, note that that ds in the integral up there is what is was before – an infinitesimally smallchunk of arc length. (Think about it for a second and make sure that it makes sense that you’llget the area of the fence by multiplying the function value with a chunk of length along thecurve, then summing up.) Recall from past experience that if we can parameterize the curveC by

r(t) = x(t)i + y(t)j a ≤ t ≤ b

then

ds =√

dx2 + dy2 =

√

dx2 + dy2

dt2dt2 =

√

(

dx

dt

)2

+

(

dy

dt

)2

dt

So

∫

C

f(x, y) ds =

∫ b

a

f(x(t), y(t))

√

(x′(t))2 + (y′(t))2dt

Step by step

• Parameterize the curve if not already done.

• Write the function f(x, y) as a function of t by composing f(x(t), y(t)).

• Write the expression

√

(

dxdt

)

2

+(

dydt

)2

dt =√

(x′(t))2+ (y′(t))

2dt.

• Set up the integral

∫

C

f(x, y) ds =

∫ b

a

f(x(t), y(t))

√

(x′(t))2 + (y′(t))2dt

by making the above substitutions.

• Integrate and evaluate.

247

Example:

Find∫

C

xy4 ds

where C is the right half of the circle x2 + y2 = 16 traversed counterclockwise from (0,−4) to(0, 4).

• Quickly sketch the curve C and parameterize it. Recall that the best way to parameterizea circle or ellipse is by using sine and cosine.

• Convert the function f(x, y) = xy4 to f(x(t), y(t)) by substituting your parameterization.

• Compute dxdt

anddydt

and use them to construct ds =

√

(

dxdt

)2

+(

dydt

)2

dt

• Assemble the whole thing into an integral and integrate.

248

The plot of the surface f(x, y) = xy4 and one interpretation of what you just found is givenbelow. This “fence” area is called the lateral surface area.

−5

0

5

−5

0

5

0

100

200

300

400

500

x

f(x,y)

y

z

−5

0

5

−5

0

5

0

100

200

300

400

500

x

f(x,y)

y

z

−5

0

5

−5

0

5

0

100

200

300

400

500

x

f(x,y) on boundary

y

z

−5

0

5

−5

0

5

0

100

200

300

400

500

x

f(x,y) on boundary

y

z

Figure 2:

Question:

Does the direction of travel along the path matter? What would happen if you had traversedthe half circle clockwise from (0, 4) to (0,−4)?

249

More on curves

The curve in the previous example is an example of a smooth curve, which intuitively is justwhat it sounds like - no sharp changes in direction. Formally, a plane curve [space curve] C

given byr(t) = x(t)i + y(t)j [r(t) = x(t)i + y(t)j + z(t)k]

is smooth if dxdt

anddydt

[and dzdt

] are continuous on [a, b], and not simultaneously zero on (a, b).

A curve is piecewise smooth if the interval [a, b] can be partitioned into a finite number ofsubintervals, on which C is smooth on each subinterval.

If a curve is piecewise smooth, we’ll have to break up the integral, since each piece will have adifferent parameterization.

∫

C

f ds =

∫

C1

f ds +

∫

C2

f ds + ... +

∫

Cn

f ds

where C = C1 ∪ C2 ∪ ... ∪ Cn and each of the Ci is smooth.

As we saw in the first example, parameterizing a curve gives it an orientation: a specificdirection of travel along a curve. Reversing the orientation reverses the sign of the value of theintegral, so direction does matter.

For example, both parameterizations below give the segment of the line y = x+1 with endpoints(0, 1) and (1, 2), but have opposite orientations:

r1(t) = ti + (t + 1)j, 0 ≤ t ≤ 1

r2(t) = (1 − t)i + (2 − t)j, 0 ≤ t ≤ 1

250

Example:

Write a piecewise smooth parameterization of the curve shown below. Note also that theorientation is shown on the curve, and be sure your parameterization reflects that.

251

Line integrals in 3D and another physical interpretation

There’s no particular reason mathematically why we can’t extend this to curves in space:

r(t) = x(t)i + y(t)j + z(t)k a ≤ t ≤ b

then

ds =

√

(

dx

dt

)2

+

(

dy

dt

)2

+

(

dz

dt

)2

dt

∫

C

f(x, y) ds =

∫ b

a

f(x(t), y(t), z(t))

√

(x′(t))2 + (y′(t))2 + (z′(t))2dt

But describing the result as an analog of surface area is tricky to visualize. Here is anotherpossible physical interpretation, that certainly works for curves in 2D, but also extends nicelyup into 3 dimensional curves: density and mass.

Imagine a wire in 2D or 3D whose shape is described by the smooth curve C . If f(x, y) orf(x, y, z) is a function that gives the density of the wire, then the line integral

∫

C

f ds

gives the mass.

252


Line integrals of vector fields

Suppose you have a vector field

F(x, y, z) = M(x, y, z)i + N(x, y, z)j + P (x, y, z)k

One of the innumerable things this function could represent is the force on a particle as afunction of its position in space.

Example:

Given a forceF(x, y, z) = x2i − (x + y + z)j + zk

, what is the force on the particle when it is located at the point (1, 2, 3)?

Now, suppose we want to move that particle around a given curve C . At each point on thecurve, the force on the particle will vary. What is the work done by the force in moving theparticle around the curve? Time to derive...

First, the basic formula for work is W = Fd (force times distance). We’ve seen that withvector forces, the only part of the force that does does work is the part that’s moving you inthe direction of travel. Since we’re traveling along a curve, at each point on the curve, the partof the force that’s doing work is the component that lies along the unit tangent vector.

(F · T)T

A little chunk of work done in moving the object along a little chunk of curve would be

∆Wi = (F(xi, yi, zi) · T(xi, yi, zi))∆si

Summing all the little chunks up and pushing to infinity in the usual way gives

W =

∫

C

F(x, y, z) · T(x, y, z) ds

∫

C

f ds

We’ve seen however that integrating with respect to ds is a nuisance, since the function has tobe re-parameterized in terms of the length s of the curve. So we make the substitutions:

T =r′

||r′||ds = ||r′||dt

and parameterize in terms of t, a ≤ t ≤ b:

253

W =

∫ b

a

F(x(t), y(t), z(t)) ·r′(t)

||r′(t)||||r′(t)|| dt

W =

∫ b

a

F(x(t), y(t), z(t)) · r′(t) dt

We can further write r′(t) dt as dr and write the whole thing compactly as

W =

∫

C

F · dr

Anything in this form is termed a line integral of a vector field. It doesn’t have to be work,since F could be representing something other than force, but that’s one possible physicalinterpretation that motivates the development of these things.

Line integral of a vector field

Let F be a continuous vector field defined on a smooth curve C with parameterization r(t),a ≤ t ≤ b. The line integral of F on C is given by

∫

C

F · dr =

∫

C

F · T ds =

∫ b

a

F(x(t), y(t), z(t)) · r′(t) dt

To compute a line integral of a vector field:

• Parameterize the curve if not already done, obtaining r(t).

• Write the function F(x, y) or F(x, y, z) as a function of t by composing.

• Get r′(t).

• Get the dot product F · r′.

• Set up the integral∫ b

aF · r′ dt and evaluate.

254

Example:

Find the work done by the force

F(x, y) = x2y3i − y√

xj

in moving a particle along a curve defined by

r(t) = t2i − t3j, 0 ≤ t ≤ 1

• Parameterize the curve if not already done, obtaining r(t).

• Write the function F(x, y) or F(x, y, z) as a function of t by composing.

• Get r′(t).

• Get the dot product F · r′.

• Set up the integral∫ b

aF · r′ dt and evaluate.

255


Line integrals of vector fields, differential form

This is largely a notational thing, although it does suggest one more possible technique thatwe’ll see at the end.

You may sometimes see a line integral of a vector field F(x, y) = M i + Nj or F(x, y, z) =M i + Nj + Pk given in the differential form

∫

C

M dx + N dy or

∫

C

M dx + N dy + P dz

The caution about this notation is that it appears to be suggesting something that you aren’tallowed to do - it does NOT typically mean integrate M with respect to x, and N with respectto y, and so on.

Here, try that for yourself and see that it makes no sense:

∫

C

xy dx + x2y3 dy

with C given by r(t) = ti− t2j, 0 ≤ t ≤ 1.

Integrating M with respect to x and N with respect to y would give...well 12x2y + 1

4x2y4

and...now what? How do you interpret the “integrate over C” part of things? You haven’tincorporated the curve into the integral. The 0 and 1 clearly aren’t the bounds on the integral,since those are values for t, and not x or y.

Of course we do have functions x(t) and y(t), and I suppose we could plug them in now, butwe all know there’s a big difference between integrate and then substitute vs. substitute andthen integrate - it won’t give you the same answer. And substitute before integrate is still thecorrect way to go.

As before, you really need to make all the substitutions and parameterize in terms of t before

you integrate:

r(t) = ti− t2j gives us x(t) = t, y(t) = −t2, and so

dx

dt= 1 → dx = 1dt

dy

dt= −2t → dy = −2t dt

and so∫

C

xy dx + x2y3 dy =

∫

1

0

(t)(−t2)(1dt) + (t)2(−t2)3(−2t dt)

256

In the above integral, I’ve made substitutions for x, y, dx and dy. Now notice that dt is acommon factor, and it becomes

∫

1

0

(

(t)(−t2)(1) + (t)2(−t2)3(−2t))

dt

∫

1

0

(

−t3 − 2t9)

dt

Integrating that is trivial (I’ll leave it as an exercise to the reader). Instead, consider this. Since

∫

C

xy dx + x2y3 dy

has been given as an example of something in the form∫

CM dx +N dy, the implication is that

M and N are the component functions of a vector field F(x, y) = M i + Nj.

Consider (in the usual way) the question of finding∫

CF · dr when

F(x, y) = xyi + x2y3j

with C given by r(t) = ti− t2j, 0 ≤ t ≤ 1.

• Get F(x(t), y(t)):

F(x(t), y(t)) = (t)(−t2)i + (t)2(−t2)3j = −t3i + t8j

• Get r′(t):r′(t) = 1i − 2tj

• Dot them:F · r =< −t3, t8 > · < 1,−2t >= −t3 − 2t9

And look, we’re right back around to the integral

∫

1

0

(

−t3 − 2t9)

dt

Both forms really say the same thing - all that’s happening in the differential form is the theusual form has already been dotted out. If you work through the notation, you’ll see that thetwo forms are equivalent:

257

∫

C

F · dr =

∫

b

a

F · r′ dt

=

∫

b

a

< M, N > · < x′(t), y′(t) > dt

=

∫

b

a

< M, N > · <dx

dt,dy

dt> dt

=

∫

b

a

(

Mdx

dt+ N

dy

dt

)

dt

=

∫

b

a

M dx + N dy

Really just a matter of whether you prefer to thing of things as x′(t) and y′(t), or as dx and dy- if you make the substitutions for t in accordance with what’s there, you’ll come out with thesame thing.

So why the additional and somewhat confusing notation? Because occasionally you can skipover the whole parameterization thing and get y in terms of x, and therefore dy in terms ofdx. This will work, and be labor saving - note however it only works for vector fields in theplane. If you’re working in space, you can’t describe curves in a y = f(x) format, but have touse parametric descriptions.

258

Example one:

Evaluate the line integral∫

C(2x− y) dx +(x +3y) dy where C is the arc on y = x3/2 from (0, 0)

to (2, 8). Do this in two ways: (1) the original process of parameterizing everybody in terms oft, and (2) using the differential form M dx + N dy directly.

259

Example two:

Evaluate the line integral∫

C(2x − y) dx + (x + 3y) dy where C is the parabolic path x = t,

y = 2t2, 0 ≤ t ≤ 2. Do this in two ways: (1) the original process of parameterizing everybodyin terms of t, and (2) using the differential form M dx + N dy directly.

260


Fundamental theorem of line integrals

Theorem:

Let C be a (piecewise) smooth curve given by r(t), a ≤ t ≤ b. Let f be a differentiable functionof two or three variables with continuous gradient vector ∇f . Then

∫

C

∇f · dr = f(r(b)) − f(r(a))

We’ll get to the proof in a little bit (it’s surprisingly simple), but first we should look at theimplications.

First, it tells us that for a given F that if F is the gradient of some scalar function f , then

∫

C

F · dr = f(r(b)) − f(r(a)) when F = ∇f

We’ve already dealt with the concept “if F is the gradient of some scalar function f” - anotherway to phrase this is “if F is conservative” then

∫

C

F · dr = f(r(b)) − f(r(a)) when F = ∇f

The fundamental theorem is giving us several things, then, or at least a way to piece somethings we already know together:

• We do have tests to determine if F is a conservative vector field.

• If F is in fact conservative, we have a process for finding a scalar function f so thatF = ∇f . Recall that f is called a potential function for F

• And the fundamental theorem is giving us an alternative way to evaluate the line integral- find F’s potential function f , and evaluate it at the final and initial points of the curve.

261

Example:

Let F = 2xi + yj. Let C be the curve parameterized by

r(t) = ti +√

tj, 0 ≤ t ≤ 4

• Start by evaluating the line integral in the usual way: compute∫

CF · dr by computing

∫

b

a

F(x(t), y(t)) · r′(t) dt

• Now, try the fundamental theorem. First, verify that F is conservative.

• Find a potential function f so ∇f = F

• Evaluate f(r(b)) − f(r(a))

262

Independence of path

For a given vector field F, integrating F over different curves will generally NOT produce thesame result, even if the curves start and end in the same place. Verify this for yourself:

Example:

Let C1 be given byr1(t) = ti + tj, 0 ≤ t ≤ 1

Let C2 be given byr2(t) = ti + t2j, 0 ≤ t ≤ 1

• Sketch C1 and C2 and verify that they start and end in the same place.

• Now, take a function, say F(x, y) = yi + 2j and compute∫

C1F · dr and

∫

C2F · dr

263

However, if F is conservative, the statement

∫

C

F · dr = f(r(b)) − f(r(a))

tells us that for two curves r1(t) and r2(t), as long as r1(a) = r2(a) and r1(b) = r2(b), it doesn’tmatter which path you take...and as long as you know the endpoints, you don’t need to knowthe path at all. We say that the line integral

∫

CF · dr is independent of path.

Theorem:

If F is continuous on an open connected region, then the line integral∫

CF · dr is independent

of path if and only if F is conservative.

Example:

Let F(x, y, z) = 2xyi + (x2 + z2)j + 2zyk. Let C be a piecewise smooth curve from (1, 1, 0) to(0, 2, 3). Is it possible to evaluate

∫

CF · dr? If so, do it.

• First, check to see if F is conservative. Since F is a vector field in space, you need tocheck the condition curl F = 0.

• Since F is conservative, the line integral is independent of path, and it doesn’t matterhow we get from (1, 1, 0) to (0, 2, 3). It is possible to evaluate

∫

CF ·dr. To do so, we need

to find a potential function for F:

264

• Apply the Fundamental Theorem and evaluate

f(x(b), y(b), z(b))− f(x(a), y(a), z(a))

Notice you don’t have the values of a and b (which would be the t values for what-ever parameterization) and you don’t need them, because you already have the result:(x(b), y(b), z(b)) = (0, 2, 3) and (x(a), y(a), z(a)) = (1, 1, 0).

Major and extremely useful consequence

Theorem:

If F is conservative and C is a closed curve (starts and ends at the same point), then

∫

C

F · dr = 0

If C is closed, then

f(r(b)) = f(x(b), y(b), z(b)) = f(x(a), y(a), z(a)) = f(r(a))

no matter what r is. And if F is conservative, then

∫

C

F · dr = f(r(b)) − f(r(a)) = 0

265

Example:

Let F(x, y) = x2i+sin yj. Let C be the counterclockwise path around the triangle with vertices(1, 1), (3, 4), and (2, 7). What is the value of

∫

CF · dr?

Summary of results

What this comes down to is a variety of choices for evaluating line integrals of vector fields.First, test to see if F is conservative.

• If F is not conservative, you’ll have to evaluate it somehow.

– Parameterize the path (or paths) in terms of t and evaluate∫

CF ·dr the “long way.”

– If in the plane and the path can be expressed as y = f(x) or x = f(y), express indifferential form

∫

CM dx + N dy and evaluate using dx or dy if possible.

• If F is conservative and C is a closed path.

– Immediately conclude∫

CF · dr = 0.

• If F is conservative and C is not a closed path.

– Parameterize the path (or paths) in terms of t and evaluate∫

CF ·dr the “long way.”

– If in the plane and the path can be expressed as y = f(x) or x = f(y), express indifferential form

∫

CM dx + N dy and evaluate using dx or dy if possible.

– Find a potential function and apply the fundamental theorem (must do if particularpath is unknown).

– Replace the path with a simpler path with the same start and end points.

266


Green’s Theorem - Introduction

Green’s theorem is a key theorem that has several versions (all of which are equivalent, ofcourse). The version that we are going to look at gives us another way to evaluate a line inte-gral of a vector field in the plane by computing a related double integral.

Recall that one way to denote a line integral of a vector field

F(x, y) = M(x, y)i + N(x, y)j

is by using the notation∫

C

M dx + N dy

(called the differential form).

Recall also (because I can’t stress this enough!) that that does not mean to integrate M withrespect to x and N with respect to y! You’d find yourself ending up with an expression thatmade no sense when you hit the “now evaluate over C” part of the integral.

The M dx + N dy notation is simply the stretched out version of

< M, N > · < dx, dy >= F · dr

and we know that the most general way to find

∫

C

F · dr

is to use a parameterization of r(t) on a ≤ t ≤ b and work our way through

∫

b

a

F(x(t), y(t)) · r′(t) dt

We’ve also seen alternatives to that, however; if r(t) is a curve in the plane and can be expressedas y = f(x) or x = f(y), then we may write dy in terms of dx or vice versa, and skip the wholeparameterization part.

The Fundamental Theorem of Line Integrals gave us another way to evaluate line integrals ifF happens to be a conservative vector field, either in the plane or in space. The FundamentalTheorem gives us the result that for conservative fields, the path from start to finish doesn’tmatter, and if we can find a potential function, we can evaluate the integral without parame-terizing the path, or even knowing or caring what the path is, as long as we know the endpoints.

A corollary of that is that the value of a line integral of a conservative field on a closed (startsand ends at the same point) path is 0.

267

Green’s Theorem gives us yet another way to evaluate a line integral for a vector field in theplane. It relates the value of the line integral along a closed path in the plane to a specific dou-ble integral computed over the area enclosed by that path. Why do we need yet another way?Because Green’s Theorem also does not require us to parameterize the path, and may be a sim-pler option when the path has multiple pieces, but the enclosed region is fairly straightforwardto describe. Unlike the Fundamental Theorem, it doesn’t require that the vector field be con-servative, and can be applied to any field in the plane. However, the path does need to be closed.

As a side bonus, since we’re turning this into a double integral, it allows us a way to work inpolar! Yay!

Green’s Theorem states:

Let R be a simply connected region with a piecewise smooth boundary C , orientedcounterclockwise. If M and N have continuous partial derivatives in an open regioncontaining R, then

∫

C

M dx + N dy =

∫∫

R

(

∂N

∂x−

∂M

∂y

)

dA

One thing that comes out of this section is that there are numerous ways to evaluate lineintegrals, and part of that concept includes knowing how to choose the best way for a particularproblem. I’ll summarize that after we get through Green’s Theorem, and may even try to flowchart the decision process!

268


Green’s Theorem

Let R be a simply connected region with a piecewise smooth boundary C , orientedcounterclockwise. If M and N have continuous partial derivatives in an open regioncontaining R, then

∫

C

M dx + N dy =

∫∫

R

(

∂N

∂x−

∂M

∂y

)

dA

Now, we need to define some terms...

Simple closed curve

A curve C is simple if it does not intersect itself at any point other than its endpoints. If Cis given by r(t) = x(t)i + y(t)j with a ≤ t ≤ b, then C is simple if r(c) 6= r(d) for all c and d inthe open interval (a, b). If r(a) = r(b) (the curve starts and ends at the same spot, enclosing aregion), then C is a closed curve.

Simply connected region

A region R is simply connected if its boundary C consists of a single simple closed curve.

Oriented counterclockwise

Recall that reversing the direction of travel reverses the sign on the line integral - we needto establish a default direction. A curve which is oriented counterclockwise is exactly what itsounds like - if you walk along the curve, the enclosed region will always lie to your left. If youwish to traverse a path in a clockwise direction, you can still use Green’s Theorem, but youneed to switch the sign of your answer.

269

Steps to implement Green’s Theorem

This is pretty painless, since it’s based on two things you already know how to do: (1) set uparea integrals, and (2) take partial derivatives.

• Verify that the curve C forms the border of a simply connected region.

• Construct the expression ∂N∂x

− ∂M∂y

.

• Describe the region R in the way you would as if you were setting up an area integral.dx dy, dy dx and r dr dθ are all options here.

• Set up and integrate∫∫

R

(

∂N

∂x−

∂M

∂y

)

dA

Notation note:

There’s nothing special about the∫

CM dx + N dy notation for the line integral here (it’s just

easiest to write the proof using that form). We could also state Green’s Theorem as

∫

C

F · dr =

∫∫

R

(

∂N

∂x−

∂M

∂y

)

dA

Both notations just mean “line integral of field F over curve C .”

Note about conservative fields:

We already know that if F is conservative, then the line integral of F around any closed pathshould be 0. Green’s Theorem only applies to closed paths in the first place, and also leads to

that result. Recall the test for conservative fields in the plane is to check for ∂N∂x

= ∂M∂y

. If

this is true, then the expression ∂N∂x

− ∂M∂y

= 0, and the value of the Green’s Theorem integral

will be zero as well.

270

Example:

Use Green’s Theorem to evaluate the line integral

∫

C

y3 dx + (x3 + 3xy2) dy

where C is the path shown in the figure below.

• Note that C is the boundary of a simply connected region R. Also note the orientationis counterclockwise; if it was we’d just switch the sign of the answer at the end.

• Construct the expression ∂N∂x

− ∂M∂y

.

• Since R is both horizontally and vertically simple, you have your choice of setup. Writethe inequalities that describe R in the usual way.

271

• Set up and integrate∫∫

R

(

∂N∂x

− ∂M∂y

)

dA.

Exercise:

For comparison, compute that same line integral using the other techniques we’ve established:

• Since the curves are given in y = f(x) form, you can directly sub out y’s and dy’s in the∫

CM dx + N dy structure without parameterizing. Note that it will take two integrals -

one for each leg of the piecewise path.

• And, you can also do it from the initial definition of line integral, of course. ParameterizeC1 and C2 and compute

∫

C1F · dr and

∫

C2F · dr.

Solution to this one posted as a separate item - go and try it, then check.

Summary

Green’s Theorem gives us yet another technique for computing line integrals. Use when

• C forms the boundary of a simply connected region in the plane, and

• F is NOT conservative (if F is conservative, the answer is zero, and you’re done)

272

Line integral strategy You’ve probably noticed at this point that we have several different ways to compute a line integral. The ultimate trick is to figure out what is the most efficient one to use for any given problem. I may try to flow chart this; in the meantime, I’ll at least list out what to look for.

If it’s a fairly simple integrand and a single curve which is already parameterized, you can probably just go straight to the definition, and it’ll take less time than sorting all this out. For something a bit more complicated (especially piecewise smooth curves with multiple pieces), here a some things you can look for:

First, test to see whether F is conservative or not:

For ( , )x y M N= +F i j , check N Mx y=

∂ ∂∂ ∂

.

For ( , , )x y z M N P= + +F i j k , check curl =F 0 .

If F is conservative • Is it a closed path? If yes, the answer is 0 and you’re done!

• If it’s not a closed path

o Are you given a path, or do you only know the endpoints [say 1 1( , )x y and 2 2( , )x y in the

plane, or 1 1 1( , ),x y z and 2 2 2, )( ,x y z in space] for the path? If you don’t have a path, you

must use the Fundamental Theorem of Line Integrals:

Cd =⋅∫ F r 2 2 2 1 1 1( , , ,) , )(f f x yy z zx − where f∇ = F

o If you have expression(s) for the curve(s) that form the path, you have several options If the path has multiple pieces, and it’s easy to find a potential function, try the

Fundamental Theorem. If the path has multiple pieces, but you’re having a hard time getting a potential

function from f∇ , try replacing that path with a simpler path with the same

start and end points. If the path has one piece (or you’ve replaced it so it has one piece), it’s in the

plane, and you can express it easily as ( )y f x= or ( )x f y= , try working with

the C

M dx N dy+∫ form, and don’t bother with the t ’s.

If the path has one piece and can’t be expressed nicely as ( )y f x= or

( )x f y= , or if it’s already parameterized and it’s easy enough to work with the

parameterization, or it’s in space, work straight from the definition – get d⋅F rwith the t ’s and dt ’s subbed in, and integrate.

273

If F is not conservative • Is it a closed path in the plane? Forming the boundary of a simply connected region? If yes, try

Green’s Theorem. Vertically simple, horizontally simple, and theta‐simple (polar) regions work out nicely.

• If it’s not a closed path

o Are you given a path, or do you only know the endpoints [say 1 1( , )x y and 2 2( , )x y in the

plane, or 1 1 1( , ),x y z and 2 2 2, )( ,x y z in space] for the path? If you don’t have a path, and

it’s not conservative, that’s it, you’re stuck. These functions are not independent of path, so it matters how you get from there to here.

o If you have expression(s) for the curve(s) that form the path, you have a couple options If the path has one piece (or you’ve replaced it so it has one piece), it’s in the

plane, and you can express it easily as ( )y f x= or ( )x f y= , try working with

the C

M dx N dy+∫ form, and don’t bother with the t ’s.

If the path has one piece and can’t be expressed nicely as ( )y f x= or

( )x f y= , or if it’s already parameterized and it’s easy enough to work with the

parameterization, or it’s in space, work straight from the definition – get d⋅F rwith the t ’s and dt ’s subbed in, and integrate.

Things of note: • You have less options for paths in space: either Fundamental Theorem if conservative, or

straight from the definition if not are the only two options.

• Green’s theorem only applies to closed paths in the plane(bounding simply connected regions), and only gets used for non‐conservative fields (if it’s conservative and Green’s would apply, you already know the answer is zero)

• Fundamental Theorem can be applied to any kind of path, or no path at all, but you must have a conservative field.

274

Calculus Vector Principia Mathematica

Documents

Transcript of Calculus Vector Principia Mathematica