Calculus in Crisis - Saint Joseph's University

49
1 Calculus in Crisis In Which Mighty ε and δ Save the Day Rachel W. Hall Saint Joseph’s University March 16, 2010 e Berlin Academy’s Problem of the Year, 1784 “It is well known that higher mathematics continually uses infinitely large and infinitely small quantities. Nevertheless, geometers, and even the ancient analysts, have carefully avoided everything which approaches the infinite; and some great modern analysts hold that the terms of the expression ‘infinite magnitude’ contradict one another. e Academy hopes, therefore, that it can be explained how so many true theorems have been deduced from a contradictory supposition, and that a principle can be delineated which is sure, clear—in a word, truly mathematical—which can be substituted for ‘the infinite.’ is is to be done without making the researches which had been expedited by using the concept of ‘the infinite’ too difficult or tedious. We require that this matter be treated with all possible rigor, clarity, and simplicity.”* *All translations are from Grabiner (1981).

Transcript of Calculus in Crisis - Saint Joseph's University

Page 1: Calculus in Crisis - Saint Joseph's University

1

Calculus in CrisisIn Which Mighty ε and δ Save the Day

Rachel W. HallSaint Joseph’s UniversityMarch 16, 2010

e Berlin Academy’s Problem of the Year, 1784“It is well known that higher mathematics continually uses

infinitely large and infinitely small quantities. Nevertheless,geometers, and even the ancient analysts, have carefullyavoided everything which approaches the infinite; and somegreat modern analysts hold that the terms of the expression‘infinite magnitude’ contradict one another.

e Academy hopes, therefore, that it can be explained howso many true theorems have been deduced from acontradictory supposition, and that a principle can bedelineated which is sure, clear—in a word, trulymathematical—which can be substituted for ‘the infinite.’is is to be done without making the researches which hadbeen expedited by using the concept of ‘the infinite’ toodifficult or tedious. We require that this matter be treatedwith all possible rigor, clarity, and simplicity.”*

*All translations are from Grabiner (1981).

Page 2: Calculus in Crisis - Saint Joseph's University

2

And the Winner is…• Simon L’Huilier won the prize, but it was clear

from the judges’ comments that they didn’t thinkany of the essays really did the job.

• Most of the contestants failed to explain how “truetheorems” had been proved on a non-rigorousfoundation.

• Take the Berlin Challenge: what do the words“infinitely large” and “infinitely small” mean toyou? Is “infinite magnitude” an oxymoron?

• If your answer involved ε, then you have Cauchyto thank.

e State of Calculus in the 18th Century

• In inventing calculus, Newton and Leibnizdeveloped enormously powerful tools thatdemonstratably solved many physical andmathematical problems.

• e Bernoullis, d’Alembert, Euler, Lagrange, andLaplace built on their work.

• However, the “foundations” of calculus--rigorousdefinitions and proofs--were not consideredimportant until the late 1700s.

Page 3: Calculus in Crisis - Saint Joseph's University

3

Berkeley’s e Analyst (1734)• George Berkeley, philosopher and bishop, was

opposed to scientists’ criticisms of religion.• He attacked calculus at its weakest link: the

definition of the derivative.

Berkeley on Newton“It must, indeed, be acknowledged, that he used Fluxions,

like the Scaffold of a building, as things to be laid aside orgot rid of, as soon as finite Lines were found proportionalto them.

But then these finite Exponents are found by the help ofFluxions. Whatever therefore is got by such Exponentsand Proportions is to be ascribed to Fluxions: which musttherefore be previously understood.

And what are these Fluxions? e Velocities of evanescentIncrements? And what are these same evanescentIncrements? ey are neither finite Quantities norQuantities infinitely small, nor yet nothing. May we notcall them the Ghosts of departed Quantities?”

Page 4: Calculus in Crisis - Saint Joseph's University

4

Example: e Derivative of f (x) = x2 (from Grabiner)

• Infinitesimals (Leibniz, l’Hospital, JohannBernoulli)

• Fluxions (Newton, Maclaurin)• Early Limit Concept (Newton, d’Alembert,

Lacroix, Maclaurin)• Compensation of Errors (Berkeley)• Greek Style (Maclaurin)• Zeros (Euler)• Algebraic Method (Lagrange)• Limits: Improved Version (Cauchy)

Devlin’s “Letter to a Calculus Student”e subtlety that appears to have eluded Bishop Berkeley is

that, although we initially think of h as denoting smallerand smaller numbers, the “lim” term in formula (*) asks usto take a leap (and it’s a massive one) to imagine not justcalculating quotients infinitely many times, but regardingthat entire process as a single entity. It's actually abreathtaking leap.

at's what formula (*) asks you to do: to hold infinity in thepalm of your hand. To see an infinite (and henceunending) process as a single, completed thing.

Did any work of art, any other piece of human creativity, everdemand more of the observer? And to such enormousconsequence for Humankind? If ever any painting, novel,poem, or statue can be thought of as having a beauty thatgoes beneath the surface, then the definition of thederivative may justly claim to have more beauty by far.

Page 5: Calculus in Crisis - Saint Joseph's University

5

Augustin Cauchy on LimitsFrom Cours d’analyse (1821)“When the successively attributed values of the same

variable indefinitely approach a fixed value, so thatfinally they differ from it by as little as desired, thelast is called the limit of all the others.’’

e word “indefinitely” is important: previousmathematicians did not allow the variable toexceed or equal its limit.

But where are ε and δ?

Cauchy’s ε and δHere’s how Cauchy used ε and δ in a proof:

“Let δ and ε be two very small numbers; the first ischosen so that for all numerical [i.e., absolute]values of h less than δ and for any value of xincluded [in the interval of definition], the ratio(f (x+h) - f (x))/h will always be greater thanf ’(x) - ε and less than f ’(x) + ε.”

Page 6: Calculus in Crisis - Saint Joseph's University

6

e Algebra of Inequalities: e Missing Link

e math historian Judith Grabiner (1981) traced Cauchy’schoice of the letter ε to error analysis.

e idea is, given some infinite series (and assuming itconverges), the nth error term is the difference between thenth partial sum and the sum of the series.

Mathematicians such as d’Alembert had made great progressin bounding the error of the nth partial sum.

ese computations involved inequalities and were quitesimilar to some of the techniques Cauchy eventually usedto prove limit laws.

A Modern Calculus Crisis• Calculus education was under attack in the late 1980s: too

many “cookbook” techniques and arcane theoremsobscured calculus’ fundamental principals and very fewcourses incorporated technology.

• In 1988, the NSF launched a 7-million dollar program toreform calculus education.

• e most famous “reform” textbook, the “Harvard”Calculus by Hughes-Hallett et al. (1994), inspired a strongbacklash.

• Most textbooks used today incorporate elements of bothtraditional and reform calculus.

Page 7: Calculus in Crisis - Saint Joseph's University

7

Tucker on the Mean Value eorem“Instead of an admission that Newton, Leibniz, the

Bernoullis, and Euler all managed quite well without anyrigorous foundations, instead of the story how a rigorouscalculus took mathematicians two hundred years to getright, the Mean Value eorem is waved, like a cross infront of a vampire, to hold the difficulties at bay. eorigin of the Mean Value eorem in the structure of thereal numbers is not addressed; that is much too difficult fora standard course. Maybe it is traced back to the ExtremeValue eorem, but the trail ends there.

“e result is that a technical existence theorem is introducedwithout proof and used to prove intuitively obviousstatements, such as ‘if your speedometer reads zero, you arenot going anywhere’ (if f ' = 0 on an interval, then f isconstant on that interval).”

e MVT in Stewart’s Calculus

Page 8: Calculus in Crisis - Saint Joseph's University

8

Rolle’s eorem

e Extreme Value eorem

Page 9: Calculus in Crisis - Saint Joseph's University

9

e EVT in Hughes-Hallett, Calculus

www.wiley.com/college/hugheshallett (HERE)

e proof uses both• the Completeness Axiom (a bounded set of real numbershas a least upper bound), and• the Nested Interval eorem (an infinite sequence ofnested, closed intervals contains at least one common point).

Swann’s Response to Tucker“Whether or not the militants’ ‘final product’ is ‘better,’ which

is by no means established [3], one thing is clear: bookssuch as the ‘Harvard Calculus’ are ‘enablers;’ bylegitimizing the abandonment of the concepts ofmathematical proof, related rates, convergence of series,and so forth from the calculus sequence, other texts andteachers will feel free to follow.

Mathematics is unique in its concern with rigorousfoundations and proofs. Here its role as ‘Queen andservant of the Sciences’ is to offer the content of calculus asan anchor of certainty to aid the disciplines it serves.Should we not attempt to convey some sense of theremarkable way that the results of calculus can be provedto be true to those who will use it?

Page 10: Calculus in Crisis - Saint Joseph's University

10

My Own Take on Calculus• For the most part, my goal in teaching first-

semester calculus is to convey the excitement of18th century calculus. Students should usecalculus to explore the world.

• In order to truly appreciate calculus, it is crucial tounderstand some of the challenges faced inestablishing a rigorous foundation for the subject.

• Treating the calculus crisis as a “story” is a goodway to make this struggle vivid.

• Perhaps the Extreme Value eorem also makes agood story?

Discussion problemIs the following equivalent to the standard definition

of the derivative? Prove or give a counterexample.

is is the definition that calculators use. Why?

Page 11: Calculus in Crisis - Saint Joseph's University

11

SourcesGrabiner, Judith V. e Origins of Cauchy’s Rigorous Calculus.

MIT Press, 1981.Tucker, omas. Rethinking rigor in calculus: the role of the

Mean Value eorem. Amer. Math. Monthly 104 (1997),231-240.

Swann, Howard. Commentary on rethinking rigor incalculus. Amer. Math. Monthly 104 (1997), 241-245.

Brodie, Scott E. On “Rethinking rigor in calculus…,” or whywe don’t do calculus on the rational numbers. Coll. Math. J.30:2 (1999), 135-138.

Stewart, James. Calculus: Early Transcendentals. 4th ed.,Brooks/Cole, 1999.

Hughes-Hallett, Deborah et al. Calculus. 4th ed., 2005.

Page 12: Calculus in Crisis - Saint Joseph's University

5/6/09 4:43 PMLetter to a calculus student

Page 1 of 3http://www.maa.org/devlin/devlin_06_06.html

Membership Publications Professional Development Meetings Organization Competitions Support the MAA

Devlin's Angle

June 2006

Letter to a calculus studentDear Calculus Student,

Let me begin with a quotation from the great philosopher Bertrand Russell. He wrote, in Mysticism andLogic (1918): "Mathematics, rightly viewed, possesses not only truth, but supreme beauty-a beauty coldand austere, like that of sculpture, without appeal to any part of our weaker nature, without the gorgeoustrappings of painting or music, yet sublimely pure, and capable of a stern perfection such as only thegreatest art can show."

Beauty is one of the last things you are likely to associate with the calculus. Power, yes. Utility, that too.Hopefully also ingenuity on the part of Netwon and Leibniz who invented the stuff. But not beauty. Mostlikely, you see the subject as a collection of techniques for solving problems to do with continuous changeor the computation of areas and volumes. Those techniques are so different from anything you havepreviously encountered in mathematics, that it will take you every bit of effort and concentration simplyto learn and follow the rules. Understanding those rules and knowing why they hold can come only later,if at all. Appreciation of the inner beauty of the subject comes later still. Again, if at all.

I fear, then, that at this stage in your career there is little chance that you will be able to truly see thebeauty in the subject. Beauty - true, deep beauty, not superficial gloss - comes only with experience andfamiliarity. To see and appreciate true beauty in music we have to listen to a lot of music - even betterwe learn to play an instrument. To see the deep underlying beauty in art we must first look at a greatmany paintings, and ideally try our own hands at putting paint onto canvas. It is only by consuming agreat deal of wine - over many years I should stress - that we acquire the taste to discern a great wine.And it is only after we have watched many hours of football or baseball, or any other sport, that we cantruly appreciate the great artistry of its master practitioners. Reading descriptions about the beauty in theactivities or creations of experts can never do more than hint at what the writer is trying to convey.

My hope then is not that you will read my words and say, "Yes, I get it. Boy this guy Devlin is right.Calculus is beautiful. Awesome!" What I do hope is that I can at least convince you that I (and my fellowmathematicians) can see the great beauty in our subject (including calculus). And maybe one day, manyyears from now, if you continue to study and use mathematics, you will remember reading these words,and at that stage you will nod your head knowingly and think, "Yes, now I can see what he was gettingat. Now I too can see the beauty."

The first step toward seeing the beauty in calculus - or in any other part of mathematics - is to go beyond

Page 13: Calculus in Crisis - Saint Joseph's University

5/6/09 4:43 PMLetter to a calculus student

Page 2 of 3http://www.maa.org/devlin/devlin_06_06.html

the techniques and the symbolic manipulations and see the subject for what it is. Like a Shakespeareansonnet that captures the very essence of love, or a painting that brings out the beauty of the humanform that is far more than just skin deep, the true beauty of calculus can only be fully appreciated bydigging deep enough.

The beauty of calculus is primarily one of ideas. And there is no more beautiful idea in calculus than theformula for the definition of the derivative:

(*) f'(x) = lim_{h -> 0}[f(x+h) - f(x)]/h

For this to make sense, it is important that h is not equal to zero. For if you allow h to be zero, then thequotient in the above formula becomes

[f(x+0) - f(x)]/0 = [f(x) - f(x)]/0 = 0/0

and 0/0 is undefined. Yet, if you take any nonzero value of h, no matter how small, the quotient

[f(x+h) - f(x)]/h

will not (in general) be the derivative.

So what exactly is h? The answer is, it's not a number, nor is it a symbol used to denote some unknownnumber. It's a variable.

What's that you say? "Isn't a variable just a symbol used to denote an unknown number?" The answer is"No." Sir Isaac Newton and Gottfried Leibniz, the two inventors of calculus, knew the difference, but asgreat a mind as the famous 18th Century philosopher and theologian (Bishop) George Berkeley seemednot to. In his tract The analyst: or a discourse addressed to an infidel mathematician, Berkeley arguedthat, although calculus led to true results, its foundations were insecure. He wrote of derivatives (whichNewton called fluxions):

"And what are these fluxions? The velocities of evanescent increments. And what are thesesame evanescent increments? They are neither finite quantities, nor quantities infinitely small,nor yet nothing. May we not call them ghosts of departed quantities?"

The "evanescent increments" he was referring to are those h's in formula (*). Berkeley's problem - and hewas by no means alone - was that he failed to see the subtlety in the formula. Like any great work ofart, this formula simultaneously provides you with different ways of looking at the same thing. If you lookat it just one way, you will miss its true meaning. It also asks you, nay like all great works of art itchallenges you, to use your imagination - to go beyond the experience of your senses and step into anidealized world created by the human mind.

The expression to the right of the equal sign in (*) represents the result of a process. Not an actualprocess that you can carry out step-by-step, but an idealized, abstract process, one that exists only inthe mind. It's the process of computing the ratio

[f(x+h) - f(x)]/h

for increasingly smaller nonzero values of h and then identifying the unique number that those quotientvalues approach, in the sense that the difference between those quotients and that number can be madeas small as you please by taking values of h sufficiently small. (Part of the mathematical theory of thederivative is to decide when there is such a number, and to show that if it exists it is unique.) The reasonyou can't actually carry out this procedure is that it is infinite: it asks you to imagine taking smaller andsmaller values of h ad infinitum.

The subtlety that appears to have eluded Bishop Berkeley is that, although we initially think of h asdenoting smaller and smaller numbers, the "lim" term in formula (*) asks us to take a leap (and it's a

Page 14: Calculus in Crisis - Saint Joseph's University

5/6/09 4:43 PMLetter to a calculus student

Page 3 of 3http://www.maa.org/devlin/devlin_06_06.html

massive one) to imagine not just calculating quotients infinitely many times, but regarding that entireprocess as a single entity. It's actually a breathtaking leap.

In Auguries of Innocence, the poet William Blake wrote:

To see a World in a Grain of SandAnd a Heaven in a Wild FlowerHold Infinity in the palm of your handAnd Eternity in an hour

That's what formula (*) asks you to do: to hold infinity in the palm of your hand. To see an infinite (andhence unending) process as a single, completed thing. Did any work of art, any other piece of humancreativity, ever demand more of the observer? And to such enormous consequence for Humankind? Ifever any painting, novel, poem, or statue can be thought of as having a beauty that goes beneath thesurface, then the definition of the derivative may justly claim to have more beauty by far.

Devlin's Angle is updated at the beginning of each month.

Mathematician Keith Devlin (email: [email protected]) is the Executive Director of the Center forthe Study of Language and Information at Stanford University and The Math Guy on NPR's WeekendEdition. Devlin's newest book, THE MATH INSTINCT: Why You're a Mathematical Genius (alongwith Lobsters, Birds, Cats, and Dogs) was published recently by Thunder's Mouth Press.

Copyright ©2009 The Mathematical Association of AmericaPlease send comments, suggestions, or corrections for this page to

MAA Online disclaimer Privacy policy Contact us

Page 15: Calculus in Crisis - Saint Joseph's University

Rethinking Rigor in Calculus: The Role of the Mean Value TheoremAuthor(s): Thomas W. TuckerSource: The American Mathematical Monthly, Vol. 104, No. 3 (Mar., 1997), pp. 231-240Published by: Mathematical Association of AmericaStable URL: http://www.jstor.org/stable/2974788Accessed: 05/03/2010 16:35

Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available athttp://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unlessyou have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and youmay use content in the JSTOR archive only for your personal, non-commercial use.

Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained athttp://www.jstor.org/action/showPublisher?publisherCode=maa.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printedpage of such transmission.

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access toThe American Mathematical Monthly.

http://www.jstor.org

Page 16: Calculus in Crisis - Saint Joseph's University

Rethinking Rigor in Calculus: The Role of the Mean Value Theorem

Thomas W. Tucker , I W. _ t

1. INTRODUCTION. Mathematicians have been struggling with the theoretical foundations of the calculus ever since its inception. Bishop Berkeley's attack on Newton's "ghosts of departed quantities," Euler's claim that 1 - 1 + 1 - 1 = 1/2, Cauchy's s- 8 definition of limit, all are part of the fascinating history of this struggle (see [7]). Calculus instructors and textbooks face the same struggle, but the tack taken, although formal, is often not sensible or honest. Instead of an admission that Newton, Leibnitz, the Bernoullis, and Euler all managed quite well without any rigorous foundations, instead of the story how a rigorous calculus took mathematicians two hundred years to get right, the Mean Value Theorem is waved, like a cross in front of a vampire, to hold the difficulties at bay. The origin of the Mean Value Theorem in the structure of the real numbers is not addressed; that is much too difficult for a standard course. Maybe it is traced back to the Extreme Value Theorem, but the trail ends there. The result is that a technical existence theorem is introduced without proof and used to prove intuitively obvious statements, such as "if your speedometer reads zero, you are not going anywhere" (if t' = 0 on an interval, then t is constant on that interval). That's the sort of thing that gives mathematics a bad name: assuming the nonobvious to prove the obvious. And by the way, there is nothing obvious about the Mean Value Theorem without the hypothesis of continuity of the derivative. Cauchy himself was never able to prove it in that form.

I have serious reservations about the need for formal theorems and proofs in a standard calculus course. On the other hand, for those mathematicians who do feel that need, I have a suggestion for an alternative theoretical cornerstone to replace the Mean Value Theorem (MVT); I hope textbook authors adopt it. It is much easier to state, much more intuitively obvious, and much more powerful than most mathematicians realize. It is simply this:

The Increasing Function Theorem (IFT). Iff' 2 0 on an interval, then f is increas- ing on that interval.

Here, increasing means that if c < d, then f(c) < f(d). This would usually be called nondecreasing, but that term is awkward; for example, nondecreasing and not decreasing mean different things. It seems to make more sense to use the term strictly increasing for the condition that if c < d, then f(c) < f(d). A function that is increasing, but not strictly increasing, we call weakly increasing.

Most of the rest of this paper is concerned with the consequences of the IFT, treating it as an axiom. I will give, however, a short independent proof of the IFT, for the sake of completeness and for readers who have probably never thought of proving the IFT directly without the MVT. Of course, the IFT follows easily from the MVT. In fact, the contrapositive of the IFT is a weak form of the MVT: if a < b and f(b) < f(a), there is a number c, a < c < b, such that f'(c) < O.

1997] 231 THE ROLE OF THE MEAN VALUE THEOREM

Page 17: Calculus in Crisis - Saint Joseph's University

It is impossible to be a pioneer in territory as well-trodden as the Mean Value Theorem. Others have championed calculus without the Mean Value Theorem (see [1], [4], [6]). The first two sections of this paper follow Lax, Burstein, and Lax [9] quite closely, although unintentionally. In fact, after searching through dozens of calculus books for the Taylor remainder proof given in this paper and finally finding it in Lax-Burstein-Lax (LBL), I felt a little uncomfortable. Maybe this paper shouldn't be published and all that is needed is an announcement "Go read LBL." Then I read Grabiner [7] and found that the Taylor remainder proof given here and in LBL is actually Lagrange's original proof. I was surprised that such a simple, direct proof could have been covered over by years of second-growth . Jung e.

Moreover, the idea of Lagrange's proof keeps being rediscovered for special cases like sinx or cosx. For example, the Monthly published such an article recently [2], which then generated a subsequent Editor's Note [2] citing calculus textbooks and Monthly articles where the idea of [2] had already been presented. None of these references noted that the same idea works for all functions; LBL is still the only book that does that, to my knowledge. And hardly anyone seems to know the idea is really Lagrange's! Under these circumstances, it appears that some dissemination is badly needed to clear up a memory lapse of generations of mathematicians. It also appears that previous calls ([4], [6]) to downplay the Mean Value Theorem have fallen on deaf ears. Perhaps the recent debates about calculus instruction have unplugged some ears and it is time to tty the call again.

2. A PROOF OF THE INCREASING FUNCTION THEOREM. There is a reason- ably elementary proof of the IFT that depends only on the nested interval property of the reals: if an < an+l < bn l < bn for all n 2 1 and limn pO(bn - an) = 0, then there is a number c such that limn yOOan - limn yOObn = c. The proof of the IFT given here does not require the continuity of f' and is so self-contained that it probably could be given in a standard calculus course. Although I generated this proof in response to some remarks of Peter Lax, I should have known the proof is too natural to be original. In revising this paper, I discovered Richmond's article [10], which contains essentially the same proof, and as I already knew, Ampere and Cauchy used the key observation in their own proofs.

Proof of the IFT. The proof depends on the following simple

Observation. Given a function f, define slope(a,b) to be the usual quotient (f(b) - f(a))/(b - a). If slope(a, b) = m and c is between a and b, then one of slope(a, c) and slope(c, b) is greater than or equal to m and one is less than or equal to m. For a proof, draw the obvious picture.

Suppose now that f'(x) 2 0 on [a, b] and that f is not increasing; that is, for some a1, b1 with a < a1 < b1 < b, we have t(a1) > t(b1). Let m = slope(a1, b1). Note that m < O. By repeated bisection and our observation, we can find a nested sequence of intervals [an, bn] with slope(an, bn) < m and limn > Do(bn - an) = 0. Let c = limnOOan = limn :Obn (the possibility c = a or c - b causes no difficulty). Since f'(c) 2 0 and m < O, for all x sufficiently near c, slope(x, c) > m. Thus for all large enough n, slope(an, c) > m and slope(c, bn) > m, which contradicts our observation and the fact that, by construction, slope(an, bn) < m. If an = c or bn = c the contradiction is immediate. n

As we have observed, the contrapositive of the IFT is an existence statement that if f is not increasing on the interval [a, b], there exists a number c between a

232 [March THE ROLE OF THE MEAN VALUE THEOREM

Page 18: Calculus in Crisis - Saint Joseph's University

and b where f '(c) < O. The preceding proof is constructive, in that once one finds a1 < b1 with t(a1) > t(b1), the bisection procedure effectively computes a number c such that f'(c) < O.

3. IMMEDIATE CONSEQUENCES OF THE IFT. We first consider some conse- quences and variations of the IFT.

Theorem 1. The following statements are consequences of the IFT. Assume f is differentiable on [a, b] and a < b.

a) If f '(x) < O on the interval [a, b], then f is decreasing on the interval [a, b]. b) If f '(x) = O on the interval [a, b], then f is constant on the interval [a, b]. c) If f'(x) > O on the interval [a, b], then f is strictly increasing on the interval

[a, b]. d) If f '(x) < g'(x) on the interval [a, b], then f(x) - f(a) < g(x) - g(a) for all x

in [a,b]. e) If m < f'(x) < M on the interval [a, b], then m(x - a) < f(x) - f(a) <

M(x-a) forallxin [a,b].

Proof:

(a) Multiplication by -1 reverses inequalities and interchanges "increasing" and "decreasing".

(b) By the IFT and (a), it follows that f is both (weakly) increasing and (weakly) decreasing on [a, b]. That means f is constant.

(c) By the IFT, f is increasing. Suppose that a < c < d < b and f(c) = f(d). Since f is increasing on [c, d] we must have f(x) = f(c) = f(d) on [c, d]. That is, f is constant on [c, d]. Therefore f '(x) = O on [c, d], contradicting f'(x) > O on [a, b].

(d) Apply the IFY to h(x) = g(x) - f(x) to conclude g(a) - f(a) < g(x) - f(x). (e) Apply (d) to f(x) and Mx to get the right inequality and to mx and f(x) to

get the left inequality.

Theorem lc could be called the Strictly Increasing Function Theorem (SIFT). Lax-Burstein-Lax [9] calls it the Criterion for Montonicity. There the IFT is derived directly from the SIFT by looking at f(x) + mx = g(x), for all positive slopes m. If f '(x) 2 O, then g'(x) > O, so by the SIFT g is strictly increasing. Thus if x > a, then f(a) + ma < f(x) + mx. Since this inequality holds for all m > O, it follows that f(a) < f(x), that is, f is increasing. I feel, however, that this proof is a little tricky. Although the idea of perturbing a function is important throughout analysis, it comes out of the blue for a first-year calculus student. I prefer the IFT over the SIFY as a theoretical cornerstone. First, our proof that the IFT implies the SIFT is easier and more natural than a proof that the SIFT implies the IFT. More importantly, Theorem lc, which could be called the Constant Function Theorem, follows immediately from the IFT; the only way the SIFT can get this fundamental result is via the IFT. By the way, I view the Constant Function Theorem as even more basic than the IFT. It would be nice to use it as our theoretical cornerstone, but I know of no way to use it to get the IFT.

Theorem ld is called the Racetrack Principle by Jerry Uhl: if one car goes faster than another, it travels farther during any time interval. It is used as a theoretical cornerstone in the text [5].

1997] 233 THE ROLE OF THE MEAN VALUE THEOREM

Page 19: Calculus in Crisis - Saint Joseph's University

Theorem le is perhaps the most important, especially from a historical view- point. If the inequalities are rewritten:

f( x) - f( a) m < <M x - a

we have the Mean Value Inequality. The Mean Value Theorem follows immedi- ately if we know that f' is continuous and that the Intermediate Value Theorem holds. That is exactly what Cauchy did [7]: he proved the Mean Value Inequality and assumed the continuity of f' and the Intermediate Value Theorem. His assumption of continuity should not be surprising since his proof of the Mean Value Inequality also assumes that the difference quotient (f(x + h) - f(x))/h approaches f'(x) uniformly as h approaches 0. Peter Lax has argued that, for the theoretical foundations of an introductory calculus course, one should always avoid pathology and assume uniform continuity and uniform convergence, just as Cauchy did. It is interesting to note that before Cauchy, Ampere [7] saw the importance of the Mean Value Inequality and even used it as the defining property of the derivative. One could argue in a similar vein that the Mean Value Theorem should be the defining property of the derivative; Andrew Gleason has told me that a calculus textbook by Donald Richmond around 1960 did exactly that, but I have been unable to find the book.

Finally, I should comment on the hypothesis of differentiability at the end- points, both in the IFY and in Theorem 1. All one need assume is continuity at the endpoints, just as in the MVT. Simply observe in the proof of the IFT that the initial points a1 and b1 can be chosen so that a < a1 < b1 < b, since if f(a) > f(b) then by continuity f(al) > f(b) for a1 > a near enough a, and t(a1) > t(b1) for bl < b near enough b.

4. ERROR BOUNDS AND ERROR BEHAVIOR FOR TAYLOR POLYNOMIALS. If Theorem le is rewritten

f(a) +m(x-a) <f(x) <f(a) +M(x-a), we see a glimmering of an error bound for Taylor polynomials. The proof we are about to give is almost too transparent and simple to believe: just antidifferentiate repeatedly the inequality f(n+l)(x) < M. Not only does the proof give the La- grange form of the error bound, it also creates the Taylor polynomial itself. Moreover, as we have observed, it is Lagrange's original proof and can be found in LBL [9]. It is also the proof I wrote for the textbook of the Calculus Consortium Based at Harvard [8]. On the other hand, I have so far been unable to find it anywhere else. All the other proofs I know involve applications of Rolle's Theorem to rather elaborate auxiliary functions or repeated integration by parts or clever tricks with varying parameters. None are natural and none are likely to be discovered or appreciated by an average calculus student.

Theorem 2. (Taylor Error Bound). Suppose that m < f(n+l)(x) < M on the interval [a, b], where f (i) denotes the ith derivative of f. Then on [a, b]

n+l n+l (x-a) (x-a) (n + 1)! -f(x)-Tn(X) <M (

where Tn(x) is the degree n Taylor polynomial for f centered at x = a.

234 [March THE ROLE OF THE MEAN VALUE THEOREM

Page 20: Calculus in Crisis - Saint Joseph's University

Proof: To get the upper bound, we apply Theorem ld (the Racetrack Principle) to

t(n)(X) and Mx (since tn+l < M), which gives

f(n)( X) -t(n)( a) < M( x-a) .

Applying the Racetrack Principle again, we get

(X - a)2 t(n-l)(X) _f(n-l)(a) -f(n)(a)(x-a) < M 2

and again

f(n-2)(x) _ t(n-2)(a) _ f(n-l)(a)(x-a) _ f(n)(a) (x - a) < M (x-a)

Applying the Racetrack Principle a total of n + 1 times gives the upper bound.

The lower bound is obtained the same way. M

Theorem 2 gives error bounds only for x 2 a. To get similar bounds for x < a, we observe that if f is increasing and x < a, then f(x) <f(a), rather than

f(a) < f(x). Thus for x < a, each application of Theorem ld reverses the inequali-

ties, but since Theorem 2 sandwiches the error for x 2 a, reversing inequalities

will simply sandwich the error again for x < a (although which bound is the upper

one depends on whether n is odd or even). The usual two-sided error bound

involving absolute values then follows immediately.

It is possible for students to discover Theorem 2 for themselves. Consider the

following problem. A particle is traveling along the x-axis with position x = f(t) and suppose the initial position, velocity, and acceleration are all 0. If f"'(t) < S for t 2 0, find an upper bound on the position at time t = 2. Since students are

well-trained to antidifferentiate acceleration to get velocity and velocity to get

position, it is not unnatural to see them argue as follows:

t"'(t) < S a=f"(t) < St + cl, and here c1 = O since f"(0) = 0

t2 v=f'(t) < 5 2 + C27 and here c2 = O since f'(0) = 0

t3 s-f(t) < 5 6 + C3 and here C3 = O since f(0) - 0.

Thus, we get f(2) < 5 * 23/6 - 20/3. This is a legitimate argument as long as one

can justify antidifferentiating inequalities in the same way as equalities. That is

exactly the point of the Racetrack Principle!

Acceleration and velocity are not a bad way of introducing Taylor series. The

usual formula students memorize from physics,

s = sO + vot + 2 at2,

is precisely the degree 2 Taylor polynomial for s(t) when the constant acceleration

a is interpreted as the acceleration at time 0. This fact seems worth exploiting, but

I don't know any textbook that makes the connection.

Taylor's theorem is usually presented as a method of bounding the error in

approximating a function by its degree n Taylor polynomial. This viewpoint is

particularly appropriate in studying the error for fixed x as n > oo' as in the proof

of the convergence for all values of x for the Taylor series for ex or sinx.

1997] 235 THE ROLE OF THE MEAN VALUE THEOREM

Page 21: Calculus in Crisis - Saint Joseph's University

Nevertheless, I believe that this viewpoint is overemphasized and that the true power of Taylor series is in explaining error (or convergence) behavior for fixed n

as x a. Why is Simpson's Rule so much better than the Trapezoid Rule? What

makes the approximation sin x x so good? For numerical behavior, the impor- tant thing to know is the order of convergence for fixed n under normal circum- stances and what situations might affect that order of convergence. The real point of Taylor's theorem is that the error is order n + 1 in (x - a) with a constant depending on the (n + l)st derivative.

To be more precise, we say E(h) is asymptotic to Chn, denoted E(h) Chn, if limhOE(h)/hn = C. Also, we say E(h) is order n with bound M if limsuplE(h)/hn}l < M. Then Taylor's theorem can be viewed this way:

Corollaxy. Let E(h) be the error f(x) - Tn(x) where Tn(x) is the nth degree Taylor polynomial for f at x = a and where h = x - a. If f (n + 1) is continuous at x = a, then E(h) f (n + 1) (a) hn + 1 /(n + 1)! . If It(n + l) (X)l < M in a neighborhood of x = a, then E(h) is order n + 1 with bound M/(n + 1)!.

5. ERROR BEHAVIOR FOR NUMERICAL INTEGRATION. Another application of the Mean Value Theorem is to explain the error behavior for various common numerical integration rules: Left Rule, Right Rule, Trapezoid Rule, Midpoint Rule, Simpson's Rule. This behavior is best described using Taylor series in Ax for the error. Numerical analysis texts sometimes do this, but calculus texts don't. Since this approach is not so well-known, I'll give a version.

The idea is to concentrate on one panel of the subdivided area. Without loss of generality, we can assume the panel is centered at the origin. Thus we wish to compute

I(h) = | f ( x) dDc, where h = Ax/2. -h

The estimate for this single panel by the left-rectangle rule is

I(h) L(h) = 2h( f ( -h))

The other estimates are given by

Left: L(h) = 2hf(-h)

Right: R(h)= 2hf(h)

Midpoint: M(h) = 2hf(0)

Trapezoid: T(h) = (L(h) + R(h))/2

Simpson: S(h) = (2M(h) + T(h))/3

The formula relating Simpson's Rule to the midpoint and trapezoidal rules is not as well known as it should be. Students can be led to guess the weighted mean as a better estimate, if they spend a little time looking at the error behavior of the midpoint and trapezoidal rules.

We want to compute the Taylor series centered at a = 0 for all these functions. For the rules, this is simply a matter of replacing f(h) or f(-h) by the Taylor series for f centered at a = 0. For I(h), we observe that by the Fundamental Theorem of Calculus, I'(h) = f (h) + f ( - h). Thus I"(h) = f '(h) - f '( - h), I"'(h) = f"(h) + f"(-h), etc.

[March 236 THE ROLE OF THE MEAN VALUE THEOREM

Page 22: Calculus in Crisis - Saint Joseph's University

The Taylor series for I(h) is therefore

h3 h5 I(h) = 2f (0)h + 2f "(°) 3 ! + 2f ""(°) 5 ! + * o o o

The series for the rules are

( _h)2 L(h) = 2h f (0) + f '(0)( -h) + f "(°) 21 + * * 1

h2

R(h) = 2h f(0) + f'(O)h + f"(°) 2 + *1

M(h) = 2h[f(0)]

h2 h4 T(h) = 2h f(°) + f"(°) 2 + f (°) 4! + *1

h2 h4 S(h) = 2h f(°) + f"(°) 6 + f (°) 3 .4! + *1 1

The error behavior for each rule is obtained by subtracting the Taylor series for I(h) from the Taylor series for the rule and looking for the first term that doesn't cancel. The errors behave asymptotically as follows:

Left Error -2f'(0)h2

RightError 2f'(0)h2

h3 Midpoint Error -2 f " (0) 3 !

Trapezoid Error 2 f " (°) ( 2 - 6 ) h3 = 2 f " (0) 3

1 1 hS Simpson Error 2 f "" (0) 3 * 4 ! 5 ! hs = 2 t t (°) 180 .

The error behavior of these rules for the entire interval is obtained by multiplying by the number n of subdivisions and replacing h by Ax/2 where Ax = (b - a)/n, except for Simpson's rule where h = Ax. We have to replace t(k)(O) by a bound Mk on It'k'l for the entire interval. Using 2nh = (b - a), we find the absolute value of the errors have the following behavior in terms of Ax:

Left: order 1 with bound (b-a)(1/2)M

Right: order 1 with bound (b-a)(1/2)M1

Midpoint: order 2 with bound (b - a)(l/24)M2

Trapezoid: order 2 with bound (b-a)(1/12)M2

Simpson: order 4 with bound (b - a)(l/180)M4.

The typical textbook problem on numerical integration is to find the value of n that guarantees the error is within a specified tolerance. In practice, one simply keeps doubling n until the desired number of digits seems to have stabilized. Thus, error behavior, rather than error bounds, may be what we really are interested in.

1997] 237 THE ROLE OF THE MEAN VALUE THEOREM

Page 23: Calculus in Crisis - Saint Joseph's University

For example, it is useful to know that increasing n by a factor of 10 for the Left or Right rule, decreases the error by a factor of 1/10, that is, it gives one more significant digit. Thus if it takes 1 second for a graphing calculator to compute an integral accurate to 2 digits using the Left or Right Rule, it will take 101° seconds to get 12 digits of accuracy (that's 3169 years and, as my students have observed, a lot of batteries). By contrast, Simpson's Rule gets 4 extra digits for 10 times the work, and the same integral can be computed to 12 digits of accuracy in a minute or two on the same calculator (Simpson's Rule probably would get a headstart of 4 or 5 digits in the first second).

The dependence of error behavior on the higher derivatives of the integrand is also important, because it is a warning to look out for integrals whose integrand has an unbounded derivative on the interval of integration. For example, even using Simpson's Rule on|Ol/1-x2 ffic to get an approximation for 7r/4 is painfully slow going. Indeed, the order of convergence is 3/2 rather than 4.

Taylor series can be used in the same way to analyze the error behavior for numerical differentiation approximations:

f'(x) f(x + h) - f(x)

f'(x) f(x + h)-f(x - h)

f,,( ) f(x + h) + f(x - h) - 2f(x)

For example, students are often curious why some graphing calculators use the second of these approximations as a numerical derivative rather than the more familiar first approximation. Taylor series give the answer immediately: the second error for the approximation is order 2 while the first is order 1. The dependence of the error of each approximation on higher derivatives of f also has interesting effects. Try plotting the error near x = 0 with h = .01 for the second approxima- tion to f', when f is the innocuous-looking function f(x) = X8/3.

6. THE FUNDAMENTAL THEOREMS OF CALCULUS. The proof given in [8] for the Taylor error bound appeals to the Fundamental Theorem of Calculus to turn the inequality f n+1(X) < M into the inequality f (n)(X)-f(n)(a) < M(x-a). I suspect this is the natural inclination of most mathematicians, and it shows how much under-appreciated the IFT is. No definite integrals are needed; the IFT itself is a disguised form of integration. The subtle connection between the IFT and the Fundamental Theorem of Calculus is worth discussing.

There are of course two main versions of the Fundamental Theorem of Calculus. There are also variations on what restrictions are placed on the inte- grand f. I will assume f is continuous. The theorems then are

First Fundamental Theorem of Calculus (FTC I). Iff is continuous on the interval [a, b] and F(x) = |aXf(t) dtfor x in [a, b], then F'(x) = f(x).

Second Fundamental Theorem of Calculus (FTC II). Iff is continuous and F(x) = f(x), then |abf(t) dt = F(b)-F(a).

The First Fundamental Theorem is not directly related to the IFT. The hard part of the proof is showing that continuous functions are Riemann integrable. The

[March 238

THE ROLE OF THE MEAN VALUE THEOREM

Page 24: Calculus in Crisis - Saint Joseph's University

rest is a straightforward consequence of the integral version of the Mean Value Inequality:

m(b-a) < | f(x) dus < M(b-a),

where m < f(x) < M on the interval [a, b]. Note that unlike the Mean Value Inequality for derivatives, this inequality follows easily from the definition of the Riemann integral, so easily that it is not uncommon to view the inequality as a defining property of the definite integral (the corresponding view for the Mean Value Inequality for derivatives, Ampere notwithstanding, is much less common).

On the other hand, the Second Fundamental Theorem is closely connected to the IFT. The IFT for continuously differentiable functions follows directly from the FTC II and the fact that the integral of a nonnegative function is nonnegative. In fact, that is the way the IFT is proved in [8]. There, the FTC II, as embodied in the relation between velocity and change in position, is taken as the intuitively clear, theoretical cornerstone, and the IFT is derived from it. I suspect, however, that most students see the IFT as more "obvious"than the FTC II.

Conversely, the IFT implies the FTC II by the method used in many calculus books: simply invoke the FTC I with x - b and observe that, by the IFT (the Constant Function Theorem, Theorem lb), two antiderivatives of f differ by a constant.

The assumption of continuity in the FTC I is necessary. The assumption of continuity in the FTC II is another matter. Of course, if F' = f is not continuous, the integral might not exist. For example, if F(O) = 0 and F(x) = x2sin(1/x2) for x 7& O, then F' exists everywhere but is not even Lebesgue integrable on [0,1]. Suppose, however, that we assume only that |abf(t)dt exists. Then the familiar argument using the Mean Value Theorem still works. Just represent F(b) - F(a) as a telescoping sum and use the MVT on each term of the sum to turn it into a Riemann sum for |abt(t) dt. Here the IFT does not work. Just as the MVT follows from the IFT only under the assumption of continuity of the derivative, the FTC II follows from the IFT only under the assumption of continuity of the integrand.

7. CONCLUSION. Many calculus textbooks have sections where the author is writing on automatic pilot, just putting in material demanded by users. These sections have the same dreary examples; little is new, or thought over fresh from the start. This shouldn't be surprising, since writing a calculus textbook is a significant project and one can't devote the same enthusiasm and energy to all parts of the project. I have always felt that the theoretical sections of standard calculus textbooks are most prone to such a pedestrian treatment. Moreover, calculus instruction does not place much emphasis on those theoretical sections, at least when it comes to testing. For example, a study of the compendium of final exams in [11] reveals only one question (out of more than 300 on 23 exams) involving the Mean Value Theorem, and that one asked for the value of c satisfying the conclusion of the Mean Value Theorem for a quadratic function. When both textbooks and instruction appear to be just going through the motions with theory, it surprises me that some critics of new textbooks like [8] bemoan the absence of the Mean Value Theorem or a e - 8 definition of limit.

I sympathize with yearnings for an occasional foray into the theoretical struc- ture of the calculus. I just ask that it be thoughtful and sensible. Use intuitive definitions. If a theorem is to be used without proof, like the Mean Value Theorem, keep it as simple and as "obvious" as possible. Don't use tricky proofs or

1997] 239 THE ROLE OF THE MEAN VALUE THEOREM

Page 25: Calculus in Crisis - Saint Joseph's University

deus-ex-machina auxiliary functions. Don't prove things in more generalit than necessary; even analysts don't usually deal with the discontinuous derivatives allowed by the Mean Value Theorem.

In this paper, I have tried to give a sensible approach to the Mean Value Theorem and its usual applications to monotonicit, Taylor error bounds, quadra- ture error bounds, and the Fundamental Theorems of Calculus. One standard application of the MVT I have not considered is l'Hopital's Rule; for a non-MVT approach, see [3]. LBL [9] has some other applications to concavit and the second derivative test for extrema.

In recent years, calculus content and pedagogy have been rethought completely. People have found that there is nothing sacred about related rates and the lecture method. It is time as well to rethink the theory taught in standard calculus classes. There is nothing sacred about the Mean Value Theorem.

ACKNOVVLEDGMENTS. I wish to thank Andy Gleason, Peter Lax, and Jerry Uhl for numerous suggestions and corrections for this article. In particular, the proof given for the IFT was instigated by a bisection proof Lax showed me for the SIFT. He also showed me applications to l'Hopital's Rule, the Corrected Midpoint Rule for quadrature, and the definition of volumes and arclengths using antideriva- tives rather than definite integrals; all of this I hope he puts into print. I am indebted to Gleason, whose meticulous reading caught a number of egregious errors and whose comments cleared up my muddy thinking at numerous points.

REFERENCES

1. L. Bers, On avoiding the mean value theorem, Amer. Math. Monthly 74 (1967), 583. 2. D. Bo, A simple derivation of the Maclaurin series for sine and cosine, Amer. Math. Monthly 97

(1990), 836. Editor's Note in the Monthly 98 (1991), 364. 3. R. P. Boas, Lhospital's rule without mean value theorems, Amer. Math. Monthly 76 (1969),

1051-1053. 4. R. P. Boas, Who needs these mean-value theorems anyway?, Two-Year College Math J. 12 (1981),

178-181. 5. W. Davis, H. Porta, J. Uhl, Calculus & Mathematica: Derivatives: Measuring Growth, Addison-

Wesley, 1994. 6. J. Dieudonne, Foundations of Modern Analysis, Academic Press, New York, 1960. 7. J. V. Grabiner, The Origins of Cauchy's Rigorous Calculus, MIT Press, Cambridge, 1981. 8. D. Hughes-Hallett, A. M. Gleason, et al., Calculus, John Wiley & Sons, New York, 1994. 9. P. Lax, S. Burstein, and A. Lax, Calculus with Applications and Computing, Volume 1, Springer-

Verlag, New York, 1984. 10. D. E. Richmond, An elementary proof of a theorem of calculus, Amer. Math. Monthly 92 (1985),

589-590. 11. L. A. Steen, editor, Calculus for a New Century: A Pump, Not a Filter, MAA Notes 8,

Mathematical Association of America, Washington, DC, 1988.

Department of Mathematics Colgate University Hamilton, NY 13346 ttucker@center. colgate. ed u

240 [March THE ROLE OF THE MEAN VALUE THEOREM

Page 26: Calculus in Crisis - Saint Joseph's University

(

Commentary on Rethinking Rigorin Calculus: The Role of the Mean

Value Theorem

Howard Swann

Professor Tucker's article joins the current deconstructive attack on traditionalcontent and methods of teaching of calculus that seems to be part of the mission ofthe militant wing of the 'Calculus Reform Movement.' Here the primary targetsare current textbooks' efforts to present the foundations of calculus and thefrequent use of the mean value theorem.

As the author remarks, the traditional presentation of the foundations ofcalculus is often poorly motivated and incomprehensible to most students. So inreforming the teaching of the calculus sequence, one should either omit the logicalfoundations or attempt to make them interesting and comprehensible. The author,who is one of the co-authors of the 'Harvard Calculus' text [2] where the firstoption is chosen and the concept of mathematical proof based on rigorousdefinitions is eliminated entirely, urges that we keep things as "intuitive ... , simpleand obvious as possible." Various demonstrations are our new "proofs;" I use thequotation marks to make the distinction. The author's favored replacement for theMean Value Theorem (MVT), the Increasing Function Theorem (1FT), finds itsintuitive justification in an automotive ('Racetrack') argument. Such automotivearguments are a new addition to our pantheon of "proofs." An automotive "proof'of the 1FT is 'if the speedometer on a motor-car always reports a number greaterthan or equal to zero, then the car must be moving (weakly) forward.' The 1FT is tobe treated as an 'axiom,' yet the essential first foundational question for calculus is'What is it that a speedometer is supposed to report?' Intuition falters here, fornature has yet to provide us with a speedometer.

The author states, "The origin of The Mean Value Theorem in the structure ofthe real numbers is not addressed; that is much too difficult for a standard course."I agree that proofs of the extreme value theorem and other global results frombasic principles do not belong in today's beginning calculus texts in the presenteducational climate. However, an informal discussion of human attempts to define'number' is fascinating and accessible.

For example, although we currently use the 'real numbers,' today's students,brought up on Star Trek, are delighted with the realization that there still is thefollowing problem:-When we use real numbers to represent time t and positionp(t), we are led to the conclusion that in moving from p(t' ) to p(t") we mustdisappear infinitely often, for there is NO instant of time 'next to' t' nor anyposition 'next to' p(t' ). A variant of this problem bothered Zeno 2500 years ago; ithas not been resolved; the reals are indeed 'full of holes.' Why shouldn't today'sstudents again contemplate this version of the abyss confronting human attemptsto comprehend infinity, particularly when Weierstrass has contrived a remarkablyclever way across?

For Weierstrass, in treating continuity and differentiability, insisted that weconsider only functions that are accompanied by suitable e, S arguments. In this

1997] COMMENTARY ON THE ROLE OF THE MEAN VALUE THEOREM 241

Page 27: Calculus in Crisis - Saint Joseph's University

class of functions he was able to show uniqueness (and thus define 'correctness') ofguesses for limits and derivatives [1]. These notions are accessible to students andgive the foundations for differential calculus. When we add the global results, theimplications so astonished Bertrand Russell that he pronounced [4, p. 64]:

... all goes smoothly until we reach those studies in which the notion ofinfinity is employed-the infinitesimal calculus and the whole of highermathematics. The solution of the difficulties which formerly surrounded themathematically infinite is probably the greatest achievement of which our agehas to boast.

Learning to understand and appreciate proofs is a gradual process; it surely isimperative to introduce the notion of mathematical proof in beginning mul­tisemester calculus and keep it alive even though actual proofs are few. Such anintroduction is essential for later mathematics courses, and students must be madeaware that the assertions of mathematics can be proved to be true.

The bright promise of the new technology gives us a chance to explore theseideas in a striking way. For example, using-say-Mathematica, 'zoom'the graphsof I(x) == Ixlsin(l/x) (continuously extended) and g(x) == x 2n /(2n-l) sin(l/x)(continuously extended), search for possible 'local linearity' and try to decide ifthey are differentiable at zero. For n > 4, the graph of [g(x) - g(O)]jx has adelightful fractal quality when you magnify the domain around zero; we giveexamples in Figure 1. These are not 'important' functions in a practical sense, buta look at such graphs encourages a sense of delight and wonder concerning thedifficulties of the foundations of analysis. It does not take very much time topresent and discuss these ideas.

I(x) = Ixlsin(1/x) g(x) = x10/

9 sin(1/x)

g(x) - g(O)

x-O-.3 :5: x :5: .3

g(x) - g(O)

x-O

Figure 1

- .01 :5: x :5: .01

242 COMMENTARY ON THE ROLE OF THE MEAN VALUE THEOREM [March

Page 28: Calculus in Crisis - Saint Joseph's University

As for the mean value theorem, the author states "And by the way, there isnothing obvious about the MVT without the hypothesis of continuity of thederivative." I believe that this is not true, for here is a pictorial "proof' of theMVT:

MEAN VALUE THEOREM. If !(x) is continuous on [a, b] and has a derivative on(a, b), then there is some point c, a < c < b, such that

!(b) - !(a)f'(c) = b _ a . = slope of line through (a,!(a» and (b,!(b».

Pictorial "proo!:" Intuitively, the assumptions of the theorem mean that the graphof !(x) is smooth between (a, !(a» and (b,f(b» and presumably has no sharpcomers since ! has derivatives. If the graph of !(x) is not a straight line, some ofthe graph will be above the line through (a, !(a» and (b, !(b» or below this line.Suppose some of the graph is above the line. Imagine a line that is parallel to theline through (a, !(a» and (b, !(b» but far above the line. Move it down toward theline, keeping it parallel to the line through (a, !(a» and (b,f(b». Since there areno comers on the graph, when the line first hits the graph at some point (c, !(c»,surely it will be tangent to the graph at such a point. So, if our definition of thederivative as the slope of a line that is tangent to the graph at (c, !(c» is any good,the slope of this tangent line must be f'(c). But since the line is parallel to the linethrough (a, !(a» and (b, !(b», it will have the same slope as this line, i.e.

f'(c) = (f(b) - !(a»j(b - a).

A similar argument holds if some of the graph is below the line. •

/(a)

a

Figure 2

c

/(b) - I(a)rise

1

b

1997] COMMENTARY ON THE ROLE OF THE MEAN VALUE THEOREM 243

Page 29: Calculus in Crisis - Saint Joseph's University

It is to be hoped that the phrases 'presumably' and 'if the definition is any good'make the students suspicious. This should lead to consideration of a special case(Rolle's theorem) where it is clear that the only assumption necessary is theextreme value theorem. This in turn invites a discussion (no proofs) of the extremevalue theorem as one of the crucial global tests of the effectiveness of Weierstrass'definition of continuity. The boundedness theorem, the extreme value theorem,and the intermediate value theorem are no longer 'obvious' when students haverealized that the problems with the 'holes' in the real numbers extend toany intuitive sense of continuity of a function on an interval. For example, isWeierstrass' definition of continuity strong enough to force a continuous functionto be bounded on a closed bounded interval? The announcement that the answer is'yes' is an excellent promotional preview of later courses. At any rate, oncestudents have 'bought' Rolle's theorem, we can then use the conventional proof toshow that the MVT must hold.

Here we reverse the author's prescription for giving mathematics a bad name;such a sequence of arguments reveals the charm and power of mathematics, for we

. prove that a questionable complicated result must be true if we assume othersimpler results that are less questionable.

The author offers us a mathematical proof of the Increasing Function Theorem,"easier than most proofs of the MVT". He presents the theorem as

If f' ~ 0 on an interval, then f is increasing on that interval.

We infer from the proof that the interval is [a, b], closed and bounded, and thatwe are to have one-sided derivatives at a and b.

The key observation for the author's proof is the following: Given f, letslope(a, b) = [f(b) - f(a)]j(b - a). The author points out that" If slope(a, b) = mand c is between a and b, .then one of slope(a, c) and slope (c, b) is greater thanor equal to m and one is less than or equal to m. For a proof, draw the obviouspicture."

The "obvious picture" encourages this assertion, but knowing that the art ofconverting a "proof' to a proof is one of the key skills our majors should learn, ifwe are giving a proof here, we must go further. Two mathematical proofs areimmediately discovered; a proof by contradiction (four main cases) or a directproof. The direct proof shows first that the result must be true if m = 0, and thenuses the same 'deus-ex-machina' auxiliary function that annoys the author when itis employed to prove the MVT from Rolle's Theorem.

The author admonishes us: " ... previous calls to downplay the MVT have fallenon the deaf ears of textbook writers. Maybe calculus reform has unblocked someears and it is time to try the call again."

I borrow a phrase from the author and wave Occam's Razor, that 'principle ofparsimony,' "like a cross in front of a vampire, to hold the" attack on the meanvalue theorem at bay.

The mean value theorem is actually a friendly theorem; what has it done toprovoke such ire? It provides one more test of the effectiveness of Weierstrass'definition of limit and continuity and is used endlessly to establish all sorts ofresults; more, by the author's admission, than the 1FT. For example, how do weprove that the formula for arc length is correct without the MVT? We do not proveit, apparently; with the wave of the symbol ' - " we make ourselves content withthe arguments of "Newton, Leibnitz, the Bernoullis and Euler."

Which is more intuitively 'obvious' and persuasive; the pictorial "proof' of theMVT that we sketched above, or an automotive "proof' of the 1FT?

244 COMMENTARY ON TIlE ROLE OF TIlE MEAN VALUE TIlEOREM [March

Page 30: Calculus in Crisis - Saint Joseph's University

A one-line proof shows that the author's Iff follows from the MVT. Theauthor's suggested proof for the MVT from the Iff requires, in addition to theusual assumptions for the MVT, that the function's derivatives be extendible to acontinuous function on the closed interval, requires the extreme value theoremand the intermediate value theorem, and fails to establish that the sought-for valuefor c is strictly between points a and b. This is essential, for example, for showingthat we can repeat the application of L'Hospital's rule a second time in evaluatinga limit.

One positive note: The author's argument (Theorem 2) for an error bound forTaylor's series is elegant and should be adopted by one and all.

However, I do not find the main arguments of the paper to be persuasive. Thoseof us who, as the author says, "bemoan the absence of the Mean Value Theoremor the e, 8 definition of limit" regret that "it is time ... to rethink the theorytaught in standard calculus classes." Some of us are disappointed that "there isnothing sacred about related rates;" we used to regard them highly, for relatedrates give us the heat equation, one of the classic models of mathematical physics,and the primary example for the study of elliptic and parabolic partial differentialequations.

Whether or not the militants' 'final product' is 'better,' which is by no meansestablished [3], one thing is clear: books such as the "Harvard Calculus" are"enablers;' by legitimizing the abandonment of the concepts of mathematicalproof, related rates, convergence of series, and so forth from the calculus se­quence, other texts and teachers will feel free to follow.

Mathematics is unique in its concern with rigorous foundations and proofs.Here its role as 'Queen and servant of the Sciences' is to offer the content ofcalculus as an anchor of certainty to aid the disciplines it serves. Should we notattempt to convey some sense of the remarkable way that the results of calculuscan be proved to be true to those who will use it?

REFERENCES

1. C. B. Boyer, The History of the Calculus and its Conceptual Development, Dover, 1959.2. D. Hughes-Hallett, A. M. Gleason et al., Calculus, John Wiley & Sons, New York, 1994.3. K. Johnson, Harvard Calculus at Oklahoma State, Amer. Math. Monthly 102 (1995) 794-797.4. B. Russell, Mysticism and Logic, W. W. Norton & Co., Inc., New York, 1929.

Department of Mathematics and Computer ScienceSan Jose State UniversitySan Jose, CA [email protected]

1997] COMMENTARY ON THE ROLE OF THE MEAN VALUE THEOREM 245

Page 31: Calculus in Crisis - Saint Joseph's University

On "Rethinking Rigor in Calculus...," or Why We Don't Do Calculus on the Rational NumbersAuthor(s): Scott E. BrodieSource: The College Mathematics Journal, Vol. 30, No. 2 (Mar., 1999), pp. 135-138Published by: Mathematical Association of AmericaStable URL: http://www.jstor.org/stable/2687725Accessed: 16/03/2010 08:16

Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available athttp://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unlessyou have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and youmay use content in the JSTOR archive only for your personal, non-commercial use.

Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained athttp://www.jstor.org/action/showPublisher?publisherCode=maa.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printedpage of such transmission.

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access toThe College Mathematics Journal.

http://www.jstor.org

Page 32: Calculus in Crisis - Saint Joseph's University
Page 33: Calculus in Crisis - Saint Joseph's University
Page 34: Calculus in Crisis - Saint Joseph's University
Page 35: Calculus in Crisis - Saint Joseph's University
Page 36: Calculus in Crisis - Saint Joseph's University

Who Gave You the Epsilon? Cauchy and the Origins of Rigorous CalculusJudith V. Grabiner, 424 West 7th Street, Claremont, California 91711

The American Mathematical Monthly,March 1983, Volume 90, Number 3, pp. 185–194.

Student: The car has a speed of 50 miles an hour. What does that mean?

Teacher: Given any there exists a such that if then

Student: How in the world did anybody ever think of such an answer?

Perhaps this exchange will remind us that the rigorous basis for the calculus is notat all intuitive—in fact, quite the contrary. The calculus is a subject dealing withspeeds and distances, with tangents and areas—not inequalities. When Newton

and Leibniz invented the calculus in the late seventeenth century, they did not use delta-epsilon proofs. It took a hundred and fifty years to develop them. This means that it wasprobably very hard, and it is no wonder that a modern student finds the rigorous basis ofthe calculus difficult. How, then, did the calculus get a rigorous basis in terms of thealgebra of inequalities?

Delta-epsilon proofs are first found in the works of Augustin-Louis Cauchy(1789–1867). This is not always recognized, since Cauchy gave a purely verbaldefinition of limit, which at first glance does not resemble modern definitions: “Whenthe successively attributed values of the same variable indefinitely approach a fixedvalue, so that finally they differ from it by as little as desired, the last is called the limitof all the others’’ [1]. Cauchy also gave a purely verbal definition of the derivative of

as the limit, when it exists, of the quotient of differences whenh goes to zero, a statement much like those that had already been made by Newton,Leibniz, d’Alembert, Maclaurin, and Euler. But what is significant is that Cauchytranslated such verbal statements into the precise language of inequalities when heneeded them in his proofs. For instance, for the derivative [2]:

(1) Let be two very small numbers; the first is chosen so that for all numerical[i.e., absolute] values of h less than and for any value of x included [in theinterval of definition], the ratio will always be greater than

and less than

This one example will be enough to indicate how Cauchy did the calculus, because thequestion to be answered in the present paper is not, “how is a rigorous delta-epsilonproof constructed?’’ As Cauchy’s intellectual heirs we all know this. The centralquestion is, how and why was Cauchy able to put the calculus on a rigorous basis, whenhis predecessors were not? The answers to this historical question cannot be found byreflecting on the logical relations between the concepts, but by looking in detail at thepast and seeing how the existing state of affairs in fact developed from that past. Thus we will examine the mathematical situation in the seventeenth and eighteenthcenturies—the background against which we can appreciate Cauchy’s innovation. We will describe the powerful techniques of the calculus of this earlier period and the

f9sxd 1 «.f9sxd 2 «s f sx 1 hd 2 f sxddyh

d,d, «

s f sx 1 hd 2 f sxddyhf sxd

|s2 2 s1

t2 2 t12 50| < «.

|t2 2 t1| < d,d« > 0,

Page 37: Calculus in Crisis - Saint Joseph's University

relatively unimpressive views put forth to justify them. We will then discuss how a senseof urgency about rigorizing analysis gradually developed in the eighteenth century. Mostimportant,we will explain the development of the mathematical techniques necessaryfor the new rigor from the work of men like Euler, d’Alembert, Poisson,and especiallyLagrange. Finally, we will show how these mathematical results,though often developedfor purposes far removed from establishing foundations for the calculus,were used byCauchy in constructing his new rigorous analysis.

The Practice of Analysis: From Newton to Euler. In the late seventeenth century,Newton and Leibniz,almost simultaneously, independently invented the calculus. Thisinvention involved three things. First, they invented the general concepts of differentialquotient and integral (these are Leibniz’s terms; Newton called the concepts “f luxion’’and “f luent’’). Second, they devised a notation for these concepts which made thecalculus an algorithm: the methods not only worked, but were easy to use. Theirnotations had great heuristic power, and we still use Leibniz’s and andNewton’s today. Third, both men realized that the basic processes of finding tangentsand areas,that is, differentiating and integrating, are mutually inverse—what we nowcall the Fundamental Theorem of Calculus.

Once the calculus had been invented, mathematicians possessed an extremely powerfulset of methods for solving problems in geometry, in physics,and in pure analysis. Butwhat was the nature of the basic concepts? For Leibniz,the differential quotient was aratio of infinitesimal differences,and the integral was a sum of infinitesimals. ForNewton, the derivative, or fluxion, was described as a rate of change; the integral, orfluent,was its inverse. In fact,throughout the eighteenth century, the integral wasgenerally thought of as the inverse of the differential. One might imagine asking Leibnizexactly what an infinitesimal was,or Newton what a rate of change might be. Newton’sanswer, the best of the eighteenth century, is instructive. Consider a ratio of finitequantities (in modern notation, as h goes to zero). The ratioeventually becomes what Newton called an “ultimate ratio.’’ Ultimate ratios are “limitsto which the ratios of quantities decreasing without limit do always converge, and towhich they approach nearer than by any given difference, but never go beyond, nor everreach until the quantities vanish’’ [3]. Except for “reaching’’ the limit when thequantities vanish,we can translate Newton’s words into our algebraic language. Newtonhimself, however, did not do this,nor did most of his followers in the eighteenthcentury. Moreover, “never go beyond’’ does not allow a variable to oscillate about itslimit. Thus,though Newton’s is an intuitively pleasing picture, as it stands it was not andcould not be used for proofs about limits. The definition sounds good, but it was notunderstood or applied in algebraic terms.

But most eighteenth-century mathematicians would object,“Why worry aboutfoundations?’’ In the eighteenth century, the calculus,intuitively understood andalgorithmically executed, was applied to a wide range of problems. For instance, thepartial differential equation for vibrating strings was solved; the equations of motion forthe solar system were solved; the Laplace transform and the calculus of variations andthe gamma function were invented and applied; all of mechanics was worked out in thelanguage of the calculus. These were great achievements on the part of eighteenth-

s f sx 1 hd 2 f sxddyh

x.,

ey dx,dyydx

2

Page 38: Calculus in Crisis - Saint Joseph's University

century mathematicians. Who would be greatly concerned about foundations when suchimportant problems could be successfully treated by the calculus? Results were whatcounted.

This point will be better appreciated by looking at an example which illustrates both the“uncritical’’ approach to concepts of the eighteenth century and the immense power ofeighteenth-century techniques,from the work of the great master of such techniques:Leonhard Euler. The problem is to find the sum of the series

It clearly has a finite sum since it is bounded above by the series

whose sum was known to be 2; Johann Bernoulli had found this sum by treatingas the difference between the series

and the series and observing thatthis difference telescopes [4].

Euler’s summation of

makes use of a lemma from the theory of equations:given a polynomial equation whoseconstant term is one, the coefficient of the linear term is the product of the reciprocalsof the roots with the signs changed. This result was both discovered and demonstratedby considering the equation having roots a and b. Multiplying andthen dividing out ab, we obtain

the result is now obvious,as is the extension to equations of higher degree.

Euler’s solution then considers the equation sin

Expanding this as an infinite series,Euler obtained

Dividing by x yields

Finally, substituting produces

But Euler thought that power series could be manipulated just like polynomials. Thus,we now have a polynomial equation in u, whose constant term is one. Applying thelemma to it,the coefficient of the linear term with the sign changed is Theroots of the equation in u are the roots of with the substitution namely

. . . . Thus the lemma implies

1y6 5 1yp 2 1 1ys4p 2d11ys9p 2d 1 . . . .

9p 2,4p 2,p 2,u 5 x 2,sin x 5 0

1y3! 5 1y6.

1 2 uy3! 1 u2y5! 2 . . . 5 0.

x 2 5 u

1 2 x 2y3! 1 x 4y5! 2 . . . 5 0.

x 2 x 3y3! 1 x 5y5! 2 . . . 5 0.

x 5 0.

s1yabdx 2 2 s1ya 1 1ybdx 1 1 5 0;

sx 2 adsx 2 bd 5 0,

o`

k511yk2

1y2 1 1y3 1 1y4 1 . . .,1y1 1 1y2 1 1y3 1 . . .1ys1 ? 2d 1 1ys2 ? 3d 1 1ys3 ? 4d 1 . . .

1 1 1ys1 ? 2d 1 1ys2 ? 3d 1 1ys3 ? 4d 1 . . . 1 1yfsk 2 1d ? kg 1 . . .,

1y1 1 1y4 1 1y9 1 . . . 1 1yk2 1 . . ..

3

Page 39: Calculus in Crisis - Saint Joseph's University

Multiplying by yields the sum of the original series [5]:

Though it is easy to criticize eighteenth-century arguments like this for their lack ofrigor, it is also unfair. Foundations,precise specifications of the conditions under whichsuch manipulations with infinites or infinitesimals were admissible, were not veryimportant to men like Euler, because without such specifications they made importantnew discoveries,whose results in cases like this could readily be verif ied. When thefoundations of the calculus were discussed in the eighteenth century, they were treatedas secondary. Discussions of foundations appeared in the introductions to books,inpopularizations,and in philosophical writings,and were not—as they are now and havebeen since Cauchy’s time—the subject of articles in research-oriented journals.

Thus,where we once had one question to answer, we now have two. The first remains,where do Cauchy’s rigorous techniques come from? Second, one must now ask,whyrigorize the calculus in the first place? If few mathematicians were very interested infoundations in the eighteenth century [6], then when,and why, were attitudes changed?

Of course, to establish rigor, it is necessary—though not sufficient—to think rigor issignificant. But more important,to establish rigor, it is necessary (though also notsufficient) to have a set of techniques in existence which are suitable for that purpose. In particular, if the calculus is to be made rigorous by being reduced to the algebra ofinequalities,one must have both the algebra of inequalities,and facts about the conceptsof the calculus that can be expressed in terms of the algebra of inequalities.

In the early nineteenth century, three conditions held for the first time:Rigor wasconsidered important; there was a well-developed algebra of inequalities; and, certainproperties were known about the basic concepts of analysis—limits,convergence,continuity, derivatives,integrals—properties which could be expressed in the languageof inequalities if desired. Cauchy, followed by Riemann and Weierstrass,gave thecalculus a rigorous basis,using the already-existing algebra of inequalities,and built alogically-connected structure of theorems about the concepts of the calculus. It is ourtask to explain how these three conditions—the developed algebra of inequalities,theimportance of rigor, the appropriate properties of the concepts of the calculus—came to be.

The Algebra of Inequalities. Today, the algebra of inequalities is studied in calculuscourses because of its use as a basis for the calculus,but why should it have beenstudied in the eighteenth century when this application was unknown? In the eighteenth century, inequalities were important in the study of a major class of results:approximations. For example, consider an equation such as for not aninteger. Usually a cannot be found exactly, but it can be approximated by an infiniteseries. In general, given some number n of terms of such an approximating series,eighteenth-century mathematicians sought to compute an upper upper bound on theerror in the approximation—that is, the difference between the sum of the series and thenth partial sum. This computation was a problem in the algebra of inequalities. Jeand’Alembert solved it for the important case of the binomial series; given the number ofterms of the series n, and assuming implicitly that the series converges to its sum,hecould find the bounds on the error—that is, on the remainder of the series after the nth

msx 1 1dm 5 a,

1y1 1 1y4 1 1y9 1 . . . 1 1yk2 1 . . . 5 p 2y6.

p 2

4

Page 40: Calculus in Crisis - Saint Joseph's University

term—by bounding the series above and below with convergent geometric progressions[7]. Similarly, Joseph-Louis Lagrange invented a new approximation method usingcontinued fractions and, by extremely intricate inequality-calculations,gave necessaryand sufficient conditions for a given iteration of the approximation to be closer to theresult than the previous iteration [8]. Lagrange also derived the Lagrange remainder ofthe Taylor series [9], using an inequality which bounded the remainder above and belowby the maximum and minimum values of the nth derivative and then applying theintermediate-value theorem for continuous functions. Thus through such eighteenth-century work [10], there was by the end of the eighteenth century a developed algebra ofinequalities,and people used to working with it. Given an n, these people are used tofinding an error—that is, an epsilon.

Changing Attitudes toward Rigor. Mathematicians were much more interested infinding rigorous foundations for the calculus in 1800 than they had been a hundred yearsbefore. There are many reasons for this:no one enough by itself, but apparentlysufficient when acting together. Of course one might think that eighteenth-centurymathematicians were always making errors because of the lack of an explicitly-formulated rigorous foundation. But this did not occur. They were usually right, and fortwo reasons. One is that if one deals with real variables,functions of one variable, serieswhich are power series,and functions arising from physical problems,errors will notoccur too often. A second reason is that mathematicians like Euler and Laplace had adeep insight into the basic properties of the concepts of the calculus,and were able tochoose fruitful methods and evade pitfalls. The only “error’’ they committed was to usemethods that shocked mathematicians of later ages who had grown up with the rigor ofthe nineteenth century.

What then were the reasons for the deepened interest in rigor? One set of reasons was philosophical. In 1734,the British philosopher Bishop Berkeley had attacked the calculus on the ground that it was not rigorous. In The Analyst,or a DiscourseAddressed to an Infidel Mathematician, he said that mathematicians had no businessattacking the unreasonableness of religion, given the way they themselves reasoned. He ridiculed fluxions—“velocities of evanescent increments’’—calling the evanescentincrements “ghosts of departed quantities’’ [11]. Even more to the point,he correctlycriticized a number of specific arguments from the writings of his mathematicalcontemporaries. For instance, he attacked the process of finding the fluxion (ourderivative) by reviewing the steps of the process:if we consider taking the ratioof the differences then simplifying to then letting h vanish,we obtain But is h zero? If it is,we cannot meaningfully divide by it; if it is notzero, we have no right to throw it away. As Berkeley put it, the quantity we have calledh “might have signified either an increment or nothing. But then,which of these soeveryou make it signify, you must argue consistently with such its signification’’ [12].

2x.2x 1 h,ssx 1 hd2 2 x 2dyh,

y 5 x 2

5

Page 41: Calculus in Crisis - Saint Joseph's University

Since an adequate response to Berkeley’s objections would have involved recognizingthat an equation involving limits is a shorthand expression for a sequence ofinequalities—a subtle and difficult idea—no eighteenth-century analyst gave a fullyadequate answer to Berkeley. However, many tried. Maclaurin, d’Alembert, Lagrange,Lazare Carnot, and possibly Euler, all knew about Berkeley’s work, and all wrotesomething about foundations. So Berkeley did call attention to the question. However,except for Maclaurin, no leading mathematician spent much time on the questionbecause of Berkeley’s work, and even Maclaurin’s influence lay in other fields.

Another factor contributing to the new interest in rigor was that there was a limit to thenumber of results that could be reached by eighteenth-century methods. Near the end ofthe century, some leading mathematicians had begun to feel that this limit was at hand.D’Alembert and Lagrange indicate this in their correspondence, with Lagrange callinghigher mathematics “decadent’’ [13]. The philosopher Diderot went so far as to claimthat the mathematicians of the eighteenth century had “erected the pillars of Hercules’’beyond which it was impossible to go [14]. Thus,there was a perceived need toconsolidate the gains of the past century.

Another “f actor” was Lagrange, who became increasingly interested in foundations,andthrough his activities, interested other mathematicians. In the eighteenth century,scientific academies offered prizes for solving major outstanding problems. In 1784,Lagrange and his colleagues posed the problem of foundations of the calculus as theBerlin Academy’s prize problem. Nobody solved it to Lagrange’s satisfaction,but two ofthe entries in the competition were later expanded into full-length books,the first on theContinent,on foundations:Simon L’Huilier’ s Exposition élémentaire des principes descalculs supérieurs, Berlin, 1787,and Lazare Carnot’s Réflexions sur la métaphysique ducalcul infinitésimal, Paris, 1797. Thus Lagrange clearly helped revive interest in theproblem.

Lagrange’s interest stemmed in part from his respect for the power and generality ofalgebra; he wanted to gain for the calculus the certainty he believed algebra to possess.But there was another factor increasing interest in foundations,not only for Lagrange,but for many other mathematicians by the end of the eighteenth century: the need toteach. Teaching forces one’s attention to basic questions. Yet before the mideighteenthcentury, mathematicians had often made their living by being attached to royal courts.But royal courts declined; the number of mathematicians increased; and mathematicsbegan to look useful. First in military schools and later on at the Ecole Polytechnique inParis, another line of work became available: teaching mathematics to students ofscience and engineering. The Ecole Polytechnique was founded by the Frenchrevolutionary government to train scientists,who, the government believed, might proveuseful to a modern state. And it was as a lecturer in analysis at the Ecole Polytechniquethat Lagrange wrote his two major works on the calculus which treated foundations; similarly, it was 40 years earlier, teaching the calculus at the Military Academy at Turin,that Lagrange had first set out to work on the problem of foundations. Because teachingforces one to ask basic questions about the nature of the most important concepts,thechange in the economic circumstances of mathematicians—the need to teach—provideda catalyst for the crystallization of the foundations of the calculus out of the historicaland mathematical background. In fact,even well into the nineteenth century, much offoundations was born in the teaching situation; Weierstrass’s foundations come from his

6

Page 42: Calculus in Crisis - Saint Joseph's University

lectures at Berlin; Dedekind first thought of the problem of continuity while teaching atZurich; Dini and Landau turned to foundations while teaching analysis; and, mostimportant for our present purposes,so did Cauchy. Cauchy’s foundations of analysisappear in the books based on his lectures at the Ecole Polytechnique; his book of 1821was the first example of the great French tradition of Cours d’analyse.

The Concepts of the Calculus. Arising from algebra, the algebra of inequalities wasnow there for the calculus to be reduced to; the desire to make the calculus rigorous hadarisen through consolidation, through philosophy, through teaching, through Lagrange.Now let us turn to the mathematical substance of eighteenth-century analysis,to seewhat was known about the concepts of the calculus before Cauchy, and what he had towork out for himself, in order to define, and prove theorems about,limit, convergence,continuity, derivatives,and integrals.

First, consider the concept of limit. As we have already pointed out,since Newton thelimit had been thought of as a boundwhich could be approached closer and closer,though not surpassed. By 1800,with the work of L’Huilier and Lacroix on alternatingseries,the restriction that the limit be one-sided had been abandoned. Cauchysystematically translated this refined limit-concept into the algebra of inequalities,andused it in proofs once it had been so translated; thus he gave reality to the oft-repeatedeighteenth-century statement that the calculus could be based on limits.

For example, consider the concept of convergence. Maclaurin had said already that thesum of a series was the limit of the partial sums. For Cauchy, this meant somethingprecise. It meant that, given an one could find n such that, for more than n terms,thesum of the infinite series is within of the nth partial sum. That is the reverse of theerror-estimating procedure that d’Alembert had used. From his definition of a serieshaving a sum,Cauchy could prove that a geometric progression with radius less inabsolute value than 1 converged to its usual sum. As we have said, d’Alembert hadshown that the binomial series for, say, could be bounded above and below byconvergent geometric progressions. Cauchy assumed that if a series of positive terms isbounded above, term-by-term, by a convergent geometric progression,then it converges;he then used such comparisons to prove a number of tests for convergence:the root test,the ratio test,the logarithm test. The treatment is quite elegant [15]. Taking a techniqueused a few times by men like d’Alembert and Lagrange on an ad hoc basis inapproximations,and using the definition of the sum of a series based on the limitconcept, Cauchy created the first rigorous theory of convergence.

Let us now turn to the concept of continuity. Cauchy gave essentially the moderndefinition of continuous function,saying that the function is continuous on a giveninterval if for each x in that interval “the numerical [i.e., absolute] value of thedifference decreases indefinitely with ’’ [16]. He used this definitionin proving the intermediate value theorem for continuous functions [17]. The proofproceeds by examining a function on an interval, say where is negative,

is positive, and dividing the interval into m parts of width Cauchy considered the sign of the function at the points . . .,

unless one of the values of f is zero, there are two values of xdiffering by h such that f is negative at one, positive at the other. Repeating this processfor new intervals of width . . . gives an increasing sequence ofsc 2 bdym 2,sc 2 bdym,

f scd;f sb 1 sm 2 1dhd,f sb 1 hd,f sbd,

h 5 sc 2 bdym.fb, cgf scdf sbdfb, cg,f sxd

af sx 1 ad 2 f sxd

f sxd

s1 1 xdpyq

««,

7

Page 43: Calculus in Crisis - Saint Joseph's University

values of b, . . . for which f is negative, and a decreasing sequence of valuesof x: c, . . . for which f is positive, and such that the difference between and goes to zero. Cauchy asserted that these two sequences must have a common limit a. He then argued that since is continuous,the sequence of the negative values and of positive values both converge toward the common limit which musttherefore be zero.

Cauchy’s proof involves an already existing technique, which Lagrange had applied inapproximating real roots of polynomial equations. If a polynomial was negative for onevalue of the variable, positive for another, there was a root in between,and thedifference between those two values of the variable bounded the error made in takingeither as an approximation to the root [18]. Thus again we have the algebra ofinequalities providing a technique which Cauchy transformed from a tool ofapproximation to a tool of rigor.

It is worth remarking at this point that Cauchy, in his treatment both of convergence andof continuity, implicitly assumed various forms of the completeness property for the realnumbers. For instance, he treated as obvious that a series of positive terms,boundedabove by a convergent geometric progression,converges:also,his proof of theintermediate-value theorem assumes that a bounded monotone sequence has a limit.While Cauchy was the first systematically to exploit inequality proof techniques toprove theorems in analysis,he did not identify all the implicit assumptions about thereal numbers that such inequality techniques involve. Similarly, as the reader may havealready noticed, Cauchy’s definition of continuous function does not distinguishbetween what we now call point-wise and uniform continuity; also,in treating series offunctions,Cauchy did not distinguish between pointwise and uniform convergence. Theverbal formulations like “f or all’’ that are involved in choosing deltas did not distinguishbetween “f or any epsilon and for all x’’ and “f or any x, given any epsilon’’ [19]. Nor wasit at all clear in the 1820s how much depended on this distinction,since proofs aboutcontinuity and convergence were in themselves so novel. We shall see the sameconfusion between uniform and point-wise convergence as we turn now to Cauchy’stheory of the derivative.

Again we begin with an approximation. Lagrange gave the following inequality aboutthe derivative:

(2)

where V goes to 0 with h. He interpreted this to mean that, given any D, one can find hsufficiently small so that V is between and [20]. Clearly this is equivalent to (1) above, Cauchy’s delta-epsilon characterization of the derivative. But how didLagrange obtain this result? The answer is surprising; for Lagrange, formula (2) was aconsequence of Taylor’s theorem. Lagrange believed that any function (that is, anyanalytic expression,whether finite or infinite, involving the variable) had a uniquepower-series expansion (except possibly at a finite number of isolated points). This isbecause he believed that there was an “algebra of infinite series,’’ an algebra exemplifiedby work of Euler such as the example we gave above. And Lagrange said that the way tomake the calculus rigorous was to reduce it to algebra. Although there is no “algebra’’ ofinfinite series that gives power-series expansions without any consideration ofconvergence and limits,this assumption led Lagrange to define without reference tof9sxd

1D2D

f sx 1 hd 5 f sxd 1 hf9sxd 1 hV,

f sad,f sckdf sbkdf sxd

ckbkc2,c1,b2,b1,x:

8

Page 44: Calculus in Crisis - Saint Joseph's University

limits, as the coefficient of the linear term in h in the Taylor series expansion forFollowing Euler, Lagrange then said that, for any power series in h, one could

take h sufficiently small so that any given term of the series exceeded the sum of all therest of the terms following it; this approximation, said Lagrange, is assumed inapplications of the calculus to geometry and mechanics [21]. Applying thisapproximation to the linear term in the Taylor series produces (2),which I call theLagrange property of the derivative. (Like Cauchy’s (1),the inequality-translationLagrange gives for (2) assumes that, given any D, one finds h sufficiently small so

with no mention whatever of x.)

Not only did Lagrange state property (2) and the associated inequalities,he used them asa basis for a number of proofs about derivatives:for instance, to prove that a functionwith positive derivative on an interval is increasing there, to prove the mean-valuetheorem for derivatives,and to obtain the Lagrange remainder for the Taylor series.(Details may be found in the works cited in [22].) Lagrange also applied his results tocharacterize the properties of maxima and minima,and orders of contact betweencurves.

With a few modifications,Lagrange’s proofs are valid—provided that property (2) canbe justified. Cauchy borrowed and simplified what are in effect Lagrange’s inequalityproofs about derivatives,with a few improvements,basing them on his own (1). ButCauchy made these proofs legitimate because Cauchy defined the derivative precisely tosatisfy the relevant inequalities. Once again, the key properties come from anapproximation. For Lagrange, the derivative was exactly—no epsilons needed—thecoefficient of the linear term in the Taylor series; formula (2),and the correspondinginequality that lies between were approximations. Cauchybrought Lagrange’s inequality properties and proofs together with a definition ofderivative devised to make those techniques rigorously founded [22].

The last of the concepts we shall consider, the integral, followed an analogousdevelopment. In the eighteenth century, the integral was usually thought of as the inverseof the differential. But sometimes the inverse could not be computed exactly, so menlike Euler remarked that the integral could be approximated as closely as one liked by asum. Of course, the geometric picture of an area being approximated by rectangles,orthe Leibnizian definition of the integral as a sum,suggests this immediately. But what isimportant for our purposes is that much work was done on approximating the values ofdefinite integrals in the eighteenth century, including considerations of how small thesubintervals used in the sums should be when the function oscillates to a greater orlesser extent. For instance, Euler treated sums of the form

as approximations to the integral [23].

In 1820,S. D. Poisson,who was interested in complex integration and therefore moreconcerned than most people about the existence and behavior of integrals,asked thefollowing question. If the integral F is defined as the antiderivative of f, and if

can it be proved that is the limit of the sum

S 5 hf sad 1 hf sa 1 hd 1 . . . 1 hf sa 1 sn 2 1dhd

Fsbd 2 Fsad 5 eba f sxd dxb 2 a 5 nh,

exnxo

f sxd dx

on

k50f sxkdsxk11 2 xkd

hsf9sxd ± Dd,f sx 1 hd 2 f sxd

|V| ≤ D

f sx 1 hd.

9

Page 45: Calculus in Crisis - Saint Joseph's University

as h gets small? (S is an approximating sum of the eighteenth-century sort.) Poissoncalled this result “the fundamental proposition of the theory of definite integrals.’’He proved it by using another inequality-result:the Taylor series with remainder. First,he wrote as the telescoping sum

(3)

Then,for each of the terms of the form

Taylor’s series with remainder gives,since by definition

where for some Thus the telescoping sum (3) becomes

So and the sum Sdiffer by

Letting R be the maximum value for the

Therefore, if h is taken sufficiently small, differs from Sby less than anygiven quantity [24].

Poisson’s was the first attempt to prove the equivalence of the antiderivative and limit-of-sums conceptions of the integral. However, besides the implicit assumptions of theexistence of antiderivatives and bounded first derivatives for f on the given interval, theproof assumes that the subintervals on which the sum is taken are all equal. Should theresult not hold for unequal divisions also? Poisson thought so,and justified it by saying,“If the integral is represented by the area of a curve, this area will be the same, if wedivide the difference . . . into an infinite number of equal parts, or an infinite numberof unequal parts following any law’’ [25]. This,however, is an assertion, not a proof.And Cauchy saw that a proof was needed.

Cauchy did not like formalistic arguments in supposedly rigorous subjects,saying thatmost algebraic formulas hold “only under certain conditions,and for certain values ofthe quantities they contain’’ [26]. In particular, one could not assume that what workedfor finite expressions automatically worked for infinite ones. Thus,Cauchy showed thatthe sum of the series was by actually calculating the difference between the nth partial sum and and showing that it was arbitrarilysmall [27]. Similarly, just because there was an operation called taking a derivative didnot mean that the inverse of that operation always produced a result. The existence ofthe definite integral had to be proved. And how does one prove existence in the 1820s?

p2y6p2y61y1 1 1y4 1 1y9 1 . . .

Fsbd 2 Fsad

5 R ? nh ? hw 5 Rsb 2 adhw.

sR1 1 . . . 1 Rndh11w ≤ n ? Rsh11wdRk,

sR1 1. . .

1 Rndh11w.Fsbd 2 Fsad

1 sR1 1 . . . 1 Rndh11w.

hf sad 1 hf sa 1 hd 1 . . . 1 hf sa 1 sn 2 1dhd

Rk.w > 0,

hf sa 1 sk 2 1dhd 1 Rkh11w

Fsa 1 khd 2 Fsa 1 sk 2 1dhd 5

F9 5 f,

Fsa 1 khd 2 Fsa 1 sk 2 1dhd,

1 . . . 1 Fsbd 2 Fsa 1 sn 2 1dhd

Fsa 1 hd 2 Fsad 1 Fsa 1 2hd 2 Fsa 1 hd

Fsbd 2 Fsad

10

Page 46: Calculus in Crisis - Saint Joseph's University

One constructs the mathematical object in question by using an eighteenth-centuryapproximation that converges to it. Cauchy defined the integral as the limit of Euler-style sums

for sufficiently small. Assuming explicitly that was continuous on thegiven interval (and implicitly that it was uniformly continuous),Cauchy was able toshow that all sums of that form approach a fixed value, called by definition the integralof the function on that interval. This is an extremely hard proof [28]. Finally, borrowingfrom Lagrange the mean-value theorem for integrals,Cauchy proved the FundamentalTheorem of Calculus [29].

Conclusion. Here are all the pieces of the puzzle we originally set out to solve.Algebraic approximations produced the algebra of inequalities; eighteenth-centuryapproximations in the calculus produced the useful properties of the concepts ofanalysis:d’Alembert’s error-bounds for series,Lagrange’s inequalities about derivatives,Euler’s approximations to integrals. There was a new interest in foundations. All thatwas needed was a sufficiently great genius to build the new foundation.

Two men came close. In 1816,Carl Friedrich Gauss gave a rigorous treatment of theconvergence of the hypergeometric series,using the technique of comparing a serieswith convergent geometric progressions; however, Gauss did not give a generalfoundation for all of analysis. Bernhard Bolzano,whose work was little known until the1860’s,echoing Lagrange’s call to reduce the calculus to algebra,gave in 1817 adefinition of continuous function like Cauchy’s and then proved—by a differenttechnique from Cauchy’s—the intermediate-value theorem [30]. But it was Cauchy whogave rigorous definitions and proofs for all the basic concepts; it was he who realizedthe far-reaching power of the inequality-based limit concept; and it was he who gaveus—except for a few implicit assumptions about uniformity and about completeness—the modern rigorous approach to calculus.

Mathematicians are used to taking the rigorous foundations of the calculus as acompleted whole. What I have tried to do as a historian is to reveal what went intomaking up that great achievement. This needs to be done, because completed wholes bytheir nature do not reveal the separate strands that go into weaving them—especiallywhen the strands have been considerably transformed. In Cauchy’s work, though,onetrace indeed was left of the origin of rigorous calculus in approximations—the letterepsilon. The corresponds to the initial letter in the word “erreur’’ (or “error’’), andCauchy in fact used for “error’’ in some of his work on probability [31]. It is bothamusing and historically appropriate that the “ ,’’ once used to designate the “error’’ inapproximations,has become transformed into the characteristic symbol of precisionand rigor in the calculus. As Cauchy transformed the algebra of inequalities from a toolof approximation to a tool of rigor, so he transformed the calculus from a powerfulmethod of generating results to the rigorous subject we know today.

««

«

f sxdxk11 2 xk

o f sxkdsxk11 2 xkd

11

Page 47: Calculus in Crisis - Saint Joseph's University

References[1] A.-L. Cauchy, Cours d’analyse, Paris, 1821; in Oeuvres complètes d’Augustin

Cauchy, series 2,vol. 3,Paris, Gauthier-Villars,1899,p. 19.

[2] A.-L. Cauchy, Résumé des leçons données à l’école royale polytechnique sur lecalcul infinitésimal,Paris, 1823; in Oeuvres,series 2,vol. 4,p. 44. Cauchy usedi forthe increment; otherwise the notation is his.

[3] Isaac Newton,Mathematical Principles of Natural Philosophy, 3rd ed., 1726,tr A.Motte, revised by Florian Cajori, University of California Press,Berkeley, 1934,Scholium to Lemma XI,p. 39.

[4] Johann Bernoulli, Opera Omnia,IV, 8; section entitled “De seriebus varia,Corollarium III,’’ cited by D. J. Struik, A Source Book in Mathematics,1200–1800,Harvard, Cambridge, 1969,p. 321.

[5] Boyer, History of Mathematics,p. 487; Euler’s paper is in Comm. Acad. Sci.Petrop.,7, 1734–5,pp. 123–34; in Leonhard Euler, Opera omnia,series 1,vol. 14,pp. 73–86.

[6] J. V. Grabiner, The Origins of Cauchy’s Rigorous Calculus,M. I. T. Press,Cambridge and London,1981,chapter 2.

[7] J. d’Alembert, Réflexions sur les suites et sur les racines imaginaires,in Opusculesmathématiques,vol. 5,Briasson,Paris, 1768,pp. 171–215; see especially pp. 175–178.

[8] J.-L. Lagrange, Traité de la résolution des équations numériques de tous les degrés,2nd ed., Courcier, Paris, 1808; in Oeuvres de Lagrange, Gauthier-Villars,Paris,1867–1892,vol. 8,pp. 162–163.

[9] Lagrange, Théorie des fonctions analytiques,2nd ed., Paris, 1813,in Oeuvres,vol 9,pp. 80–85; compare Lagrange, Leçons sur le calcul des fonctions,Paris, 1806,in Oeuvres,vol. 10,pp. 91–95.

[10] Grabiner, Origins of Cauchy’s Rigorous Calculus,pp. 56–68; compare H.Goldstine, A History of Numerical Analysis from the 16th through the 19thCentury, Springer-Verlag, New York, Heidelberg, Berlin, 1977,chapters 2–4.

[11] George Berkeley, The Analyst, section 35.

[12] Analyst, section 15. Berkeley used the function where we have used and aNewtonian notation, lower-case o, for the increment.

[13] Letter from Lagrange to d’Alembert, 24 February 1772,in Oeuvres de Lagrange,vol. 13,p. 229.

[14] D. Diderot, De l’interprétation de la nature, in Oeuvres philosophiques,ed., P.Vernière, Garnier, Paris, 1961,pp. 180–181.

[15] Cauchy, Cours d’analyse, Oeuvres,series 2,vol. 3; for real-valued series,seeespecially pp. 114–138.

[16] Cauchy, op. cit.,p. 43. So did Bolzano; see below, and note 30.

x2,x n

12

Page 48: Calculus in Crisis - Saint Joseph's University

[17] Cauchy, op. cit.,pp. 378–380. For an English translation of this proof, see Grabiner,Origins,pp. 167–168. For clarity, I have substituted and c,

for Cauchy’s and X, in the present version.

[18] Lagrange, Equations numériques,sections 2 and 6,in Oeuvres,vol. 8; also inLagrange, Leçons élémentaires sur les mathématiques données à l’école normale en 1795,Séances des Ecoles Normales,Paris, 1794–1795; in Oeuvres,vol. 7,pp. 181–288; this method is on pp. 260–261.

[19] I. Grattan-Guinness,Development of the Foundations of Mathematical Analysisfrom Euler to Riemann,M. I. T. Press,Cambridge and London,1970,p. 123,putsit well: “Unif orm convergence was tucked away in the word “always,’’ with noreference to the variable at all.’’

[20] Lagrange, Leçons sur le calcul des fonctions,Oeuvres 10,p. 87; compareLagrange, Théorie des fonctions analytiques,Oeuvres 9,p. 77. I have substituted hfor the i Lagrange used for the increment.

[21] Lagrange, Théorie des fonctions analytiques,Oeuvres9, p. 29. Compare Leçons surle calcul des fonctions,Oeuvres 10,p. 101. For Euler, see his Institutiones calculidifferentialis, St. Petersburg, 1755; in Opera, series 1,vol. 10,section 122.

[22] Grabiner, Origins of Cauchy’s Rigorous Calculus,chapter5; also J. V. Grabiner,The origins of Cauchy’s theory of the derivative, Hist. Math., 5, 1978,pp. 379–409.

[23] The notation is modernized. For Euler, see Institutiones calculi integralis,St. Petersburg, 1768–1770,3 vols; in Opera, series 1,vol. 11,p. 184. Eighteenth-century summations approximating integrals are treated in A. P. Iushkevich,O vozniknoveniya poiyatiya ob opredelennom integrale Koshi,Trudy InstitutaIstorii Estestvoznaniya,Akademia Nauk SSSR,vol. 1,1947,pp. 373–411.

[24] S. D. Poisson,Suite du mémoire sur les intégrales définies,Journ. de l’Ecolepolytechnique, Cah. 18,11,1820,pp.295–341,319–323. I have substituted h, wfor Poisson’s k, and have used for his

[25] Poisson,op. cit.,pp. 329–330.

[26] Cauchy, Cours d’analyse, Introduction,Oeuvres,Series 2,vol. 3,p. iii.

[27] Cauchy, Cours d’analyse, Note VIII, Oeuvres,series 2,vol. 3,pp. 456–457.

[28] Cauchy, Calcul infinitésimal,Oeuvres,series 2,vol. 4,122–25; in Grabiner, Originsof Cauchy’s Rigorous Calculus,pp. 171–175 in English translation.

[29] Cauchy, op. cit.,pp. 151–152.

[30] B. Bolzano,Rein analytischer Beweis des Lehrsatzes dass zwischen je zweyWerthen,die ein entgegengesetztes Resultat gewaehren,wenigstens eine reele Wurzel der Gleichung liege, Prague, 1817. English version,S. B. Russ,A translation of Bolzano’s paper on the intermediate value theorem,Hist. Math., 7,1980,pp. 156–185. The contention by Grattan-Guinness,Foundations,p. 54,thatCauchy took his program of rigorizing analysis,definition of continuity, Cauchycriterion, and proof of the intermediate-value theorem,from Bolzano’s paperwithout acknowledgement is not,in my opinion,valid; the similarities are better

R0.R1a,

X99, . . .X9,x2, . . .x1,x0,c2, . . .c1,b2, . . .b1,b,

13

Page 49: Calculus in Crisis - Saint Joseph's University

explained by common prior influences,especially that of Lagrange. For adocumented argument to this effect,see J. V. Grabiner, Cauchy and Bolzano:Tradition and transformation in the history of mathematics, to appear in E.Mendelsohn,Transformation and Tradition in the Sciences,Cambridge UniversityPress,Cambridge, 1984,pp. 105–124; see also Grabiner, Origins of Cauchy’sRigorous Calculus,pp.69–75,102–105,94–96,52–53.

[31] Cauchy, Sur la plus grande erreur à craindre dans un résultat moyen,et sur lesystème de facteurs qui rend cette plus grande erreur un minimum,Comptes rendus37,1853; in Oeuvres,series 1,vol. 12,pp. 114–124.

14