Chapter 18 of 47 · 4000 words · ~20 min read

Part 18

30. This method had met with opposition from the first. Christiaan Huygens, whose opinion carried more weight than that of any other scientific man of the day, declared that the employment of differentials was unnecessary, and that Leibnitz's second differential was meaningless (1691). A Dutch physician named Bernhard Nieuwentijt attacked the method on account of the use of quantities which are at one stage of the process treated as somethings and at a later stage as nothings, and he was especially severe in commenting upon the second and higher differentials (1694, 1695). Other attacks were made by Michel Rolle (1701), but they were directed rather against matters of detail than against the general principles. The fact is that, although Leibnitz in his answers to Nieuwentijt (1695), and to Rolle (1702), indicated that the processes of the calculus could be justified by the methods of the ancient geometry, he never expressed himself very clearly on the subject of differentials, and he conveyed, probably without intending it, the impression that the calculus leads to correct results by compensation of errors. In England the method of fluxions had to face similar attacks. George Berkeley, bishop and philosopher, wrote in 1734 a tract entitled _The Analyst; or a Discourse addressed to an Infidel Mathematician_, in which he proposed to destroy the presumption that the opinions of mathematicians in matters of faith are likely to be more trustworthy than those of divines, by contending that in the much vaunted fluxional calculus there are mysteries which are accepted unquestioningly by the mathematicians, but are incapable of logical demonstration. Berkeley's criticism was levelled against all infinitesimals, that is to say, all quantities vaguely conceived as in some intermediate state between nullity and finiteness, as he took Newton's moments to be conceived. The tract occasioned a controversy which had the important consequence of making it plain that all arguments about infinitesimals must be given up, and the calculus must be founded on the method of limits. During the controversy Benjamin Robins gave an exceedingly clear explanation of Newton's theories of fluxions and of prime and ultimate ratios regarded as theories of limits. In this explanation he pointed out that Newton's _moment_ (Leibnitz's "differential") is to be regarded as so much of the actual difference between two neighbouring values of a variable as is needful for the formation of the fluxion (or differential coefficient) (see G. A. Gibson, "The Analyst Controversy," _Proc. Math. Soc._, Edinburgh, xvii., 1899). Colin Maclaurin published in 1742 a _Treatise of Fluxions_, in which he reduced the whole theory to a theory of limits, and demonstrated it by the method of Archimedes. This notion was gradually transferred to the continental mathematicians. Leonhard Euler in his _Institutiones Calculi differentialis_ (1755) was reduced to the position of one who asserts that all differentials are zero, but, as the product of zero and any finite quantity is zero, the ratio of two zeros can be a finite quantity which it is the business of the calculus to determine. Jean le Rond d'Alembert in the _Encyclopédie méthodique_ (1755, 2nd ed. 1784) declared that differentials were unnecessary, and that Leibnitz's calculus was a calculus of mutually compensating errors, while Newton's method was entirely rigorous. D'Alembert's opinion of Leibnitz's calculus was expressed also by Lazare N. M. Carnot in his _Réflexions sur la métaphysique du calcul infinitésimal_ (1799) and by Joseph Louis de la Grange (generally called Lagrange) in writings from 1760 onwards. Lagrange proposed in his _Théorie des fonctions analytiques_ (1797) to found the whole of the calculus on the theory of series. It was not until 1823 that a treatise on the differential calculus founded upon the method of limits was published. The treatise was the _Résumé des leçons ... sur le calcul infinitésimal_ of Augustin Louis Cauchy. Since that time it has been understood that the use of the phrase "infinitely small" in any mathematical argument is a figurative mode of expression pointing to a limiting process. In the opinion of many eminent mathematicians such modes of expression are confusing to students, but in treatises on the calculus the traditional modes of expression are still largely adopted.

Arithmetical basis of modern analysis.

31. Defective modes of expression did not hinder constructive work. It was the great merit of Leibnitz's symbolism that a mathematician who used it knew what was to be done in order to formulate any problem analytically, even though he might not be absolutely clear as to the proper interpretation of the symbols, or able to render a satisfactory account of them. While new and varied results were promptly obtained by using them, a long time elapsed before the theory of them was placed on a sound basis. Even after Cauchy had formulated his theory much remained to be done, both in the rapidly growing department of complex variables, and in the regions opened up by the theory of expansions in trigonometric series. In both directions it was seen that rigorous demonstration demanded greater precision in regard to fundamental notions, and the requirement of precision led to a gradual shifting of the basis of analysis from geometrical intuition to arithmetical law. A sketch of the outcome of this movement--the "arithmetization of analysis," as it has been called--will be found in FUNCTION. Its general tendency has been to show that many theories and processes, at first accepted as of general validity, are liable to exceptions, and much of the work of the analysts of the latter half of the 19th century was directed to discovering the most general conditions in which particular processes, frequently but not universally applicable, can be used without scruple.

III. _Outlines of the Infinitesimal Calculus._

32. The general notions of functionality, limits and continuity are explained in the article FUNCTION. Illustrations of the more immediate ways in which these notions present themselves in the development of the differential and integral calculus will be useful in what follows.

[Illustration: FIG. 8.]

Geometrical limits.

Tangents.

33. Let y be given as a function of x, or, more generally, let x and y be given as functions of a variable t. The first of these cases is included in the second by putting x = t. If certain conditions are satisfied the aggregate of the points determined by the functional relations form a curve. The first condition is that the aggregate of the values of t to which values of x and y correspond must be continuous, or, in other words, that these values must consist of all real numbers, or of all those real numbers which lie between assigned extreme numbers. When this condition is satisfied the points are "ordered," and their order is determined by the order of the numbers t, supposed to be arranged in order of increasing or decreasing magnitude; also there are two senses of description of the curve, according as t is taken to increase or to diminish. The second condition is that the aggregate of the points which are determined by the functional relations must be "continuous." This condition means that, if any point P determined by a value of t is taken, and any distance [delta], however small, is chosen, it is possible to find two points Q, Q´ of the aggregate which are such that (i.) P is between Q and Q´, (ii.) if R, R´ are any points between Q and Q´ the distance RR´ is less than [delta]. The meaning of the word "between" in this statement is fixed by the ordering of the points. Sometimes additional conditions are imposed upon the functional relations before they are regarded as defining a curve. An aggregate of points which satisfies the two conditions stated above is sometimes called a "Jordan curve." It by no means follows that every curve of this kind has a tangent. In order that the curve may have a tangent at P it is necessary that, if any angle [alpha], however small, is specified, a distance [delta] can be found such that when P is between Q and Q´, and PQ and PQ´ are less than [delta], the angle RPR´ is less than [alpha] for all pairs of points R, R´ which are between P and Q, or between P and Q´ (fig. 8). When this condition is satisfied y is a function of x which has a differential coefficient. The only way of finding out whether this condition is satisfied or not is to attempt to form the differential coefficient. If the quotient of differences [Delta]y/[Delta]x has a limit when [Delta]x tends to zero, y is a differentiable function of x, and the limit in question is the differential coefficient. The derived function, or differential coefficient, of a function [f](x) is always defined by the formula

d[f](x) [f](x + h) - [f](x) [f]´(x) = ------- = lim. -------------------. dx h=0 h

Rules for the formation of differential coefficients in particular cases have been given in § 11 above. The definition of a differential coefficient, and the rules of differentiation are quite independent of any geometrical interpretation, such as that concerning tangents to a curve, and the tangent to a curve is properly defined by means of the differential coefficient of a function, not the differential coefficient by means of the tangent.

Progressive and Regressive Differential Coefficients.

It may happen that the limit employed in defining the differential coefficient has one value when h approaches zero through positive values, and a different value when h approaches zero through negative values. The two limits are then called the "progressive" and "regressive" differential coefficients. In applications to dynamics, when x denotes a coordinate and t the time, dx/dt denotes a velocity. If the velocity is changed suddenly the progressive differential coefficient measures the velocity just after the change, and the regressive differential coefficient measures the velocity just before the change. Variable velocities are properly defined by means of differential coefficients.

Areas.

Lengths of Curves.

All geometrical limits may be specified in terms similar to those employed in specifying the tangent to a curve; in difficult cases they must be so specified. Geometrical intuition may fail to answer the question of the existence or non-existence of the appropriate limits. In the last resort the definitions of many quantities of geometrical import must be analytical, not geometrical. As illustrations of this statement we may take the definitions of the areas and lengths of curves. We may not assume that every curve has an area or a length. To find out whether a curve has an area or not, we must ascertain whether the limit expressed by [f]ydx exists. When the limit exists the curve has an area. The definition of the integral is quite independent of any geometrical interpretation. The length of a curve again is defined by means of a limiting process. Let P, Q be two points of a curve, and R1, R2, ... R_(n-1) a set of intermediate points of the curve, supposed to be described in the sense in which Q comes after P. The points R are supposed to be reached successively in the order of the suffixes when the curve is described in this sense. We form a sum of lengths of chords

PR1 + R1R2 + ... + R_(n-1)Q.

If this sum has a limit when the number of the points R is increased indefinitely and the lengths of all the chords are diminished indefinitely, this limit is the length of the arc PQ. The limit is the same whatever law may be adopted for inserting the intermediate points R and diminishing the lengths of the chords. It appears from this statement that the differential element of the arc of a curve is the length of the chord joining two neighbouring points. In accordance with the fundamental artifice for forming differentials (§§ 9, 10), the differential element of arc ds may be expressed by the formula

ds = [root] {(dx)² + (dy)²},

of which the right-hand member is really the measure of the distance between two neighbouring points on the tangent. The square root must be taken to be positive. We may describe this differential element as being so much of the actual arc between two neighbouring points as need be retained for the purpose of forming the integral expression for an arc. This is a description, not a definition, because the length of the short arc itself is only definable by means of the integral expression. Similar considerations to those used in defining the areas of plane figures and the lengths of plane curves are applicable to the formation of expressions for differential elements of volume or of the areas of curved surfaces.

Constants of Integration.

34. In regard to differential coefficients it is an important theorem that, if the derived function [f]´(x) vanishes at all points of an interval, the function [f](x) is constant in the interval. It follows that, if two functions have the same derived function they can only differ by a constant. Conversely, indefinite integrals are indeterminate to the extent of an additive constant.

Higher Differential Coefficients.

35. The differential coefficient dy/dx, or the derived function [f]´(x), is itself a function of x, and its differential coefficient is denoted by [f]´´(x) or d²y/dx². In the second of these notations d/dx is regarded as the symbol of an operation, that of differentiation with respect to x, and the index 2 means that the operation is repeated. In like manner we may express the results of n successive differentiations by [f]^(n)(x) or by d^n·y/dx^n. When the second differential coefficient exists, or the first is differentiable, we have the relation

[f](x + h) - 2[f](x) + [f](x - h) [f]´´(x) = lim. --------------------------------- (i.) h=0 h²

The limit expressed by the right-hand member of this equation may exist in cases in which [f]´(x) does not exist or is not differentiable. The result that, when the limit here expressed can be shown to vanish at all points of an interval, then [f](x) must be a linear function of x in the interval, is important.

The relation (i.) is a particular case of the more general relation

[f]^(n)(x) = lim.(h=0) h^-n [[f](x + nh) -n[f] {(x + (n - 1)h}

n(n - 1) + -------- [f]{x + (n - 2)h} - ... +(-1)^n [f](x)]. (ii.) 2!

As in the case of relation (i.) the limit expressed by the right-hand member may exist although some or all of the derived functions [f]´(x), [f]´´(x), ... [f]^(n-1)(x) do not exist.

Corresponding to the rule iii. of § 11 we have the rule for forming the nth differential coefficient of a product in the form

d^n(uv) d^n v du d^(n-1)v n(n - 1) d²u d^(n-2)v d^n u ------- = u ----- + n -- -------- + -------- ---- -------- + ... + ----- v, dx^n dx^n dx dx^(n-1) 1.2 dx² dx^(n-2) dx^n

where the coefficients are those of the expansion of (1 + x)^n in powers of x (n being a positive integer). The rule is due to Leibnitz, (1695).

_Differentials of higher orders_ may be introduced in the same way as the differential of the first order. In general when y = [f](x), the nth differential d^n·y is defined by the equation

d^n·y = [f]^n(x)(dx)^n,

in which dx is the (arbitrary) differential of x.

Symbols of operation.

When d/dx is regarded as a single symbol of operation the symbol [f] ... dx represents the inverse operation. If the former is denoted by D, the latter may be denoted by D^-1. D^n means that the operation D is to be performed n times in succession; D^-n that the operation of forming the indefinite integral is to be performed n times in succession. Leibnitz's course of thought (§ 24) naturally led him to inquire after an interpretation of D^n. where n is not an integer. For an account of the researches to which this inquiry gave rise, reference may be made to the article by A. Voss in _Ency. d. math. Wiss._ Bd. ii. A, 2 (Leipzig, 1889). The matter is referred to as "fractional" or "generalized" differentiation.

[Illustration: FIG. 9.]

Theorem of Intermediate Value.

36. After the formation of differential coefficients the most important theorem of the differential calculus is the _theorem of intermediate value_ ("theorem of mean value," "theorem of finite increments," "Rolle's theorem," are other names for it). This theorem may be explained as follows: Let A, B be two points of a curve y = [f](x) (fig. 9). Then there is a point P between A and B at which the tangent is parallel to the secant AB. This theorem is expressed analytically in the statement that if [f]´(x) is continuous between a and b, there is a value x1 of x between a and b which has the property expressed by the equation

[f](b) - [f](a) --------------- = [f]´(x1). (i.) b - a

The value x1 can be expressed in the form a + [theta](b - a) where [theta] is a number between 0 and 1.

A slightly more general theorem was given by Cauchy (1823) to the effect that, if [f]´(x) and F´(x) are continuous between x = a and x = b, then there is a number [theta] between 0 and 1 which has the property expressed by the equation

F(b) - F(a) F´{a + [theta](b - a)} --------------- = ------------------------. [f](b) - [f](a) [f]´{a + [theta](b - a)}

The theorem expressed by the relation (i.) was first noted by Rolle (1690) for the case where [f](x) is a rational integral function which vanishes when x = a and also when x = b. The general theorem was given by Lagrange (1797). Its fundamental importance was first recognized by Cauchy (1823). It may be observed here that the theorem of integral calculus expressed by the equation _ / b F(b) - F(a) = | F´(x) dx _/ a

follows at once from the definition of an integral and the theorem of intermediate value.

The theorem of intermediate value may be generalized in the statement that, if [f](x) and all its differential coefficients up to the nth inclusive are continuous in the interval between x = a and x = b, then there is a number [theta] between 0 and 1 which has the property expressed by the equation

(b - a)² (b - a)^(n-1) [f](b) = [f](a) + (b - a)[f]´(a) + -------- [f]´´ (a) + ... + ------------- [f]^(n-1)(a) 2! (n - 1)!

(b - a)^n + --------- [f]^(n) {a + [theta](b - a)}. (i.) n!

Taylor's Theorem.

37. This theorem provides a means for computing the values of a function at points near to an assigned point when the value of the function and its differential coefficients at the assigned point are known. The function is expressed by a terminated series, and, when the remainder tends to zero as n increases, it may be transformed into an infinite series. The theorem was first given by Brook Taylor in his _Methodus Incrementorum_ (1717) as a corollary to a theorem concerning finite differences. Taylor gave the expression for [f](x + z) in terms of [f](x), [f]´(x), ... as an infinite series proceeding by powers of z. His notation was that appropriate to the method of fluxions which he used. This rule for expressing a function as an infinite series is known as Taylor's theorem. The relation (i.), in which the remainder after n terms is put in evidence, was first obtained by Lagrange (1797). Another form of the remainder was given by Cauchy (1823) viz.,

(b - a)^n --------- (1 - [theta])^(n-1) [f]^n {a + [theta](b - a)}. (n - 1)!

The conditions of validity of Taylor's expansion in an infinite series have been investigated very completely by A. Pringsheim (_Math. Ann._ Bd. xliv., 1894). It is not sufficient that the function and all its differential coefficients should be finite at x = a; there must be a _neighbourhood_ of a within which Cauchy's form of the remainder tends to zero as n increases (cf. FUNCTION).

An example of the necessity of this condition is afforded by the function f(x) which is given by the equation

__ n = [oo] 1 \ (-1)^n 1 [f](x) = ------ + ) ------ ------------ (i.) 1 + x² /__ n = 1 n! 1 + 3^(2n)x²

The sum of the series

x² [f](0) + x[f]´(0) + -- [f]´´(0) + ... (ii.) 2!

is the same as that of the series

e^-1 - x² e^-3² + x^4 e^(-3^4) - ...

It is easy to prove that this is less than e^-1 when x lies between 0 and 1, and also that f(x) is greater than e^-l when x = 1/[root]3. Hence the sum of the series (i.) is not equal to the sum of the series (ii.).

The particular case of Taylor's theorem in which a = 0 is often called Maclaurin's theorem, because it was first explicitly stated by Colin Maclaurin in his _Treatise of Fluxions_ (1742). Maclaurin like Taylor worked exclusively with the fluxional calculus.

Expansions in power series.

Examples of expansions in series had been known for some time. The series for log (1 + x) was obtained by Nicolaus Mercator (1668) by expanding (1 + x)^-1 by the method of algebraic division, and integrating the series term by term. He regarded his result as a "quadrature of the hyperbola." Newton (1669) obtained the expansion of sin^-1 x by expanding (l - x²)^-½ by the binomial theorem and integrating the series term by term. James Gregory (1671) gave the series for tan^-1 x. Newton also obtained the series for sin x, cos x, and e^x by reversion of series (1669). The symbol e for the base of the Napierian logarithms was introduced by Euler (1739). All these series can be obtained at once by Taylor's theorem. James Gregory found also the first few terms of the series for tan x and sec x; the terms of these series may be found successively by Taylor's theorem, but the numerical coefficient of the general term cannot be obtained in this way.

Taylor's theorem for the expansion of a function in a power series was the basis of Lagrange's theory of functions, and it is fundamental also in the theory of analytic functions of a complex variable as developed later by Karl Weierstrass. It has also numerous applications to problems of maxima and minima and to analytical geometry. These matters are treated in the appropriate articles.

The forms of the coefficients in the series for tan x and sec x can be expressed most simply in terms of a set of numbers introduced by James Bernoulli in his treatise on probability entitled _Ars Conjectandi_ (1713). These numbers B1, B2, ... called Bernoulli's numbers, are the coefficients so denoted in the formula

x x B1 B2 B3 ------- = 1 - --- + -- x² - -- x^4 + -- x^6 - ..., e^x - 1 2 2! 4! 6!

and they are connected with the sums of powers of the reciprocals of the natural numbers by equations of the type

(2n)! / 1 1 1 \ B_n = ------------------ ( ------ + ------ + ------ + ... ). 2^(2n-1) [pi]^(2n) \ 1^(2n) 2^(2n) 3^(2n) /

The function

m m·m - 1 x^m - --- x^(m-1) + ------- B1 x^(m-2) - ... 2 2!

has been called Bernoulli's function of the mth order by J. L. Raabe (Crelle's _J. f. Math._ Bd. xlii., 1851). Bernoulli's numbers and functions are of especial importance in the calculus of finite differences (see the article by D. Seliwanoff in _Ency. d. math. Wiss._ Bd. i., E., 1901).

When x is given in terms of y by means of a power series of the form

x = y(C0 + C1y + C2y² + ...) (C0 [not eq.] 0) = y [f]0(y), say,

there arises the problem of expressing y as a power series in x. This problem is that of _reversion of series_. It can be shown that provided the absolute value of x is not too great,