Chapter 43 of 62 · 10865 words · ~54 min read

CHAPTER XII.

THE INDUCTIVE OR INVERSE APPLICATION OF THE THEORY OF PROBABILITY.

We have hitherto considered the theory of probability only in its simple deductive employment, in which it enables us to determine from given conditions the probable character of events happening under those conditions. But as deductive reasoning when inversely applied constitutes the process of induction, so the calculation of probabilities may be inversely applied; from the known character of certain events we may argue backwards to the probability of a certain law or condition governing those events. Having satisfactorily accomplished this work, we may indeed calculate forwards to the probable character of future events happening under the same conditions; but this part of the process is a direct use of deductive reasoning (p. 226).

Now it is highly instructive to find that whether the theory of probability be deductively or inductively applied, the calculation is always performed according to the principles and rules of deduction. The probability that an event has a particular condition entirely depends upon the probability that if the condition existed the event would follow. If we take up a pack of common playing cards, and observe that they are arranged in perfect numerical order, we conclude beyond all reasonable doubt that they have been thus intentionally arranged by some person acquainted with the usual order of sequence. This conclusion is quite irresistible, and rightly so; for there are but two suppositions which we can make as to the reason of the cards being in that particular order:--

1. They may have been intentionally arranged by some one who would probably prefer the numerical order.

2. They may have fallen into that order by chance, that is, by some series of conditions which, being unknown to us, cannot be known to lead by preference to the particular order in question.

The latter supposition is by no means absurd, for any one order is as likely as any other when there is no preponderating tendency. But we can readily calculate by the doctrine of permutations the probability that fifty-two objects would fall by chance into any one particular order. Fifty-two objects can be arranged in 52 × 51 × ... × 3 × 2 × 1 or about 8066 × (10)^{64} possible orders, the number obtained requiring 68 places of figures for its full expression. Hence it is excessively unlikely that anyone should ever meet with a pack of cards arranged in perfect order by accident. If we do meet with a pack so arranged, we inevitably adopt the other supposition, that some person, having reasons for preferring that special order, has thus put them together.

We know that of the immense number of possible orders the numerical order is the most remarkable; it is useful as proving the perfect constitution of the pack, and it is the intentional result of certain games. At any rate, the probability that intention should produce that order is incomparably greater than the probability that chance should produce it; and as a certain pack exists in that order, we rightly prefer the supposition which most probably leads to the observed result.

By a similar mode of reasoning we every day arrive, and validly arrive, at conclusions approximating to certainty. Whenever we observe a perfect resemblance between two objects, as, for instance, two printed pages, two engravings, two coins, two foot-prints, we are warranted in asserting that they proceed from the same type, the same plate, the same pair of dies, or the same boot. And why? Because it is almost impossible that with different types, plates, dies, or boots some apparent distinction of form should not be produced. It is impossible for the hand of the most skilful artist to make two objects alike, so that mechanical repetition is the only probable explanation of exact similarity.

We can often establish with extreme probability that one document is copied from another. Suppose that each document contains 10,000 words, and that the same word is incorrectly spelt in each. There is then a probability of less than 1 in 10,000 that the same mistake should be made in each. If we meet with a second error occurring in each document, the probability is less than 1 in 10,000 × 9999, that two such coincidences should occur by chance, and the numbers grow with extreme rapidity for more numerous coincidences. We cannot make any precise calculations without taking into account the character of the errors committed, concerning the conditions of which we have no accurate means of estimating probabilities. Nevertheless, abundant evidence may thus be obtained as to the derivation of documents from each other. In the examination of many sets of logarithmic tables, six remarkable errors were found to be present in all but two, and it was proved that tables printed at Paris, Berlin, Florence, Avignon, and even in China, besides thirteen sets printed in England between the years 1633 and 1822, were derived directly or indirectly from some common source.[150] With a certain amount of labour, it is possible to establish beyond reasonable doubt the relationship or genealogy of any number of copies of one document, proceeding possibly from parent copies now lost. The relations between the manuscripts of the New Testament have been elaborately investigated in this manner, and the same work has been performed for many classical writings, especially by German scholars.

[150] Lardner, *Edinburgh Review*, July 1834, p. 277.

*Principle of the Inverse Method.*

The inverse application of the rules of probability entirely depends upon a proposition which may be thus stated, nearly in the words of Laplace.[151] *If an event can be produced by any one of a certain number of different causes, all equally probable à priori, the probabilities of the existence of these causes as inferred from the event, are proportional to the probabilities of the event as derived from these causes.* In other words, the most probable cause of an event which has happened is that which would most probably lead to the event supposing the cause to exist; but all other possible causes are also to be taken into account with probabilities proportional to the probability that the event would happen if the cause existed. Suppose, to fix our ideas clearly, that E is the event, and C_{1} C_{2} C_{3} are the three only conceivable causes. If C_{1} exist, the probability is *p*_{1} that E would follow; if C_{2} or C_{3} exist, the like probabilities are respectively *p*_{2} and *p*_{3}. Then as *p*_{1} is to *p*_{2}, so is the probability of C_{1} being the actual cause to the probability of C_{2} being it; and, similarly, as *p*_{2} is to *p*_{3}, so is the probability of C_{2} being the actual cause to the probability of C_{3} being it. By a simple mathematical process we arrive at the conclusion that the actual probability of C_{1} being the cause is

*p*_{1}/(*p*_{1} + *p*_{2} + *p*_{3});

[151] *Mémoires par divers Savans*, tom. vi.; quoted by Todhunter in his *History of the Theory of Probability*, p. 458.

and the similar probabilities of the existence of C_{2} and C_{3} are,

*p*_{2}/(*p*_{1} + *p*_{2} + *p*_{3}) and *p*_{3}/(*p*_{1} + *p*_{2} + *p*_{3}).

The sum of these three fractions amounts to unity, which correctly expresses the certainty that one cause or other must be in operation.

We may thus state the result in general language. *If it is certain that one or other of the supposed causes exists, the probability that any one does exist is the probability that if it exists the event happens, divided by the sum of all the similar probabilities.* There may seem to be an intricacy in this subject which may prove distasteful to some readers; but this intricacy is essential to the subject in hand. No one can possibly understand the principles of inductive reasoning, unless he will take the trouble to master the meaning of this rule, by which we recede from an event to the probability of each of its possible causes.

This rule or principle of the indirect method is that which common sense leads us to adopt almost instinctively, before we have any comprehension of the principle in its general form. It is easy to see, too, that it is the rule which will, out of a great multitude of cases, lead us most often to the truth, since the most probable cause of an event really means that cause which in the greatest number of cases produces the event. Donkin and Boole have given demonstrations of this principle, but the one most easy to comprehend is that of Poisson. He imagines each possible cause of an event to be represented by a distinct ballot-box, containing black and white balls, in such a ratio that the probability of a white ball being drawn is equal to that of the event happening. He further supposes that each box, as is possible, contains the same total number of balls, black and white; then, mixing all the contents of the boxes together, he shows that if a white ball be drawn from the aggregate ballot-box thus formed, the probability that it proceeded from any particular ballot-box is represented by the number of white balls in that particular box, divided by the total number of white balls in all the boxes. This result corresponds to that given by the principle in question.[152]

[152] Poisson, *Recherches sur la Probabilité des Jugements*, Paris, 1837, pp. 82, 83.

Thus, if there be three boxes, each containing ten balls in all, and respectively containing seven, four, and three white balls, then on mixing all the balls together we have fourteen white ones; and if we draw a white ball, that is if the event happens, the probability that it came out of the first box is 7/14; which is exactly equal to (7/10)/(7/10 + 4/10 + 3/10), the fraction given by the rule of the Inverse Method.

*Simple Applications of the Inverse Method.*

In many cases of scientific induction we may apply the principle of the inverse method in a simple manner. If only two, or at the most a few hypotheses, may be made as to the origin of certain phenomena, we may sometimes easily calculate the respective probabilities. It was thus that Bunsen and Kirchhoff established, with a probability little short of certainty, that iron exists in the sun. On comparing the spectra of sunlight and of the light proceeding from the incandescent vapour of iron, it became apparent that at least sixty bright lines in the spectrum of iron coincided with dark lines in the sun’s spectrum. Such coincidences could never be observed with certainty, because, even if the lines only closely approached, the instrumental imperfections of the spectroscope would make them apparently coincident, and if one line came within half a millimetre of another, on the map of the spectra, they could not be pronounced distinct. Now the average distance of the solar lines on Kirchhoff’s map is 2 mm., and if we throw down a line, as it were, by pure chance on such a map, the probability is about one-half that the new line will fall within 1/2 mm. on one side or the other of some one of the solar lines. To put it in another way, we may suppose that each solar line, either on account of its real breadth, or the defects of the instrument, possesses a breadth of 1/2 mm., and that each line in the iron spectrum has a like breadth. The probability then is just one-half that the centre of each iron line will come by chance within 1 mm. of the centre of a solar line, so as to appear to coincide with it. The probability of casual coincidence of each iron line with a solar line is in like manner 1/2. Coincidence in the case of each of the sixty iron lines is a very unlikely event if it arises casually, for it would have a probability of only (1/2)^{60} or less than 1 in a trillion. The odds, in short, are more than a million million millions to unity against such casual coincidence.[153] But on the other hypothesis, that iron exists in the sun, it is highly probable that such coincidences would be observed; it is immensely more probable that sixty coincidences would be observed if iron existed in the sun, than that they should arise from chance. Hence by our principle it is immensely probable that iron does exist in the sun.

[153] Kirchhoff’s *Researches on the Solar Spectrum*. First part, translated by Roscoe, pp. 18, 19.

All the other interesting results, given by the comparison of spectra, rest upon the same principle of probability. The almost complete coincidence between the spectra of solar, lunar, and planetary light renders it practically certain that the light is all of solar origin, and is reflected from the surfaces of the moon and planets, suffering only slight alteration from the atmospheres of some of the planets. A fresh confirmation of the truth of the Copernican theory is thus furnished.

Herschel proved in this way the connection between the direction of the oblique faces of quartz crystals, and the direction in which the same crystals rotate the plane of polarisation of light. For if it is found in a second crystal that the relation is the same as in the first, the probability of this happening by chance is 1/2; the probability that in another crystal also the direction will be the same is 1/4, and so on. The probability that in *n* + 1 crystals there would be casual agreement of direction is the nth power of 1/2. Thus, if in examining fourteen crystals the same relation of the two phenomena is discovered in each, the odds that it proceeds from uniform conditions are more than 8000 to 1.[154] Since the first observations on this subject were made in 1820, no exceptions have been observed, so that the probability of invariable connection is incalculably great.

[154] *Edinburgh Review*, No. 185, vol. xcii. July 1850, p. 32; Herschel’s *Essays*, p. 421; *Transactions of the Cambridge Philosophical Society*, vol. i. p. 43.

It is exceedingly probable that the ancient Egyptians had exactly recorded the eclipses occurring during long periods of time, for Diogenes Laertius mentions that 373 solar and 832 lunar eclipses had been observed, and the ratio between these numbers exactly expresses that which would hold true of the eclipses of any long period, of say 1200 or 1300 years, as estimated on astronomical grounds. It is evident that an agreement between small numbers, or customary numbers, such as seven, one hundred, a myriad, &c., is much more likely to happen from chance, and therefore gives much less presumption of dependence. If two ancient writers spoke of the sacrifice of oxen, they would in all probability describe it as a hecatomb, and there would be nothing remarkable in the coincidence. But it is impossible to point out any special reason why an old writer should select such numbers as 373 and 832, unless they had been the results of observation.

On similar grounds, we must inevitably believe in the human origin of the flint flakes so copiously discovered of late years. For though the accidental stroke of one stone against another may often produce flakes, such as are occasionally found on the sea-shore, yet when several flakes are found in close company, and each one bears evidence, not of a single blow only, but of several successive blows, all conducing to form a symmetrical knife-like form, the probability of a natural and accidental origin becomes incredibly small, and the contrary supposition, that they are the work of intelligent beings, approximately certain.[155]

[155] Evans’ *Ancient Stone Implements of Great Britain*. London, 1872 (Longmans).

*The Theory of Probability in Astronomy.*

The science of astronomy, occupied with the simple relations of distance, magnitude, and motion of the heavenly bodies, admits more easily than almost any other science of interesting conclusions founded on the theory of probability. More than a century ago, in 1767, Michell showed the extreme probability of bonds connecting together systems of stars. He was struck by the unexpected number of fixed stars which have companions close to them. Such a conjunction might happen casually by one star, although possibly at a great distance from the other, happening to lie on a straight line passing near the earth. But the probabilities are so greatly against such an optical union happening often in the expanse of the heavens, that Michell asserted the existence of some connection between most of the double stars. It has since been estimated by Struve, that the odds are 9570 to 1 against any two stars of not less than the seventh magnitude falling within the apparent distance of four seconds of each other by chance, and yet ninety-one such cases were known when the estimation was made, and many more cases have since been discovered. There were also four known triple stars, and yet the odds against the appearance of any one such conjunction are 173,524 to 1.[156] The conclusions of Michell have been entirely verified by the discovery that many double stars are connected by gravitation.

[156] Herschel, *Outlines of Astronomy*, 1849, p. 565; but Todhunter, in his *History of the Theory of Probability*, p. 335, states that the calculations do not agree with those published by Struve.

Michell also investigated the probability that the six brightest stars in the Pleiades should have come by accidents into such striking proximity. Estimating the number of stars of equal or greater brightness at 1500, be found the odds to be nearly 500,000 to 1 against casual conjunction. Extending the same kind of argument to other clusters, such as that of Præsepe, the nebula in the hilt of Perseus’ sword, he says:[157] “We may with the highest probability conclude, the odds against the contrary opinion being many million millions to one, that the stars are really collected together in clusters in some places, where they form a kind of system, while in others there are either few or none of them, to whatever cause this may be owing, whether to their mutual gravitation, or to some other law or appointment of the Creator.”

[157] *Philosophical Transactions*, 1767, vol. lvii. p. 431.

The calculations of Michell have been called in question by the late James D. Forbes,[158] and Mr. Todhunter vaguely countenances his objections,[159] otherwise I should not have thought them of much weight. Certainly Laplace accepts Michell’s views,[160] and if Michell be in error it is in the methods of calculation, not in the general validity of his reasoning and conclusions.

[158] *Philosophical Magazine*, 3rd Series, vol. xxxvii. p. 401, December 1850; also August 1849.

[159] *History*, &c., p. 334.

[160] *Essai Philosophique*, p. 57.

Similar calculations might no doubt be applied to the peculiar drifting motions which have been detected by Mr. R A. Proctor in some of the constellations.[161] The odds are very greatly against any numerous group of stars moving together in any one direction by chance. On like grounds, there can be no doubt that the sun has a considerable proper motion because on the average the fixed stars show a tendency to move apparently from one point of the heavens towards that diametrically opposite. The sun’s motion in the contrary direction would explain this tendency, otherwise we must believe that thousands of stars accidentally agree in their direction of motion, or are urged by some common force from which the sun is exempt. It may be said that the rotation of the earth is proved in like manner, because it is immensely more probable that one body would revolve than that the sun, moon, planets, comets, and the whole of the stars of the heavens should be whirled round the earth daily, with a uniform motion superadded to their own peculiar motions. This appears to be mainly the reason which led Gilbert, one of the earliest English Copernicans, and in every way an admirable physicist, to admit the rotation of the earth, while Francis Bacon denied it.

[161] *Proceedings of the Royal Society*; 20 January, 1870; *Philosophical Magazine*, 4th Series, vol. xxxix. p. 381.

In contemplating the planetary system, we are struck with the similarity in direction of nearly all its movements. Newton remarked upon the regularity and uniformity of these motions, and contrasted them with the eccentricity and irregularity of the cometary orbits.[162] Could we, in fact, look down upon the system from the northern side, we should see all the planets moving round from west to east, the satellites moving round their primaries, and the sun, planets, and satellites rotating in the same direction, with some exceptions on the verge of the system. In the time of Laplace eleven planets were known, and the directions of rotation were known for the sun, six planets, the satellites of Jupiter, Saturn’s ring, and one of his satellites. Thus there were altogether 43 motions all concurring, namely:--

Orbital motions of eleven planets 11 Orbital motions of eighteen satellites 18 Axial rotations 14 -- 43

[162] *Principia*, bk. ii. General scholium.

The probability that 43 motions independent of each other would coincide by chance is the 42nd power of 1/2, so that the odds are about 4,400,000,000,000 to 1 in favour of some common cause for the uniformity of direction. This probability, as Laplace observes,[163] is higher than that of many historical events which we undoubtingly believe. In the present day, the probability is much increased by the discovery of additional planets, and the rotation of other satellites, and it is only slightly weakened by the fact that some of the outlying satellites are exceptional in direction, there being considerable evidence of an accidental disturbance in the more distant parts of the system.

[163] *Essai Philosophique*, p. 55. Laplace appears to count the rings of Saturn as giving two independent movements.

Hardly less remarkable than the uniform direction of motion is the near approximation of the orbits of the planets to a common plane. Daniel Bernoulli roughly estimated the probability of such an agreement arising from accident as 1 ÷ (12)^{6} the greatest inclination of any orbit to the sun’s equator being 1-12th part of a quadrant. Laplace devoted to this subject some of his most ingenious investigations. He found the probability that the sum of the inclinations of the planetary orbits would not exceed by accident the actual amount (·914187 of a right angle for the ten planets known in 1801) to be (1/10)! (·914187)^{10} or about ·00000011235. This probability may be combined with that derived from the direction of motion, and it then becomes immensely probable that the constitution of the planetary system arose out of uniform conditions, or, as we say, from some common cause.[164]

[164] Lubbock, *Essay on Probability*, p. 14. De Morgan, *Encyc. Metrop.* art. *Probability*, p. 412. Todhunter’s *History of the Theory of Probability*, p. 543. Concerning the objections raised to these conclusions by Boole, see the *Philosophical Magazine*, 4th Series, vol. ii. p. 98. Boole’s *Laws of Thought*, pp. 364–375.

If the same kind of calculation be applied to the orbits of comets, the result is very different.[165] Of the orbits which have been determined 48·9 per cent. only are direct or in the same direction as the planetary motions.[166] Hence it becomes apparent that comets do not properly belong to the solar system, and it is probable that they are stray portions of nebulous matter which have accidentally become attached to the system by the attractive powers of the sun or Jupiter.

[165] Laplace, *Essai Philosophique*, pp. 55, 56.

[166] Chambers’ *Astronomy*, 2nd ed. pp. 346–49.

*The General Inverse Problem.*

In the instances described in the preceding sections, we have been occupied in receding from the occurrence of certain similar events to the probability that there must have been a condition or cause for such events. We have found that the theory of probability, although never yielding a certain result, often enables us to establish an hypothesis beyond the reach of reasonable doubt. There is, however, another method of applying the theory, which possesses for us even greater interest, because it illustrates, in the most complete manner, the theory of inference adopted in this work, which theory indeed it suggested. The problem to be solved is as follows:--

*An event having happened a certain number of times, and failed a certain number of times, required the probability that it will happen any given number of times in the future under the same circumstances.*

All the *larger* planets hitherto discovered move in one direction round the sun; what is the probability that, if a new planet exterior to Neptune be discovered, it will move in the same direction? All known permanent gases, except chlorine, are colourless; what is the probability that, if some new permanent gas should be discovered, it will be colourless? In the general solution of this problem, we wish to infer the future happening of any event from the number of times that it has already been observed to happen. Now, it is very instructive to find that there is no known process by which we can pass directly from the data to the conclusion. It is always requisite to recede from the data to the probability of some hypothesis, and to make that hypothesis the ground of our inference concerning future events. Mathematicians, in fact, make every hypothesis which is applicable to the question in hand; they then calculate, by the inverse method, the probability of every such hypothesis according to the data, and the probability that if each hypothesis be true, the required future event will happen. The total probability that the event will happen is the sum of the separate probabilities contributed by each distinct hypothesis.

To illustrate more precisely the method of solving the problem, it is desirable to adopt some concrete mode of representation, and the ballot-box, so often employed by mathematicians, will best serve our purpose. Let the happening of any event be represented by the drawing of a white ball from a ballot-box, while the failure of an event is represented by the drawing of a black ball. Now, in the inductive problem we are supposed to be ignorant of the contents of the ballot-box, and are required to ground all our inferences on our experience of those contents as shown in successive drawings. Rude common sense would guide us nearly to a true conclusion. Thus, if we had drawn twenty balls one after another, replacing the ball after each drawing, and the ball had in each case proved to be white, we should believe that there was a considerable preponderance of white balls in the urn, and a probability in favour of drawing a white ball on the next occasion. Though we had drawn white balls for thousands of times without fail, it would still be possible that some black balls lurked in the urn and would at last appear, so that our inferences could never be certain. On the other hand, if black balls came at intervals, we should expect that after a certain number of trials the black balls would appear again from time to time with somewhat the same frequency.

The mathematical solution of the question consists in little more than a close analysis of the mode in which our common sense proceeds. If twenty white balls have been drawn and no black ball, my common sense tells me that any hypothesis which makes the black balls in the urn considerable compared with the white ones is improbable; a preponderance of white balls is a more probable hypothesis, and as a deduction from this more probable hypothesis, I expect a recurrence of white balls. The mathematician merely reduces this process of thought to exact numbers. Taking, for instance, the hypothesis that there are 99 white and one black ball in the urn, he can calculate the probability that 20 white balls would be drawn in succession in those circumstances; he thus forms a definite estimate of the probability of this hypothesis, and knowing at the same time the probability of a white ball reappearing if such be the contents of the urn, he combines these probabilities, and obtains an exact estimate that a white ball will recur in consequence of this hypothesis. But as this hypothesis is only one out of many possible ones, since the ratio of white and black balls may be 98 to 2, or 97 to 3, or 96 to 4, and so on, he has to repeat the estimate for every such possible hypothesis. To make the method of solving the problem perfectly evident, I will describe in the next section a very simple case of the problem, originally devised for the purpose by Condorcet, which was also adopted by Lacroix,[167] and has passed into the works of De Morgan, Lubbock, and others.

[167] *Traité élémentaire du Calcul des Probabilités*, 3rd ed. (1833), p. 148.

*Simple Illustration of the Inverse Problem.*

Suppose it to be known that a ballot-box contains only four black or white balls, the ratio of black and white balls being unknown. Four drawings having been made with replacement, and a white ball having appeared on each occasion but one, it is required to determine the probability that a white ball will appear next time. Now the hypotheses which can be made as to the contents of the urn are very limited in number, and are at most the following five:--

4 white and 0 black balls 3 " " 1 " " 2 " " 2 " " 1 " " 3 " " 0 " " 4 " "

The actual occurrence of black and white balls in the drawings puts the first and last hypothesis out of the question, so that we have only three left to consider.

If the box contains three white and one black, the probability of drawing a white each time is 3/4, and a black 1/4; so that the compound event observed, namely, three white and one black, has the probability 3/4 × 3/4 × 3/4 × 1/4, by the rule already given (p. 204). But as it is indifferent in what order the balls are drawn, and the black ball might come first, second, third, or fourth, we must multiply by four, to obtain the probability of three white and one black in any order, thus getting 27/64.

Taking the next hypothesis of two white and two black balls in the urn, we obtain for the same probability the quantity 1/2 × 1/2 × 1/2 × 1/2 × 4, or 16/64, and from the third hypothesis of one white and three black we deduce likewise 1/4 × 1/4 × 1/4 × 3/4 × 4, or 3/64. According, then, as we adopt the first, second, or third hypothesis, the probability that the result actually noticed would follow is 27/64, 16/64, and 3/64. Now it is certain that one or other of these hypotheses must be the true one, and their absolute probabilities are proportional to the probabilities that the observed events would follow from them (pp. 242, 243). All we have to do, then, in order to obtain the absolute probability of each hypothesis, is to alter these fractions in a uniform ratio, so that their sum shall be unity, the expression of certainty. Now, since 27 + 16 + 3 = 46, this will be effected by dividing each fraction by 46, and multiplying by 64. Thus the probabilities of the first, second, and third hypotheses are respectively--

27/46, 16/46, 3/46.

The inductive part of the problem is completed, since we have found that the urn most likely contains three white and one black ball, and have assigned the exact probability of each possible supposition. But we are now in a position to resume deductive reasoning, and infer the probability that the next drawing will yield, say a white ball. For if the box contains three white and one black ball, the probability of drawing a white one is certainly 3/4; and as the probability of the box being so constituted is 27/46, the compound probability that the box will be so filled and will give a white ball at the next trial, is

27/46 × 3/4 or 81/184.

Again, the probability is 16/46 that the box contains two white and two black, and under those conditions the probability is 1/2 that a white ball will appear; hence the probability that a white ball will appear in consequence of that condition, is

16/46 × 1/2 or 32/184.

From the third supposition we get in like manner the probability

3/46 × 1/4 or 3/184.

Since one and not more than one hypothesis can be true, we may add together these separate probabilities, and we find that

81/184 + 32/184 + 3/184 or 116/184

is the complete probability that a white ball will be next drawn under the conditions and data supposed.

*General Solution of the Inverse Problem.*

In the instance of the inverse method described in the last section, the balls supposed to be in the ballot-box were few, for the purpose of simplifying the calculation. In order that our solution may apply to natural phenomena, we must render our hypotheses as little arbitrary as possible. Having no *à priori* knowledge of the conditions of the phenomena in question, there is no limit to the variety of hypotheses which might be suggested. Mathematicians have therefore had recourse to the most extensive suppositions which can be made, namely, that the ballot-box contains an infinite number of balls; they have then varied the proportion of white to black balls continuously, from the smallest to the greatest possible proportion, and estimated the aggregate probability which results from this comprehensive supposition.

To explain their procedure, let us imagine that, instead of an infinite number, the ballot-box contains a large finite number of balls, say 1000. Then the number of white balls might be 1 or 2 or 3 or 4, and so on, up to 999. Supposing that three white and one black ball have been drawn from the urn as before, there is a certain very small probability that this would have occurred in the case of a box containing one white and 999 black balls; there is also a small probability that from such a box the next ball would be white. Compound these probabilities, and we have the probability that the next ball really will be white, in consequence of the existence of that proportion of balls. If there be two white and 998 black balls in the box, the probability is greater and will increase until the balls are supposed to be in the proportion of those drawn. Now 999 different hypotheses are possible, and the calculation is to be made for each of these, and their aggregate taken as the final result. It is apparent that as the number of balls in the box is increased, the absolute probability of any one hypothesis concerning the exact proportion of balls is decreased, but the aggregate results of all the hypotheses will assume the character of a wider average.

When we take the step of supposing the balls within the urn to be infinite in number, the possible proportions of white and black balls also become infinite, and the probability of any one proportion actually existing is infinitely small. Hence the final result that the next ball drawn will be white is really the sum of an infinite number of infinitely small quantities. It might seem impossible to calculate out a problem having an infinite number of hypotheses, but the wonderful resources of the integral calculus enable this to be done with far greater facility than if we supposed any large finite number of balls, and then actually computed the results. I will not attempt to describe the processes by which Laplace finally accomplished the complete solution of the problem. They are to be found described in several English works, especially De Morgan’s *Treatise on Probabilities*, in the *Encyclopædia Metropolitana*, and Mr. Todhunter’s *History of the Theory of Probability*. The abbreviating power of mathematical analysis was never more strikingly shown. But I may add that though the integral calculus is employed as a means of summing infinitely numerous results, we in no way abandon the principles of combinations already treated. We calculate the values of infinitely numerous factorials, not, however, obtaining their actual products, which would lead to an infinite number of figures, but obtaining the final answer to the problem by devices which can only be comprehended after study of the integral calculus.

It must be allowed that the hypothesis adopted by Laplace is in some degree arbitrary, so that there was some opening for the doubt which Boole has cast upon it.[168] But it may be replied, (1) that the supposition of an infinite number of balls treated in the manner of Laplace is less arbitrary and more comprehensive than any other that can be suggested. (2) The result does not differ much from that which would be obtained on the hypothesis of any large finite number of balls. (3) The supposition leads to a series of simple formulas which can be applied with ease in many cases, and which bear all the appearance of truth so far as it can be independently judged by a sound and practiced understanding.

[168] *Laws of Thought*, pp. 368–375.

*Rules of the Inverse Method.*

By the solution of the problem, as described in the last section, we obtain the following series of simple rules.

1. *To find the probability that an event which has not hitherto been observed to fail will happen once more, divide the number of times the event has been observed increased by one, by the same number increased by two.*

If there have been *m* occasions on which a certain event might have been observed to happen, and it has happened on all those occasions, then the probability that it will happen on the next occasion of the same kind (*m* + 1)/(*m* + 2). For instance, we may say that there are nine places in the planetary system where planets might exist obeying Bode’s law of distance, and in every place there is a planet obeying the law more or less exactly, although no reason is known for the coincidence. Hence the probability that the next planet beyond Neptune will conform to the law is 10/11.

2. *To find the, probability that an event which has not hitherto failed will not fail for a certain number of new occasions, divide the number of times the event has happened increased by one, by the same number increased by one and the number of times it is to happen.*

An event having happened *m* times without fail, the probability that it will happen *n* more times is (*m* + 1)/(*m* + *n* + 1). Thus the probability that three new planets would obey Bode’s law is 10/13; but it must be allowed that this, as well as the previous result, would be much weakened by the fact that Neptune can barely be said to obey the law.

*3. An event having happened and failed a certain number of times, to find the probability that it will happen the next time, divide the number of times the event has happened increased by one, by the whole number of times the event has happened or failed increased by two.*

If an event has happened *m* times and failed *n* times, the probability that it will happen on the next occasion is (*m* + 1)/(*m* + *n* + 2). Thus, if we assume that of the elements discovered up to the year 1873, 50 are metallic and 14 non-metallic, then the probability that the next element discovered will be metallic is 51/66. Again, since of 37 metals which have been sufficiently examined only four, namely, sodium, potassium, lanthanum, and lithium, are of less density than water, the probability that the next metal examined or discovered will be less dense than water is (4 + 1)/(37 + 2) or 5/39.

We may state the results of the method in a more general manner thus,[169]--If under given circumstances certain events A, B, C, &c., have happened respectively *m*, *n*, *p*, &c., times, and one or other of these events must happen, then the probabilities of these events are proportional to *m* + 1, *n* + 1, *p* + 1, &c., so that the probability of A will be (*m* + 1)/(*m* + 1 + *n* + 1 + *p* + 1 + &c.) But if new events may happen in addition to those which have been observed, we must assign unity for the probability of such new event. The odds then become 1 for a new event, *m* + 1 for A, *n* + 1 for B, and so on, and the absolute probability of A is (*m* + 1)/(1 + *m* + 1 + *n* + 1 + &c.)

[169] De Morgan’s *Essay on Probabilities*, Cabinet Cyclopædia, p. 67.

It is interesting to trace out the variations of probability according to these rules. The first time a casual event happens it is 2 to 1 that it will happen again; if it does happen it is 3 to 1 that it will happen a third time; and on successive occasions of the like kind the odds become 4, 5, 6, &c., to 1. The odds of course will be discriminated from the probabilities which are successively 2/3, 3/4, 4/5, &c. Thus on the first occasion on which a person sees a shark, and notices that it is accompanied by a little pilot fish, the odds are 2 to 1, or the probability 2/3, that the next shark will be so accompanied.

When an event has happened a very great number of times, its happening once again approaches nearly to certainty. If we suppose the sun to have risen one thousand million times, the probability that it will rise again, on the ground of this knowledge merely, is (1,000,000,000 + 1)/(1,000,000,000 + 1 + 1). But then the probability that it will continue to rise for as long a period in the future is only (1,000,000,000 + 1)/(2,000,000,000 + 1), or almost exactly 1/2. The probability that it will continue so rising a thousand times as long is only about 1/1001. The lesson which we may draw from these figures is quite that which we should adopt on other grounds, namely, that experience never affords certain knowledge, and that it is exceedingly improbable that events will always happen as we observe them. Inferences pushed far beyond their data soon lose any considerable probability. De Morgan has said,[170] “No finite experience whatsoever can justify us in saying that the future shall coincide with the past in all time to come, or that there is any probability for such a conclusion.” On the other hand, we gain the assurance that experience sufficiently extended and prolonged will give us the knowledge of future events with an unlimited degree of probability, provided indeed that those events are not subject to arbitrary interference.

[170] *Essay on Probabilities*, p. 128.

It must be clearly understood that these probabilities are only such as arise from the mere happening of the events, irrespective of any knowledge derived from other sources concerning those events or the general laws of nature. All our knowledge of nature is indeed founded in like manner upon observation, and is therefore only probable. The law of gravitation itself is only probably true. But when a number of different facts, observed under the most diverse circumstances, are found to be harmonized under a supposed law of nature, the probability of the law approximates closely to certainty. Each science rests upon so many observed facts, and derives so much support from analogies or connections with other sciences, that there are comparatively few cases where our judgment of the probability of an event depends entirely upon a few antecedent events, disconnected from the general body of physical science.

Events, again, may often exhibit a regularity of succession or preponderance of character, which the simple formula will not take into account. For instance, the majority of the elements recently discovered are metals, so that the probability of the next discovery being that of a metal, is doubtless greater than we calculated (p. 258). At the more distant parts of the planetary system, there are symptoms of disturbance which would prevent our placing much reliance on any inference from the prevailing order of the known planets to those undiscovered ones which may possibly exist at great distances. These and all like complications in no way invalidate the theoretic truth of the formulas, but render their sound application much more difficult.

Erroneous objections have been raised to the theory of probability, on the ground that we ought not to trust to our *à priori* conceptions of what is likely to happen, but should always endeavour to obtain precise experimental data to guide us.[171] This course, however, is perfectly in accordance with the theory, which is our best and only guide, whatever data we possess. We ought to be always applying the inverse method of probabilities so as to take into account all additional information. When we throw up a coin for the first time, we are probably quite ignorant whether it tends more to fall head or tail upwards, and we must therefore assume the probability of each event as 1/2. But if it shows head in the first throw, we now have very slight experimental evidence in favour of a tendency to show head. The chance of two heads is now slightly greater than 1/4, which it appeared to be at first,[172] and as we go on throwing the coin time after time, the probability of head appearing next time constantly varies in a slight degree according to the character of our previous experience. As Laplace remarks, we ought always to have regard to such considerations in common life. Events when closely scrutinized will hardly ever prove to be quite independent, and the slightest preponderance one way or the other is some evidence of connection, and in the absence of better evidence should be taken into account.

[171] J. S. Mill, *System of Logic*, 5th edition, bk. iii. chap. xviii. § 3.

[172] Todhunter’s *History*, pp. 472, 598.

The grand object of seeking to estimate the probability of future events from past experience, seems to have been entertained by James Bernoulli and De Moivre, at least such was the opinion of Condorcet; and Bernoulli may be said to have solved one case of the problem.[173] The English writers Bayes and Price are, however, undoubtedly the first who put forward any distinct rules on the subject.[174] Condorcet and several other eminent mathematicians advanced the mathematical theory of the subject; but it was reserved to the immortal Laplace to bring to the subject the full power of his genius, and carry the solution of the problem almost to perfection. It is instructive to observe that a theory which arose from petty games of chance, the rules and the very names of which are forgotten, gradually advanced, until it embraced the most sublime problems of science, and finally undertook to measure the value and certainty of all our inductions.

[173] Todhunter’s *History*, pp. 378, 379.

[174] *Philosophical Transactions*, [1763], vol. liii. p. 370, and [1764], vol. liv. p. 296. Todhunter, pp. 294–300.

*Fortuitous Coincidences.*

We should have studied the theory of probability to very little purpose, if we thought that it would furnish us with an infallible guide. The theory itself points out the approximate certainty, that we shall sometimes be deceived by extraordinary fortuitous coincidences. There is no run of luck so extreme that it may not happen, and it may happen to us, or in our time, as well as to other persons or in other times. We may be forced by correct calculation to refer such coincidences to a necessary cause, and yet we may be deceived. All that the calculus of probability pretends to give, is *the result in the long run*, as it is called, and this really means in *an infinity of cases*. During any finite experience, however long, chances may be against us. Nevertheless the theory is the best guide we can have. If we always think and act according to its well-interpreted indications, we shall have the best chance of escaping error; and if all persons, throughout all time to come, obey the theory in like manner, they will undoubtedly thereby reap the greatest advantage.

No rule can be given for discriminating between coincidences which are casual and those which are the effects of law. By a fortuitous or casual coincidence, we mean an agreement between events, which nevertheless arise from wholly independent and different causes or conditions, and which will not always so agree. It is a fortuitous coincidence, if a penny thrown up repeatedly in various ways always falls on the same side; but it would not be fortuitous if there were any similarity in the motions of the hand, and the height of the throw, so as to cause or tend to cause a uniform result. Now among the infinitely numerous events, objects, or relations in the universe, it is quite likely that we shall occasionally notice casual coincidences. There are seven intervals in the octave, and there is nothing very improbable in the colours of the spectrum happening to be apparently divisible into the same or similar series of seven intervals. It is hardly yet decided whether this apparent coincidence, with which Newton was much struck, is well founded or not,[175] but the question will probably be decided in the negative.

[175] Newton’s *Opticks*, Bk. I., Part ii. Prop. 3; *Nature*, vol. i. p. 286.

It is certainly a casual coincidence which the ancients noticed between the seven vowels, the seven strings of the lyre, the seven Pleiades, and the seven chiefs at Thebes.[176] The accidents connected with the number seven have misled the human intellect throughout the historical period. Pythagoras imagined a connection between the seven planets and the seven intervals of the monochord. The alchemists were never tired of drawing inferences from the coincidence in numbers of the seven planets and the seven metals, not to speak of the seven days of the week.

[176] Aristotle’s *Metaphysics*, xiii. 6. 3.

A singular circumstance was pointed out concerning the dimensions of the earth, sun, and moon; the sun’s diameter was almost exactly 110 times as great as the earth’s diameter, while in almost exactly the same ratio the mean distance of the earth was greater than the sun’s diameter, and the mean distance of the moon from the earth was greater than the moon’s diameter. The agreement was so close that it might have proved more than casual, but its fortuitous character is now sufficiently shown by the fact, that the coincidence ceases to be remarkable when we adopt the amended dimensions of the planetary system.

A considerable number of the elements have atomic weights, which are apparently exact multiples of that of hydrogen. If this be not a law to be ultimately extended to all the elements, as supposed by Prout, it is a most remarkable coincidence. But, as I have observed, we have no means of absolutely discriminating accidental coincidences from those which imply a deep producing cause. A coincidence must either be very strong in itself, or it must be corroborated by some explanation or connection with other laws of nature. Little attention was ever given to the coincidence concerning the dimensions of the sun, earth, and moon, because it was not very strong in itself, and had no apparent connection with the principles of physical astronomy. Prout’s Law bears more probability because it would bring the constitution of the elements themselves in close connection with the atomic theory, representing them as built up out of a simpler substance.

In historical and social matters, coincidences are frequently pointed out which are due to chance, although there is always a strong popular tendency to regard them as the work of design, or as having some hidden meaning. If to 1794, the number of the year in which Robespierre fell, we add the sum of its digits, the result is 1815, the year in which Napoleon fell; the repetition of the process gives 1830 the year in which Charles the Tenth abdicated. Again, the French Chamber of Deputies, in 1830, consisted of 402 members, of whom 221 formed the party called “La queue de Robespierre,” while the remainder, 181 in number, were named “Les honnêtes gens.” If we give to each letter a numerical value corresponding to its place in the alphabet, it will be found that the sum of the values of the letters in each name exactly indicates the number of the party.

A number of such coincidences, often of a very curious character, might be adduced, and the probability against the occurrence of each is enormously great. They must be attributed to chance, because they cannot be shown to have the slightest connection with the general laws of nature; but persons are often found to be greatly influenced by such coincidences, regarding them as evidence of fatality, that is of a system of causation governing human affairs independently of the ordinary laws of nature. Let it be remembered that there are an infinite number of opportunities in life for some strange coincidence to present itself, so that it is quite to be expected that remarkable conjunctions will sometimes happen.

In all matters of judicial evidence, we must bear in mind the probable occurrence from time to time of unaccountable coincidences. The Roman jurists refused for this reason to invalidate a testamentary deed, the witnesses of which had sealed it with the same seal. For witnesses independently using their own seals might be found to possess identical ones by accident.[177] It is well known that circumstantial evidence of apparently overwhelming completeness will sometimes lead to a mistaken judgment, and as absolute certainty is never really attainable, every court must act upon probabilities of a high amount, and in a certain small proportion of cases they must almost of necessity condemn the innocent victims of a remarkable conjuncture of circumstances.[178] Popular judgments usually turn upon probabilities of far less amount, as when the palace of Nicomedia, and even the bedchamber of Diocletian, having been on fire twice within fifteen days, the people entirely refused to believe that it could be the result of accident. The Romans believed that there was fatality connected with the name of Sextus.

“Semper sub Sextis perdita Roma fuit.”

[177] Possunt autem omnes testes et uno annulo signare testamentum Quid enim si septem annuli una sculptura fuerint, secundum quod Pomponio visum est?--*Justinian*, ii. tit. x. 5.

[178] See Wills on *Circumstantial Evidence*, p. 148.

The utmost precautions will not provide against all contingencies. To avoid errors in important calculations, it is usual to have them repeated by different computers; but a case is on record in which three computers made exactly the same calculations of the place of a star, and yet all did it wrong in precisely the same manner, for no apparent reason.[179]

[179] *Memoirs of the Royal Astronomical Society*, vol. iv. p. 290, quoted by Lardner, *Edinburgh Review*, July 1834, p. 278.

*Summary of the Theory of Inductive Inference.*

The theory of inductive inference stated in this and the previous chapters, was suggested by the study of the Inverse Method of Probability, but it also bears much resemblance to the so-called Deductive Method described by Mill, in his celebrated *System of Logic*. Mill’s views concerning the Deductive Method, probably form the most original and valuable part of his treatise, and I should have ascribed the doctrine entirely to him, had I not found that the opinions put forward in other parts of his work are entirely inconsistent with the theory here upheld. As this subject is the most important and difficult one with which we have to deal, I will try to remedy the imperfect manner in which I have treated it, by giving a recapitulation of the views adopted.

All inductive reasoning is but the inverse application of deductive reasoning. Being in possession of certain particular facts or events expressed in propositions, we imagine some more general proposition expressing the existence of a law or cause; and, deducing the particular results of that supposed general proposition, we observe whether they agree with the facts in question. Hypothesis is thus always employed, consciously or unconsciously. The sole conditions to which we need conform in framing any hypothesis is, that we both have and exercise the power of inferring deductively from the hypothesis to the particular results, which are to be compared with the known facts. Thus there are but three steps in the process of induction:--

(1) Framing some hypothesis as to the character of the general law.

(2) Deducing consequences from that law.

(3) Observing whether the consequences agree with the particular facts under consideration.

In very simple cases of inverse reasoning, hypothesis may seem altogether needless. To take numbers again as a convenient illustration, I have only to look at the series,

1, 2, 4, 8, 16, 32, &c.,

to know at once that the general law is that of geometrical progression; I need no successive trial of various hypotheses, because I am familiar with the series, and have long since learnt from what general formula it proceeds. In the same way a mathematician becomes acquainted with the integrals of a number of common formulas, so that he need not go through any process of discovery. But it is none the less true that whenever previous reasoning does not furnish the knowledge, hypotheses must be framed and tried (p. 124).

There naturally arise two cases, according as the nature of the subject admits of certain or only probable deductive reasoning. Certainty, indeed, is but a singular case of probability, and the general principles of procedure are always the same. Nevertheless, when certainty of inference is possible, the process is simplified. Of several mutually inconsistent hypotheses, the results of which can be certainly compared with fact, but one hypothesis can ultimately be entertained. Thus in the inverse logical problem, two logically distinct conditions could not yield the same series of possible combinations. Accordingly, in the case of two terms we had to choose one of six different kinds of propositions (p. 136), and in the case of three terms, our choice lay among 192 possible distinct hypotheses (p. 140). Natural laws, however, are often quantitative in character, and the possible hypotheses are then infinite in variety.

When deduction is certain, comparison with fact is needed only to assure ourselves that we have rightly selected the hypothetical conditions. The law establishes itself, and no number of particular verifications can add to its probability. Having once deduced from the principles of algebra that the difference of the squares of two numbers is equal to the product of their sum and difference, no number of particular trials of its truth will render it more certain. On the other hand, no finite number of particular verifications of a supposed law will render that law certain. In short, certainty belongs only to the deductive process, and to the teachings of direct intuition; and as the conditions of nature are not given by intuition, we can only be certain that we have got a correct hypothesis when, out of a limited number conceivably possible, we select that one which alone agrees with the facts to be explained.

In geometry and kindred branches of mathematics, deductive reasoning is conspicuously certain, and it would often seem as if the consideration of a single diagram yields us certain knowledge of a general proposition. But in reality all this certainty is of a purely hypothetical character. Doubtless if we could ascertain that a supposed circle was a true and perfect circle, we could be certain concerning a multitude of its geometrical properties. But geometrical figures are physical objects, and the senses can never assure us as to their exact forms. The figures really treated in Euclid’s *Elements* are imaginary, and we never can verify in practice the conclusions which we draw with certainty in inference; questions of degree and probability enter.

Passing now to subjects in which deduction is only probable, it ceases to be possible to adopt one hypothesis to the exclusion of the others. We must entertain at the same time all conceivable hypotheses, and regard each with the degree of esteem proportionate to its probability. We go through the same steps as before.

(1) We frame an hypothesis.

(2) We deduce the probability of various series of possible consequences.

(3) We compare the consequences with the particular facts, and observe the probability that such facts would happen under the hypothesis.

The above processes must be performed for every conceivable hypothesis, and then the absolute probability of each will be yielded by the principle of the inverse method (p. 242). As in the case of certainty we accept that hypothesis which certainly gives the required results, so now we accept as most probable that hypothesis which most probably gives the results; but we are obliged to entertain at the same time all other hypotheses with degrees of probability proportionate to the probabilities that they would give the same results.

So far we have treated only of the process by which we pass from special facts to general laws, that inverse application of deduction which constitutes induction. But the direct employment of deduction is often combined with the inverse. No sooner have we established a general law, than the mind rapidly draws particular consequences from it. In geometry we may almost seem to infer that *because* one equilateral triangle is equiangular, therefore another is so. In reality it is not because one is that another is, but because all are. The geometrical conditions are perfectly general, and by what is sometimes called *parity of reasoning* whatever is true of one equilateral triangle, so far as it is equilateral, is true of all equilateral triangles.

Similarly, in all other cases of inductive inference, where we seem to pass from some particular instances to a new instance, we go through the same process. We form an hypothesis as to the logical conditions under which the given instances might occur; we calculate inversely the probability of that hypothesis, and compounding this with the probability that a new instance would proceed from the same conditions, we gain the absolute probability of occurrence of the new instance in virtue of this hypothesis. But as several, or many, or even an infinite number of mutually inconsistent hypotheses may be possible, we must repeat the calculation for each such conceivable hypothesis, and then the complete probability of the future instance will be the sum of the separate probabilities. The complication of this process is often very much reduced in practice, owing to the fact that one hypothesis may be almost certainly true, and other hypotheses, though conceivable, may be so improbable as to be neglected without appreciable error.

When we possess no knowledge whatever of the conditions from which the events proceed, we may be unable to form any probable hypotheses as to their mode of origin. We have now to fall back upon the general solution of the problem effected by Laplace, which consists in admitting on an equal footing every conceivable ratio of favourable and unfavourable chances for the production of the event, and then accepting the aggregate result as the best which can be obtained. This solution is only to be accepted in the absence of all better means, but like other results of the calculus of probability, it comes to our aid where knowledge is at an end and ignorance begins, and it prevents us from over-estimating the knowledge we possess. The general results of the solution are in accordance with common sense, namely, that the more often an event has happened the more probable, as a general rule, is its subsequent recurrence. With the extension of experience this probability increases, but at the same time the probability is slight that events will long continue to happen as they have previously happened.

We have now pursued the theory of inductive inference, as far as can be done with regard to simple logical or numerical relations. The laws of nature deal with time and space, which are infinitely divisible. As we passed from pure logic to numerical logic, so we must now pass from questions of discontinuous, to questions of continuous quantity, encountering fresh considerations of much difficulty. Before, therefore, we consider how the great inductions and generalisations of physical science illustrate the views of inductive reasoning just explained, we must break off for a time, and review the means which we possess of measuring and comparing magnitudes of time, space, mass, force, momentum, energy, and the various manifestations of energy in motion, heat, electricity, chemical change, and the other phenomena of nature.

BOOK III.

METHODS OF MEASUREMENT.