Part 11
Minkowski—Mechanics, appendix, page 65 of paper (2). Planck—Verh. d. D. P. G. Vol. 4, 1906, p. 136.
Footnote 33:
Schütz, Gött. Nachr. 1897, p. 110.
Footnote 34:
Lienard, L’Eclairage électrique T. 16, 1896, p. 53. Wiechert, Ann. d. Physik, Vol. 4.
Footnote 35:
K. Schwarzschild. Gött-Nachr. 1903. H. A. Lorentz, Enzyklopädie der Math. Wissenschaften V. Art 14, p. 199.
The Foundation of the Generalised Theory of Relativity By A. Einstein. From Annalen der Physik 4.49.1916.
The theory which is sketched in the following pages forms the most wide-going generalization conceivable of what is at present known as “the theory of Relativity;” this latter theory I differentiate from the former “Special Relativity theory,” and suppose it to be known. The generalization of the Relativity theory has been made much easier through the form given to the special Relativity theory by Minkowski, which mathematician was the first to recognize clearly the formal equivalence of the space like and time-like co-ordinates, and who made use of it in the building up of the theory. The mathematical apparatus useful for the general relativity theory, lay already complete in the “Absolute Differential Calculus,” which were based on the researches of Gauss, Riemann and Christoffel on the non-Euclidean manifold, and which have been shaped into a system by Ricci and Levi-civita, and already applied to the problems of theoretical physics. I have in part B of this communication developed in the simplest and clearest manner, all the supposed mathematical auxiliaries, not known to Physicists, which will be useful for our purpose, so that, a study of the mathematical literature is not necessary for an understanding of this paper. Finally in this place I thank my friend Grossmann, by whose help I was not only spared the study of the mathematical literature pertinent to this subject, but who also aided me in the researches on the field equations of gravitation.
A Principal considerations about the Postulate of Relativity.
§ 1. Remarks on the Special Relativity Theory.
The special relativity theory rests on the following postulate which also holds valid for the Galileo-Newtonian mechanics.
If a co-ordinate system K be so chosen that when referred to it, the physical laws hold in their simplest forms these laws would be also valid when referred to another system of co-ordinates K′ which is subjected to an uniform translational motion relative to K. We call this postulate “The Special Relativity Principle.” By the word special, it is signified that the principle is limited to the case, when K′ has _uniform translatory_ motion with reference to K, but the equivalence of K and K′ does not extend to the case of non-uniform motion of K′ relative to K.
The Special Relativity Theory does not differ from the classical mechanics through the assumption of this postulate, but only through the postulate of the constancy of light-velocity in vacuum which, when combined with the special relativity postulate, gives in a well-known way, the relativity of synchronism as well as the Lorenz-transformation, with all the relations between moving rigid bodies and clocks.
The modification which the theory of space and time has undergone through the special relativity theory, is indeed a profound one, but a weightier point remains untouched. According to the special relativity theory, the theorems of geometry are to be looked upon as the laws about any possible relative positions of solid bodies at rest, and more generally the theorems of kinematics, as theorems which describe the relation between measurable bodies and clocks. Consider two material points of a solid body at rest; then according to these conceptions there corresponds to these points a wholly definite extent of length, independent of kind, position, orientation and time of the body.
Similarly let us consider two positions of the pointers of a clock which is at rest with reference to a co-ordinate system; then to these positions, there always corresponds, a time-interval of a definite length, independent of time and place. It would be soon shown that the general relativity theory can not hold fast to this simple physical significance of space and time.
§ 2. About the reasons which explain the extension of the relativity-postulate.
To the classical mechanics (no less than) to the special relativity theory, is attached an episteomological defect, which was perhaps first cleanly pointed out by E. Mach. We shall illustrate it by the following example; Let two fluid bodies of equal kind and magnitude swim freely in space at such a great distance from one another (and from all other masses) that only that sort of gravitational forces are to be taken into account which the parts of any of these bodies exert upon each other. The distance of the bodies from one another is invariable. The relative motion of the different parts of each body is not to occur. But each mass is seen to rotate by an observer at rest relative to the other mass round the connecting line of the masses with a constant angular velocity (definite relative motion for both the masses). Now let us think that the surfaces of both the bodies (S₁ and S₂) are measured with the help of measuring rods (relatively at rest); it is then found that the surface of S₁ is a sphere and the surface of the other is an ellipsoid of rotation. We now ask, why is this difference between the two bodies? An answer to this question can only then be regarded as satisfactory from the episteomological standpoint when the thing adduced as the cause is an observable fact of experience. The law of causality has the sense of a definite statement about the world of experience only when observable facts alone appear as causes and effects.
The Newtonian mechanics does not give to this question any satisfactory answer. For example, it says:—The laws of mechanics hold true for a space R₁ relative to which the body S₁ is at rest, not however for a space relative to which S₂ is at rest.
The Galiliean space, which is here introduced is however only a purely imaginary cause, not an observable thing. It is thus clear that the Newtonian mechanics does not, in the case treated here, actually fulfil the requirements of causality, but produces on the mind a fictitious complacency, in that it makes responsible a _wholly imaginary cause_ R₁ for the different behaviours of the bodies S₁ and S₂ which are actually observable.
A satisfactory explanation to the question put forward above can only be thus given:—that the physical system composed of S₁ and S₂ shows for itself alone no conceivable cause to which the different behaviour of S₁ and S₂ can be attributed. The cause must thus lie outside the system. We are therefore led to the conception that the general laws of motion which determine specially the forms of S₁ and S₂ must be of such a kind, that the mechanical behaviour of S₁ and S₂ must be essentially conditioned by the distant masses, which we had not brought into the system considered. These distant masses, (and their relative motion as regards the bodies under consideration) are then to be looked upon as the seat of the principal observable causes for the different behaviours of the bodies under consideration. They take the place of the imaginary cause R₁. Among all the conceivable spaces R₁ and R₂ moving in any manner relative to one another, there is a priori, no one set which can be regarded as affording greater advantages, against which the objection which was already raised from the standpoint of the theory of knowledge cannot be again revived. The laws of physics must be so constituted that they should remain valid for any system of co-ordinates moving in any manner. We thus arrive at an extension of the relativity postulate.
Besides this momentous episteomological argument, there is also a well-known physical fact which speaks in favour of an extension of the relativity theory. Let there be a Galiliean co-ordinate system K relative to which (at least in the four-dimensional region considered) a mass at a sufficient distance from other masses move uniformly in a line. Let K′ be a second co-ordinate system which has a uniformly accelerated motion relative to K. Relative to K′ any mass at a sufficiently great distance experiences an accelerated motion such that its acceleration and the direction of acceleration is independent of its material composition and its physical conditions.
Can any observer, at rest relative to K′, then conclude that he is in an actually accelerated reference-system? This is to be answered in the negative; the above-named behaviour of the freely moving masses relative to K′ can be explained in as good a manner in the following way. The reference-system K′ has no acceleration. In the space-time region considered there is a gravitation-field which generates the accelerated motion relative to K′.
This conception is feasible, because to us the experience of the existence of a field of force (namely the gravitation field) has shown that it possesses the remarkable property of imparting the same acceleration to all bodies. The mechanical behaviour of the bodies relative to K′ is the same as experience would expect of them with reference to systems which we assume from habit as stationary; thus it explains why from the physical stand-point it can be assumed that the systems K and K′ can both with the same legitimacy be taken as at rest, that is, they will be equivalent as systems of reference for a description of physical phenomena.
From these discussions we see, that the working out of the general relativity theory must, at the same time, lead to a theory of gravitation; for we can “create” a gravitational field by a simple variation of the co-ordinate system. Also we see immediately that the principle of the constancy of light-velocity must be modified, for we recognise easily that the path of a ray of light with reference to K′ must be, in general, curved, when light travels with a definite and constant velocity in a straight line with reference to K.
§ 3. The time-space continuum. Requirements of the general Co-variance for the equations expressing the laws of Nature in general.
In the classical mechanics as well as in the special relativity theory, the co-ordinates of time and space have an immediate physical significance; when we say that any arbitrary point has _x₁_ as its X₁ co-ordinate, it signifies that the projection of the point-event on the X₁-axis _ascertained_ by means of a solid rod according to the rules of Euclidean Geometry is reached when a definite measuring rod, the unit rod, can be carried _x₁_ times from the origin of co-ordinates along the X₁ axis. A point having _x₄_ = _t₁_ as the X₄ co-ordinate signifies that a unit clock which is adjusted to be at rest relative to the system of co-ordinates, and coinciding in its spatial position with the point-event and set according to some definite standard has gone over _x₄_ = _t_ periods before the occurrence of the point-event.
This conception of time and space is continually present in the mind of the physicist, though often in an unconscious way, as is clearly recognised from the role which this conception has played in physical measurements. This conception must also appear to the reader to be lying at the basis of the second consideration of the last paragraph and imparting a sense to these conceptions. But we wish to show that we are to abandon it and in general to replace it by more general conceptions in order to be able to work out thoroughly the postulate of general relativity,—the case of special relativity appearing as a limiting case when there is no gravitation.
We introduce in a space, which is free from Gravitation-field, a Galiliean Co-ordinate System K (_x_, _y_, _z_, _t_) and also, another system K′ (_x′_ _y′_ _z′_ _t′_) rotating uniformly relative to K. The origin of both the systems as well as their _z_-axes might continue to coincide. We will show that for a space-time measurement in the system K′, the above established rules for the physical significance of time and space can not be maintained. On grounds of symmetry it is clear that a circle round the origin in the XY plane of K, can also be looked upon as a circle in the plane (X′, Y′) of K′. Let us now think of measuring the circumference and the diameter of these circles, with a unit measuring rod (infinitely small compared with the radius) and take the quotient of both the results of measurement. If this experiment be carried out with a measuring rod at rest relatively to the Galiliean system K we would get π, as the quotient. The result of measurement with a rod relatively at rest as regards K′ would be a number which is greater than π. This can be seen easily when we regard the whole measurement-process from the system K and remember that the rod placed on the periphery suffers a Lorenz-contraction, not however when the rod is placed along the radius. Euclidean Geometry therefore does not hold for the system K′; the above fixed conceptions of co-ordinates which assume the validity of Euclidean Geometry fail with regard to the system K′. We cannot similarly introduce in K′ a time corresponding to physical requirements, which will be shown by all similarly prepared clocks at rest relative to the system K′. In order to see this we suppose that two similarly made clocks are arranged one at the centre and one at the periphery of the circle, and considered from the stationary system K. According to the well-known results of the special relativity theory it follows—(as viewed from K)—that the clock placed at the periphery will go slower than the second one which is at rest. The observer at the common origin of co-ordinates who is able to see the clock at the periphery by means of light will see the clock at the periphery going slower than the clock beside him. Since he cannot allow the velocity of light to depend explicitly upon the time in the way under consideration he will interpret his observation by saying that the clock on the periphery actually goes slower than the clock at the origin. He cannot therefore do otherwise than define time in such a way that the rate of going of a clock depends on its position.
We therefore arrive at this result. In the general relativity theory time and space magnitudes cannot be so defined that the difference in spatial co-ordinates can be immediately measured by the unit-measuring rod, and time-like co-ordinate difference with the aid of a normal clock.
The means hitherto at our disposal, for placing our co-ordinate system in the time-space continuum, in a definite way, therefore completely fail and it appears that there is no other way which will enable us to fit the co-ordinate system to the four-dimensional world in such a way, that by it we can expect to get a specially simple formulation of the laws of Nature. So that nothing remains for us but to regard all conceivable co-ordinate systems as equally suitable for the description of natural phenomena. This amounts to the following law:—
_That in general, Laws of Nature are expressed by means of equations which are valid for all co-ordinate systems, that is, which are covariant for all possible transformations._ It is clear that a physics which satisfies this postulate will be unobjectionable from the standpoint of the general relativity postulate. Because among all substitutions there are, in every case, contained those, which correspond to all relative motions of the co-ordinate system (in three dimensions). This condition of general covariance which takes away the last remnants of physical objectivity from space and time, is a natural requirement, as seen from the following considerations. All our _well-substantiated_ space-time propositions amount to the determination of space-time coincidences. If, for example, the event consisted in the motion of material points, then, for this last case, nothing else are really observable except the encounters between two or more of these material points. The results of our measurements are nothing else than well-proved theorems about such coincidences of material points, of our measuring rods with other material points, coincidences between the hands of a clock, dial-marks and point-events occurring at the same position and at the same time.
The introduction of a system of co-ordinates serves no other purpose than an easy description of totality of such coincidences. We fit to the world our space-time variables (_x₁_ _x₂_ _x₃_ _x₄_) such that to any and every point-event corresponds a system of values of (_x₁_ _x₂_ _x₃_ _x₄_). Two coincident point-events correspond to the same value of the variables (_x₁_ _x₂_ _x₃_ _x₄_); _i.e._, the coincidence is characterised by the equality of the co-ordinates. If we now introduce any four functions (_x′₁_ _x′₂_ _x′₃_ _x′₄_) as co-ordinates, so that there is an unique correspondence between them, the equality of all the four co-ordinates in the new system will still be the expression of the space-time coincidence of two material points. As the purpose of all physical laws is to allow us to remember such coincidences, there is a priori no reason present, to prefer a certain co-ordinate system to another; _i.e._, we get the condition of general covariance.
§ 4. Relation of four co-ordinates to spatial and time-like measurements.
_Analytical expression for the Gravitation-field._
I am not trying in this communication to deduce the general Relativity-theory as the simplest logical system possible, with a minimum of axioms. But it is my chief aim to develop the theory in such a manner that the reader perceives the psychological naturalness of the way proposed, and the fundamental assumptions appear to be most reasonable according to the light of experience. In this sense, we shall now introduce the following supposition; that for an infinitely small four-dimensional region, the relativity theory is valid in the special sense when the axes are suitably chosen.
The nature of acceleration of an infinitely small (positional) co-ordinate system is hereby to be so chosen, that the gravitational field does not appear; this is possible for an infinitely small region. X₁, X₂, X₃ are the spatial co-ordinates; X₄ is the corresponding time-co-ordinate measured by some suitable measuring clock. These co-ordinates have, with a given orientation of the system, an immediate physical significance in the sense of the special relativity theory (when we take a rigid rod as our unit of measure). The expression
(1) _ds²_ = - _d_X₁² - _d_X₂² - _d_X₃² + _d_X₄²
had then, according to the special relativity theory, a value which may be obtained by space-time measurement, and which is independent of the orientation of the local co-ordinate system. Let us take _ds_ as the magnitude of the line-element belonging to two infinitely near points in the four-dimensional region. If _ds²_ belonging to the element (_d_X₁, _d_X₂, _d_X₃, _d_X₄) be positive we call it with Minkowski, time-like, and in the contrary case space-like.
To the line-element considered, _i.e._, to both the infinitely near point-events belong also definite differentials _dx₁_, _dx₂_, _dx₃_, _dx₄_, of the four-dimensional co-ordinates of any chosen system of reference. If there be also a local system of the above kind given for the case under consideration, _d_X’s would then be represented by definite linear homogeneous expressions of the form
(2) _d_X_{ν} = σ_{σ}_a__{νσ}_dx__{σ}
If we substitute the expression in (1) we get
(3) _ds²_ = σ_{στ}_g__{στ}_dx__{σ}_dx__{τ}
where _g__{στ} will be functions of _x__{σ}, but will no longer depend upon the orientation and motion of the ‘local’ co-ordinates; for _ds²_ is a definite magnitude belonging to two point-events infinitely near in space and time and can be got by measurements with rods and clocks. The _g__{τσ}’s are here to be so chosen, that _g__{τσ} = _g__{στ}; the summation is to be extended over all values of σ and τ, so that the sum is to be extended, over 4 × 4 terms, of which 12 are equal in pairs.
From the method adopted here, the case of the usual relativity theory comes out when owing to the special behaviour of _g__{στ} in a _finite_ region it is possible to choose the system of co-ordinates in such a way that _g__{στ} assumes constant values—
{ -1, 0, 0, 0 (4) { 0 -1 0 0 { 0 0 -1 0 { 0 0 0 +1
We would afterwards see that the choice of such a system of co-ordinates for a finite region is in general not possible.
From the considerations in § 2 and § 3 it is clear, that from the physical stand-point the quantities _g__{στ} are to be looked upon as magnitudes which describe the gravitation-field with reference to the chosen system of axes. We assume firstly, that in a certain four-dimensional region considered, the special relativity theory is true for some particular choice of co-ordinates. The _g__{στ}’s then have the values given in (4). A free material point moves with reference to such a system uniformly in a straight-line. If we now introduce, by any substitution, the space-time co-ordinates _x₁_..._x₄_ then in the new system _g__{μν}’s are no longer constants, but functions of space and time. At the same time, the motion of a free point-mass in the new co-ordinates, will appear as curvilinear, and not uniform, in which the law of motion, will be _independent of the nature of the moving mass-points_. We can thus signify this motion as one under the influence of a gravitation field. We see that the appearance of a gravitation-field is connected with space-time variability of _g__{στ}’s. In the general case, we can not by any suitable choice of axes, make special relativity theory valid throughout any finite region. We thus deduce the conception that _g__{στ}’s describe the gravitational field. According to the general relativity theory, gravitation thus plays an exceptional rôle as distinguished from the others, specially the electromagnetic forces, in as much as the 10 functions _g__{στ} representing gravitation, define immediately the metrical properties of the four-dimensional region.
B Mathematical Auxiliaries for Establishing the General Covariant Equations.
We have seen before that the general relativity-postulate leads to the condition that the system of equations for Physics, must be co-variants for any possible substitution of co-ordinates _x₁_, ... _x₄_; we have now to see how such general co-variant equations can be obtained. We shall now turn our attention to these purely mathematical propositions. It will be shown that in the solution, the invariant _ds_, given in equation (3) plays a fundamental rôle, which we, following Gauss’s Theory of Surfaces, style as the line-element.
The fundamental idea of the general co-variant theory is this:—With reference to any co-ordinate system, let certain things (tensors) be defined by a number of functions of co-ordinates which are called the components of the tensor. There are now certain rules according to which the components can be calculated in a new system of co-ordinates, when these are known for the original system, and when the transformation connecting the two systems is known. The things herefrom designated as “Tensors” have further the property that the transformation equation of their components are linear and homogeneous; so that all the components in the new system vanish if they are all zero in the original system. Thus a law of Nature can be formulated by putting all the components of a tensor equal to zero so that it is a general co-variant equation; thus while we seek the laws of formation of the tensors, we also reach the means of establishing general co-variant laws.
5. Contra-variant and co-variant Four-vector.
Contra-variant Four-vector. The line-element is defined by the four components _dx__{ν}, whose transformation law is expressed by the equation
(5) $$ dx'_{\sigma} = \sum_{\nu} \frac{\partial x'_{\sigma}}{\partial x_{\nu}} dx_{\nu} $$