iTheoretical
;>la
. , _.,. __ ; ,_-_
INTERNATIONAL SERIES IN PHYSICS
LEE A. DuBRIDGE, Consulting Editor
INTRODUCTION TO THEORETICAL PHYSICS
The quality of the materials used in the
manufacture of thin book is governed by
nnn tinned postwar shortages.
INTERNATIONAL SERIES IN PHYSICS
LEE A. DuBRIDGE, Consulting Editor
Backer and Goudsmit— Atomic Energy States
Bitter — Introduction to Ferromacnetism
Clark — Applied X-rays
Condon and Morse — Quantum Mechanics
Curtis— Electrical Measurements
Davey — Crystal Structure and Its Applications
Edwards — Analytic and Vector Mechanics
Eldridge — The Physical Basis op Things
Hardy and Perrin — The Principles of Optics
Harnwell — Principles of Electricity and Electro-
magnetism
Harnwell and Livingood — Experimental Atomic Physics
Houston- — Principles of Mathematical Physics
Hughes and DuBridge — Photoelectric Phenomena
Hund — -High-frequency Measurements
Hund — Phenomena in High-frequency Systems
Kemble — The Fundamental Principles of Quantum
Mechanics
Kennard — Kinetic Theory of Gases
Roller- — The Physics of Electron Tubes
Morse — Vibration and Sound
Musical — The Flow of Homogeneous Fluids through
Porous Media
Pauling and Goudsmit — The Structure of Line Spectra
Richtmyer and Kennard — Introduction to Modern
Physics
Ruarh and Urey — Atoms, Molecules and Quanta
Seitz — The Modern Theory of Solids
Slater — Introduction to Chemical Physics
Slater — Microwave Transmission
Slater and Frank — Introduction to Theoretical
Physics
Smythe — Static and Dynamic Electricity
Stratton — Electromagnetic Theory
White— Introduction to Atomic Spectra
Williams — Magnetic Phenomena
Dr. F. K. Richtmyer was consulting editor of the aeries from its
inception in 1029 until his death in 1939.
INTRODUCTION TO
THEORETICAL PHYSICS
BY
JOHN C, SLATER, Ph.D.
Professor of Physics, Massachusetts Institute of Technology
AND
NATHANIEL H. FRANK, Sc.D.
Assistant Professor of Physics, Massachusetts
Institute of Technology
First Edition
Tenth Impression
McGRAW-HILL BOOK COMPANY, Inc.
NEW YORK AND LONDON
1933
CoPYHIGHT, 1933, BY THB
McGkaw-Hili, Book Company, Inc.
PHINTED IN THE UNITED STATES OF AMEBIC A
All rights reserved. This book, or
parts thereof, may not be reproduced
in any form without permission of
the publishers.
CITY Of UVERPOOt
COUKE OF TECHNOLOGY UBRARV
PREFACE
The general plan of a book is often clearer if one knows how it
came to be written. This book started from two separate
sources. First, it originated in a year's lecture course of the
same title, covering about the first two-thirds of the ground pre-
sented here, the part on classical physics. This course grew out
of the conviction that the teaching of theoretical physics in a
number of separate courses, as in mechanics, electromagnetic
theory, potential theory, thermodynamics, tends to keep a
student from seeing the unity of physics, and from appreciating
the importance of applying principles developed for one branch
of science to the problems of another. The second source of this
book was a projected volume on the structure of matter, dealing
principally with applications of modern atomic theory to the
structure of atoms, molecules, and solids, and to chemical
problems. As work progressed on this, it became evident that
the structure of matter could not be treated without a thorough
understanding of the principles of wave mechanics, and that
such an understanding demanded a careful grounding in classical
physics, in mechanics, wave motion, the theory of vibrating
systems, potential theory, statistical mechanics, where many
principles needed in the quantum theory are best introduced.
The ideal solution seemed to be to combine the two projects,
including the classical and the more modern parts of theoretical
physics in a coherent whole, thus further increasing the unity of
treatment of which we have spoken.
Two general principles have determined the order of presenting
the material: mathematical difficulty, and order of historical
development. Mechanics and problems of oscillations, involving
ordinary differential equations and simple vector analysis,
come first. Then follow vibrations and wave motion, intro-
ducing partial differential equations which can be solved by
separation of variables, and Fourier series. Hydrodynamics,
electromagnetic theory, and optics bring in more general partial
differential equations, potential theory, and differential vector
operations. Wave mechanics uses almost all the mathematical
machinery which has been developed in the earlier part of the
book. It is natural that the historical order is in general tho
Vi PREFACE
same as the order of increasing mathematical difficulty, for each
branch of physics as it develops builds on the foundation of
everything that has gone before. In cases where the two
arrangements do not coincide, we have grouped together subjects
of mathematical similarity, thus emphasizing the unity of which
we have spoken.
In a book of such wide scope, it is inevitable that many impor-
tant subjects are treated in a cursory manner. An effort has
been made to present enough of the groundwork of each subject so
that not only is further work facilitated, but also the position of
these subjects in a more general scheme of physical thought is
clearly shown. In spite of this, however, the student will of
course make much use of other references, and we give a list of
references, by no means exhaustive, but suggesting a few titles
in each field which a student who has mastered the material of
this book should be able to appreciate.
At the end of each chapter is a set of problems. The ability
to work problems, in our opinion, is essential to a proper under-
standing of physics, and it is hoped that these problems will
provide useful practice. At the same time, in many cases, the
problems have been used to extend and amplify the discussion of
the subject matter, where limitations of space made such dis-
cussion impossible in the text. The attempt has been made,
though we are conscious of having fallen far short of succeeding
in it, to carry each branch of the subject far enough so that
definite calculations can be made with it. Thus a far surer
mastery is attained than in a merely descriptive discussion.
Finally, we wish to remind the reader that the book is very
definitely one on theoretical physics. Though at times descrip-
tive material, and descriptions of experimental results, are
included, it is in general assumed that the reader has a fair
knowledge of experimental physics, of the grade generally covered
in intermediate college courses. No doubt it is unfortunate, in
view of the unity which we have stressed,, to separate the theoret-
ical side of the subject from the experimental in this way. This
is particularly true when one remembers that the greatest diffi-
culty winch the student has in mastering theoretical physics
comes in learning how to apply mathematics to a physical situa-
tion, how to formulate a problem mathematically, rather than in
solving the problem when it is once formulated. We have tried
wherever possible, in problems and text, to bridge the gap
PREFACE yii
between pure mathematics and experimental physics. But the
only satisfactory answer to this difficulty is a broad training in
which theoretical physics goes side by side with experimental
physics and practical laboratory work. The same ability to
overcome obstacles, the same ingenuity in devising one method
of procedure when another fails, the same physical intuition
leading one to perceive the answer to a problem through a mass
of intervening detail, the same critical judgment leading one to
distinguish right from wrong procedures, and to appraise results
carefully on the ground of physical plausibility, are required in
theoretical and in experimental physics. Leaks in vacuum sys-
tems or in electric circuits have their counterparts in the many
disastrous things that can happen to equations. And it is often
as hard to devise a mathematical system to deal with a difficult
problem, without unjustifiable approximations and impossible
complications, as it is to design apparatus for measuring a diffi-
cult quantity or detecting a new effect. These things cannot be
taught. They come only from that combination of inherent
insighl and faithful practice which is necessary to the successful
physicist. But half the battle is over if the student approaches
theoretical physics, not as a set of mysterious formulas, or as a
dull routine to be learned, but as a collection of methods, of tools,
of apparatus, subject to the same sort of rules as other physical
apparatus, and yielding physical results of great importance.
The title of this book might have been aptly extended to "Intro-
duction to the Methods of Theoretical Physics," for the aim has
constantly been, not to teach a great collection of facts, but to
teach mastery of the tools by which the facts have been dis-
covered and by which future discoveries will be made.
In a subject about which so much has been written, it seems
hardly practicable to acknowledge our indebtedness to any
specific books. From many of those mentioned in the section
on suggested references, and from many others, we have received
ideas, though the material in general has been written without
conscious following of earlier models. We wish to express
thanks to several of our colleagues for suggestions, and partic-
ularly to Professors P. M. Morse and J. A. Stratton, who have
read the manuscript with much care and have contributed greatly
by their discussions.
Cambridge, Mass., J. C, S.
September, 1933. N. H. F.
CONTENTS
Page
Pkeface . V
Chapter I
POWER SERIES
Introduction 1
1. Power Series 2
2. Small Quantities of Various Orders 3
3. Taylor's Expansion 4
4. The Binomial Theorem 4
6. Expansion about an Arbitrary Point 4
6. Expansion about a Pole 5
7. Convergence 5
Problems 8
Chapter II
POWER SERIES METHOD FOR DIFFERENTIAL EQUATIONS
Introduction, ■ • 10
8. The Fallino Body. 11
9. Falling Body with Viscosity 11
10. Particular and General Solutions for Falling Body with
Viscosity, , 14
11. Electric Circuit Containing Resistance and Inductance 16
Problems 17
Chapter III
POWER SERIES AND EXPONENTIAL METHODS FOR SIMPLE
HARMONIC VIBRATIONS
Introduction. 19
12. Particle with Linear Restoring Force 19
13. Oscillating Electric Circuit 20
14. The Exponential Method of Solution 21
15. Complex Exponentials 22
16. Complex Numbers 23
17. Application of Complex Numbers to Vibration Problems 25
Problems 26
Chapter IV
DAMPED VIBRATIONS, FORCED VIBRATIONS, AND
RESONANCE
Introduction 27
18. Damped Vibrational Motion 27
x CONTENTS
Page
19. Damped Electrical Oscillations 28
20. Initial Conditions for Transients. 29
21. ['"uiu.'ED Vibrations and Resonance 29
22. Mechanical Resonance 30
23. Electrical Resonance 31
24. Superposition of Transient and Forced Motion 33
25. Motion under General External Forces 35
26. Generalizations Regarding Linear Differential Equations 36
Problems 37
Chapter V
ENERGY
Introduction 39
27. Mechanical Energy 40
28. Use of the Potential for Discussing the Motion of a System 42
29. The Rolling-ball Analogy 45
30. Motion in Several Dimensions. 46
Problems 46
Chapter VI
VECTOR FORCES AND POTENTIALS
Introduction. 48
31. Vectors and Their Components 48
32. Scalar Product of Two Vectors 49
33. Vector Product of Two Vectors ■ ■ 50
34. Vector Fields 51
35. The Energy Theorem in Three Dimensions 52
36. Line Integrals and Potential Energy 52
37. Force as Gradient of Potential 53
38. Equipotential Surfaces 54
39. The Curl and the Condition for a Conservative System 55
40. The Symbolic Vector V 55
Problems ^
Chapter VII
LAGRANGE'S EQUATIONS AND PLANETARY MOTION
Introduction 58
41. Lagrange's Equations 58
42. Planetary Motion **0
43. Energy Method for Radial Motion in Central Field ... 61
44. Orbits in Central Motion °2
45. Justification of Lagrange's Method 64
Problemb "'
Chapter VIII
GENERALIZED MOMENTA AND HAMILTON'S EQUATIONS
Introduction
46. Generalized Forces.
69
69
CONTENTS xi
Page
47. Generalized Momenta 70
48. Hamilton's Equations of Motion 71
49. General Proof of Hamilton's Equations 72
50. Example of Hamilton's Equations 74
51. Applications of Lagrange's and Hamilton's Equations ... 75
Problems. . . 76
Chapter IX
PHASE SPACE AND THE GENERAL MOTION OF PARTICLES
Introduction 79
52. The Phase Space 80
53. Phase Space for the Linear Oscillator 81
54. Phase Space for Central Motion 82
55. noncentkal two-dimensional motfon 83
56. Configuration Si'ace and Momentum Space 83
57. The Two-dimensional Oscillator 84
5S. Methods of Solution 86
59. Contact Transformations and Angle Variables 87
60. Methods of Solution for Nonperiodic Motions 90
Problems 90
Chapter X
THE MOTION OF RIGID BODIES
Introduction 92
61. Elementary Theory of Precessing Top 92
62. Angular Momentum, Moment of Inertia, and Kinetic Energy 94
63. The Ellipsoid of Inertia; Principal Axes of Inertia .... 95
64. The Equations of Motion 96
65. Euler's Equations 98
66. Torque-free Motion of a Symmetric Rigid Body 98
67. Euler's Ancles 100
68. General Motion of a Symmetrical Top under Gravity . . . 102
69. Precession and Nutation 104
Problems 105
Chapter XI
COUPLED SYSTEMS AND NORMAL COORDINATES
Introduction 107
70. Coupled Oscillators 107
71. Normal Coordinates Ill
72. Relation of Problem of Coupled Systems to Two-dimen-
sional Oscillator 114
73. The General Problem of the Motion of Several Particles 117
Problems 118
Chapter XII
THE VIBRATING STRING, AND FOURIER SERIES
Introduction 120
74. Differential Equation of the Vibrating String 120
XU CONTENTS
Page
75. The Initial Conditions for the String 122
76. Fourier Series 123
77. Coefficients of Fourier Series 124
78. Convergence of Fourier Series 125
79. Sine and Cosine Series, with Application to the String 126
80. The String as a Limiting Problem of Vibration of Particles 128
81. Lagrange's Equations for the Weighted String 131
82. Continuous String as Limiting Case 131
Problems 132
Chapter XIII
NORMAL COORDINATES AND THE VIBRATING STRING
Introduction 134
83. Normal Coordinates 134
84. Normal Coordinates and Function Space 137
85. Fourier Analysis in Function Space 139
86. Equations of Motion in Normal Coordinates 140
87. The Vibrating String with Friction 142
Problems 144
Chapter XIV
THE STRING WITH VARIABLE TENSION AND DENSITY
Introduction 146
88. Differential Equation for the Variable String 146
89. Approximate Solution for Slowly Changing Density and
Tension 147
90. Progressive Waves and Standing Waves 149
91. Orthogonality of Normal Functions 151
92. Expansion of an Arbitrary Function Using Normal Func-
tions. 152
93. Perturbation Theory 154
94. Reflection of Waves from a Discontinuity 156
Problems 158
Chapter XV
THE VIBRATING MEMBRANE
Introduction 160
95. Boundary Conditions on the Rectangular Membrane . . . 160
96. The Nodes in a Vibrating Membrane 162
97. Initial Conditions 162
98. The Method of Separation of Variables 163
99. The Circular Membrane 164
100. The Laplacian in Polar Coordinates. 164
101. Solution of the Differential Equation by Separation. . . 165
102. Boundary Conditions . 166
103. Physical Nature of the Solution 167
104. Initial Condition at t = 168
CONTENTS xiii
Page
105. Proof of Orthogonality of the J's 169
Problems. . 170
Chapter XVI
STRESSES, STRAINS, AND VIBRATIONS OF AN ELASTIC SOLID
Introduction 172
106. Stresses, Body and Surface Forces 172
107. Examples of Stresses. 174
108. The Equation of Motion 175
109. Transversa Waves 176
110. Longitudinal Waves 178
111. General Wave Propagation 179
112. Strains and Hooke's Law 180
113. Young's Modulus 182
Problems 183
Chapter XVII
FLOW OF FLUIDS
Introduction 185
114. Velocity, Flux Density, and Lines of Flow 185
115. The Equation of Continuity 186
116. Gauss's Theorem 187
117. Lines of Flow to Measure Rate of Flow 188
118. Irrotational Flow and the Velocity Potential 188
119. Euler's Equations of Motion for Ideal Fluids 190
120. Irrotational Flow and Bernoulli's Equation 191
121. Viscous Fluids 192
122. Poiseuille's Law 194
Problems 195
Chapter XVIII
HEAT FLOW
Introduction 197
123. Differential Equation of Heat Flow 197
124. The Steady Flow of Heat 198
125. Flow Vectors in Generalized Coordinates 199
126. Gradient in Generalized Coordinates. 200
127. Divergence in Generalized Coordinates 200
128. Laplacian 201
129. Steady Flow of Heat in a Sphere 201
130. Spherical Harmonics 202
131. Fourier's Method for the Transient Flow of Heat .... 203
132. Integral Method for Heat Flow 205
Problems 209
Chapter XIX
ELECTROSTATICS, GREEN'S THEOREM, AND POTENTIAL
THEORY
t ntroduction 210
133. The Divergence of the Field . . 210
x iv CONTENTS
Page
134. The Potential , 211
135. Electrostatic Problems without Conductors 212
130, Electrostatic Problems with Conductors 215
137. G keen's Theorem -. • 217
138. Proof of Solution of Pqisson's Equation . . . • 217
139. Solution of Poisson's Equation in a Finite Region 220
140. Green's Distribution. 221
141. Green's Method of Solving Differential Equations. . . . 222
Problems 223
Chapter XX
MAGNETIC FIELDS, STOKES'S THEOREM, AND VECTOR
POTENTIAL
Introduction • 225
142. The Magnetic Field of Currents 226
143. Field of a Straight Wire 228
144. Stokes's Theorem 229
145. The Curl in Curvilinear Coordinates 229
146. Applications of Stokes's Theorem 230
147. Example: Magnetic Field in a Solenoid 231
148. The Vector Potential 231
149. The Biot-Savart Law 233
Problems 234
Chapter XXI
ELECTROMAGNETIC INDUCTION AND MAXWELL'S
EQUATIONS
Introduction 235
150. The Differential Equation for Electromagnetic Induction 235
151. The Displacement Current 236
152. Maxwell's Equations 239
153. The Vector and Scalar Potentials 241
Problems.
244
Chapter XXII
ENERGY IN THE ELECTROMAGNETIC FIELD
Introduction 246
154. Energy in a Condenser 246
155. Energy in the Electric Field 247
156. Energy in a Solenoid 248
157. Energy Density and Energy Flow 249
J 58. Poynteng's Theorem • • 250
159. The Nature of an E.M.F 250
160. Examples of Poynting's Vector 251
161. Energy in a Plane Wave . 253
162. Plank Waves in Metals 255
Problems 256
CONTENTS XV
Chapter XXIII
REFLECTION AND REFRACTION OF ELECTROMAGNETIC
WAVES Paui ,
Introduction 258
103, Boundary Conditions at a Surface of Discontinuity. . . . 258
164. The Laws of Reflection and Refraction 259
165. Reflection Coefficient at Normal Incidence 260
166. Fresnel's Equations 202
167. The Pol arizing Angle 264
168. Total Reflection 265
169. The Optical Behavior of Metals 267
Problems 268
Chapter XXI V
ELECTRON THEORY AND DISPERSION
Introduction 270
170. Polarization and Dielectric Constant 271
171. The Relations of P, E, and D 273
172. PoLARlZABILITY AND DIELECTRIC CONSTANT OF GaSES 275
173. Dispersion in Gases 275
174. Dispersion of Solids and Liquids 278
175. Dispersion of Metals 280
Problems .- 283
Chapter XXV
SPHERICAL ELECTROMAGNETIC WAVES
Introduction. 286
176. Spherical Solutions of the Wave Equation 286
177. Scalar Potential for Oscillating Dipole 288
178. Vector Potential. 289
179. The Fields. 290
180. The Hertz Vector ,..:... 291
181. Intensity of Radiation from a Dipole 293
182. Scattering of Light 293
183. Polarization of Scattered Light 295
184. Coherence and Incoherence of Light 295
185. Coherence and the Spectrum 298
186. Coherence of Different Sources 299
Problems 299
Chapter XXVI
HUYGENS' PRINCIPLE AND GREEN'S THEOREM
Introduction 302
187. The Retarded Potentials 303
188. Mathematical Formulation of Huygens* Principle 305
189. Application to Optics 307
190. Integration for a Spherical Surface by Fresnel's Zones 308
XVI CONTENTS
Page
191. The Use of Huygens' Principle 310
192. Huygens' Principle for Diffraction Problems 310
193. Qualitative Discussion of Diffraction, Using Fresnel's
Zones 31 1
Problems 314
Chapter XXVII
FRESNEL AND FRAUNHOFER DIFFRACTION
Introduction 31 g
194. Comparison of Fresnel and Fraunhofer Diffraction. . . .315
195. Fresnel Diffraction from a Slit 319
196. Cornu's Spiral 320
197. Fraunhofer Diffraction from Rectangular Slit 323
198. The Circular Aperture 324
199. Resolving Power of a Lens 325
200. Diffraction from Several Slits; the Diffraction Grating 326
Problems 328
Chapter XXVIII
WAVES, RAYS, AND WAVE MECHANICS
Introduction 329
201. The Quantum Hypothesis 330
202. The Statistical Interpretation of Wave Theory 332
203. The Uncertainty Principle for Optics 333
204. Wave Mechanics 335
205. Frequency and Wave Length in Wave Mechanics 337
206. Wave Packets and the Uncertainty Principle 337
207. Fermat's Principle 339
208. The Motion of Particles and the Principle of Least Action 342
Problems. 343
Chapter XXIX
SCHRODINGER'S EQUATION IN onE DIMENSION
Introduction. 345
209. Scitrodinger's Equation 345
210. one-dimensional Motion in Wave Mechanics 346
211. Boundary Conditions in one-dimensional Motion 350
212. The Penetration of Barriers 351
213. Motion in a Finite Region, and the Quantum Condition . . 353
214. Motion in Two or More Finite Regions 355
Problems 356
Chapter XXX
THE CORRESPONDENCE PRINCIPLE AND STATISTICAL
MECHANICS
Introduction 358
215. The Quantum Condition in the Phase Space ....*.... 358
216. Angle Variables and the Correspondence Principle, . . . 359
CONTENTS xvii
Paojo
217. The Quantum Condition for Several Degrees of Freedom 361
218. Classical Statistical Mechanics in the Phase Space .... 364
219. Liouvtlle's Theorem 365
220. Distributions Independent of Time 366
221. The Microcanonical Ensemble 367
222. The Canonical Ensemble 368
223. The Quantum Theory and the Phase Space 369
Problems. 371
Chapteh XXXI
MATRICES
Introduction. • 374
224. Mean Value of a Function of Coordinates 374
225. Physical Meaning of Matrix Components 375
226. Initial Conditions, and Determination of c's 377
227. Mean Values of Functions of Momenta 379
228. Schrodinger's Equation Including the Time 381
229. Some Theorems Regarding Matrices 382
Problems ... 384
Chapter XXXII
PERTURBATION THEORY
Introduction, 386
230. The Secular Equation of Pertdrbation Theory 386
231. The Power Series Solution 387
232. Perturbation Theory for Degenerate Systems 390
233. The Method of Variation of Constants 391
234. External Radiation Field 392
235. Einstein's Probability Coefficients 393
236. Method of Deriving the Probability Coefficients. .... 395
237. Application of Perturbation Theory 396
238. Spontaneous Radiation and Coupled Systems 399
239. Applications of Coupled Systems to Radioactivity and
Electronic Collisions 402
Problems 404
Chapter XXXIII
THE HYDROGEN ATOM AND THE CENTRAL FIELD
Introduction. . . 406
240. The Atom and Its Nucleus 406
241. The Structure of Hydrogen 407
242. Discussion of the Function of r for Hydrogen 410
243. The Angular Momentum 414
244. Series and Selection Principles ( ... 416
245. TnE General Central Field 418
Problems 423
xviii CONTENTS
Chapter XXXIV
ATOMIC STRUCTURE Paqb
Introduction. . 425
246. Tub Periodic Table 42f>
247. The Mktitod of Self-consistent Fields 430
248. Effective Nuclear Charges 431
249. The Many-body Problem in Wave Mechanics 432
250. SciirGdinger's Equation and Effective Nuclear Charges, . 433
251. Ionization Potentials and one-electron Energies 435
Problems 437
Chapter XXXV
INTERATOMIC FORCES AND MOLECULAR STRUCTURE
Introduction 439
252. Ionic Forces 439
253. Polarization Force. . 439
254. Van der Waals' Force 440
255. Penetration or Coulomb Force 442
256. Valence Attraction 442
257. Atomic Repulsions 444
258. Analytical Formulas for Valence and Repulsive Forces. . 444
259. Types of Substances: Valence Compounds 447
260. Metals 449
261. Ionic Compounds 449
Problems 451
Chapter XXXVI
EQUATION OF STATE OF GASES
Introduction. 454
262. Gases, Liquids, and Solids 454
263. The Canonical Ensemble 456
264. The Free Energy 458
265. Properties of Perfect Gases on Classical Theory 461
266. Properties of Imperfect Gases on Classical Theory . . . 462
267. Van der Waals' Equation 464
268. Quantum Statistics 466
269. Quantum Theory of the Perfect Gas 468
Problems 470
Chapter XXXVII
NUCLEAR VIBRATIONS IN MOLECULES AND SOLIDS
Introduction 471
270. The Crystal at Absolute Zero 472
271. Temperature Vibrations of a Crystal 474
272. Equation of State of Solids 478
273. Vibrations of Molecules 480
CONTENTS xix
Page
274. Diatomic Molecules 481
275. Specific Heat of Diatomic Molecules 483
27G. Polyatomic Molecules ," 485
Problems • 486
Chapter XXXVIII
COLLISIONS AND CHEMICAL REACTIONS
Introduction 488
277. Chemical Reactions 488
278. Collisions with Electronic Excitation 491
279. Electronic and Nuclear Energy in Metals 494
280. Perturbation Method for Interaction of Nuclei 497
Problems 499
Chapter XXXIX
ELECTRONIC INTERACTIONS
Introduction ■ SOI
281. The Exclusion Principle 502
282. Results of Antisymmetry of Wave Functions ....... 506
283. The Electron Spin 507
284. Electron Spins and Multiplicity of Levels 509
285. Multiplicity and the Exclusion Principle . 510
288. Spin Degeneracy for Two Electrons •. .512
287. Effect of Exclusion Principle and Spin 514
Problems 516
Chapter XL
ELECTRONIC ENERGY OF ATOMS AND MOLECULES
Introduction 518
288. Atomic Energy Levels 518
289. Spin and Orbital Degeneracy in Atomic Multiplets .... 520
290. Energy Levels of Diatomic Molecules 522
291. Heitler and London Method for H 2 523
292. The Method of Molecular Orbitals 527
Problems 530
Chapter XLI
FERMI STATISTICS AND METALLIC STRUCTURE
Introduction 531
293. The Exclusion Principle for Free Electrons 531
294. Maximum Kinetic Eneroy and Density of Electrons. . . . 534
295. The Fermi-Thomas Atomic Model 535
296. Electrons in Metals 536
297. The Fermi Distrlbution. 540
Problems 543
XX CONTENTS
Chapter XLII
DISPERSION, DIELECTRICS, AND MAGNETISM Pagh
Introduction , 545
298. Dispersion and Dispersion Electrons 546
299. Quantum Theory of Dispersion . 548
300. polarizability 549
301. Van der Waals' Force 551
302. Types of Dielectrics 553
303. Theory of Dipole Orientation 554
304. Magnetic Substances 556
Problems 558
Suggested References. 561
Index 565
INTRODUCTION
TO
THEORETICAL PHYSICS
CHAPTER I
POWER SERIES
The first result of a physical experiment is ordinarily a table
of values, one column containing values of an independent
variable, another of a dependent variable. In mechanics, the
independent variable is ordinarily the time, the dependent
variable the displacement. In thermodynamics, we may have
two independent variables, as volume and temperature, and one
dependent variable, the pressure. With electric currents, we
may have the current flowing in some part of the circuit as
dependent variable, the electromotive force applied as inde-
pendent variable, as when in a vacuum tube we measure plate
current as function of grid voltage. In electromagnetic theory,
the electric or magnetic field strength, the dependent variable,
is a function of four independent variables, the three coordinates
of space, and time.
The relation between independent and dependent variable can
be given by a table of values, by drawing a graph, or analytically
by approximating the results by a mathematical formula. The
last method is by far the most powerful, particularly if further
calculations must be made using the experimental results, so that
we are led to the study of mathematical functions. There are a
good many well-known functions; for example, the algebraic func-
tions, as ax + bx 2 ; the trigonometric functions, as sin (ax + 6);
exponential functions, as ae -6 *; and rarer things like Bessel's
functions, J n (x). It may be that, by inspection of the results,
or for some theoretical reason, we may decide that some such
well-known function can be used to describe our experimental
data within the experimental error. But in actual physical
1
2 INTRODUCTION TO THEORETICAL PHYSICS
problems, we meet many functions which are not included among
these well-known forms. The question presents itself, can we
' not get some general method of describing functions analytically,
equally applicable to familiar and unfamiliar functions?
1. Power Series.— Power series present one such general
method, on the whole the most useful one. The simplest form
of power series is A + A x x + A 2 x 2 + * • • , where the A's
are arbitrary coefficients. By giving these coefficients suitable
values, we can make the series approach any desired function
as closely as we please, with some exceptions as we shall note
below. As examples of common series, we have first the poly-
nomials (in which all A„'s after a certain n are zero); and then
many familiar infinite series, as
(1 + x) n = 1 + nx + — ^ — 2| — x + 31 x +
2 3 " ' ' (1)
e* - i + a; + |!+|! + . . • , (2)
COS X = 1 - 7^ + Ti _ ^1 + ' ' * » ( 3 )
sin x = x — g| + ^ j — • • • (4)
In fitting an experimental table of values, it is generally true
that we cannot use one of these well-known series. We must
determine coefficients to fit the data. A familiar process is that
in which we know beforehand that the graph of the function
should be a straight line. Then, either by actually plotting and
estimating by means of a ruler, or by using least squares, we
find the two constants of the linear relation y = a + bx. If
the graph is slightly curved, we may be able to determine the
constants of a parabola y = a + bx + ex 2 to fit it approximately.
More complicated curves can be approximated by taking more
terms. It is plain that, if there are n points determined experi-
mentally, we can find a polynomial containing n coefficients which
will just pass through them. But this is hardly a sportsmanlike
thing to do, and generally we look for a function containing
far fewer constants than the number of points we wish to -fit.
In other words, in practice, rather than using infinite series, we
are accustomed to use only the first few terms of such a series, ,
POWER SERIES 3
2. Small Quantities of Various Orders. — The general justifica-
tion of this method of using only a finite part of a series comes
from considering small quantities of various orders, as they are
called. A power series is practically useful only if it converges
rather rapidly; that is, if each term is decidedly smaller than the
one before it. If we imagine that a physical relation is really
expressed by a rapidly converging infinite series, then the sum
of all the terms after a certain one will be smaller than the
inevitable errors of experiment, and may be neglected, leaving
only a polynomial. Suppose, for instance, that the linear dimen-
sion d of a solid under pressure, expressed as a function of the
pressure p, is given exactly by a series d = d Q — ap + bp 2 —
• • • . For small pressures, the change of length ap will be
small compared with d , and the second-order term bp 2 will be
in turn small compared with ap (though of course this will not
be true for much higher pressures, since ap will increase, and
bp 2 will increase even more). We express this by saying that
ap is small quantity of the first order, bp 2 a small quantity of
the second order. It may well be that the second-order quantities
are so small that we can neglect them, so that approximately
d = d — ap. Now if we are interested in finding the way in
which the volume, proportional to d 3 , changes with pressure,
we have accurately
^3 = do s _ 3do * ap + (Zd a 2 + 3d 2 b)p 2 + • • • . (5)
But we are assuming that ap is small compared with d , and
bp 2 is small compared with ap, for all pressures used. Thus we
readily see that the term in p 2 in this final expression (5) is
small compared with the term in p, and can be neglected in
comparison with the leading term d 3 , so that in d 3 , as in d, we
can neglect the second order of small quantities. We could
then have started with the abbreviated expression d = d — ap,
and have obtained the same result for d 3 , to the first order.
This method of cutting off infinite series at definite places,
retaining only terms of a certain order, is very commonly used,
and often is the only thing that simplifies computations with
series enough to make them practically possible. But we must
notice that the justification depends entirely on the physical
situation, and can be different in different cases. Thus if we
had to consider higher pressures in our problem above, we should
have to retain the second-order terms, but perhaps could neglect
4 INTRODUCTION TO THEORETICAL PHYSICS
third-order ones. one must always use good physical judgment
in neglecting small quantities. Now, of course, in many cases
we do not need to neglect high powers at all. The problems
which we meet will often have simple enough relations between
the coefficients of the successive terms so that we can write
down as many terms as we please, without trouble, as we can
with the binomial or exponential series. But it always pays
to inquire, if the high terms of the series get too complicated
to work with successfully, if they cannot be neglected.
3. Taylor's Expansion. — We have been speaking of series
representing functions obtained from experiment, or about whiph
we do not have much information. But it may be that we have
to work with a function whose analytical properties we know,
and in that case there is a standard method of finding its series
expansion, known as Taylor's theorem. This is as follows:
/(*) =/(o) +f(p)x +^r^ 2 +-^r* 3 +-.••, W
where f{x) is the function of x, /(0) means the value of the func-
tion when x = 0, /'(0) is the first derivative for x = 0, and so on,
so that f(x) = A + Aix + A 2 x 2 +•••', where A n = f n (0)/nl
To justify this, we need only differentiate n times, obtaining
very easily
/"Or) = n(n - 1) • • • (2)(l)A n + (n + l)(n) • • • (2)A n+1 x
+ (n + 2)(n + 1) • • • (S)A n+2 x* + • • •
= n\A n + ^ n >' A n+1 x + • • • .
If now we let x = 0, all terms but the first vanish, so that we have
/*(0) = n\ A n , or A n =/»(0)/n!.
4. The Binomial Theorem. — As an illustration of Taylor's
expansion, we prove the binomial theorem, the expansion of
(1 + x) n given in Eq. (1). We have
f(x) = (1 + *)»,
fix) = nil + x) n ~\
fix) = nin - 1)(1 + xY~\
etc., by differentiation. Thus, setting x = 0, (1 + x) goes into
1, so that we have /(0) = 1, jf'(O) = n, f?(0) = n(n - 1),
etc., and A = 1, Ai = n/l\, A 2 = n(n — \)/2\, etc.
5. Expansion about an Arbitrary Point.— A slightly more
general expansion is obtained by shifting the origin along the x
axis to a point a. The expansion is
POWER SERIES 5
fix) = f(a) +f{a)(x - 'a) +^p(^ - a) 2 + • • • (7)
From Taylor's theorem, we can see immediately a general
condition which a function must satisfy if it can be expanded
in power series about a given point (by expanding about a point
we mean setting up an expansion in powers of x — a, if a is
the given point). The function and all its derivatives must be
finite at the point in question, since otherwise some coefficients
of the expansion will be infinite. Thus for example we cannot
expand 1/x in power .series in a;: we have/(0) = 1/0 = infinite,
and all the derivatives are also infinite. Such a point is called
a singular point of the function. But by expanding about
another point we can avoid this difficulty. Thus we can expand
1/x about a, if a ^ 0;
/(a) = 1/a, /'(a) = -1/a 2 , /"(a) = 1-2/a 3 , /'"(a) = -1-2-3M
etc., so that
1 ' 1 _ (x - a) , {x - a) 2 _ (x - a) z . .
x a a 2 ^ a 3 a 4 " r " w
From this we can understand that a function can be expanded in
power series about a point which is not a singular point.
6. Expansion about a Pole. — At some singular points, the
function behaves like l/x n , an inverse power of x. Such a
singularity is called a pole. If fix) has a pole of order n at the
origin, then by definition x n f(x) has no singularity at the origin,
and can be expanded in power series A + A\x • • • . Thus
we have for f(x) the expansion
an infinite series starting with inverse powers, but turning into
anordinary series of positive powers after its nth term. A similar
theorem holds for expansion about a pole at x — a. A singularity
which is not a pole is called an essential singularity. An example
of an essential singularity is that possessed by the function
e~ 1/x at x = 0. This function approaches as # approaches
through positive values, but becomes infinite as x approaches
through negative values, and no inverse power 1/x" has such a
behavior.
7. Convergence. — A series is said to converge if the process of
adding its terms is one that can be carried out and that leads to a
6 INTRODUCTION TO THEORETICAL PHYSICS
definite answer. Thus (1 — x)~ l , by the binomial theorem, is
equal to 1 + x + x 2 + x 3 + • • • .' Now if x is less than unity,
and we try to add these terms, we get an answer. For example,
if x = 0.1, we have 1 + 0.1 + 0.01 + 0.001 + 0.0001 + • • • =
1.111 • • • , which equals (1 — 0.1) -1 = 1%, as it should.
But if x is greater than unity, this no longer holds: if x = 2,
we have 1 + 2 + 4 + 8+ •••, which certainly is infinitely
great, and leads to no definite value. Another situation is
obtained if we set x = — 1 in the series, when we have 1 — 1 +
1 — 1 + 1 • • • , a series which is said to oscillate (successive
terms have opposite signs). As a matter of fact, we find that
the series 1 + x + x 2 + x 3 + • • • , which is called the geometric
series, converges if x is between —1 and +1, but does not
converge if x is equal to or greater than 1, or equal to or less
than —1. This series illustrates two of the simplest types of
nonconvergence of series, the simple divergence, in which terms
get greater and greater, and the oscillation, where the terms have
the same order of magnitude but alternate sign. There is still
another type of series which does not converge, sometimes called
the semiconvergent or asymptotic series, whose terms begin to
decrease regularly as we go out in the series, but after a certain
point start in increasing, and eventually become infinite. These
asymptotic series often can be used for computation, for it can
be shown in many cases that, if we retain terms just up to the
smallest one, the resulting sum is a good approximation to the
function the series is supposed to represent.
Our definition of convergence in the last paragraph was very
crude. More exactly, a series converges if the sum of the first n
terms approaches a limit as n increases indefinitely. This defini-
tion agrees with the usual procedure of the physicist, for he often
computes by series, and he does it by adding a finite number of
terms. He carries this far enough so that adding more terms
does not change the sum, to the order of accuracy to which he
works, which essentially means that the sum is approaching
a limit.
To tell whether a given series converges is not always easy.
In the first place, we can be sure in some cases that given Taylor's
expansions cannot converge if the argument (that is, the inde-
pendent variable), has too large a value. Thus 1 + x + x 2 • • •
does not converge if x is equal to, or greater than, 1, and we could
have seen this from the fact that the series equals 1/(1 — x),
POWER SERIES 7
which has a singularity for x = 1 (being equal to %). Thus the
function is infinite for re = 1, and the series to represent it could
not converge. And increasing x beyond 1 cannot make the series
converge again. In fact, as soon as the variable in a series
becomes greater than the value for which the function has a
singularity, the series will diverge. But it is a little more com-
plicated than this would seem, for 1 + x + x 2 + • • • diverges
also for x less than —1, and there is no singularity here. As a
matter of fact, a power series converges in general so long as the
argument is less in absolute value than the smallest value for
which there is a singularity, but not beyond. But this singu-
larity can come from imaginary or complex values of the argu-
ment, so that we might well miss it completely if we did not
consider imaginary values. For this reason, this criterion for
convergence is rather tricky.
When we actually examine a series, we can often tell whether
it converges or not. Surely a series cannot converge unless its
successive terms get smaller and smaller. We can investigate
this by the ratio test, taking the ratio of the nth term to the one
before, and seeing how this ratio changes as we go out in the series.
If the limiting ratio is less than 1, the series converges; if it is
greater than 1, it diverges. If the ratio is just 1, the test gives
no information. Thus for example with the series x + x 2 /2 +
x 3 /3 + • • • , the ratio of the term in x n to that in x n ~ l is
— — ^zj- = x. As n approaches infinity, n — 1 and n
tb 00 lb
become approximately equal, so that the ratio approaches x.
Thus we see that if z is less numerically than unity, this series
converges; if x is greater than unity, it diverges; if x = 1, we
cannot say. From other information, we know that the series
when x = 1, which is 1 + 1/2 + 1/3 + 1/4 + • • • , diverges.
But with the similar series x + x 2 /2 2 + x s /3 2 + • • • , where the
ratio of terms also approaches x as we go out in the series, and
the series again diverges for x greater numerically than unity,
converges for x less than unity, we have just the other situation
at x = 1: the series 1 + 1/2 2 + 1/3 2 + • • • converges.
Often a series can be approximately summed by comparison
with an integral. Thus
1 + 2 n + 3^ + ' ' " = 2^ = J ^ approximately.
8 INTRODUCTION TO THEORETICAL PHYSICS
The approximation is rather poor for the small values of z, but
becomes better for large z values, on which the convergence
depends. It would be a better approximation, for instance, to
write ttt + t^ — h ' • ' = I — • From this we see that the
10 n ll n J 10 z n
series converges when n > 1, the integral being -. — _ ..^ W-I
which is zero at the upper limit, but diverges if n Z 1. For
n — 1, for instance, the integral becomes logarithmically infinite
at z = oo .
Problems
1. Plot 1 as a function of x, and show that it has a minimum at
x a, 2
x = 2. Expand in Taylor's series about this point, obtaining an expansion
y = A + Az(x — 2) 2 + A s (x — 2) 3 + • • • , where necessarily the coeffi-
cient A-l is zero. Now plot on the graph the successive approximations
y = A , y = A + A 2 (x - 2) 2 , y = A + A 2 (x - 2) 2 + A,(x - 2) 3 , y =
A + A 2 (x — 2) 2 + A 3 (x — 2) 3 + A 4 (x — 2) 4 , observing how they approxi-
mate the real curve more and more accurately.
2. a. Derive the series for the exponential, cosine and sine series, directly
from Taylor's theorem.
6. Differentiate the series for sin x term by term, and show that the
result is the series for cos x.
3. In the series for e x , set x = 1, obtaining a series for e. Using this series,
compute the value of e to four decimal places.
4. Why does one always have series for In (1 + x) in powers of x, rather
than for In x? From the series for In (1 + x), compute logarithms to
base e of 1.1, 1.2, 1.3, 1.4, 1.5.
6. The function l/(x — i), where i = \/— 1> has a singularity for x = i,
but not for any real value of x. Show that nevertheless the series expansion
about x = diverges for x greater than 1 or less than —1, obtaining the
power series by Taylor's theorem, and separating real and imaginary parts
of the series. This is an example of a case where the series diverges on
account of singularities for complex values of x.
6. As a result of an experiment, we are given the table of values following:
x
V
1
7.0
2
11.1
3
15.2
4
19.3
5
23.2
6
27.1
7
30.8
8
34.5
9
38.2
10
41.7
POWER SERIES 9
Try to devise some practicable scheme for telling whether this function (in
which, being a result of experiment, the values are only approximations),
can be represented within the error of experiment by a linear, quadratic,
cubic, etc., polynomial. Get the coefficients of the resulting series, and use
them to find the value of the function and its slope at x = 0. Plot the
points, the curve which approximates them, and the straight-line tangent to
the curve at x = 0. It is legitimate to use graphical methods if you wish.
7. Expand tan -1 a; in a power series about x = 0. Hints:
(a) -j- tan" 1 (x) =
dx v ' 1 + x %
w r^-2 = 1 ~ x * + xi - ** + •
t- (tan -1 x) dx = tan -1 x + c.
<•>/;
What is the range of convergence of the resulting series? Calculate from
this series the value of tt/4 = tan -1 1 correct to 5 per cent. How many terms
of the series are necessary to obtain this accuracy?
8. By a procedure analogous to that used in Prob. 7 expand sin -1 sin a
power series about x — 0. Find the range of convergence for this series.
9. From the known Taylor's series for e x , write the corresponding series
for e~» 2 . By integrating this series obtain to 1 per cent a value for
,o e ~ x2dx >
whose correct value is 0.748. ...
10. Make use of the binomial theorem to obtain an expansion of
VI + -y/x in ascending powers of xV*. What is the range of convergence?
11. Discuss by the ratio test the convergence of the following series:
(a.) x + x*/2 + x 3 /3 + xV4 + • • •
(6) x + x72 2 + a;V3 2 + x*/4* + • • •
(c) The binomial expansion of (1 + x) k , for nonintegral k.
(d) The series for e x , sin x, cos x.
s:
CHAPTER II
POWER SERIES METHOD FOR DIFFERENTIAL
EQUATIONS
Most important physical laws involve statements giving the
relation between the rate of change of some quantity and other
quantities. Such a relation, stated in mathematical language,
is a differential equation — an equation containing derivatives of
functions, as well as the functions themselves. For example,
the fundamental law of mechanics is Newton's second law of
motion: the force equals the time rate of change of the
momentum. Or in electricity, in a circuit containing an
inductance, the back electromotive force of the inductance equals
a constant times the time rate of change of the current. But
these differential relations are not in the form which can be used
in making direct connection with experiment. one cannot
directly plot graphs, or give tables of values, from them. one
must rather solve the differential equations, that is, find algebraic
relations between the variables, containing no differentiations,
but consistent with the differential equations. For most of our
course we shall be interested in finding such solutions of differen-
tial equations.
Solving differential equations is rather like integrating func-
tions: there are no general rules. Individual cases must be
treated by appropriate special methods. We shall meet some
such special rules, and shall make much use of some of them.
Those who have studied differential equations have learned a
variety of such rules. But rather more important on the whole
is a method which is applicable, though not always most con-
venient, in a very large number of cases: the method of power
series. In general, the solution of a differential equation consists
of a certain functional relation between variables. If we assume
that this function is expanded in power series, our only problem
is to determine the coefficients. And by substituting the series
back into the differential equation, we can very often get condi-
tions for determining them. We shall illustrate the method
by examples.
10
POWER SERIES METHOD FOR DIFFERENTIAL EQUATIONS 11
8. The Falling Body. — Imagine a body moving vertically
under the action # of gravity. To describe its motion, we have an
independent variable, the time t, and a dependent variable, the
height x. Let the mass of the body be m, and let its velocity,
which is of course dx/dt, be also called v. The force acting on
it is F. Then Newton's law states that F = \, J > where mv
dt
is the momentum. If the mass is constant (which does not
always have to be the case, as we shall see in Prob. 7), we can
rewrite the equation as F = mdv/dt, or =ma, where a is the
acceleration. Substituting v = dx/dt, this is also F = md 2 x/dt 2 .
These are all forms of Newton's second law, written as differential
equations. We shall first take the case where the force, like
that of gravity on the earth's surface, is constant : F = constant =
— mg, where g is the acceleration of gravity, and where the nega-
tive sign means that the force is downward. Then we have
t? dv d 2 x ...
F=-mg = m m = m^, (1)
or d 2 x/dt 2 = dv/dt = — g. These can be solved at once, by direct
integration: integrating once with respect to t, dx/dt = v =
constant —gt = v — gt, where y , the constant of integration,
obviously means the value of the velocity when t = 0. Integrat-
ing again, and calling the second constant of integration x , we
have x = x + v t — \gt 2 , containing now two arbitrary con-
stants, the initial position and initial velocity. The presence
of such arbitrary constants is the most characteristic feature of
the solutions of differential equations. And we note that the
number of arbitrary constants equals the number of integrations
we must perform to get rid of the differentiations. If the dif-
ferential equation is one of the first order (with only first deriva-
tives in it), there will be one arbitrary constant in the solution;
if it is of the second order (second derivatives), there will be two,
and so on. And always the arbitrary constants must be deter-
mined so as to satisfy certain "initial conditions," such as the
values of the position and velocity at t = 0.
9. Falling Body with Viscosity.— With the problem of the
falling body, the solution has automatically come out as a poly-
nomial in t, which is simply a power series that breaks off, so
that there is no need of more complicated methods. But now
let us take a more difficult case: we assume the body to be falling
12 INTRODUCTION TO THEORETICAL PHYSICS
through a viscous medium under the action of gravity. Here
the force is a sum of two parts: gravity, — mg, % and a frictional
force depending on velocity. It is found experimentally that
for small velocities this frictional force, in a viscous medium, is
proportional to the velocity, with, of course, a negative coeffi-
cient, since it opposes the motion, changing signs with the velocity.
Let it be called — kv, k being the coefficient, which depends in a
complicated way on the shape and size of the body, and is pro-
portional to the coefficient of viscosity of the fluid. Then we
have
dv 7
m-j = — mg — kv,
or
m-j + kv = — mg. (2)
This is a simple sort of differential equation, in a standard form.
It is
1. A linear differential equation. That is, it contains v and its
derivatives (as v, dv/dt, d 2 v/dt 2 , etc.) only in the first power (in
dv/dt, kv), or the zero power ( — mg, independent of v), not as
squares or cubes [as, for example, (dv/dt) 2 }, or products (as v
dv/dt).
2. A differential equation of the first order (containing no
derivative higher than the first).
3. An inhomogeneous equation (it contains terms of both the
first power and the zero power in v and its derivatives, while a
homogeneous equation contains only terms of tlie same power,
as all of the first power. That is, if the term — mg were absent,
the equation would be homogeneous).
We cannot solve Eq. (2) by direct integration, for if we inte-
grate with respect to t, one term would be jv dt, which we cannot
evaluate, since v is an unknown function of time. Thus we must
proceed differently. Let us assume that v is given by a power
series in the time* v = A + Ait -\- • • • , and try to determine
the coefficients. We do this by substituting the series in the
equation. We have by direct differentiation
^ = A t + 2A 2 t + 3A 3 * 2 +•••+(» + l)A n+l t n '+•••.
Then, substituting, we have
m[Ai + 2A 2 t + 3A s t 2 + • • • + (n + l)A n+l t n + • • • J
+ k(A + Ait + AJ* +'•••+ Aj n +•••) = —m-
POWER SERIES METHOD FOR DIFFERENTIAL EQUATIONS 13
Rearranging,
(A: + ±A + g) + (2A 2 + ±A x )t + • ■ •
+ \{n + l)A n+l + ^A n \» + • • • = 0. (3)
This states that a certain power series in t is equal to* zero, for all
values of t. But the only function of t which is always zero is
zero itself, and by Taylor's theorem the expansion of zero in
power series is a series all of whose coefficients are zero. Thus
Eq. (3) can only be satisfied, for all values of t, if each coefficient
vanishes :
k
2A 2 + -A x = (4)
(n + 1)A„ +1 + ^-A n =
m
Here we have an infinite set of equations to solve for the coeffi-
cients A. Fortunately they are so arranged that we can solve
them, getting all A'b in terms of A , if we start with the first and
work down:
4,=
& Ik. lk/k A , \
^ Ao + g )
(5)
A _ _lh 1 k*/k
As " 3m A * ~ ~Z\lAm
A n+1 = - ■* -A. = (-1)*+* ] „ ,—(-Ao + g\
(n + 1) m v ' {n + V)\m n \m y J
And the power series is
v = A + A x t + • • •
-^ + (^' + »X- ( + 5=« , -S!^+- : -> (6)
Thus we have the solution. If we set t = 0, we have v = A , so
that A is simply the initial velocity, and is the arbitrary constant
14
INTRODUCTION TO THEORETICAL PHYSICS
which we meet in the solution. We could compute from our
series the value of v at any time t, knowing the initial velocity.
It happens in this case that we can recognize the infinite series
as representing a familiar function. For we have
k. , 1 fc 2 1 k s
--Z4
m
e = 1 -
m 2 ! ra 2 3 ! w 3
which has close connection with our series, so that we can write
at once
v = A + l—Ao + g
(-Ao + g)
(e"='-l)
m
mg
k/m
-(A. + %y*
mg
k '
(7)
mg
Fig. 1. — Velocity of damped falling body, with various initial conditions.
We Can see the physical properties of the solution most clearly
from the graph in Fig. 1, No matter what the initial velocity
may have been, the particle finally settles down to motion with a
constant speed, given by —mg/k. The initial velocity is A , and
if this is greater than the final velocity, the body slows down; if
it is less, it speeds up, to attain this final speed.
10. Particular and General Solutions for Falling Body with
Viscosity. — It is instructive to notice that we can solve our
POWER SERIES METHOD FOR DIFFERENTIAL EQUATIONS 15
problem in an elementary way. Our equation is mdv/dt + kv =
— trig. Plainly a particular solution is given by assuming a
constant velocity. Then dv/dt is zero, so that the equation is
kv = —rag, or » = —mg/k. But this is not the most general
solution, for it does not have an arbitrary constant; it represents
merely the particular case in which the initial velocity happened
to be just the correct final value, and is unable to describe any
other initial condition. To get a general solution, we proceed as
follows: we take the homogeneous equation mdv/dt + kv = 0,
which we obtain from our inhomogeneous equation by leaving
out the term — mg. We can easily solve this: writing it dv/v =
— (k/m)dt, and integrating, we have In v = — (k/m)t + con-
stant, and taking the exponential, v = constant X e~ (k/m) ', where
the constant is arbitrary. Then the sum of this general solution
of the homogeneous equation, and the particular solution — (m/k)g
of the inhomogeneous equation, is the solution we desire. We
may prove this easily. For we have
\ m Jt + fcjw 6 "* / = °-
Adding,
showing that the function Ce~ {k/m)t — (m/k)g satisfies the differ-
ential Eq. (2).
The procedure we have just used is an illustration of the general
rule : A general solution of an inhomogeneous equation is obtained
by adding a particular solution of the inhomogeneous equation,
and a general solution of the related homogeneous equation. In this
statement, the terms "particular solution" and "general solu-
tion" are used in a technical sense: a "particular solution" is
one which satisfies the differential equation but has no arbitrary
constants; a "general solution" is one which has its full comple-
ment of arbitrary constants. The proof of the rule in general is
carried out just as in our case, adding the particular solution of
the inhomogeneous equation and a general solution of the homo-
geneous equation, and showing that the sum satisfies the inhomo-
geneous equation. one thing should be noted: the properties
we have been discussing depend entirely on the linear character
16 INTRODUCTION TO THEORETICAL PHYSICS
of the differential equation, for it is only with linear functions
/ that /(xi) + f(x 2 ) = f(xi + x 2 ).
11. Electric Circuit Containing Resistance and Inductance.—
The theory of the electrical circuit reminds one in many ways of
mechanical principles: electric current is analogous to velocity,
charge to displacement, electromotive force to mechanical force.
Thus in a circuit containing a resistance, inductance, and con-
denser, all in series, the current can flow through the circuit,
piling up in the condenser because it cannot flow through. Let
q be the charge on one plate of the condenser ( — q being the charge
on the other), and let i be the current flowing through the circuit
toward the condenser plate in question, so that the current
measures just the amount of charge per second flowing onto the
condenser plate, or i = dq/dt (as v = dx/dt). Now let the
coefficient of self-induction of the circuit be L, the resistance R,
the capacity of the condenser C. Then there are three e.m.fs.
(electromotive forces) acting on the current, in addition to a
possible external e.m.f. E from a battery: the back e.m.fs. of
di
induction, resistance, and capacity. The first is — L-jg the
electromotive force induced in a circuit when the current changes;
the second is — Ri, the value familiar from Ohm's law; the third
i s —q/C, as given by the elementary law of the condenser.
These are all negative, for they act to oppose the current. Now
the law of the circuit is that the total e.m.f. acting on the circuit
is zero:
-Lf-tt-£ + *-0,
or
T 'Jt ^ "' ' C
L d 4 t + Ri + £=E. (8)
This is a differential equation. Let us take the special case where
there is no condenser, so that the equation is Ldi/dt + Ri = E.
The equation is then exactly analogous to the equation mdv/dt +
kv = F, which we had for a falling body with viscosity. And we
see that self-induction is analogous to inertia, resistance to
viscosity. The analogy is often valuable.
If now the applied e.m.f. E of the battery is constant, the
problem can be solved mathematically just as before, and we find
i = constant X e~ (B/L)t + E/R. The first term is the transient
POWER SERIES METHOD FOR DIFFERENTIAL EQUATIONS 17
effect, of arbitrary size, as we see from the arbitrary constant,
rapidly dying out as time goes on, while the second is the constant
value given by Ohm's law, the value to which the current tends
if we wait long enough. "
Problems
1. Show that the solution v = (A + mp/fc)e -( */ m) ' — .mg/k reduces
properly to uniformly accelerated motion in the limiting case where the
viscous resistance vanishes. Illustrate this graphically, showing curves for
several different k's, and finally for k = 0, all with the same initial velocity.
2. A raindrop weighs 0.1 gm., and after falling from rest reaches a limiting
speed of 1,000 cm. per second by the time it reaches the earth. How long
did it take to reach half its final speed? Nine tenths of its final speed?
How far did it travel before reaching half its final speed? For how long
could its velocity be described by the simple law v = —gt to an error of
1 per cent?
3. At high velocities, the viscous resistance is proportional to the third
power of the velocity. Assuming this law, set up the differential equation
for a particle falling under gravity and acted on by such a viscous drag.
Solve by power series, obtaining at least four terms in the expansion for v
as a function of t. Draw graphs of velocity as function of time, and discuss
the solutions physically.
4. Using the same law of viscosity as in the preceding problem, but assum-
ing no gravitational force, solve by direct integration of the differential
equation for the case of a particle starting with given initial velocity and
being damped down to rest. Show by Taylor's expansion of this function
that it agrees with the special case of the power series of the preceding
problem obtained by letting the gravitational force be zero.
6. A large coil has a resistance of 0.7 ohm, inductance of 5 henries. Until
t = 0, no current is flowing in the coil. At that moment, a battery of 5
volts e.m.f. is connected to it. After 5 sec, the battery is short-circuited
and the current in the coil allowed to die down. Compute the current as
function of the time, drawing a curve to represent it.
6. A coil having L = 10 henries. R = 1 ohm, has no current flowing in
it until t = 0. Then it has an applied voltage increasing linearly with the
time, from zero at t = 0, to 1 volt at t = 1 sec. After t = 1, the e.m.f.
remains equal to 1 volt. By series methods find the current at any time,
and plot the curve. '
7. Suppose we have a rocket, shot off with initial velocity v , and there-
after losing mass according to the law m = m (l — ct), where m is the mass
at any time, m the initial mass at t = 0, c is a constant, and where the mass
lost does not have appreciable velocity after it leaves the rocket. Show
that on account of the loss of mass the rocket is accelerated, just as if a force
were acting on a body of constant mass. The rocket is acted on by a viscous
resisting force in addition. Taking account of these forces, find the differ-
ential equation for its velocity as a function of time, and integrate the equa-
tion directly. Now find also the solution for v as a power series in the time.
Show that the resulting series agrees with that obtained by expanding the
18 INTRODUCTION TO THEORETICAL PHYSICS
exact solution. Calculate the limiting ratio of successive terms in the
power series, as we go out in the series, and from this result obtain the region
of convergence of the series. Is this result reasonable physically? What
happens in the exact solution outside the range of convergence?
8. In a radioactive disintegration, the number of atoms disintegrating per
second, and turning into atoms of another sort, is simply proportional to the
total number of radioactive atoms present. Write down the differential
equation for the number of atoms present at any time, and find its solution.
Assuming that half the atoms of a sample of radium disintegrate in 1,300
years, how many would decay in the first year?
9. If at the same time radium were being produced at a constant rate by
disintegration of uranium, how would this change the situation in the
preceding problem? Set up the new differential equation. Assuming
that we start without any radium, but with pure uranium, find the amount
of radium as a function of the time. Show that the amount of radium
approaches an equilibrium amount, which it reaches in time, whether the
initial amount of radium is greater or less than the equilibrium amount.
10. Find a series solution for the differential equation m dv/dt + kv = c/t,
where c is a constant, representing a damped motion under the action of an
external force which decreases inversely proportionally to the time, the
series having the form v = on/t + a»/t* + • • • . Show that this series is
divergent for all values of t. Show that the differential equation is formally
satisfied by the expression v = er* J ^ ~dt. This solution is convergent for t
negative. The integral | j dt is known as the exponential integral func-
tion, and is important in physics and mathematics. It is frequently calcu-
lated by using the above divergent series. Explain how this procedure might
be valid.
11. Suppose a particle is acted on by a damping force proportional to the
velocity, and to a force which varies sinusoidally with the time. Solve the
resulting differential equation for velocity as function of time, by the series
method, by expanding the force in power series in the time. Can you
recognize the analytical form of the resulting power series?
d 2 v 1 dtf
12. Solve by power series Bessel's equation ^ + - ^ + y =0. The
result is Bessel's function of the zero order, Jo 0c). From the series, plot
J (x) for x between and 5.
13. The equation for Bessel's function of the mth order, J m {x), is ^ +
i d y. + (\ _ — 2 V = 0. Solve by power series, showing that the first
x dx \ x 2 J
term in the expansion is that in x m . Plot Ji(x) for x between and 5.
Bessel's functions oscillate, like the sine and cosine, all the way to infinity.
We shall use them in discussing standing waves in a circular membrane, and
for many other problems. The second independent solution of the equation
is infinite at the origin, and hence cannot be expanded in power series.
CHAPTER III
POWER SERIES AND EXPONENTIAL METHODS FOR
SIMPLE HARMONIC VIBRATIONS
In the last chapter we have found a general method of power
series for solving differential equations, and have applied it to
the problem of motion under viscous forces. Next we consider
the same method, applied to somewhat different problems: a
particle acted on by restoring forces proportional to the distance,
or an electric circuit containing inductance and capacity.
12. Particle with Linear Restoring Force. — Suppose that the
force acting on a particle is proportional to the displacement
from a fixed position, and opposite to the displacement, a so-called
linear restoring force. This force is -kx, if x is the displace-
ment, k a constant. For the moment we assume that there is
no gravitational or other external force acting. Then the equa-
tion of motion is md 2 x/dt 2 = — kx, or
m dfi + kx = (1)
This is a homogeneous linear differential equation of the second
order, with constant coefficients (that is, m, k are independent
of time). We solve it in series as before. If x = A + Ait +
A 2 t 2 + • • • , we have immediately, by the method used before,
(2mA 2 + kAo) + (3 • 2mA 3 + kA x )t + (4 • SmA* + kA 2 )t 2 +
• • • =0.
Thus, setting the separate coefficients equal to zero, and solving
one equation after the other, we find
a 1Jc A a 1 * A
A2 = -2m Ao > As = -3lm Al >
(2)
These equations determine all the coefficients in terms of two
arbitrary ones, A and Ai, which are the two arbitrary constants
19
20 INTRODUCTION TO THEORETICAL PHYSICS
to be expected in the solution of a second-order differential
equation. The solution may be written
x = A<_
+ A
x-^ + imV-
' Z\m
4 !\w/
(3)
We now observe that these series represent well-known functions:
the first is the cosine, the second the sine, except for a factor, so
that we have
x = A cos \/hJm t + AtVm/k sin \/k/m t. (4)
Thus the motion is a periodic one, as shown by the sinusoidal
functions. The period T is found from the fact that when t
increases by T, the sine or cosine must come back to its initial
value, which it does when its argument (that is, the thing whose
cosine we are taking), increases by 2tt. Thus Vk/m T = 2t,
T = 2TrVWk, the familiar formula for the period in simple
harmonic motion. From this, the frequency v is given by
v = l/T = (l/27r)\/fcM and tne angular velocity co by co =
2ttv = y/k/m. It is often convenient to use these relations in
rewriting the equation of motion, writing it
d*x/dt* + co 2 z = 0, or d 2 x/dt 2 + 47rVx = 0. (5)
13. Oscillating Electric Circuit.— In the last chapter, we have
seen that the equation for an electric circuit containing resistance,
inductance, and capacity, is L di/dt + Ri + q/C = E, where i
is the current, q the charge on the condenser, and E the impressed
electromotive force. We also saw that i = dq/dt. Substituting,
we obtain
4' + 4+§ = * (6)
This is an inhomogeneous second-order linear differential equation
for q, which becomes homogeneous if E = 0. We consider that
case, and in particular let R be zero. Then the problem becomes
mathematically equivalent to the preceding one, and has the
differential equation d*q/dt* +j/LC = 0. The solution is
q = A cos VYJLC t + AiVLC sin Vl/LC t, so that the
current oscillates in the circuit. By differentiating, we can find
the current directly instead of the charge: i = dq/dt =
POWER SERIES AND EXPONENTIAL METHODS 21
-A Vl/LC sin Vl/LC t + A x cos Vl/LC t, so that the
oscillations of charge and current are similar. The period of
oscillation is given by T = 2w\/LC, increasing as either the
inductance or the capacity becomes large.
14. The Exponential Method of Solution. — We have found
that the solutions of our vibration problems, as well as of several
other differential equations, come out either as exponential
functions, or as sines or cosines. As a matter of fact, any
homogeneous linear differential equation, with constant coeffi-
cients, has such solutions. on account of the importance of
this type of equation, we shall consider its solution specially.
Let us take a second-order differential equation,
g + «g+*-«. • (7)
a type which includes the mechanical and electrical problems
we have worked with. We can show very easily that this has
an exponential solution, y = e kx . For let us substitute this
function into the equation. We have dy/dx = ky, d 2 y/dx 2 =
k 2 y, so that the equation becomes (k 2 + ak + b)e kx = 0. This
equation is factored, and since e kx is not always zero, the other
factor must be, and we have k 2 + ak + 6 = 0, or solving the
quadratic by formula, k = -a/2 ± Via/2) 2 - &. Thus if k
eq uals either k x = -a/2 + V(«/2) 2 - b, or k 2 = —a/2 —
V(a/ 2 ) 2 — b, e kx is a solution of the equation. We have, in
fact, two independent solutions.
Now if we have two independent solutions of a second-order
linear homogeneous differential equation, we can readily show that
any linear combination of them is itself a solution. If such a
solution has two arbitrary constants, it is a general solution. Thus
we can write the general solution of Eq. (7)
y = Ae klX + Be k * x ,
or
y = e -(a/2)s[_4 e v / («/2)*-&s _|_ £ e -V(«/2) *-&*]. (8)
This is the solution, with its two arbitrary constants, and it
might seem as if no further discussion were necessary. But
there is an interesting feature still to consider: the quantity
(a/2) 2 — 6 under the radical may easily be negative, and the
square root imaginary, so that we have to investigate the
exponentials of imaginary quantities.
22 INTRODUCTION TO THEORETICAL PHYSICS
Suppose, for example, that the damping term is zero: a = 0,
and the differential equation is d 2 y/dx 2 + by = 0. This is the
only case we have so far worked out in detail. Then the solution
becomes y = Ae*^ 1 * + Be'*^ 1 *, where i = yf^l. But we
have already seen that the solution of this same equation is C
cos y/bx + D sin y/bx. If both forms are right, there must
be connections between exponential and sinusoidal functions,
which we now proceed to investigate.
15. Complex Exponentials. — Let us investigate the function
e ix by series methods. We have at once
e ,x - 1 + tx - 2] _ Iff "+~ 41 * "
-( l -*+t---) + i ('-a + --;)
or
e ix = cos x -\- i sin x.
Similarly we have
e -ix _ cos x — i sin x. (9)
We can solve for cos x by adding these equations and dividing
by 2, or for sin x by subtracting and dividing by 2r.
e ix _|_ e -ix ^ e ix _ g-XX
cos a; = H » sin a; = ^ ( 10 )
These theorems are fundamental in the study of exponential
and sinusoidal functions.
In terms of the formulas of the last paragraph, we can readily
see that our two formulations of simple harmonic motion are
both correct. For we have
= A (cos y/bx + i sin y/bx) + B (cos y/bx — i sin y/bx)
= (A + B) cos y/bx + i(A — B) sin y/bx,
or one constant times the cosine plus another times the sine,
which is the more familiar solution. By giving A and B suitable
complex values, we can have both coefficients real. But to
know how to do this, and to understand the whole process, we
should study complex numbers for themselves. Let us then
make a little survey of the theory of complex numbers.
POWER SERIES AND EXPONENTIAL METHODS
"23
/ /
&/ /
x /
0^
^*
16. Complex Numbers. — A complex numb er is usually written
A + Bi, where A and B are real, i = s/ — 1. It is often plotted
in a diagram: we let abscissas represent real parts of numbers,
ordinates the imaginary parts, so that A measures the abscissa,
B the ordinate, of the point representing A + Bi. Every point
in the plane corresponds to a complex number, and vice versa.
All real numbers lie along the axis of abscissas, all pure imagi-
naries along the axis of ordinates, and the other complex numbers
between. But it is also often convenient to think of a complex
number as being represented, not
merely by a point, but by the
vector from the origin out to the
point. The fundamental reason
for this is that these vectors obey
the parallelogram law of addition,
just as force or velocity vectors do
(see Fig. 2). The vector treat-
ment is suggestive in many ways.
For example, we can consider the
angle between two Complex num-
bers. Thus, any real number, and
any pure imaginary number, are
at an angle of 90 deg. to each
other. Or, the number 1 -+■ i is at
an angle of 45 deg. with either 1
or i. When a complex number is regarded as a vector, we can
describe it by two quantities: the absolute magnitude of the
vector, or its length, \/A 2 + B 2 ; and the angle which it makes
with the real axis, or tan -1 B/A.
The vector representation of complex numbers has very close
connection with complex exponential functions. Let us consider
the complex number e ie , where is a real quantity. As we have
seen, this equals cos + i sin 0, so that the real part is cos 0,
the imaginary part sin 0. The vector representing this number
is then a vector of unit magnitude, for V cos 2 + sin 2 = 1.
Further, it makes just the angle with the real axis. We cau
see interesting special cases. The number e™' 2 = i, as we can
see at once from the vector diagram, or from the fact that it
3 C A E
Fig. 2. — Law of addition of com-
plex vectors. The vector E + Fi
represents the vector sum of A + Bi
and C + Di. Evidently OE =
OA + AE = OA + OC, and OF =
OB + OD. Hence E + Fi =
(A + O + (B + D)i.
,2-ri —
equals cos t/2 + i sin x/2 = i. Similarly e" = — 1, e*
• = 1. This Jast result shows that the exponential
e 4*i =
24
INTRODUCTION TO THEORETICAL PHYSICS
function of an imaginary argument is periodic with period 2iri,
similarly to the sine and cosine of a real argument.
Next we look at the number re i0 , where r, 6 are both real. It
differs from e ie in that both real and imaginary parts are multi-
plied by the same real factor r, which simply increases the length
of the vector to r, without changing the angle. Thus re ie is a
vector of length r, angle 0. As a result, we can easily write any
complex number in complex exponential form: A + Bi = re ie ,
where r = VA 2 + B 2 , 6 =
tsar 1 B/A, or A = r cos 6,B =
r sin (see Fig. 3). We may
use these results in showing
what happens when two com-
plex numbers are multiplied
together. Suppose we wish to
form the product (A + Bi)
(C -f Di) . Of course, multiply-
ing directly, this equals (AC —
BD) + (AD + BC)i, so that we
can easily find real and imagi-
nary parts of the product, but
this is not very informing. It
is better to write A -J- Bi =
r x e ie \ C + Di = r 2 e ie \ Then
the product is (rxe iei )(r%e i6i ) =
(ri/- 2 )e i(01+e2) . That is, the mag-
nitude of the product of two
complex numbers is the product
of the magnitudes, and the
angle is the sum of their angles.
Suppose we have a complex number re ie , and consider the
closely related number re~ ie . The second is called the conjugate
of the first. If we have a complex number in the form A -f Bi,
its conjugate is A — Bi. Or in general, if we change the sign
of i wherever it appears in a complex number, we obtain its
conjugate. Graphically, the vector representing the conjugate
of a number is the mirror image of the vector representing the
number itself, in the axis of real numbers. Now conjugate
numbers have two important properties: the sum of a number
and its conjugate is real (for the imaginary parts just cancel in
taking this sum), and the product is real (for this equals
Fig.
A=rcos0 A
-The complex number P equals
either A + Bi, or re %e .
POWER SERIES AND EXPONENTIAL METHODS 25
r 2 e i(e-6) = r 2) The second fact is useful in finding the absolute
magnitude of a complex number: if z is complex, z its conjugate
(this is the usual notation), then -y/zi equals the absolute
magnitude of z. From the other fact, we may find the real and
i - % -I— 5
imaginary parts of complex numbers: — ^ — equals the real part
z — z
of z, and as we can easily show, ~. equals the imaginary
part. We see examples in our relations between sinusoidal and
exponential functions where e~ ix is the conjugate of e ix , so that
gix _|_ a — ix-
2 should, and does, equal the real part of e ix , or cos x,
oix ff~i- x
and ~-. equals the imaginary part, or sin x.
17. Application of Complex Numbers to Vibration Problems. —
There are two different, though related, ways of applying com-
plex numbers to vibration problems. The first, and perhaps
more logical, is directly suggested by what we have done. We
found for undamped vibrations that y = Ae iy /~ hx + Be~ i ^ x .
Now naturally we wish y to be real, since it represents a real
displacement. To do this, we make use of the proposition that
we have just found, that the sum of a complex number and its
conjugate is real. Since e^V*"* i s the conjugate of e 1 ^", we
achieve the desired result if we make B = A, for then the whole
second term is just the conjugate of the first. Incidentally,
C
if we write A = -= e~ ia , we have
y = - e i(Vb*-cc) _|_ ^(Vfrs-oO = C cog (y/b x _ a)> (U)
giving a form, in terms of amplitude C and phase a, which is
often useful and important.
The second method of treatment is more common, particularly
in electrical applications. Suppose we work directly with the
complex solution y = Ae i ^ /hx , but consider that only the real
part is of physical significance. This real part, as we have
seen, is half the sum of this quantity and its conjugate, so that,
except for a factor of 2, it comes to the same thing we have
considered before. However, it is often easier to think of it in
this way, and the process of using a complex solution, and finally
taking the real part, is very common. Of course, if A is real,
26 INTRODUCTION TO THEORETICAL PHYSICS
the real part is simply A cos y/bx; if A is complex, we may write
it Ce - **, and the real part of the product is C cos {\/bx — a).
This second method is particularly interesting in discussing simple
harmonic motion, where x is replaced by t, and y/b by to, so
that we are considering the real part of ■ Ae™*. The complex
number is given by a vector of length A, rotating in the complex
plane with angular velocity w. And its real part is simply the
projection of the vector along the real axis. Thus it corresponds
exactly to the most elementary formulation of simple harmonic
motion, as the projection of a circular motion on a diameter.
Problems
1. Show directly that the solution A sin at + B cos at for the particle
moving with simple harmonic motion can also be written C cos (at - a).
Find C and a. as functions of A and B, and vice versa. The constant C is
called the amplitude of the motion, and a is called the phase. Note that a
can be regarded as an angle, measured in radians.
2. A pendulum 1 m. long is held at an angle of 1 deg. to the vertical,
and released with an initial velocity of 5 cm. per second toward the position
of equilibrium. Find amplitude and phase of the resulting motion.
3. A circuit contains resistance, inductance, and capacity, but there is no
impressed e.m.f. Solve the differential equation in series, and show by
comparison of the first few terms that the series represents the function
e -iR/iDt(A s i n w t + B cos at), where a 2 = 1/LC — R 2 /4JL 2 .
4. In an oscillatory circuit, show that the phases of the charge and the
current differ by 90 deg.
5. Given a complex number' represented by a vector, what is the nature
of the vector representing its square root; its cube root? Find the three
cube roots of unity, the four fourth roots, the five fifth roots, plotting them
in the complex plane, and giving real and imaginary components of each.
With one of the cube roots, in terms of its real and imaginary parts, cube by
direct multiplication and show that the result is unity.
6. Find real and imaginary parts of V3~+5t, jT+lBi y/ A + Bi ^^
A, B are real. .
7. Show that In (-a) = id + In a, or &rt + In a, or m general nm +
In a, where n is an odd integer.
8. Prove that if we have a complex solution of the problem of a vibrating
particle, the real part of this complex function is itself a solution of the
problem. ,
9. Show that in general a linear homogeneous differential equation of the
nth order with constant coefficients has n independent exponential solutions
of the sort we have considered.
. 10. Show that if we have n independent solutions of an nth order differen-
tial equation, then an arbitrary linear combination of these solutions, con-
taining n coefficients, is a general solution of the equation.
CHAPTER IV
DAMPED VIBRATIONS, FORCED VIBRATIONS, AND
RESONANCE
We have now reached the point where we can discuss a wide
range of prbblems in oscillatory mechanical or electrical systems.
The general question we shall take up is that of a system con-
taining inertia, damping force proportional to the velocity, and
restoring f oroe proportional to the displacement, under the
action of an impressed force. This leads to an inhomogeneous
second-order linear differential equation, of the form
m|f + 2mk < ^ + m^x = F(t), (1)
where the coefficients 2mk and wco 2 of the damping and restoring
force terms, respectively, are written in this way to obtain a
simple result. The term F(t) , which makes the equation inhomo-
geneous, is the impressed force, a function of time. The solution
of such an inhomogeneous equation, as we have seen, can be
written as a sum of two parts. one is a particular solution
of the problem, the so-called forced motion, a steady-state
solution which persists as long as the force is applied. The other
is the transient term, a general solution of the corresponding
homogeneous equation obtained by setting F = 0. This
transient proves to be a damped simple harmonic motion, an
oscillation whose amplitude decreases exponentially with time,
soon passing away, and leaving only the steady-state solution.
The amplitude and phase of the transient are determined so
that the whole motion will have the correct initial displacement
and velocity, its two arbitrary constants being chosen to fit
the initial conditions.
18. Damped Vibrational Motion. — We first consider the
transient motion, whose equation is obtained from (1) above
by setting F = 0. In the preceding chapter we have seen that
the solution can be written
x = e-^iAe^*^ 1 + Be-^^ 1 ). (2)
There are three cases: (1) k 2 - « 2 < 0; (2) k 2 - « 2 = 0; (3) k 2 -
37
28 INTRODUCTION TO THEORETICAL PHYSICS
« 2 > 0. The first is the case where the damping is small. Here
y/k 2 — o> 2 = i\/oi 2 — k 2 , and the radical is real. Then we have
the same sort of expression we have considered before, and to
get a real answer we must write B = A, or else we can take the
real part of a complex quantity. Let us do the latter: the solu-
tion is the real part of Ae^e 1 *^" 2 -^ 1 , or is Ce~ kt cos (a/« 2 — k 2 t —
a). This is like a simple harmonic motion, of angular velocity
\Ao 2 — A; 2 , phase a, but with an amplitude Ce~ kt which continually
decreases with time, and it is called damped simple harmonic
motion. For snjall damping, the angular velocity can be
k 2
expanded in power series, and is <o — ^- • • • , differing from <a
by a small quantity of the second order. Thus, for example,
a pendulum which is slightly damped will have its period only
very slightly altered by the damping. The amplitudes of
successive swings go down in exponential fashion, on account of the
factor e~~ kt . Thus the logarithms of the amplitudes go down
linearly with the time, and as a result this kind of damping is
known as logarithmic damping. The decrease in the logarithm of
the amplitude in a period is known as the logarithmic decrement.
The other extreme case is the third, where k 2 — w 2 > 0, and
there is nothing complex about the solution at all. It simply
consists of two exponential terms, with only real coefficients.
The resulting motion is not oscillatory, but merely damps down
gradually to zero. The limiting case, k 2 — « 2 = 0, is called
the critical case, and is most easily discussed as the limit of
either of the others. An interesting practical application of
all the cases is found in the problem of the vibrations of galvanom-
eters. A galvanometer without damping oscillates back and
forth with simple harmonic motion. With slight damping, it
has nearly the same frequency, but a logarithmic decrement.
As the damping is made greater and greater, the period gets
larger and larger, until finally at critical damping and beyond
there are no oscillations at all. The galvanometer, if displaced,
simply settles slowly back to its normal position.
19. Damped Electrical Oscillations. — The corresponding
electrical problem is given by the circuit containing resistance,
inductance, and capacity, and the equation is
DAMPED VIBRATIONS AND FORCED VIBRATIONS 29
The solution is
q = c e -(*/2i>< C o S ( u t - a), (4)
where
co = Vl/LC - R 2 /AL 2 .
This is the same solution which we found in Prob. 3 of the last
chapter by the series method. It is an interesting illustration
of the simplicity of the exponential method of solving the equa-
tion. As we see, the current oscillates with an angular velocity
which, for small R, differs only slightly from the undamped
angular velocity \/l/LC, but it has a logarithmic damping,
which is greater the greater R is.
20. Initial Conditions for Transients. — To fix the two arbitrary
constants of the transient, we must fit the initial displacement
and velocity. Thus, for instance, consider the solution in the
form
x = Ce~ kt cos (Vw 2 - k 2 1 - a).
Assume that at t = 0, x = z n , and dx/dt = v . From the first,
Xo = C cos a. (5)
To apply the second, we have
^ = -Ce- kt V<» 2 - k 2 sin (\A> 2 - k 2 1 - a)
-kCe~ kt cos (V« 2 - k 2 t - a).
Thus
Vo = Cy/o> 2 — k 2 sin a — kC cos a. (6)
By simultaneous solution of Eqs. (5) and (6) we can find C
and a in terms of x and v .
21. Forced Vibrations and Resonance. — Our next task is to
find a particular solution of Eq. (1) containing the external
applied force. To do this, we shall first solve the case where
the force is a sinusoidal function of the time, a very important
special case. This leads to* a solution also sinusoidal with the
same frequency, with an amplitude proportional to the amplitude
of the force, but for which the constant of proportionality depends
on the frequency, becoming large out of all proportion if the
impressed frequency is nearly equal to the natural frequency.
This phenomenon of enormously exaggerated response of the
30 INTRODUCTION TO THEORETICAL PHYSICS
oscillating system to a certain impressed frequency is called
resonance; it is of great physical importance.
Familiar examples of resonance will occur to one. In
mechanics, it is well known that a pendulum can be set swing-
ing with large oscillations if it receives small periodic impulses,
timed to synchronize with its own period, whereas any other
impressed frequency would soon get out of step with the oscil-
lations it sets up, and would force them to die down again.
Acoustical resonance is illustrated by the way in which one
vibrating tuning fork will set another into vibration if both have
the same pitch, but not otherwise. Another acoustical example
comes from Helmholtz's resonators: air chambers vibrating
with a definite pitch, which are set into resonant vibration if
sound of that particular pitch falls on them, but not appreciably
by any other pitch, so that they can be used to pick out a particu-
lar note in a complicated sound and estimate its intensity.
The resonance of electric circuits is illustrated in the tuned
circuits of the radio, which respond only to sending stations
of a particular wave length, and practically not at all to other
stations. In optics, the theory of refractive index and absorp-
tion coefficient is closely connected with resonance. As is
shown by the sharp spectrum lines, atoms contain oscillators
capable of damped simple harmonic motion, or at any rate act
as if they did; the real theory, using wave mechanics, is com-
plicated but leads essentially to this result. An external light
wave is a sinusoidal impressed force, leading to a forced motion
of the oscillators with the same frequency but different phase.
The component of motion in phase with the field reacts back on
the field to change its phase, and this progressive change of
phase as the light travels through the body is interpreted as a
changed velocity of propagation, or an index of refraction differ-
ent from unity. Similarly the other component produces a
diminution of intensity, or absorption. The phenomenon of
anomalous dispersion, with abnormally large index of refraction
and absorption coefficient, comes about when the external wave
is in resonance with the atom.
22. Mechanical Resonance. — Let the external force be F
cos cot. It is simpler to regard this as being the real part of
F e™ 1 . Thus we use the differential equation
m ~dfi + 2mk 'dt + mo> ° 2x = F ° eia "> W
DAMPED VIBRATIONS AND FORCED VIBRATIONS 31
where we use co for the natural angular frequency, to distinguish
from the impressed angular velocity co. The resulting x will be
complex, and its real part represents the actual motion. Now we
assume that the forced motion has the same frequency as the
impressed force, or that x = Ae iwt , where A may be complex.
If A r and At are the real and imaginary parts of A, we easily see
that the real part of x is given by
A r cos (at — Ai sin at, (8)
so that in general* the motion has one term in phase with the
force, whose amplitude is given by the real part of A, and another
out of phase, the amplitude being the negative of the imaginary
part. Substituting our exponential formula for x in Eq. (7), we
have
[ra( — co 2 ) + 2mk(io)) + m(a Q 2 ]Ae iwt = Foe 1 "'.
Canceling the exponential, we have
A =ll I (Q\
m (o> 2 - co 2 ) + 2ika' W
To get the coefficients of terms in phase and out of phase with
the force, or A r and — Ai, we multiply numerator and denomina-
tor by the conjugate of the denominator, obtaining respectively
A r = '-
COq 2 — CO 2
to (co 2 - co 2 ) 2 + 4A; 2 co 2
and
_ a __Fq 2/bco
1 to (coo 2 - co 2 ) 2 + 4A; 2 co 2 " UUj
These two functions are plotted in Fig. 4. It is seen that the
first has the form made familiar by the anomalous dispersion
curve in optics, the second resembling the corresponding absorp-
tion curve. This resemblance is an essential one, as we shall
see in Chap. XXIV. one feature of the curves should be
mentioned. The anomalous behavior in the neighborhood of
coo is confined to a narrower and narrower band of frequencies
as & becomes smaller and smaller compared with co , so that if
the damping is very small the resonance is very sharp, while if
there is large damping, there is a broad range of frequencies over
which resonance is appreciable.
23. Electrical Resonance. — Suppose that a dynamo supplies
sinusoidally alternating electromotive force, given by E cos wt,
32
INTRODUCTION TO THEORETICAL PHYSICS
to an electric circuit containing resistance, inductance, and
capacity. The differential equation for the charge is then
d 2 q jjdq
df 2
+ R m + i
E COS wt.
(11).
We set up instead the differencial equation for the current i =
Fig 4. — Amplitude of forced motion of an oscillator, as function of frequency.
(a) Component in phase with force; (6) component out of phase.
dq/dt, which we obtain from Eq. (11) by differentiating with
respect to time :
T dH , ^di , i d /T1 ,.
(12)
As with the mechanical case, we replace E cos cot by the complex
exponential Ee iut , of which the real part gives the electromotive
force. Similarly we assume the current to be sinusoidal, given
by the real part of i e iut . Making these changes in Eq. (12),
and carrying out the differentiations, we have
.(■
E
5/
= ioiEe 1
^o =
R + i(U> - 1/Cw)
The denominator here equals Ze ia , where
Z = V# 2 + X 2 , X = Lo
cJ
(13)
(14)
and
a — tan"
X
W
DAMPED VIBRATIONS AND FORCED VIBRATIONS 33
where X is called the reactance, Z the impedance. Then the
current is
E
i = y COS (ait — a). (15)
The impedance takes the place of the resistance in problems
involving alternating currents, since we divide the amplitude of
the e.m.f. by the impedance rather than by the resistance to get
the amplitude of the current. We note that the impedance is a
function of frequency. It becomes infinite when the frequency
becomes zero, on account of the term involving the capacity,
and showing that a direct current cannot go through a con-
denser; and also when the frequency becomes infinite, on account
of the term in the inductance, showing that infinitely rapid
oscillations cannot pass through the inductance. In between, it
goes through a minimum, at the frequency for which X = 0,
or o) = 1/y/LC, the natural frequency at which the circuit
would oscillate by itself if it had no resistance or impressed
e.m.f. Thus for impressed e.m.fs. of the same amplitude, but
of a variety of frequencies, that whose frequency agrees most
closely with the natural frequency will produce the largest cur-
rent, and the others may produce much smaller currents, so that
we have resonance, or tuning. To tune a circuit, one adjusts
L or C, or both. When it is tuned, the sharpness of tuning
depends on the size of R. For instance, if R were 0, there would
be infinite response at exact resonance, so that the tuning would
be infinitely sharp.
In addition to the dependence of amplitude on frequency,
there is also a phase difference between e.m.f. and current, given
by the quantity a above. We can get a simple interpretation
of this in the complex plane. The quantity R + iX is called
the complex impedance. Its magnitude is just the real imped-
ance Z, and its phase, or angle, is the angle a. It is interesting
to note that a goes from —90 deg. at zero frequency to +90 deg.
at infinite frequency, passing through zero at resonance.
24. Superposition of Transient and Forced Motion. — The
general solution of an oscillatory problem is the sum of the steady-
state motion (the particular solution), and a transient with
arbitrary amplitude and phase, chosen to satisfy the initial con-
ditions. Thus, choosing an electrical case, we may have no
charge and current in a circuit at t = 0, but start applying a
34 INTRODUCTION TO THEORETICAL PHYSICS
sinusoidal e.m.f. at that instant. The charge and current at
any later time are given by
-—* E
q = Ae 2L cos (a t — a ) +"— y sm M — <*)>
--«' R ---'
i = — Aa Q e 2L sin (a Q t — a ) — -^orf cos ( w <>£ — a )
-f- -y COS (<o£ — a),
where co is the natural angular frequency, A and a the amplitude
and phase of the transient. Then to determine A and a we have
the equations
El
— q = A cos a ^ sin a
= i . = Aa>o sin a — A^y cos a + -g cos, a, (16)
where g , *o are initial charge and current, equal to zero for these
particular initial conditions.
Three examples of the charge as a function of time are given
in Fig. 5. In (a), the natural frequency is taken to be much
greater than the external frequency, and the logarithmic decre-
ment large, so that the transient is a rapidly damped high
frequency vibration, which is imperceptible after a few periods
of the external force. The case (6) is that in which external and
natural frequencies are almost equal, and the damping small.
In this case, the forced and transient vibrations, having almost
the same frequencies, form beats with each other, as one always
has when two almost equal frequencies are superposed, the sum
of two sine waves leading to a sinusoidal vibration whose fre-
quency is the average of the two frequencies, but whose amplitude
is modulated with the slow difference frequency between the two
vibrations, as given by the equation
cos ait + cos a4 = 2 cos ( - — 2 — f cos I 2 — / ^ '
Since the transient gradually dies down, however, the amplitude
of the beats grows less and less, until gradually only the forced
motion remains. In the case (c), the external frequency is
exactly equal to the natural frequency. Here there are no beats,
DAMPED VIBRATIONS AND FORCED VIBRATIONS 35
the amplitude merely building up exponentially to its final
value.
Curve A is forced motion, B transient, C combined motion,
(a) Natural frequency high, impressed frequency low, large damping.
(6) Impressed and natural frequency approximately equal.
(c) Impressed and natural frequency equal.
Fig. 5.- — Transient and forced motion superposed.
25. Motion under General External Forces. — If we are given
an arbitrary external force, say F(t), we shall show in a later
chapter that it is possible to write it as a sum of sinusoidal terms :
F(t) = real part of Vf,^*'
36 INTRODUCTION TO THEORETICAL PHYSICS
Thus any sound may be considered as made up of a superposition
of pure tones, and any light as a superposition of pure colors.
Now suppose we find the forced motion resulting from each
of these sinusoidal vibrations acting separately, and then add
them. The result will be the solution of the whole problem.
For suppose x n (t) is the solution of the problem whose force is
the nth term of the summation, so that we have
d 2 d \
Add all these equations. Then we have
n n
showing that ]£\r„ satisfies the whole equation. We readily
n ^
see that this is a special case of a general theorem : if the impressed
force, in an inhomogeneous linear equation, is written as a sum
of terms, and if we have solutions of the separate problems in
which only one term of the sum is impressed at a time, the solu-
tion of the whole problem is the sum of these separate solutions.
We note that those particular forces whose frequencies are near
the natural frequency will produce greatly exaggerated responses.
26. Generalizations Regarding Linear Differential Equations.
We have made several generalizations regarding linear differential
equations, and it is well to group these together. We have seen
that
1. Any linear combination of solutions of a homogeneous linear
differential equation is itself a solution, and if the linear combina-
tion contains as many arbitrary constants as the order of the
differential equation, it is a general solution.
2. A general solution of an inhomogeneous linear differential
equation is the sum of a particular solution, and a general solu-
tion of the corresponding homogeneous equation.
3. If the inhomogeneous part of an inhomogeneous linear
differential equation is a sum of terms, and if we have the solu-
tions of the equations formed by taking just one of these sepa-
rately, the particular solution of the whole problem can be formed
by adding these separate solutions.
Physically, the first statement means that free vibrations of a
system governed by a linear differential equation may be super-
DAMPED VIBRATIONS AND FORCED VIBRATIONS 37
posed without affecting each other. The second means that
free vibrations can coexist with forced vibrations; and the last,
that forced vibrations from different sources can coexist without
affecting each other. All these properties of coexistence or
superposability of vibrations are characteristic only of linear
equations, but, as we shall see, a great many physical phenomena
are governed by such equations, so that the superposability of
vibrations is of widespread physical importance.
Problems
1. A coil of resistance 2 ohms, inductance 10 millihenries, is connected to a
condenser of capacity 10 mf. At t = 0, the condenser is charged to a
potential of 100 volts, and no current is flowing. Find the charge on the
condenser at any later time, and also the current flowing. What are the
period and logarithmic decrement of the circuit? What would the resist-
ance have to be, leaving inductance and capacity the same, such that the
system would be critically damped?
2. Prove that the displacement of a particle in damped oscillation is given
by
x = e- kt (x cos V" 2 -kH + Vo . + kxo sin -\A> 2 - ** t),
\ Vw 2 - fc 2 /
where xo, vo are initial values of displacement and velocity. Pass to the
case of critical damping, by letting w 2 — k 2 approach zero. Show that the
resulting motion has one term of the form te~ kt , and prove directly that this
satisfies the differential equation.
3. Letting w = k/2, draw curves for x as a function of t, representing the
damped motion for the case where the initial velocity is zero but the initial
displacement is not, and also for the case where the initial displacement is
zero but the velocity is not.
4. A pendulum is damped so that its amplitude falls to half its value in
1 min. Its actual period is 2 sec. Find the change in the period which
there would be if the damping were not present. (Hint: use power series
expansion for frequency, treating A; as a small quantity.)
5. A radio receiving station has a circuit tuned to a wave length of 500 m.
It is desired to have the tuning sharp enough so that a frequency differing
from this by 10,000 cycles per second gives only 1 per cent as much response
as the natural frequency, for the same amplitude of signal. Work out
reasonable values of resistance, inductance, and capacity to accomplish this.
6. The sharpness of tuning of a vibrating system may be measured by
the so-called half breadth of the resonance band, or the frequency difference
between the two frequencies for which the amplitude of response is half
that at exact resonance. Prove that the ratio of half breadth to resonance
frequency is proportional to the logarithmic decrement, if the damping is
not too great.
7. A tuning fork of pitch C (256 vibrations per second) is so slightly
damped that its amplitude after 10 sec. is 10 per cent of the original ampli-
tude. It is set into oscillation, first by another fork of the same pitch, thee
38 INTRODUCTION TO THEORETICAL PHYSICS
by one a semitone higher, both vibrating with the same amplitude. Find
the ratio of amplitudes of forced motion in the two cases. What will be
the pitch of the forced vibration in the second case?
8. The support of a simple pendulum moves horizontally back and forth
with simple harmonic motion. Show that this sets the pendulum into forced
motion, as if there were a force applied directly to the bob. Show that the
motion has the following behavior: The pendulum pivots about a point
not its point of support, but such that, if it were really pivoted here, its
natural period would be the actual period of the forced motion. Discuss
the cases where the pivotal point is below the point of support; above the
point of support. Neglect transients.
9. A particle subject to a linear restoring force and a viscous damping is
acted on by a periodic force whose frequency differs from the natural fre-
quency by a small quantity. The particle starts from rest at t = 0, and
builds up the motion. Discuss the whole problem, including initial condi-
tions. Consider what happens in the limiting case when the frequency
gets nearer and nearer the natural frequency, and the damping gets smaller
and smaller. Show that the results are as indicated in Fig. 5, (6), (c).
10. The amplitude of the forced current in a circuit is
. _ E
u ~ [R + i(L<* - 1/C«)]'
Plot real part as abscissa, imaginary part as ordinate, obtaining a curve by
taking points for all frequencies. Find the equation of the resulting curve,
and prove that it is a circle.
11. Show that for a particle subject to a linear restoring force and viscous
damping the maximum amplitude occurs when the applied frequency is
less than the natural frequency. Find this resonance frequency. Show
that maximum energy is attained when the applied frequency equals the
natural frequency. What are the maximum amplitude and maximum
energy?
12. The motion of an anharmonic undamped oscillator is described by
m Jtz ~*~ mo> ° 2x ~l~ ^ xi = ®'
where 6 is a small quantity. Solve this equation by successive approxima-
tions, expanding x in a power series in powers of b.
13. If the oscillator in Problem 12 is acted on by a force A cos pt + B cos
qt, show that the steady-state solution contains terms of frequencies 2p,
2?, q + P, q — P, 2g + p, 2q — p, etc. Note that superposition does not
hold for the equation above. These new frequencies are called combination
tones.
CHAPTER V
ENERGY
We have progressed far enough in our study of mechanics so
that it will pay to stop and survey the situation. Mechanics is
a large subject, and we may consider some Of the directions in
which we could extend what we have done already. In the first
place, we may treat the mechanics of many sorts of systems. We
may have the mechanics of particles, or of rigid bodies, or of
deformable, elastic solids, or of fluid media. All these we shall
treat, in more or less detail, before we are through. What we
have done so far comes under the heading of mechanics of parti-
cles, and we shall look at that field in more detail.
In the first place, one almost never has real particles to deal
with in a mechanical problem. Probably the closest approach
is found in the kinetic theory of monatomic gases, where the
atoms act like movable points exerting forces on each other.
But often very large bodies can act as particles, as, for instance,
the planets in their motions about the sun. Then again we can
have essentially complicated systems, like pendulums, or weights
suspended on springs, which yet have such simple motions that
we can apply the methods of the mechanics of particles to them.
Many of the problems we have treated so far have been of this
sort.
A particle has three coordinates, which may be x, y, z, and the
problem of mechanics is to find the way in which these coordinates
change with time. The starting point is Newton's second law of
motion, giving the accelerations, or second time derivatives of
the coordinates, in terms of the forces; All of our problems so far,
whether dealing with actual particles or not, fall under this
classification, and in fact belong to the more restricted class of
one-dimensional problems, with but one coordinate x. The next
few chapters will be devoted to the two- and three-dimensional
cases of mechanics of a particle.
The one-dimensional motions of a particle fall into different
classes, depending on the type of force acting. We have treated
several sorts of forces: viscous resistances, linear restoring forces,
39
40 INTRODUCTION TO THEORETICAL PHYSICS
external forces which are arbitrary functions of time. That
is, the force may be a function of velocity, of position, or of time,
or, of course, of all three combined. Most common mechanical
problems are of this type, the force depending on v, x, and t, but
this is not necessary. For instance, in radiation problems, in
electromagnetic theory, one meets a force proportional to the
time derivative of acceleration, or to d 3 x/dt 3 , which turns out to
act much like a viscous resistance. But such cases are rare.
The simplest cases are those in which the force depends only
on the coordinate. Then, in one-dimensional motion, we can
always introduce a potential energy, which added to the kinetic
energy gives a total energy that stays constant, expressing the
conservation of energy. If, on the other hand, there are external
impressed forces, the energy may increase or decrease with time,
depending on whether the impressed forces do work on the system
or have work done on them; while, if there are frictional forces,
the energy will decrease with time, being dissipated in heat, for
which reason these forces are called dissipative forces. It is
plain that the study of different types of forces is closely tied up
with the idea of energy, which we so far have not discussed, and
we turn to this question, first deriving the mathematical formula-
tion of kinetic energy for one-dimensional problems.
27. Mechanical Energy. — Let us see where the concept of
energy comes from, and how we can use it. We start with a
particle of mass to, acted on by a force F. Then Newton's
second law is md 2 x/dt 2 = F. Now let us multiply each side by
dx/dt, and integrate with respect to t, from time t up to t:
f l dxd 2 x 7 , C 1 ^ dx ,
Both these integrals can be transformed. First, we note that
d/dx\ _ 9
dt\dt)
dt~dt 2 '
Thus the left side is
to C l d(dxV_. _ m(dx\ 2 \ l
2j t() dt\~di) dt ~ 2\dt) | ( ;
or letting dx/dt be denoted by v, and its value at t = t by v , this
side is mv 2 /2 - mv 2 /2. on the right, j F dx/dt dt = / F dx, where
ENERGY 41
now the integral is from x to x, if x is the value of x at t = t ,
x at t. Then the equation is
2 mV 2
wwo 2 = j F dx. , (1)
* »/a;o
The quantity wv 2 / 2 is called the kinetic energy, JF dx is the work
done, and our equation says that the work done by the force on
the particle between two instants of time equals the increase in
kinetic energy during the time. This is the fundamental propo-
sition relating to energy, and our proof is the standard one.
Next we consider the nature of the force F. First there is
the case where it depends only on the position of the particle,
as in a gravitational field or a linear restoring force, without
friction. Then F = F(x), and we may write f Fix) dx =
- Vix), so that mv 2 /2 + V(x) = mv^/2 + V(x Q ). The quantity
V(x) is called the potential energy, and the sum of it and the kinetic
energy is the total energy; our equation states that the total
energy remains constant during the motion. The lower limit
x of integration may be chosen in an arbitrary way, or an arbi-
trary constant of integration may be added to the potential
energy, without changing the results, which depend only on
potential differences. The potential energy is related to the
force either by the equation above, or by its derivative, F =
-dV/dx.
In case the force depends on the velocity as well as the position,
the situation is quite different. Then the value of F cannot be
predicted when x is known, so that we cannot even evaluate the
work done without knowing more details about the system. In
such a case it is plainly impossible to set up a potential energy
function independent of time, or to speak of the total energy
being conserved. Such a system is called nonconservative, in
contrast to a conservative system in which the energy stays
constant. Even in a nonconservative system it is often possible
to write a potential function connected with part of the force.
Thus with a damped oscillator, we can write a potential function
for the restoring force, but not for the viscous resistance. In
such a case we shall still speak of the sum of the kinetic and
potential energy as being the total energy, but we can no longer
say that it remains constant. Rather we should say that the
time rate of change of the energy was equal to the rate of working
42 INTRODUCTION TO THEORETICAL PHYSICS
of outside forces, both of viscosity and of any external impressed
forces, on the system. Let us see what this means mathe-
matically. Let F = -dV/dx + G, where V is the potential
function for that part of the force derivable from a potential,
and G is the remaining force. Then the energy is rav 2 /2 + V.
The time rate of change of the energy is
K^) + i FWI -("» L+ © ,;
using ^ = ^>|. »*(.£ + £) -*, b, Newton,
second law, so that the time derivative of energy reduces to Gv,
or the external force times the velocity, as we should expect.
one should not be disturbed to find systems whose total energy
does not stay constant. At first sight they seem to contradict
the general law of conservation of energy, but on closer examina-
tion we always find that they are parts of a larger system whose
energy really is conserved. Thus if we consider not merely the
damped vibrating particle but also the viscous fluid doing the
damping, we shall find that the latter gains the energy lost by
the former, transforming it into heat, itself a form of energy.
It is in fact a general situation that there are two ways of treating
a mechanical problem: first, by considering the whole system,
and treating it as a conservative system; second, treating only
part of the system, and taking the forces exerted by the rest on
this part as being impressed or dissipative forces, which cannot
be derived from a potential.
28. Use of the Potential for Discussing the Motion of a System.
The one-dimensional motion of a particle in a conservative field
can be discussed with great ease by the use of the potential
function. Suppose we know V as a function of x, and suppose
that we inquire about the motion of a particle of total energy E
in this potenti al field. Then we have mv 2 /2 = E - V, v =
■y/2{E — V)/m. Since this is a known function of x, we can
find the speed at every point. In the first place, we can use this
to get an explicit solution of the problem. For writing v = dx/dt,
and integrating, we have
dx
V2(E - V)/m ' { }
giving a relation between t and x, involving two arbitrary con-
ENERGY 43
stants (energy, and the constant of integration, determining the
origin of time, or the phase). Thus for instance for a particle
moving under gravity, where V = mgx, we have
dx
*,-/:
X,
V2E/m - 2gx
Letting z = 2E/m — 2gx, so that dz = —2gdx, this is
1 C dz Wz i ,
2gJ Vz g g y
where evidently t is the value of t at which 2E/m — 2gx = 0,
or x — E/mg, which, as we readily see, is the highest point of
the path, at which the body commences to fall. If we let this
value of x be xo, then, squaring, we have x — x = —jg(t~ t ) 2 ,
the familiar solution. Many one-dimensional problems can be
solved by this method, as for instance the pendulum with large
amplitude, which leads to an elliptic integral. on the other
hand, there are, of course, many cases where the integration is
too difficult to carry out.
Even if the solutions cannot be obtained exactly, however,
we can still use the. method of the energy to get general informa-
tion about the problem. Let us imagine V plotted as a function
of x (see Fig. 6) . Then we draw on the same graph a horizontal
line at height E. The square root of the difference between the
two curves is then proportional to the velocity of the particle
at that point. Thus the velocity is only real where this difference
is positive, and is imaginary elsewhere. If the velocity is only
real in certain regions of x, this means that the motion can only
occur within those regions. As the particle approaches thg edge
of such a region, the speed gets smaller and smaller, and finally
at the edge the particle stops. Then it reverses and travels
away again. The possibility of going either toward or" away
from the boundary comes from the two signs of the square
root : the velocity at a given point of space is always the same in
magnitude, but can be in either direction. If now the region
where the kinetic energy is positive is bounded at both ends,
then after reversing its motion at one edge, the particle will travel
to the other, reverse, come back, and repeat the process indefi-
nitely. Since at a given point the particle always travels, with
the same speed, it will always require the same time to traverse
its path, and the motion will be periodic. Thus, if the total
44
INTRODUCTION TO THEORETICAL PHYSICS
energy is Ei (Fig. 6), the motion is periodic, confined between
c and e. If it is E 2 , either of two periodic motions is possible,
between b and /, or between h and j. This is a general result
for a conservative motion in one dimension which does not
extend to infinity.
If, on the other hand, the kinetic energy remains positive in
one direction all the way to infinity, but becomes negative at a
finite point in the other direction, the particle will come in from
Fig. 6. — Potential energy V as function of coordinate x.
Total energy E\, periodic motion between c and e.
Total energy Ei, periodic motion between b and /, or h and j.
Total energy Ei, nonperiodic motion between a and infinity, reversing at a.
Total energy Ei, nonperiodic nonreversing motion.
infinity, having taken since negatively infinite time to do it, will
reverse, and will return to infinity. This is the case with energy
E s , the particle coming in from the right, speeding up in the
region about i, slowing down about g, speeding up about d,
finally coming to rest at a, and reversing, traveling back to the
right. An example of the first, periodic case is a particle vibrat-
ing in simple harmonic motion, and of the second nonperiodic
case is a ball coming from infinity, hitting a wall, and being
bounced back again, or a ball thrown up in the air and coming
down again. Finally there is the possibility of a potential such
that the kinetic energy is positive at all values of x. Then the
particle persists in one direction forever, like a free particle, but
generally travels with a variable velocity. Such a case is found
for energy E A , where the particle starts from infinite distance in
one direction, travels toward the center, speeds up and slows
ENERGY 45
down corresponding to the maxima and minima, and finally goes
to infinity in the other direction. It is to be noted that motions
in the same potential field, but with different total energy, can
have quite different characteristics under this classification.
Thus oscillatory motions are always possible around minima in
the potential energy, for small enough total energy. But it
may be that for too high total energy the particle will be able to
get entirely away from the neighborhood of the minimum, and
will go to infinity. In Fig. 6, there are three points, d, g, and i,
at which the force is zero, and the particle would stay at rest
forever, if it were placed at one of these points. Of these, g
is a position of unstable equilibrium, and a small impact would
start the particle oscillating, about either d or i. on the other
hand, d and i are both points of stable equilibrium, so that a
particle at rest at either of these points would suffer only small
oscillations about that point if struck a small impact.
29. The Rolling-ball Analogy. — A simple model which shows
the properties of one-dimensional motion can be set up as
follows. We imagine a track, like a roller coaster, set up, shaped
just like the potential curve. Then we start a ball rolling on
this track, starting from rest at a given height. Its motion will
then approximate that of a particle in the corresponding potential
field. The reason is. that, since gravitational potential is pro-
portional to height, the ball actually has the potential at any
point which it should, and correspondingly the correct speed.
The only approximations made, other than friction, consist in
neglecting the fact that part of the kinetic energy actually goes
into up and down motion, and part into rotation, instead of all
into horizontal motion. From such a model, we can see how
motion may be oscillatory, if the track rises on the far side of a
dip up to the height where the ball started, or how it can go to
infinity if the track continues permanently at a lower level.
We can also see the general character of the solution in the case
where there is damping, just by imagining that the ball is subject
to friction. Obviously the motion still will have the character
of the undamped motion, but corresponding to a continually
decreasing energy. Thus with an oscillatory motion the ampli-
tude will constantly decrease until it stops, while with a motion
which originally was not oscillatory it may be possible that it
become trapped in a minimum of potential, settle down to oscil-
late*, and eventually come to rest. In any case, if the damping
46 INTRODUCTION TO THEORETICAL PHYSICS
continues, the motion will eventually stop, at a minimum of
potential.
30. Motion in Several Dimensions. — So far, we have treated
only the motion of a particle in one dimension. If it can move
about in two- or three-dimensional space, however, the problem
becomes much more difficult. Suppose the coordinates of a
particle are x, y, z, so that its motion is described by finding x, y,
and z as functions of time. Force, acceleration, are vectors, and
our first task is to investigate vector analysis enough to deal
with these quantities. We shall find that in two and three
dimensions it is by no means true that all force fields, in which
the force is a function of the position alone, can be derived from
a potential function. The next chapters, then, will deal with
vectors, force fields, and potentials. When we come to the
equations of motion, we find separate equations for each com-
ponent : if F x , F y , F z represent the components of force along
the axes, we have
m % - "■' m ^ = F " m % - F " (3)
a set of simultaneous differential equations. Such equations
can be solved in a few simple cases. For instance, if F x depends
only on x, F y only on y, F z only on z, they are simply three
independent equations, which we can handle by the methods
already used. This is called the method of separation of vari-
ables, and much of our effort will be directed toward this method
of solution. We shall carry out methods of changing to arbitrary
coordinate systems, with a view to separating variables. For
instance, in motion under a force acting toward a center of attrac-
tion, we introduce polar coordinates, and in these the equation
for r is separated from those for the angles, so that we can solve.
The process of changing coordinate systems leads us to Lagrange's
equations, the equations of motion in generalized coordinates.
Finally, the method of energy, the rolling-ball analogy, and the
other methods of the present chapter, can be used in several
dimensions, and provide the best means for a qualitative discus-
sion of a problem.
Problems
1. Take the sinusoidal solution for the displacement of a harmonic oscilla-
tor, find the velocity from it, compute kinetic and potential energy as func-
ENERGY 47
tions of time, and add to show that the sum remains constant. Show that
the energy is proportional to the square of the amplitude.
2. Proceed as in Prob. 1, but for the damped oscillator, rinding the sum of
kinetic and potential energies, showing that it decreases with time. Com-
pute the time rate of change of the energy, find the rate of working of the
frictional force, and show by direct comparison that they are equal.
3. Let a particle move in a field whose potential is — 1/x + 1/x 2 . Show
by graphical methods that for small total energy the motion is oscillatory,
but that for larger energy it is nonperiodic and extends to infinity. Find
the energy which forms the dividing line between these two cases. Compute
the limiting frequency of the oscillatory motion as the amplitude gets
smaller and smaller (using the results of Prob. 1, Chap. I), and describe
qualitatively how the frequency changes when the amplitude increases.
4. Solve directly the problem of the motion of a particle moving in a
field of potential —l/x + 1/x 2 , using the energy integral. Show that the
mathematical solution has the physical properties found in Prob. 3.
6. Using the solution of Prob. 4, find the period of oscillation of the oscil-
latory solutions in the potential — l/x + 1/x 2 , as functions of the energy.
To do this, note that the two ends of the path are the values of x for which
■\/2(E — V)/m = 0. Thus the integral I , from one of
J*o V2(# - V)/m
these points to the other will give just the half period. Show that the
period approaches the value found in Prob. 3 for small oscillations.
6. In an electric circuit, show that one can set up a magnetic energy
\U 2 analogous to a kinetic energy, and an electric energy \q 2 /C analogous
to a potential energy. Show that the rate of change of this total energy
equals the rate of working of the resistance and the applied electromotive
force.
7. An atom acts like a particle held to a position of equilibrium by a
definite restoring force and a viscous resistance. An external light wave
exerts a sinusoidal force, the atom executing a forced vibration under the
influence of the wave. Show that the atom continually absorbs energy from
the wave, the energy going into the viscous resistance. Show that the rate
of absorption is proportional to the component of amplitude out of phase
with the force, which we have already connected with the absorption
coefficient.
8. Solve the problem of the undamped oscillator, by using the equation
t = / dx/s/2{E - V)/m.
9. Discuss the problem of the pendulum with arbitrary amplitude by the
graphical method. Show that for low energies the motion is oscillatory,
but for high energies it is a continuous rotation. Sketch the qualitative
form of curves for angular displacement as a function of time, for several
energies, in both the oscillatory and rotatory ranges.
10. Set up the problem of the pendulum by the method of Prob. 8, and
show that t as a function of the angle is given by an elliptic integral. (Hint:
Use the information about elliptic integrals given in Peirce's table; note
that 1 - cos 6 = 2 sin 2 §0.)
CHAPTER VI
VECTOR FORCES AND POTENTIALS
In our one-dimensional problems, we have had no occasion
to mention vectors; however, before we can treat the detailed
theory of motion in two or three dimensions, we must discuss
them, and their relation to such things as potential energy.
31. Vectors and Their Components. — The force, in two- or
three-dimensional motion, is a vector, and we must make a
study of the mathematical relations of vectors. In the first
place, a vector is often denoted by its components along three
axes at right angles, as F x , F y , F z . Vectors, in the second place,
obey the following law of addition: if two vectors F and G have
components F x , F y , F z and G x , G v , G z , respectively, the components
of the sum F + G are (F x + G x ), (F y + G y ), (F z + G z ). A
graphical discussion shows that this is equivalent to the familiar
parallelogram law of addition (as in Fig. 2, where the same
proposition was shown for complex numbers, regarded as vectors) .
Third, if we multiply a vector by a constant, as C, each compo-
nent is multiplied by this constant. Thus the components of
CF are CF X , CF y , CF Z . Often a constant like C is called a
"scalar," to distinguish it from a vector. A scalar is a quantity
which has magnitude but not direction, a vector having both
magnitude and direction.
It is often useful to write vectors in terms of three so-called
unit vectors, i, j, k. Here, i is a vector of unit length, pointing
along the x axis, and similarly j has unit length and points along
the y axis, and k along the z axis. Now we can build up a vector
F out of them, by forming the quantity iF x + jF y -f kF z . This
is the sum of three vectors, one along each of the three axes, and
the first, which is just the component of the whole vector along
the x axis, is F x , and the other components likewise are F v and
F z . Thus the final vector has the components F X) F y , F z , and
is just the vector F.
By the magnitude of a vector we mean its length. By the
three-dimensional analogy to the Pythagorean theorem, by which
the square on the diagonal of a rectangular prism is the sum
48
VECTOR FORCES AND POTENTIALS 49
of the squ ares on the thre e sides, the magnitude of a vector F
equals -\/F x 2 + F y 2 + F z 2 . We often speak of unit vectors,
i.e., vectors whose magnitude is 1.
The component of a vector in a given direction is simply the
projection of the vector along a line in that direction. It evi-
dently equals the magnitude of the vector, times the cosine of
the angle between the direction of the vector and the desired
direction. As a special example, the component of a vector F
along the x axis is F x , and this must equal the magnitude of F,
times the cosine of the angle between F and x. If this angle is
F
called (F, x), then we must have cos (F. x) = -— ,
' VF X 2 + F y 2 + F 2
with similar formulas for y and z components. The three cosines
of the angles between a given direction, as the direction of the
vector F, and the three axes, are called direction cosines, and are
often denoted by letters I, m, n, so that in this case we have I =
cos (F, x), etc. It follows immediately that I 2 + m 2 + n 2 = 1.
We can make a simple interpretation of the direction cosines of
any direction: they are the components of a unit vector in the
desired direction, along the three coordinate axes.
32. Scalar Product of Two Vectors. — Multiplication of two
vectors is a rather special process, and there are two entirely
independent products, called the "scalar product" and the
"vector product." We shall first consider the scalar product.
The scalar product of two vectors F and G is denoted by F • G,
and by definition it is a scalar, equal to either (1) the magnitude
of F times the magnitude of G times the cosine of the angle
between; or (2) the magnitude of F times the projection of G on
F) or (3) the magnitude of G times the projection of F on G.
From the last section we see that these definitions are equivalent.
It is often useful to have the scalar product of two vectors in
terms of the components along x, y, and z. We find this by
writing in terms of i, j, and k. Thus we have
F • G = (iF x + jF y + kF z ) ■ (iG x + jG v + W.)
= (i ■ i)F x G x + (i ■ j)F x G y + (i • k)F x G,
+ U ■ i)FyG x + (j ■ j)F v Gy + (j ■ k)F y G z
+ (k • i)FJG x + (k • j)F z G y + (k • k)FjG z .
But now by the fundamental definition,
i • i = j • j = k • k = \,
ij=j'i=j'k = k-j = k-i = i'k = 0.
50
INTRODUCTION TO THEORETICAL PHYSICS
Thus
F ' G = F X G X ~\~ FyGy -f- F Z G Z
(1)
Fig.
The scalar product has many uses, principally in cases where
we are interested in the projections of vectors. For example,
the scalar product of a vector with a unit vector in a given direc-
tion equals the projection of the vector in the desired direction.
The scalar product of a vector with itself equals the square of its
magnitude, and is often denoted by F 2 . The scalar product of
two unit vectors gives the cosine of the angle between the
directions of the two vectors. To prove that two vectors are
at right angles, we need merely
prove that their scalar product
vanishes.
33. Vector Product of Two Vectors.
The vector product of two vectors F
and G is denoted by (F X G), and
by definition it is a vector, at right
angles to the plane of the two vec-
tors, equal in magnitude to either. (1)
the magnitude of F times the mag-
nitude of G times the sine of the
angle between them; or (2) the mag-
nitude of F times the projection of Q on the plane normal
to F; or (3) the magnitude of G times the projection of F on the
plane normal to G. We must further specify the sense of the
vector, whether it points up or down from the plane. This is
shown in Fig. 7, where we see that F } G, and F X G have the
same relations as the coordinates x, ?/, z in a right-handed system
of coordinates. Another way to describe the rule in words is
that, if one rotates F into G, the rotation is such that a right-
handed screw turning in that direction would be driven along
the direction of the vector product. From this rule, we note one
interesting fact: if we interchange the order of the factors, we
reverse the vector. Thus (F X G) = -(GXF).
We can compute the vector product in terms of the compo-
nents, much as we did with the scalar product. Thus we have
F XG = {iF x +jF y + kF g ) X (iG x + jG y + kG z )
= (i X i)F x G x + (* X j)F x G y + (i X k)F x G z
+ (j X i)F v G x + (j X j)F y G y + (j X W
+ (k X i)FJG* + (k X i)F z G y + (k X k)Ffi 9 .
7. — Direction of the vector
product.
VECTOR FORCES AND POTENTIALS 51
But now, as we readily see from the definition,
iXi=jXj = kXk = 0,
(as, in fact, the vector product of any vector with itself is zero) ;
and
iXj = -U X i) = k,j Xk = -(k Xj) = i,
kXi = -(iXk) = j.
Hence, rearranging terms, we have
FXG = i(F y G z - F z G y ) + j(F z G x - F X G Z ) +
k{F x G y -F y G x ). (2)
As an example of the use of the vector product, we may men-,
tion the angular momentum vector. If we have, as in Chap. V,
a particle of mass m, velocity v (a vector), and we wish its
angular momentum about a certain center, we must take m times
the magnitude of the radius vector times the projection of v at
right angles to the radius. But this is just m times the magni-
tude of the vector product of r and v. Further, this vector
product is a vector pointing along the axis of rotation, and in a
positive direction if the rotation is positive, or counterclockwise,
so that it is just in the direction conventionally assigned to the
angular momentum. Hence we have angular momentum
= m(r X v).
Another example of the use of the vector product comes fre-
quently, when we may wish to prove two vectors to be parallel.
To do this, we need only show that their vector product vanishes.
34. Vector Fields.-— Very often in physics one has vectors
which are functions of position. There are two particularly
common examples, a force field, and a velocity, or flux density,
in a flowing fluid. In an electric or magnetic or gravitational
field, for instance, the force on unit charge or pole or mass at
any point of space is a vector, of components F x , F y , F z , varying
from point to point in both direction and magnitude. Often
such a vector field is indicated graphically by introducing lines
tangent at every point to the vector at that point, called lines of
fojfce or lines of flow, as the case may be. We shall discuss the
nature of vector fields more in detail in connection with hydro-
dynamics and the flow of fluids, in Chap. XVII. Our present
application is to force fields, and our main interest is to discover
in what cases the force vector can be derived from a potential
52 INTRODUCTION TO THEORETICAL PHYSICS
function. To investigate this, let us consider the energy theorem
in three dimensions, deriving the work done in an arbitrary
displacement.
35. The Energy Theorem in Three Dimensions. — Let us start
with the equations of motion of a particle in a force field,
d 2 x „
m a¥ = F «
d 2 v
mj - F„
m % = F,. (3)
Multiplying by dx/dt, dy/dt, dz/dt, respectively, and integrating
with respect to time, we have as in the last chapter
\m v 2 x — %mv 2 x0 = jF x dx,
\m v\ - \m v\ = jF y dy,
\m v 2 z — \m v 2 zo = JF a dz.
Adding,
\m (v 2 x + v\.+ v 2 z ) - \m (v 2 x0 + v* v o + v 2 z0 )
= j(F x dx +F y dy + F z dz). (4)
Now v x 2 + Vy 2 + v. 2 is the square of the magnitude of the velocity,
or is v 2 . Thus the left side of Eq. (4) is the final kinetic energy of
the particle minus the initial kinetic energy, so that the integral
on the right should be the work done. The integrand is evidently
a scalar product : the product of the vector F, and the infinitesi-
mal displacement vector of components dx, dy, dz, which we may
call ds. The scalar product, which is F • ds. is the displacement
times the projection of the force in the direction of motion. This
is what is ordinarily called the work done, since only the compo-
nent of force along the motion does work. The integral is simply
the sum of all the infinitesimal amounts of work done, or is the
total work done, as in one-dimensional motion.
36. Line Integrals and Potential Energy. — The integral
JF ' ds is called a line integral, for its evaluation demands the
knowledge of a definite path between starting point and end
point, as well as of the function F. In general this integral will
depend on the path as well as the end points. For instance,
suppose the lines of force went in circles, as in Fig. 8. Then the
work done along the path ABC is positive, since the force and
VECTOR FORCES AND POTENTIALS
53
displacement are parallel; along ADC the work is negative, since
force and displacement are opposite; while along A EC it is zero,
force and displacement being at right angles. Intermediate
paths would yield any value we chose for the work done. Hence
we surely could not define a potential, for the work done between
A and C could not be set equal to the difference of potential in
any unique way. In discussing one-dimensional motion, we
saw that a potential depending only
on position could not be introduced
if the force depended on velocity,
time, or anything except displace-
ment. Here the condition is more
stringent : we cannot have a potential,
even if force depends only on posi-
tion, unless the integral JF • ds is
independent of path. If this condi-
tion is satisfied, however, we can
Set Up a potential energy V, SUCh Fig. 8— A nonconservative
that -JF-d, from some standard {T^TVES/M
point where the potential is zero, up along abc is positive, along
to the point we are interested in, t!%XZ£%Jl*£Z
equals V. Evidently another way of A and C is not independent of
stating the criterion for existence of pat '
a potential is that the work done in taking a particle about
any arbitrary closed path, or JF • ds where the integral is about
a closed curve and back to the starting point, be zero. Still a
third condition, easier to apply in actual cases, will be derived in
a later section.
37. Force as Gradient of Potential. — Let us suppose that it is
possible to set up a potential function V in a given case. We
know how to write V as the negative line integral of F. Now we
ask the opposite question: Given V, how do we find F? Let us
suppose that we are at a given point of space, and that we allow
the coordinates to increase by small amounts dx, dy, dz, forming
a vector ds, while at the same time we exert a force — F to balance
the force of the field. Then first, we shall do the work — F • ds
on the system ; second, the potential will increase by the amount
dV — V(x + dx, y + dy, z + dz) — V(x, y, z). These must be
equal; and writing the scalar product as F s \ds\, where \ds\ is the
magnitude of the displacement, F s the component of F parallel
to the displacement, we have
54 INTRODUCTION TO THEORETICAL PHYSICS
dV = -F-ds = -F.\ds\, F s = -S- (5)
A derivative of the sort occurring in Eq. (5), where we take the
difference of a scalar function like V at two neighboring points,
divide by the magnitude of the displacement, and pass to the
limit, is called a directional derivative, for evidently its value
depends on the direction in which the displacement is made. We
thus have the result that the component of force in any direction
is the negative directional derivative of the potential in the
desired direction.
The x component of force is determined from the directional
derivative of V along the x direction. To find that, we allow x
to increase by dx, keeping y and z fixed; divide the difference
V(x + dx, y, z) — V(x, y, z) by dx; and pass to the limit as dx
becomes small. But this is simply the partial derivative of V
with respect to x. We see, in other words, that a partial deriva-
tive of a function is merely a special case of a directional deriva-
tive, in which the direction is along one of the coordinate axes.
Using this fact, we then have
Fm = _*?, F, = -g, F, = _£. (6)
dx dy dz '
The three partial derivatives in Eq. (6) are evidently the com-
ponents of a vector, called the gradient of V, and abbreviated
grad V. Thus
Air dV _l ,- dV _l i, dV n\
grad V = ^^+J^ + k~ , (7)
and we may write a vector equation
F = -grad F. (8)
38. Equipotential Surfaces. — Let us take a displacement ds in
a direction tangent to an equipotential surface, or surface on
which V is constant. Then no work is done, so that dV = 0.
But also F • ds . = 0. If this is so, then F and ds must be at right
angles. Thus we have proved that the force, and hence the lines
of force, are at right angles to the equipotential surfaces. Any
scalar function of position can be described by a set of surfaces,
like equipotentials, on which it is constant. We see then that
the gradient of such a function is a vector, at right angles to the
VECTOR FORCES AND POTENTIALS 55
equipotentials, measuring the rate of change of the function in
this direction. The name gradient comes from contour maps in
two dimensions. There the contours are lines of constant alti-
tude, and the ordinary gradient of a slope is the rate of change
of height with horizontal distance, in the direction at right angles
to the contours, or the direction of steepest slope. In our case,
the gradient points in the direction in which the function
increases, while the force, being the negative gradient of the
potential, points in the direction in which the potential decreases.
39. The Curl and the Condition for a Conservative System. —
Let F x = -dV/dx, F y = -dV/dy. Differentiating the first
with respect to y, the second with respect to x, we have dF x /dy =
— dW/dydx, dFy/dx = — dW/dxdy. But by the fundamental
theorem of partial differentiation, these two are equal, so that
dF x /dy = dFy/dx. Similarly we have two other equations.
These can be combined in a single vector equation. We shall
find that it is useful to set up a vector called the curl, according
to the definition
(dF, _ dj\\ (dF_ x _dF J \ (dFy _ dF x \
\ dy dz) + \dz dx) + \ dx ~dy~)- (9)
Then our three equations are combined in the one vector equa-
tion curl F = 0. These form relations between the components
of force, which plainly must be fulfilled if there is a potential.
Yet it is by no means true that any set of forces will satisfy
these conditions. The vanishing of the curl at all points of
space, then, is a necessary condition which F must satisfy, if
it is derivable from a potential. It can be proved that it is
also a sufficient condition, so that it is the criterion which we
desired, telling whether a potential can be set up in a given
problem or not. As we shall see in a problem, the nonvanishing
of the curl of a vector in general means whirlpool-like lines of
force, as in Fig. 8.
40. The Symbolic Vector V. — We have seen two vector dif-
ferential operators, the gradient and the curl. These can both
be expressed conveniently in terms of a symbolic vector operator
V, equal to (i d/dx + j d/dy + k d/dz). Of course, this operator
by itself has no meaning, but its interpretation is that it is always
to be followed by some other quantity, and the differentiations
are to be performed on this quantity. Thus if we have a scalar
V, the quantity VV is a vector, equal to
curl F = i\
56 INTRODUCTION TO THEORETICAL PHYSICS
_ T/ (.b .b ,,d\ v .bV , .dV ,,dV
grad 7. (1G)
Similarly, if we have a vector j^, the vector product (V X F) is
equal to
<rx F )-(fr-l'-)+{i*--&) +
{&•- If) -*«**■ <»>
In the course of time, we shall meet several other vector
operations, which can be expressed in terms of V. We shall
merely define them now, though we shall have many applications
later. If we have a vector F, the scalar product of V with F, or
(V ■ F), is a scalar, evidently equal to
This is called the divergence of F, abbreviated div F. Again, if
we have a scalar 7, and take two factors V multiplied by 7,
or (V • V) 7, the result is
(4x + % + k Q • ( { i + 4 + k & v =
\d:r 2 ^ dy 2 ^bz i ) bx 2 ^ by 2 ^ bz 2 V - Kl6)
This is called the Laplacian of 7, and there is no usual abbrevia-
tion, except V 2 7, which evidently is equivalent to the method
of writing above. Clearly V 2 7 = div grad 7. Finally we can
take the Laplacian of a vector: if F is a vector,
V 2 F = /**= + *F? + ^A + /^ + **■ + ^ +
\ dx 2 ^ dy 2 ^ bz 2 J^ \ dx 2 ^ by 2 ^ bz 2 J ^
k y A J + ~^J +
b 2 F z , b 2 F z
bx 2 by 2
Problems
1. Find the angle between the diagonal of a cube and one of the edges.
(Hint: regard the diagonal as a vector i + j + h.)
2. Given a vector i + 2/ + 3A;, and a second i — 2j + ak, find a so that
the two vectors are at right angles to each other.
VECTOR FORCES AND POTENTIALS 57
3. Let F x = y,F v = -x,F z = 0. Prove that this vector field represents
a force tangent to circles about the origin in the xy plane. Compute JF ■ ds
around such a circle.
4. Find the curl of the force in the preceding problem. Discuss the
question as to whether it is a conservative field or not.
5. In the gravitational field of a mass m, the potential is given by —m/r,
where r is the distance from the mass, given by r 2 = x 2 + y 2 + z 2 , if the
mass is at the origin. Obtain the components of the force vector by direct
differentiation. Find the curl of the force, and show that it is zero.
6. Find which ones of the following forces are derivable from potentials,
and describe the physical nature of the force fields. Set up the potential
in cases where that can be done :
(a) F x = -^—j F y = -f^-v F z = 0.
x 2 + y 2 x 2 + y 2
(b) F x = ' y , F v = . ~ X . -, F. = 0.
Vx 2 + y 2 Vx 2 + y 2
(c) F x = xf(r), F y = yf(r), F z = zf(r), where fir) is an arbitrary function
of the distance from the origin.
id) F x = Mx), F v = My), F z = /,(*).
7. Prove that Ix + my + nz = k, where I, m, n, k are constants, and
J2 _|_ TO 2 ^_ n 2 — i } i s the equation of a plane whose normal has the direction
cosines I, m, n, and whose shortest distance from the origin is k.
8. Taking the potential field from Prob. (5), find the line integral $F>ds
around a square of arbitrary size in the xy plane, with the origin at its
center. Show by direct calculation that the integral always vanishes. Do
the same for a path made up as follows: the part of the square of side 2a,
made of lines at x = —a,y= ±a, which lies at negative values of x, and
the part of the circle of radius a, center at the origin, which joins onto and
completes the figure for positive x's.
9. Prove that A ■ (B X C) = B ■ (C X A) = C • (A X B), where A, B, C
are any vectors. Show that these are equal to the determinant
A x
Ay
A z
B x
By
B z
c x
Cy
C z
10. Prove that A X (B X C) = B(A ■ C) - C(A • B), where A, B, C
are any vectors.
11. Prove that div aF = a div F + (F ■ grad a), where a is a scalar, F a
vector.
12. Prove that curl aF = a curl F + [(grad a) X F], where a is a scalar,
F a vector.
13. Prove that div (F X G) = (G • curl F) - (F • curl G), where F, G are
vectors.
14. Prove that div curl F = 0, where F is any vector.
15. Prove that curl curl F = grad div F — V 2 F, where F is any vector.
CHAPTER VII
LAGRANGE'S EQUATIONS AND PLANETARY MOTION
In considering mechanical problems with several variables, it
is seldom very convenient to use ordinary rectangular coordinates.
In working with problems in the motion of particles, we often
wish to introduce curvilinear coordinates, as for instance polar
coordinates. With rigid dynamics, we often use rather com-
plicated quantities to give the orientation of a rigid body in
space. For instance, with a top or gyroscope, we may use
Euler's angles, namely, the latitude and longitude angles of the
axis of the top with reference to a fixed north pole, and the angle
of rotation of the top about its own axis. All these coordinates
come under the general description of generalized coordinates.
Any quantities which are capable of describing the positions
of the parts of a system, whether they be distances, angles, or
any other quantities, can serve as generalized coordinates.
Now when we begin to examine the equations of motion in
generalized coordinates, we naturally find that they can be
very complicated. In a later section we shall start with the
ordinary equations of motion in rectangular coordinates, intro-
duce new coordinates as functions of the old, and find the new
equations of motion by direct change of variables. We find
many new terms coming in, as soon as the change of variables
is at all complicated. But we shall find that there are several
fairly simple ways of writing the equations of motion, different
in form from Newton's equations, though essentially identical,
which preserve their simple form even in generalized coordinates.
The most elementary of these methods is that of Lagrange's
equations, and we consider them in this chapter.
41. Lagrange's Equations. — We start our discussion of
Lagrange's equations merely by restating Newton's second law
of motion, in a slightly different way. For the moment, we
consider only problems where there is a potential energy function.
Since F x = —dV/dx, etc., the equations of motion, written
in terms of momenta, are
58
LAGRANGE'S EQUATIONS AND PLANETARY MOTION 59
d(mv x ) _ _dV (i\
dt dx
etc. There is an interesting way in which these equations can
be written. Let the kinetic energy be called T, so that
T = | (IV s + v y * + v z >), (2)
if it is written in terms of the velocity components. If we keep
this form, we observe that mv x = dT/dv x , which we note is the
x component of momentum. Hence we can write our equations
i/^+^.o, (3)
dt\dvj ^ dx
etc. But this can be put in another form, if we let T — V = L,
called the Lagrangian function (and different from the total
energy, which is T + V). T is to be considered a function
of the velocity components, and V of the coordinates, so that L
is a function of all these six variables. Since T depends only
on the v'a, and V only on the x's, we have dT/dv x = dL/dv x ,
dV/dx = — dL/dx, etc. Hence the equations of motion are
d/MA_3L (4)
dt\dv x / dx
with similar equations for y and z. In this form, the equations
are called Lagrange's equations of motion, and they are simply
convenient ways of writing Newton's second law of motion.
As we have stated, the importance of Lagrange's equations is
that they hold in any sort of coordinates, not merely in rectangu-
lar coordinates. Thus, if the coordinates are q x . . . q n , and
their time derivatives are qi . . . q n , the equations are
l(?Ii) - ?k = (5)
dt\dqij dqi
Here as before L = T - V, but now it is no longer true, as
before, that T depends only on the velocities, V only on the
coordinates. Instead, T generally involves the coordinates
as well, so that the term dL/dqi has some contributions coming
from dT/dqi, which are evidently absent in rectangular coordi-
nates. We shall see by an example that these terms are a sort
of fictitious force introduced by using the generalized coordinates,
60 INTRODUCTION TO THEORETICAL PHYSICS
and of which the centrifugal force in polar coordinates is a typical
case. We postpone a proof of Lagrange's equations to a later
section, giving first an example of their usefulness by discussing
i.he motion of a particle in a central field, as a planet about the
sun.
42. Planetary Motion. — As an example of two-dimensional
motion, and of the Lagrangian equations, we consider the case
where V = V(r), a function only of the distance r from a given
point. This problem is almost impossible to discuss completely
if we use rectangular coordinates, but if we take polar coordinates,
r, 0, we find that we can separate variables, and that the problem
is then easily solved. To apply Lagrange's method to this case,
we write L as a function of r, 0, r, and 0. Then we have
d(dL\ _ dL =
dt\df/ dr ~ '
d/dL\ dL n
First we find L. The velocity is made up of two vector compo-
nents at right angles, along the radius and along the tangent to
a circle. The first is r, the second rd, so that v 2 = f 2 + r 2 2 , and
L = T - V = ^(r
rH 2 ) - V(r).
Differentiating,
dL
^-r = mr,
dr
— = mr J 6,
dd
^— = mrb 2 —
dr
dV
dr
dd
(7)
Then the equations are
fairnr) - mrd 2 + -^ = 0,
~{mrH) = 0. (8)
The second may be immediately integrated: mr 2 § — constant.
This has a simple interpretation, for mr 2 6 is simply the angular
momentum, since mr 2 is the moment of inertia, 6 the angular
LAGRANGE'S EQUATIONS AND PLANETARY MOTION 61
velocity, and our equation states that it is constant, since no
torque is acting. As a matter of fact, dL/dqi is called the
generalized momentum associated with the generalized coordinate
q if and linear and angular momenta are special cases of the
generalized momentum. Let then mr 2 6 = p, where p is a
constant (momenta are conventionally called p, as coordi-
nates are called q). Next we may consider the first equation,
m d 2 r/dt 2 = mrd 2 — dV/dr. The first term on the right-hand
side is at first unexpected. But when we look at it, we see that
it is the centrifugal force, which must be added to the external
force to produce the radial acceleration.
We can now solve our equations. Setting mr 2 6 = p, we have
6 = p/mr 2 , so that m d 2 r/dt 2 = p 2 /mr 3 — dV/dr = —d/dr(V +
p 2 /2mr 2 ). We have separated the variable r from 0, and the
result is just like the equation for a one-dimensional problem with
a potential V + p 2 /2mr 2 , the latter being a sort of fictitious poten-
tial energy coming from the centrifugal force. For example, if
the force is a gravitational one, V = —Gmm'/r, where m' is the
mass of the attracting body, G the gravitational constant, so
that we have the problem of the apparent potential —Gmm'/r -f-
p 2 /2mr 2 . Except for the constants, this is the case of the poten-
tial — (l/x) + 0-/x 2 ), which we have already taken up in Probs. 3
and 4, Chap. V. We showed there that motions of negative
energy are oscillatory in r, so that the orbit is concentrated in a
finite region, and motions of positive energy go to infinity. We
leave the exact discussion to a problem, but it proves to be true
that the finite orbits are periodic and are ellipses with the attract-
ing center at one focus, while the open orbits are hyperbolas.
This is, however, a special case, and we proceed to a qualitative
discussion of the general central motion, by the method of energy.
43. Energy Method for Radial Motion in Central Field. — We
have seen that the radial motion of a particle in a central field
is just like the one-dimensional motion of a particle in a potential
V + (p 2 /2mr 2 ), where p is the constant angular momentum.
This problem can be discussed as in Chap. V, plotting the curve
V + (p 2 /2mr 2 ) as a function of r, and drawing the horizontal
line at height E, as in Fig. 6. Aside from this, we can make no
general statement. But in many important physical cases, the
curve resembles A or B in Fig. 9, the rise at r = arising from
the centrifugal force, and the potential V representing attraction
in A, repulsion in B. With energy E h in either case, the motion
62
INTRODUCTION TO THEORETICAL PHYSICS
would come in from infinity to a smallest distance (c or d), called
the perihelion, from the astronomical analogy, perihelion meaning
near the sun. It would then reverse, and travel outward for
infinite time. The energy E 2 , however, would represent no
possible motion with the curve B, but with the attractive poten-
tial A, which resembles the gravitational attraction mentioned
in the preceding section, there would be oscillatory motion
between the perihelion a and the aphelion b. This motion
Fig. 9.— Curves of 7 +
2mr 2
as functions of r. Case: A, attraction; B,
repulsion. With energy Ei, motion goes to infinity with either potential; with
Ei, motion impossible with curve B, oscillatory between limits a and b with curve
A-
would be periodic, and the radius as function of time, and like-
wise the period, could be computed by the method of the energy
integral discussed in Chap. V.
44. Orbits in Central Motion. — The best picture of central
motion is obtained by considering the orbit in space, as in Fig. 10.
Suppose we consider a motion oscillatory in r, as the case E 2
of Fig. 9. Then we may draw two circles, of radii equal to
the perihelion and aphelion distances, respectively, and the
motion will take place between the circles. The orbit must be
LAGRANGE'S EQUATIONS AND PLANETARY MOTION 63
tangential to both circles, as shown. If the motion starts
on the outer circle, the particle will move with continually
decreasing radius until it touches the inner circle. At the same
time, however, on account of the angular momentum, it will
be turning around, and the angle made by the radius vector
will have turned through a definite amount between the points
of contact with outer and inner circles. After touching the inner
circle, the whole procedure is reversed, r increasing to the maxi-
Fig. 10. — Orbit of a particle in central motion.
mum value, so that after a certain time the point will touch the
outer circle again.
Now between the two successive points where the orbit touches
the outer circle, there will be a certain length of arc. It may be
that this is a rational fraction, say m/n, of the circumference,
where m and n are integers. In that case, after n excursions
to the center and out again, the aphelion point will have gone
around the circle m times, and will have come back to the starting
point. Thus the motion is periodic, repeating itself after a
certain length of time. For example, if the particle is attracted
to the center according to the inverse square, m/n is just 1,
and the particle always comes back to the same point on the
circle. But if the length of arc is an irrational fraction of the cir-
cumference, as in Fig. 10, the motion is not periodic, and will
never repeat itself. Nevertheless, it is what is called doubly
periodic. The motion resembles a slowly rotating ellipse,
64 INTRODUCTION TO THEORETICAL PHYSICS
rotating so that successive aphelion points, instead of lying on
top of each other, are displaced with respect to each other by a
given angle. This slow rotation is called precession, and one
can find the frequency, and angular velocity, of the precessional
motion. If now we imagined a turntable to rotate with the
precessional frequency, and traced out the motion on this turn-
table, the path would be closed, somewhat like an ellipse. In
other words, the whole motion is a combination of a periodic
motion, superposed on a rotation. These two motions have in
general entirely independent frequencies, and that is the origin
of the statement that the motion is doubly periodic.
45. Justification of Lagrange's Method. — We shall now show
in our special case of polar coordinates how Lagrange's method
could be justified, using this as a model for the general treat-
ments Surely the equations of motion are
dH bV d>y _ _bV
m W* ~ dx' m dt* ~ by'
We introduce the polar coordinates, x = r cos 8, y = r sin 8.
Then dx/dt = cos 8 dr/dt — r sin 8 dd/dt,
d 2 x d*r a . a drd6 . Q d 2 8 Jd8\ 2
jTa = -TPl cos 8 - 2 sin %^-rsin^-r cos 8\^J , (9)
dp-dP™" "^"dtdt ,a ^"dt
d2 y _ d ' r „•„ a _i_ o _ A de L . _ a d " e _ . „•„ Jm\
dt* - dt* sind + 2 cos e dtdt +rcose d¥ - rsin *l - ' • (10)
Using these, we can obtain the equations of motion in x and y.
But now multiply Eq. (9) by cos 8, Eq. (10) by sin 8, and add.
The result on the left is w d 2 r/dt 2 — mr(dd/dt) 2 , and on the right
— (dV/dx cos 8 + bV/by sin 8), which is just — bV/br, since
the latter should be — (bV/bx bx/br + bV/by by/br), and
bx/br = cos 8, by/br = sin 8. Thus we have the first of
Lagrange's equations. Next, multiply Eq. (9) by — r sin 8,
Eq. (10) by r cos 8, and add. on the left, we have 2mr dr/dt dd/dt
+ mr 2 d 2 8/dt 2 , which equals m d/dt(r 2 d8/dt) f and on the right we
have r bV/bx sin 8 - r bV/by cos 8 = -bV/b8. Thus the
second equation becomes m d/dt(r 2 dd/dt) = — bV/bd, the second
of Lagrange's equations (whose right member is zero in the case
of a central field).
Just such a change of variables can be carried out in the general
case. Suppose that, for the sake of simplicity, we still take only
LAGRANGE'S EQUATIONS AND PLANETARY MOTION 65
two dimensions; the general proof goes through in just the same
way, except with more complicated expressions. We start with
two rectangular coordinates x and y, in terms of which we have
the ordinary Newtonian equations m d 2 x/dt 2 = — dV/dx,
m d 2 y/dt 2 = — dV/dy, and two generalized coordinates gi and
q 2f given as functions of x and y, so that q x = qi(x, y),
q 2 = q 2 ( x , y), or conversely we can write x and y as functions of
4i and q 2 : x = x(q h q 2 ), y = y(qi, 92). We must remember
carefully what these quantities are functions of, in taking partial
derivatives. Now we have
dx dx dqi dx dq 2
~dt ~ dqi dt dq 2 dt'
dH _ dx d 2 qi dx d 2 q 2
dt 2 ~ dqi dt 2 + dq 2 dt 2
dqi/j^x dqi d 2 x dq 2 \
+ ~dt\dqi 2 dt + dqidq 2 dt )
dq 2 / d 2 x dqi d 2 x dq 2 \
+ ~dt\dq x dq 2 dt + dq 2 2 dt )
with a similar equation for d 2 y/dt 2 . In terms of these, we set up
the equations m d 2 x/dt 2 = - d V/ dx, etc. Then we multiply thr
first by dx/dq h the second by dy/dqi, and add. We have
ny dx v , /jyyygi 4. /i£ ^ _l ^ J*]L\^
m \[\dqi) + \d qi ) J dt 2 ^ \dqi dq 2 "*" dqi dq 2 ) dt 2
(d%_ d^x dy_ d 2 y\/dqi\ 2
+ \dqi dqi 2 ^ dqi dqi 2 )\dt )
+ \dq x dq x dq 2 "*" dqi dq x dq 2 ) dt dt
,(jtejPxdy_ Vy V^Yl
"*" \dqi dq 2 2 ~*~ dqi dq 2 2 )\ dt ) j
/dV dx dV dy\ _ _dV_
\ dx dqi dy dqi) dqi
It will next be shown that the rather complicated expression
on the left is equal to
d/dT\ _ dT }
dt\dqi) dqi
where T is the kinetic energy. To do this, we first have
66 INTRODUCTION TO THEORETICAL PHYSICS
2[^agri dz "^ a? 2 d< J "*" \d qi dt + ag 2 dt )
Then by differentiation, remembering that q x = dgi/dfc,
dT
dqi
— m
dt\dq
. dq x dqi
dt dt
/ dx dqi
\dqi dt
+ ~~ dgA cte /_dy dgi jfy dgAdy
5x
dg 2 ^ y^gi
i/ |_ W dt* ^ dq 2 dt* )dq x + 1 ^ ^7F + — — »—
d / Bx^
\dq x dt ' dq 2 dt J dqi
dy d 2 q 2 \ dy '
\dqi dt 2 "■" dq 2 dt 2 J dqi
+
(ib\
dt J
_dq\dqij "*" dq\dqi) J
A/iiY + jl/iy Y+ J_/^£ _^\ . JL/ifo ** Y
dq 2 \dqij ^ dq\dqi) ^ dq\dq t dq 2 ) "*" dgi^dgi a? 2y /
_a /3a; a^x a / ay g y \l)
_dq 2 \dqi dq 2 ) + d^dgi dtfajjj
Also
d?i ■ l\^i <# ag 2 dt )\dqi 2 dt + agiag 2 <ft j
/ay dqi dy^ dqA/ tPy dqi d 2 y dq 2 \l
\dqi dt ~ 1 ~ dq % dt )\dqi 2 dt ~^~ dqidq 2 dt ) \
+
Combining these two expressions, it is easy to see that we have
just the quantity which we desired. We have then the equation
d/dT
dt\dq
D-
dT
dqi
aF
dqi
If we set L = T — V, and remember that, since V does not
depend on the velocities, dV / dqi = 0, this becomes
d/dL
dt\dq
dL
Similarly we can prove the equa-
or Lagrange's equation for q x .
tion for q 2 .
It is worth remarking that the method which we have used
for proving Lagrange's equations, though straightforward and
simple in principle, is not the one usually employed. More often
a derivation is given using the calculus of variations, which avoids
most of the algebraic complications, but which on the other hand
is more difficult in the fundamental ideas involved.
LAGRANGE'S EQUATIONS AND PLANETARY MOTION 67
Problems
- 1. A particle of mass mis attracted to a center by a force —Gmm'/r 2 . Find
perihelion and aphelion distances as a function of energy and angular
momentum. Assuming that the orbit is an ellipse, prove that its major
axis is —Gmm'/E.
2. In Prob. 1, show that it is possible for perihelion and aphelion distances
to be equal, so that the orbit is circular. Find the necessary relation
between energy and angular momentum for this to happen, and check this
relation by elementary discussion, balancing the centrifugal force in the
circular motion against the attraction.
3. A particle in an inverse square field executes an elliptical motion with
the center of attraction as a focus. Find the period of this motion, by
considering the radial motion, proceeding as in Prob. 5, Chap. V, using the
results of that problem if you wish, but finding the period in terms of energy
and angular momentum.
4. Discuss in detail the motion of a planet about a sun, proving that, if
the energy is negative, the orbit is elliptical with the sun at a focus, and
finding the relations between the major and minor axes of the ellipse and the
energy and angular momentum. A procedure for the discussion is sug-
gested as follows:
Assuming the angular momentum to be p = mr 2 = constant, show
v 2 T /du\ 2 ~| 1 du
that the energy is ^- I -r ) + u 2 — Gmm'u, where u = — Find -^
from the equation of an ellipse in polar coordinates, with one focus as a pole,
which is u = • — jz. ^—> where a is the semi-major axis, e the eccentricity,
a(\ — e 2 )
so that b, the semi-minor axis, is given by b 2 /a 2 = 1 — e 2 . Substituting
your value of du/dd into the expression for energy, show that the result is a
constant, independent of B, and equal to E, if the major axis and eccentricity
are properly chosen.
5. Suppose a particle of mass to, charge e, collides with a very heavy
particle which has charge e', so that it repels with a potential energy ee'/r.
The first particle is moving with a velocity v at a great distance, and is
aimed so that, if it continued in a straight line, it would pass by the center
of repulsion at a minimum distance R. Note that this determines the
angular momentum. Using the energy method, find the perihelion distance
as a function of R and the velocity of the particle.
6. Discuss in detail the motion of the particle of Prob. 5, showing that it
will be deflected so that after the collision the line of travel will make an
angle d> with the initial direction, where tan jr- = j=- Such deflections are
° 2 mvo 2 R
observed in collisions between alpha particles and atomic nuclei, in Ruther-
ford's scattering experiments.
Suggestions: the particle executes a hyperbolic orbit, and the desired
angle is the angle between the asymptotes. Now the equation of a hyper-
bola in polar coordinates is just like that of an ellipse, as given in Prob. 4,
except that the eccentricity is greater than 1, so that the term 1 — e cos
can become zero, and r infinite, giving the angles of the asymptotes in
68 INTRODUCTION TO THEORETICAL PHYSICS
terms of e. We need then only determine e in terms of energy and angular
momentum, from the equations found in Prob. 4.
7. A two-dimensional linear oscillator is attracted to a center by a force
proportional to the distance, or F x = — ax, F v = —ay. Solve in rectangu-
lar coordinates, separating variables, showing that x and y execute independ-
ent simple harmonic vibrations of the same frequency. Prove that the
resulting orbit is an ellipse, with its center at the center of attraction.
8. Taking the solution of Prob. 7 in rectangular coordinates, find the
angular momentum vector by ordinary vector formulas from the displace-
ment and velocity, and prove by direct computation that it remains con-
stant. Find the angular momentum as a function of the dimensions of the
elliptical orbit, and show its connection with the area of the orbit.
9. Set up the problem of the two-dimensional linear oscillator, as in Prob.
7, using polar coordinates. Separate variables, solve the radial problem
by the energy method, compute the period in this way, and show that it is in
agreement with the period as found in Prob. 7.
CHAPTER VIII
GENERALIZED MOMENTA AND HAMILTON'S
EQUATIONS
In the last chapter we have found the equations of motion
in generalized coordinates, but we have not considered the mean-
ing in these coordinates of the simple concepts of momentum and
force. We shall accordingly examine these questions, and shall
see that the equations can be interpreted in the form that the
force equals the time rate of change of momentum, which as we
know is a more fundamental statement than that it is the mass
times acceleration. Using the momentum, we can then restate
the equations in a form called Hamilton's equations, equivalent
to Lagrange's equations, but more powerful in some applications
to advanced mechanics.
46. Generalized Forces. — In many mechanical problems we
have to deal with forces which cannot be derived from a potential.
Let us see how such forces may be included in the Lagrangian
scheme. For simplicity we take a two-dimensional problem,
and let the x and y components of force be F x and F v , which may
depend on time, velocity, etc., as well as position. For gen-
erality, we assume that part of the force can be derived from a
potential, the rest not, so that we have F x = — (dV/dx) +
F» t etc., where FJ is the part of the force not derivable from a
potential. Now if we proceed with the proof of Lagrange's
equations as in the last chapter, we easily find
dt\dqj dq x dqi\ * dq x "*" v dqj
with a similar equation for g 2 . We may introduce as before a
Lagrangian function, containing the part of the external forces
derivable from a potential: L = T — V. Then
d(dL\ dL „ ,'dx , w ,dy _ n m
dt\dqi/ dq x dq x aq x
with a similar equation for qz, where Qi, Q2 are called the gen-
70 INTRODUCTION TO THEORETICAL PHYSICS
eralized forces connected with the coordinates q 1} q 2 . The equa-
tion in this form may be used to discuss any arbitrary problem,
for example of damped motion, in generalized coordinates.
It is worth noting that these generalized forces are closely
related to the work done in an arbitrary displacement, just as
ordinary forces are in rectangular coordinates. For imagine the
generalized coordinates changed by amounts dq h dq 2 . There
will be a certain amount of work done on the system, equal to
— dV + dW, where dW is the work done by the external non-
conservative force F' (a force is spoken of as conservative if it is
derivable from a potential, nonconservative otherwise). Now
in general we have
dW = F x 'dx + F v 'dy
-(<+ '■'$*+ ('■'£+ '-'£)*
= Q\dq x + Q 2 dq 2 , (2)
or the sum of products of generalized forces by generalized dis-
placements. It is, of course, plain that all these arguments work
equally well with more than two generalized coordinates.
The forces Q which we have just introduced were the external
applied forces not derivable from a potential. But we may well
consider all the forces together. We could write Lagrange's
equations as
dt\dqij l dqi dg*
The three terms on the right of Eq. (3) may be taken to be three
terms of the force. The first is the generalized force not derivable
from a potential, the second the force derivable from a potential,
the third the fictitious force, like a centrifugal force, arising from
the fact that the coordinate system is not rectangular. Equation
(3) states that this total force equals the time rate of change of a
certain quantity, and it seems reasonable to consider this quantity
as a generalized momentum.
47. Generalized Momenta. — In simple cases the quantity
dL/dqi plays the part of a momentum. Thus in rectangular
coordinates, we have dL/dx = mx, or exactly the momentum
associated with the coordinate x. Similarly in polar coordinates
GENERALIZED MOMENTA AND HAMILTON'S EQUATIONS 71
the quantities associated with r and 6 are mf, the radial momen-
tum, and mr 2 9, the angular momentum, respectively. These
are but examples of a general rule, and as a matter of fact we
define dL/dqi to be the generalized momentum associated with
the coordinate q i} denoting it by pi. We note that generalized
momenta are not of the same dimensions as ordinary momenta,
in general; they are not simply components of the momentum
referred to other coordinates. Similarly generalized forces are
not simply components of forces. For instance, it is easily shown
that in polar coordinates the generalized force Q r is the com-
ponent of force along r, but Q e is the moment of force, or torque,
which by Eq. (3) above equals the time rate of change of the
angular momentum.
48. Hamilton's Equations of Motion. — Assuming no external
forces Q, we could evidently write Lagrange's equations in the
form dpi/dt — dL/dqi — 0, or dp { /dt = dL/dq t , which, taken
together with the definitions p { = dL/dqi, would form a complete
system. But there is a neater method, known as Hamilton's
method, which we use instead. We can first see how Hamilton's
equations are set up in rectangular coordinates. There we have
T = (m/2)(x 2 + y 2 + z 2 ). Then it is true that we have, for
instance,'
dL dT
Vx = -rr = tt = mx.
dx dx
We can also write T, not in terms of the velocities x, y, z, but
in terms of the momenta p x , p y , p z . Since x = p x /m, we have
T(P*> Vv, p.) = ^(P* 2 + pS + p."),
where we must specify that T is a function of the p's. Then we
have
dT(p x , p y , p z ) _p x _ mx .
dp x m m '
and similarly dT(p)/dp v = y, dT(p)/dp z = z. These take the
place of the equations p x = dT{x, y, z)/dx, etc.
Now in Hamilton's method we set up what is called the Hamil-
tonian function H. This is in all ordinary cases simply the total
energy T + V, in which T is expressed in terms of the momenta,
rather than the velocities. Thus we have H = H{q i} p t ), mean-
72 INTRODUCTION TO THEORETICAL PHYSICS
ing that it is a function of the coordinates and momenta. Then
dH = dT ,9V
dqi dqi dqi
which in rectangular coordinates gives dV/dq< = — dL/dq i} so
that in this case Lagrange's equation becomes dpi/dt = - dH/dqi.
Similarly
dH = dT = . = dqi
dpi dpi Qi dt'
The resulting equations are called Hamilton's equations:
dqi = dH
dt dpi
dpj = _dH ...
dt dq t w
It is evident that they show a symmetry between p { and q if which
is one reason for preferring them over Lagrange's equations.
For a given problem, there are twice as many Hamiltonian
equations as Lagrangian equations, but they are only first-order
rather than second-order differential equations, so that it comes
down essentially to the same thing.
49. General Proof of Hamilton's Equations. — Our proof holds
only in rectangular coordinates, and we must next give a general
proof. As before, we start with Lagrange's equations, which we
assume are correct, and we define the momenta as derivatives of
the Lagrangian function with respect to the velocities. Then we
set up the Hamiltonian function in terms of the Lagrangian
function, by the equation
h = 5)p^ - L - (5)
i
This seems at first quite different from our elementary definition
of H as the energy, but we shall show in the next paragraph that
it is equivalent. We express the Hamiltonian in terms of coordi-
nates and momenta, writing the velocities q h where they appear
both in Sp,-gy and L, in terms of the momenta, so that we have
H = JjpMvk, qk) - L[qj(Pk, qt), Qil
GENERALIZED MOMENTA AND HAMILTON'S EQUATIONS 73
Then we have
HE = a 4- ^* a Jk - ^SOk b Jl
d Vi qi ^ ^J v, d Vi ^Jdq f dp-
i i
But since by definition pj = dL/dq h the last two terms cancel,
leaving
dH = . = dqi
dpi Qi dt'
Similarly,
d qi ^ Pj dq { ^idijdqi dq~-
i i
This time the first two terms cancel, leaving dH/dqi = — dL/dq if
so that by Lagrange's equations.
dH = _dpj
dqi dt
Thus we have proved both of Hamilton's equations in the general
case.
It remains to be shown that the Hamiltonian function, as we
have defined it, is the same as the total energy. First we consider
the kinetic energy expressed in terms of the velocities. This is a
homogeneous quadratic function of the velocities :
jk
kqiqk,
(6)
where the A's are coefficients depending in general on the coordi-
nates, and we are to sum over all possible values j and k. In
particular, for rectangular coordinates, A jk = ra/2 if j = k, if
j 7* k. In cases where the coordinates are orthogonal, that is,
the coordinate surfaces intersect at right angles, as they do, for
instance, in spherical polar coordinates, or in fact in all the
coordinate systems in common use, only square terms come in,
all coefficients A jk being zero if j ^ k. But in oblique coordinate
systems, this is not true. Now for such a homogeneous quad-
ratic expression we have the theorem
74 INTRODUCTION TO THEORETICAL PHYSICS
r\/T7 ■
which we can immediately prove. For ^r- = ^S(Aa + A a) fa,
dT
we can immediately prove, .for
so that
i i j
The double sum is now just twice the sum of Eq. (6) which we
previously gave as T, proving the theorem. Hence, using
dT/dfa = dL/dqi = pi, we have T = li^Pifa, so that our defini-
i
tion in Eq. (5) of H gives H = 2T-L = 2T-T+V =
T + V = total energy, as we wished to prove.
In advanced work, one sometimes meets cases where H is not
equivalent to the total energy. Such cases are found, for
instance, where magnetic forces are present. But even here,
the following general rules are correct :
First, set up a Lagrangian function, so that the equations of
motion can be written in Lagrangian form. This can sometimes,
as in the magnetic case, be done, even if we cannot interpret
the Lagrangian function as T — V; for in the magnetic case, the
forces are not derivable from a potential, depending rather on the
velocity, and yet vary in such a way that we can use a Lagrangian
function.
Next, define the momenta as pt = dL/dfa.
Set up the Hamiltonian function Zpifa — L, expressing it in
terms of coordinates and momenta.
Then Hamilton's equations hold, using this Hamiltonian.
50. Example of Hamilton's Equations. — Let us by way of
illustration work out Hamilton's equations for the problem of
planetary motion, discussed in the previous chapter by Lagrange's
method. In terms of the coordinates r and 6, we found that
L = ^(r 2 + r 2 2 ) - V(r).
Then the momenta are p r = dL/dr = mf, the ordinary momen-
tum along the radius, and p 8 = dL/dd = mr 2 d, the angular
momentum. Next we have
2p»ft — L = {mf)f + (mr 2 e)6 — L
= m(r* + r 2 2 ) - ^(r 2 + r 2 2 ) + V(r)
GENERALIZED MOMENTA AND HAMILTON'S EQUATIONS 75
= ^(r 2 + r 2 2 ) + V(r)
= total energy.
Solving the equations for r and 6 in terms of p r and p e , we have
f = Pr/m, 8 = pe/mr 2 , and substituting these in the Hamiltonian,
we have
H = U* + ;*•') + F(r >-
Then Hamilton's equations are
di7 _ Pr _ dr _ .
dp r m dt
dps mr 2 dt '
both of which we already knew. Also
_dH = pf _ dV(r) = dpr }
dr mr 3 dr dt
showing that the time rate of change of radial momentum equals
the external force — dV/dr in the r direction, plus the centrifugal
force p 8 2 /mr 3 (which evidently equals mrd 2 = mw 2 r = mv 2 /r).
Finally
— — - - dve
dd dt'
showing that the time rate of change of angular momentum is
zero, on account of the absence of torques.
51. Applications of Lagrange's and Hamilton's Equations. —
From our discussion one might get the impression that the only
use of Lagrange's and Hamilton's equations was in introducing
curvilinear coordinates in problems of the dynamics of a particle.
This is, however, far from the case. For example, one may have
a particle moving subject to certain constraints, as a bead sliding
along a frictionless wire, or a particle constrained to move on the
surface of a sphere or other surface, as the bob of a spherical
pendulum must move in a sphere. Then we may often satisfy
the conditions of constraint by suitable choice of the generalized
coordinates. Thus, with the spherical pendulum, we may take
spherical polar coordinates r, 6, <f>. We may then arbitrarily
set r constant, equal to R, the radius of the sphere, and write
76 INTRODUCTION TO THEORETICAL PHYSICS
Lagrange's equations for 6 and <f>. To justify this, we note that
the component of the external and centrifugal force normal to
the sphere will be exactly balanced by the reaction of the con-
straint, just as the weight of a body resting on a table is exactly
balanced by the upward push of the table. Thus the generalized
force acting in the direction of r will be zero, so that a constant
value for R, leading to a constant and vanishing generalized
momentum along r, is a solution of the equations. For a
particle on a wire, similarly, if the wire happened to be a circle,
we could take polar coordinates, set r constant, and have but
one equation of motion, stating that the torque acting on the
particle equaled the time rate of change of its angular momen-
tum. We note that these two problems are essentially equiva-
lent to the spherical and ordinary pendulum, which are rigid
bodies, suggesting that Lagrange's equations are of use in dis-
cussing the motion of a rigid body. But we can go even further.
An Atwood's machine, for instance, is a special case of coupled
systems, two weights being hung by a string over a pulley. This
can be described very easily by a single generalized coordinate.
In the general problem of coupled systems, and in fact in all
problems of interaction of different particles or systems,
Lagrange's method is very suitable, as we shall see. In fact,
there is hardly a mechanical problem where generalized coordi-
nates are not applied.
For the actual solution of problems, Hamilton's equations are
generally not so convenient as Lagrange's equations. Their
importance comes in the insight they give into the situation,
by bringing the momenta directly into the statement of the
equations, and for their relation to more advanced mechanics.
The applications are principally to three fields: celestial
mechanics, statistical mechanics, and quantum theory. We
shall indicate in the next chapter the nature of some of these
applications of Hamiltonian methods, taking up some of the
general properties of the motion of particles, but postponing
until later in the book the discussion of statistics and of quantum
mechanics.
Problems
1. An Atwood's machine is built as follows: A string of length h passes
over a light fixed pulley, supporting a mass m x on one end and a pulley of
mass m 2 (negligible moment of inertia) on the other. Over this second
pulley passes a string of length U supporting a mass m 3 on one end and m 4 on
the other, where m 3 ^ w 4 . Set up Lagrange's equations of motion for this
GENERALIZED MOMENTA AND HAMILTON'S EQUATIONS 77
system, using two appropriate generalized coordinates. From these show
that the mass rwi remains in equilibrium if
(m 4 — m 8 ) 2
mi = m 2 + TO 3 + TO4 ;
m 3 + mi
2. A particle slides on the inside of a smooth paraboloid of revolution
whose axis is vertical. Use the distance from the axis, r, and the azimuth
as generalized coordinates. Find the equations of motion. Find the angu-
lar momentum necessary for the particle to move in a horizontal circle.
If this latter motion is disturbed slightly, show that the particle will perform
small oscillations about this circular path, and find the period of these
oscillations.
3. Set up the kinetic energy, Lagrange's equations, and Hamilton's
equations in spherical polar coordinates. Set up expressions for the general-
ized forces acting on r, 0, and <j>, and for the generalized momenta, explaining
the physical meaning of these quantities.
4. Set up the problem of a spherical pendulum subject to gravity and
to a resisting force proportional to the velocity and opposite in direction.
Use spherical polar coordinates. Show that for small amplitudes and no
damping this problem reduces to the two-dimensional linear oscillator of
Prob. 9, Chap. VII.
6. Derive the Hamiltonian equations for Prob. 4, in the general case,
showing that the damping forces give extra terms in the equations propor-
tional to the momenta. Show that these equations in general cannot be
separated. Derive a solution, however, for the special case in which the
instantaneous motion would be a rotation about the lowest point of the
sphere if damping were absent. Assume small damping, so that the actual
motion is a gradual spiralling in toward the lowest point.
6. The force on an electron of charge e, moving with a velocity d in a
magnetic field H, is given by F = -(v X H), where c is the velocity of light.
This corresponds to the ordinary motor law, in which the force on a circuit
is proportional to the current (here ev/c) and to the field, and at right angles
to both. In addition, the magnetic field H can be given as the curl of a
vector A, called the vector potential. Show that the equations of motion of
an electron moving in such a magnetic field, and in addition in a potential
field of potential V, can be described by Lagrange's equations, with the
Lagrangian function/^ = T — V + (e/c)(v -A}. Assume the vector poten-
tial, and magnetic field, to be independent of time, but note that
^ _ ^ .dAahc dAdy dAdz- , dA _
dt ~ dt ~ l ~ dx dt + dy ~dt + ~dz W re ~dt ~
7. For the particle of Prob. 6, set up the momentum and the Hamiltonian
function. Show that the momenta do not equal mass times velocity, and
the Hamiltonian does not have the form p 2 /2m + V.
8. In the relativity theory, the equations of motion of a particle are
different from what they are in classical mechanics, though they reduce to
the same thing for small velocities. In particular, the mass of a particle
78 INTRODUCTION TO THEORETICAL PHYSICS
increases with velocity. If a particle has a mass m when at rest, its mass
at speed v is given by
Wo
TO = r,
VI - *> 2 /c 2
where c is the velocity of light; reducing to m in the limit v/c = 0, but
becoming infinite when the particle moves with the speed of light.
Show that the equations of motion are correctly given from the Lagran-
gian function — WoC 2 \/l — v 2 /c* — V, when we remember that the momen-
tum equals the velocity times the (variable) mass.
Derive the Hamiltonian function from the Lagrangian function. Setting
the Hamiltonian function equal to T + V, where T is the kinetic energy,
&how that the Lagrangian function is not equal to T — V, as is natural
from the fact that the kinetic energy is not a homogeneous quadratic func-
tion of the velocities. Taking the kinetic energy, expand in power series
in the quantity v/c, showing that for low speeds the kinetic energy
approaches its ordinary classical value, except for an additive constant moc 2 .
This additive constant, which always appears in relativistic energy expres-
sions, is interpreted as meaning that the mass of the particle is really equiva-
lent to energy, 1 gm. being convertible into c 2 ergs of energy.
CHAPTER IX
PHASE SPACE AND THE GENERAL MOTION OF
PARTICLES
As in one-dimensional motion, we can make a great deal of
use of the energy in discussing motion in two and three dimen-
sions. In a conservative system with potential energy V, the
motion can occur only in those regions of space where E — V
is positive, if E is the total energy, and we can thus divide up
our possible problems into those occurring within a finite region
and those going to infinity. As with one-dimensional motion,
there are sometimes periodicity properties associated with the
finite motions, which we discuss in the present chapter. With
two-dimensional motion, we can visualize the use of the energy
very easily, plotting 7 as a height in a three-dimensional graph,
the result looking like a relief map, or else drawing equipotentials,
which represent the potential as the contour lines represent
height on a map. For a total energy E, we imagine the map
filled with water up to a level E, so that the submerged parts,
lakes and oceans, represent the regions where the motion occurs.
We may also use the analogy of the rolling ball in two dimensions
as well as in one, imagining that a ball starts rolling down the side
of a hill in our relief map, climbing up the valley on the other
side, and oscillating back and forth. From physical intuition
as to the motion of such a ball, we can derive much information
about complicated forms of motion.
There is one great complication present in motion in several
dimensions which was absent in one-dimensional motion. In
that simpler case, the velocity of a particle was determined at
each point of space in a conservative motion, only the direction,
forward or back, being arbitrary. Here, however, while the
magnitude of the velocity is still determined, there are an infinite
number of possible directions associated with the same magni-
tude. To describe a motion completely, then, even if we know
. its energy, we must give as well the velocity components, or
else the momenta, at each point of the path. This is accom-
79
80 INTRODUCTION TO THEORETICAL PHYSICS
plished by describing the motion, not in ordinary space, but in
the so-called phase space, in which there are dimensions asso-
ciated with both the generalized coordinates and the generalized
momenta. And the importance of Hamilton's equations arises
from the fact that they are peculiarly suited to a discussion of
motion by means of the phase space.
52. The Phase Space. — For a system with n degrees of freedom
and n generalized coordinates qi . . . q n , the phase space is a
2n-dimensional space in which q x . . . q n and p x . . . p n are
plotted as variables. A single point in this space, often called a
representative point, then determines all coordinates and veloci-
ties of the system. As time goes on, the representative point
moves about the space, as both coordinates and momenta change
with time. It is here that we make connection with Hamilton's
equations, for these equations, dqi/dt = dH/dpi, dpi/dt =
— dH/dqi, give just the components of the velocity of the
representative point in the phase space. The problem of dynam-
ics is to investigate the path of the representative point in the
phase space.
We can easily see some properties of this motion. In the first
place, it takes place with constant energy, assuming that we are
dealing only with conservative forces. To prove this, we have
for the time rate of change of H
dH = dHdqidHdq2 dH dp x dH dp 2
dt d<7i dt + dq 2 dt + dp x dt ~*~ dp 2 dt i ~
_ dHdH ,dHdH _ dH,
~ dq\ dpi dq 2 dp 2 dp
= 0.
+^+ • • • +4-f )+f (-1 )+
Now the energy H is a function of the coordinates and momenta,
and hence a function of position in the phase space. Thus the
equation H = constant determines a single relation between all
the p's and q's, and hence is the equation of a (2n — ^-dimen-
sional hypersurface in the 2n-dimensional space. The repre-
sentative point now moves about, but always stays on a single
energy surface. If in addition there are other quantities which
stay constant, as, for instance, an angular momentum, each one
of these quantities gives an additional equation between the p's
and q's, so that the representative point can move only in the
intersection of all the various surfaces represented by these
equations. Thus in some cases the region in which the motion
PHASE SPACE AND GENERAL MOTION OF PARTICLES- 81
occurs is of smaller dimensionality than 2n — 1. The extreme
case is purely periodic motion; in that case there are enough
quantities staying constant so that the motion of the repre-
sentative point is in a single closed line in phase space, or a
one-dimensional region. This fits in with the fact that all one-
dimensional conservative motions not extending to infinity are
periodic: for these have n = l, 2n — 1 = 1, so that the energy
"surface" itself reduces to a line. Motions are possible in all
the intermediate cases between the periodic motion, and the
other extreme in which the representative point comes eventually
arbitrarily close to every point of the energy surface. The latter
type of motion is called quasi-ergodic, (ergodic motion being a
nonexistent type in which the point passes through every point
of the surface). Some of the intermediate types are multiply
periodic motions, like our doubly periodic motion in the central
field. We shall investigate some of these typical cases by means
of examples.
Fig. 11. — Phase space for a linear oscillator, with line of constant energy E.
53. Phase Space for the Linear Oscillator. — As an illustration
of one-dimensional motion, we may take a linear oscillator (see
Fig. 11). The phase space is two-dimensional, so that the
energy surface is really a line. If the energy is \mv 2 + 2ir 2 mv 2 x 2 ,
where we readily see that v is the frequency of oscillation, the
Hamiltonian function is p 2 /2w + 2ir 2 mv 2 x 2 . Setting this equal
to a constant, E, the equation of the line of constant energy is
p 2 /2m -\- 27r 2 m»' 2 x 2 = E, or
+
= 1,
(1)
(V2mEy ' (\/#/2tW) 2
the equation of an ellipse, having semi-axes s/2mE and
\/E/2* 2 mv 2 .
82
INTRODUCTION TO THEORETICAL PHYSICS
54. Phase Space for Central Motion. — As an illustration of a
two-dimensional problem, we may take a central motion. The
phase space is four-dimensional, so that we cannot directly plot
it: the axes represent r, 6, p r , pe. But we recall that in central
motion p e stays constant, so that we may choose a particular
value of p , and use a three-dimensional section of the phase
space, the axes representing r, 0, and p r . We imagine r and as
rectangular coordinates in a plane, and p r as a coordinate at
right angles to the plane. Now the energy surface is given by
Pr
+
Vo
2m 2mr 2
+ V(r) = E = constant,
(2)
or solving for p r , p r = ±\/2m(E — V — p 2 /2mr 2 ). That is, for
each value of r and (E and pe being fixed), two values of p r are
given by the equation. If we plot these values, we get the
surface of Fig. 12, on which the representative point moves
in a spiral around the cylindrical surface, continually increasing,
while r increases and decreases as the point spirals round and
round. Although the orbit criss-crosses on itself, as we saw in
Fig. 12. — Surface of constant energy and constant angular momentum in
phase space of a particle moving in a central field. The spiral represents the path
of a particle.
Fig. 10, Chap. VII, still in the phase space the two different
possible directions of motion at a given point of space are on
opposite sides of the energy surface.
In Fig. 12, we have plotted only the part of the energy surface
between = and = 2x. The spiral, however, continues
indefinitely. Since the regions from = 2t to 47r, 4t to Or, etc.,
all represent the same regions of space as to 2ir, it is reasonable
PHASE SPACE AND GENERAL MOTION OF PARTICLES 83
to telescope these sections of the surface on to the one we have
drawn. Each one will have its own segment of spiral, so that
we shall have an infinite number of pieces all drawn on the surface
shown in Fig. 12. There are now two possibilities. First, the
motion can be periodic, as mentioned in Chap. VII. Then
infinitely many segments of the spiral will lie on top of each
other, resulting in one or a finite number of segments only. This
is then a one-dimensional line in phase space, as we expect for
periodic motion. Or second, the motion can be doubly periodic,
the general case for this problem. In that case, the infinite
number of segments of the spiral will not coincide, and instead
they will fill the whole surface densely. In other words, in this
case the path of the representative point fills a two-dimensional
region. This is characteristic of doubly periodic motions.
55. Noncentral Two-dimensional Motion.— Let us consider
motion in a field only slightly different from a central one, as, for
instance, if we had a central field and a small external field of
some other sort superposed. Then there will be slight torques
acting on the particle, so that its angular momentum will change
slowly. Now a given angular momentum corresponds to a given
surface in Fig. 12. Hence in this motion the representative
point does not confine itself to the surface, we have drawn, but
moves also on larger and smaller surfaces. If the motion without
the additional torques were doubly periodic, and if no new
regularities were introduced, the path of the representative point
would now fill densely a whole set of surfaces with continuously
varying sizes; that is, it would fill up a three-dimensional volume,
the most general thing possible. In most cases this volume would
be the whole region consistent with the energy, so that the
motion would be quasi-ergodic. The motion itself, in two-
dimensional coordinate space, would resemble for a short time
the orbit of Fig. 10, Chap. VII, but the circles to which the orbit
is tangent would slowly increase or decrease in size, the loops of
the orbit simultaneously getting less or more rounded. If the
departure from a central field were large, we could not use this
approximate description, but should have to say simply that
successive loops of the orbit were not merely oriented differently,
but were of different size and shape.
56. Configuration Space and Momentum Space. — It is not
always so easy to reduce a four-dimensional phase space to three
dimensions as it was with the central field. We can always
84 INTRODUCTION TO THEORETICAL PHYSICS
visualize the phase space, however, by imagining a separate
n-dimensional momentum space associated with each point of
the n-dimensional coordinate or configuration space. If we
assume that the n coordinates are the rectangular or Cartesian
coordinates, then we have a simple interpretation of the condition
that the representative point move on an energy surface. For
this states that kinetic energy = E — V, or (p x 2 + p y 2 + Pz 2 ) =
2ra (E — V). But p x 2 + p y 2 + p* is simply the square of the
radius in momentum space, so that to a given energy and a given
point of space corresponds a sphere (or, with two dimensions, a
circle) in the momentum space, on the surface of which the
representative point must move. In quasi-ergodic motion, the
representative point at one time or another comes arbitrarily
close to each point of the surface of these spheres. We note that
the spheres exist, and have real radii, only in that part of con-
figuration space where E — V is positive, and where, therefore,
according to the energy principle, the motion can occur. But
now in the more specialized types of motion, all points of the
surface of the spheres are not available for representative points.
Thus, in central motion, where the sphere degenerates to a circle
in the two-dimensional momentum space, only those velocities
are allowed which correspond to a given angular momentum.
That is, p x and p y must satisfy at the same time the equations
Px 2 + py 2 = 2m(E — V), xp y — yp x = angular momentum =
constant, the equations of a circle and a straight line, respectively,
in momentum space. These intersect in two points or in no
points, so that for some parts of the configuration space corre-
sponding to positive kinetic energy there are two possible values
of the momentum, and for other parts there is none, and the
motion cannot occur. These excluded regions are those within
the small circle in Fig. 10, Chap. VII, and outside the large circle,
but within the circle on which the kinetic energy becomes zero.
57. The Two-dimensional Oscillator. — A second example of
two-dimensional motion is provided by the two-dimensional
oscillator. In Chap. VII, Probs. 7, 8, and 9, it was shown that
a particle attracted to a center by the forces F x = — ax, F y =
— ay, could be solved by separation of variables, each coordinate
vibrating like a separate, oscillator, and the combined motions
producing an elliptical orbit with the center at the center of
attraction. The motion is periodic, with the same period which
the corresponding one-dimensional motion would have. To
PHASE SPACE AND GENERAL MOTION OF PARTICLES 85
obtain a nonperiodic motion, we make F x = —ex, F y = — ky,
the force components being proportional to the displacements,
but with different coefficients. It is easily seen that in this case
the total force, regarded as a vector, is not in the same direction
as the displacement. An example is found in the vibration of a
rectangular stick, if one end is clamped and the other vibrates,
since the stick is stiffer for bending in one direction than the
Fig. 13. — Lissajous figure for the orbit of a two-dimensional oscillator. The
ellipse surrounding the rectangle represents the equipotential corresponding to
the energy of the motion.
other, unless it is square. The variables are still separated in
the equations of motion, and the solution is
x = Ax cos (y/cjm t - «i), y = A 2 cos (Vk/m t - a 2 ). (3)
The motion is no longer periodic, for after one period of the x
motion, the y motion will not have traversed just a full period,
but will be in a different phase. By plotting (see Fig. 13), one
can see that the orbit is always within the rectangle bounded by
x = ±Ai,y = ■ ± A 8 , that it is often tangent to the edges of this
rectangle, and that in time it comes arbitrarily close to any point
within the rectangle. The resulting figure is called a Lissajous
figure, and this sort of motion is typical of many examples which
one meets. The orbit in central motion is, in fact, a sort of
Lissajous figure, as Fig. 10, Chap. VII, shows.
The two-dimensional oscillator is a typical doubly periodic
motion, the periods being just those of the two separate degrees
of freedom. The displacements x and y are singly periodic,
but if we wished to express, for example, the displacement in an
arbitrary direction as a function of time, we should have. a? -f- by,
86 INTRODUCTION TO THEORETICAL PHYSICS
which would be a sum of two terms, one periodic with the one
frequency, the other with the other. Inspection of Fig. 13
shows that at a given point of space, there are just two branches
of the orbit, corresponding to two definite values of the momen-
tum, rather than having all momenta consistent with the given
kinetic energy, in all directions, permitted as in quasi-ergodic
motion.
A small perturbation applied to the two-dimensional oscillator
would destroy the double periodicity, and make the motion
quasi-ergodic. Thus we might have a small central field added
to the linear restoring force. If the perturbation were small,
we might apply what is called the method of variation of con-
stants. That is, we could consider the coordinates to be given
by Eq. (3), but regard the A's and a's as slowly varying functions
of time rather than constants. Substituting these expressions
in the differential equations, we should find that the perturba-
tion produced such changes of amplitude and phase, at rates
proportional to the magnitude of the perturbation. Considered
from the standpoint of Fig. 13, this means that the rectangle is
gradually changing its dimensions, subject always, however,
to the condition that it is at least approximately inscribed in
the same ellipse, since the total energy is only slightly changed
by the perturbation. The result then is a slowly changing
Lissajous figure, looking therefore like a superposition of many
such figures, filling up the ellipse, and giving at a point of space
not two possible momenta only, but a continuous range of
momenta, in all directions, leading, therefore, to quasi-ergodic
motion. A similar discussion can be given for the simpler
problem of the almost periodic oscillator. In the exactly
periodic case, the orbit is a single ellipse inscribed in a rectangle
like that of Fig. 13, which in turn is inscribed in an ellipse.
If the problem is made slightly different, by introducing only
a very small difference between the force constants in the two
directions, the dimensions of the ellipse can be considered to
change slowly, though it always remains inscribed in the rec-
tangle. The actual Lissajous figure, as one sees at once by
inspection, is very similar to what one would obtain by drawing
a great many ellipses, all inscribed in the same rectangle.
68. Methods of Solution. — We have seen one method of solving
mechanical problems in several dimensions, that of separation
of variables, by which the problem is reduced essentially to
PHASE SPACE ANT) GENERAL MOTION OF PARTICLES 87
several independent one-dimensional problems. There are
several problems which can be solved by this method, in addition
to the oscillator and the central field problems which we have
treated. The problem of a particle in the field of two attracting
centers, both attracting according to the inverse square law, can
be solved by separation in ellipsoidal coordinates, with the two
centers as foci. In three dimensions, the central or the
axially symmetrical fields can be solved by separation. The
solutions in all these cases are multiply periodic, as we can see
at once from the fact that each coordinate, acting like a one-
dimensional problem, must be singly periodic. It is thus
obvious that no problems except multiply periodic ones can be
solved by separation, and it seems likely that the small list we
have just given includes practically all the multiply periodic
mechanical problems which exist in two or three dimensions.
59. Contact Transformations and Angle Variables. — Hamil-
ton's equations can be applied to multiply periodic motions by
making certain transformations of coordinates which are called
contact transformations, because it can be shown that they
transform two curves which are in contact with each other in
the original space into curves in contact in the new space.
An ordinary transformation of coordinates, of the sort which
we have discussed in connection with Lagrange's equations, is
a transformation in which new coordinates are written as func-
tions of the old ones: q/ = q/(qi • • • q n ), if the q's are the old
coordinates, the q"s the new ones. The new momenta, derived
from the new Lagrangian function, are then functions of the old
coordinates and momenta: p/ = p/(qi • • • q n , pi ' ' ' p n )-
Such a transformation is called a point transformation. But
in a contact transformation, the new coordinates as well as the
new momenta are functions of both the old coordinates and
momenta:
qj = q/(qi • • ■ q n , Pi ■ • ' p»),
Pi = Pi(Qi ' * * Qn, Pi ' • • Pn). (4)
There must naturally be restrictions on the functions, just as
in ordinary point transformations we require that the new
momenta be derived from the new Lagrangian function. When
these restrictions are applied, however, it proves that Hamilton's
equations are still satisfied in the new coordinates, thongh
Lagrange's are not. Such contact transformations can often
88 INTRODUCTION TO THEORETICAL PHYSICS
be very useful in complicated problems, reducing them to forms
which can be handled mathematically. A contact transforma-
tion can be most easily visualized simply as a change of variables
in the phase space. For instance, suppose we have the phase
space for a linear oscillator, as in Fig. 11. We can easily choose
the scale so that the line of constant energy is a circle, rather
than an ellipse. Then it is often useful to introduce polar
coordinates in the phase space, so that the motion is represented
by a constant value of r, and a value of 6 increasing uniformly
with time. The angle 0, or rather 6/2t, in this case, is often
called the angle variable, and is used as the coordinate. This
is from analogy with the rotation of a body acted on by no
torques, where the angular momentum stays constant, and the
angle increases linearly with the time. The momentum conju-
gate to the angle variable, which stays constant with time, is
not simply the radius, as we should expect from the simple use
of polar coordinates, but proves to be proportional to the square
of r; in fact, it is just xr 2 , or the area of the circle. This momen-
tum is called the action variable, or phase integral, denoted by
J, and the angle variable is denoted by w.
Since Hamilton's equations hold in the transformed coordi-
nates, and since evidently the energy H depends only on J, being
independent of w, Hamilton's equations become
_dH_ dJ
^-°-dT (5)
verifying the fact that «/ is a constant of the motion; and
dH dw
dJ dt
(6)
a quantity independent of time, and of w, verifying the fact that
w increases uniformly with time. Now since w = 6/2ir, it
increases by unity in one period, so that dw/dt is just l/T,
where T is the period, or is v, the frequency of motion. Hence
we have the important relation that
giving the frequency of motion in terms of the derivative of the
energy with respect to the action variable J.
PHASE SPACE AND GENERAL MOTION OF PARTICLES 89
It can be shown in a similar way that action and angle variables
can be introduced in general in one-dimensional periodic motions.
In every case the w's increase uniformly with time, the frequency
being given by Eq. (7). It also proves to be true in general
that the action variable / is given by the area of the path of
the representative point in phase space, which is the reason why
it is called a phase integral. This area can be written jp dq,
where this is analogous to jy dx, the area under the curve y{x).
In Fig. 11, for instance, we integrate from the minimum to the
maximum q along the upper branch of the ellipse, obtaining the
part of the area above the q axis; then integrate back along
the lower branch, where both p and dq are negative, obtaining
the area below the q axis, so that the complete integral about the
whole curve, which may be written fp dq, gives the whole area,
or /. Connected with this is the criterion which a transforma-
tion of the p's and q's must satisfy if it is to be a contact trans-
formation: it can be proved that it is a transformation in which
areas in the phase space are preserved, or are not affected by the
transformation, though the shape of an area in the new coordi-
nates may be very different from what it was in the old. An
immediate result of this is that the J's are the same no matter
what coordinates we may use for computing them.
Angle variables can also be introduced in cases with several
degrees of freedom, provided the motion is multiply periodic,
by using a separate angle variable for each coordinate. It is
evident that the method could not be used with motions which
were not multiply periodic, for we have seen that it is only in
the multiply periodic motions that there are quantities, as, for
example, angular momenta, which stay constant. Yet the
action variables, or J's, must stay constant, and consequently
cannot be introduced, for example, in quasi-ergodic motions,
where by hypothesis constants of the motion of this sort do not
exist.
We shall meet angle variables and phase integrals again in
Chap. XXX, where it is seen that they have close connection
with the quantum theory. In that theory, the phase integrals
prove to be quantized; that is, they take on only discrete values,
/ being limited to integral multiples of a fundamental physical
constant, Planck's constant h; and Eq. (7) for frequencies is
replaced by an equation of finite differences, v being a difference
of energy in two energy levels, divided by the corresponding
90 INTRODUCTION TO THEORETICAL PHYSICS
difference of J (which can be simply h). These two formulas,
which we elaborate later, form the basis of much of quantum
theory.
60. Methods of Solution for Nonperiodic Motions. — When
we meet a problem whose solution is quasi-ergodic, we are facing
a branch of mathematics which offers no explicit or exact solu T
tions. The only solutions are in the form of various series,
methods, for instance by the method of perturbations, which
can be used if the motion is almost multiply periodic. We
indicated an example of this in discussing the two-dimensional
oscillator, where we treated the problem as a Lissajous figure with
slowly varying amplitudes and phases. In general, the method
of perturbations consists in developing the various quantities
which appear in the problem in power series in the small quan-
tities measuring the deviation from the multiply periodic case.
If, for instance, that case has been discussed by the method of
angle variables, we regard the J's as slowly varying functions
of time, their rate of variation being proportional to the first
order to the magnitude of the perturbation. But in all these
methods there is great difficulty in the matter of the convergence
of the series; as time goes on, or as we consider larger and larger
perturbations, they converge worse and worse, as is natural from
the physical fact that often a slight change in initial conditions
may, after the lapse of enough time, cause a profound change
in the motion. These difficulties, as well as these methods of
solution, are met particularly in celestial mechanics.
Problems
1. Given a linear oscillator of mass m, frequency v, displacement x,
momentum p, we can introduce a new coordinate w and momentum /, by
the transformation
x = s/j /2ir 2 mv cos 2irw
p = — s/lmJv sin 2irw.
This change of variables can be shown to be a contact transformation. Find
the Hamiltonian in terms of the new variables, by substituting these values
of x and p in the total energy. Show that this resulting Hamiltonian
depends on J alone, being independent of w, and show that w is an angle
variable. Verify that J is the phase integral, or area enclosed by the orbit
in the phase space, and that v = dH/dJ. Show the geometrical interpre-
tation of the contact transformation in the phase space.
2. An electron of charge —e, mass m, moves about a nucleus of charge Ze,
and very large mass. The potential energy is —Ze*/r. Assuming Ihe
energy to be E, angular momentum p g , separate variables, and consider
PHASE SPACE AND GENERAL MOTION OF PARTICLES 91
the radial motion as a one-dimensional problem, as in Chap. VII. Take a
two-dimensional phase space in which r and p r are variables, and plot the
path of the representative point in this space.
3. Find the area of the path of the representative point in Prob. 2, and
show that it is \/2ir 2 mZ 2 e 4 / ( — E) —2irp g . Set this equal to J r , the action
variable connected with the radial motion. Find the energy in terms of
J r , and by differentiation find the frequency of motion. Verify this result
in the special case of circular motion, where you can compute the rotational
frequency by elementary methods.
4. If F x = —ex, F y = —ky, prove by direct calculation that the force,
regarded as a vector, is at right angles to the equipotential. Show that
the force is not in the direction of the displacement.
5. Suppose in a two-dimensional oscillator that the force constants along
the two axes are only slightly different from each other. Prove that the
orbit resembles an ellipse, of slowly changing shape and size. (Hint: show
that x = A cos (tot — a), y = B cos (cot — /3), where A, B, a, and /S are
constants, is the equation of the ellipse. Then show that the equation of
the path of the oscillator can be written in this form, if a. and /3 are slowly
changing functions of time.)
6. A particle moves as if it were executing simple harmonic motion about
the center of a turntable, and at the same time the turntable were rotating
with uniform angular velocity. Compute the x coordinate of the particle
as a function of time, and show that the motion is doubly periodic.
7. Sketch the orbits in Prob. 6, for several different ratios between the
frequencies of oscillation and rotation, including some cases of irrational
ratios, and also simple rational ratios, as 1/1, 1/2, 2/1.
8. A particle moving in two dimensions is attracted by two centers, of
the same strength, attracting with a force proportional to the inverse square
of the distance. Compute and plot a number of equipotentials, showing
that for some energies the motion must be entirely confined to the region
around one or the other center, while for larger energies it can surround both
centers.
9. A particle moves in three dimensions under the action of a force of
attraction to a center, depending only on the distance. Set up the problem
in spherical coordinates, using the results of Probs. 3 and 4, Chap. VIII.
Show that the variables can be separated, so that the problem is multiply
periodic. Show that energy, total angular momentum, and the component
of angular momentum along the axis of coordinates, all remain constant,
showing the connection of these quantities with the generalized momenta
of the problem. Using the obvious fact that the motion occurs in a plane
and is just like two-dimensional central motion in that plane, show that
the periods of the motions in and <f> are the same, so that the motion is only
doubly, not triply, periodic.
CHAPTER X
THE MOTION OF RIGID BODIES
In the preceding chapters we have been treating the mechanics
of particles. Then we have passed on to the general methods of
Lagrange and Hamilton, which can be applied to all sorts of
mechanical problems. The present chapter will take up the
motion of rigid bodies.
In elementary work, one learns the main outlines of the
problem of the motion of a rigid body. We know that its motion
is a superposition of a translation and a rotation. There are two
fundamental laws of motion: the force equals the time rate of
change of linear momentum, and the torque equals the time
rate of change of angular momentum. To make our ideas more
precise, the translational motion generally refers to the motion
of the center of gravity, and the rotational to rotation about the
center of gravity. The motion of the center of gravity is essen-
tially like the motion of a particle, which we have already treated.
In order to leave that out in the present chapter, we shall assume
that no net forces act, or that the body is pivoted, rotating about
a fixed point.
61. Elementary Theory of Precessing Top. — A torque is a
vector, equal in magnitude to the force acting times its lever arm
(that is, the perpendicular distance from the center of rotation
to the line of action of the force), and at right angles to force
and lever arm. That is, in vector notation, the torque on a single
particle is (r X F), where r is the radius vector to the particle,
F the force acting, and the torque on the whole body is the vector
sum of the separate torques on its parts. Similarly the angular
momentum is a vector, defined in an analogous way: the angular
momentum of a particle is equal in magnitude to the momentum
times its lever arm, and at right angles to both, so that it is
[r X (mv)], or m(r X v), and the total angular momentum of the
body is the vector sum of the angular momenta of its parts. We
see then that the equation "torque equals time rate of change of
angular momentum" is a vector equation. This results in
having two separate sorts of effect which a torque can produce.
For we can analyze the torque into two components, one parallel
92
THE MOTION OF RIGID BODIES
93
to the angular momentum, the other at right angles. The first
component of torque produces an increase or decrease of angular
momentum in the same direction as the angular momentum
already existing; that is, it produces a speeding up or slowing
down of the rotation, or an ordinary angular acceleration. This
is the effect seen in the speeding up or slowing down of wheels
on fixed axles. The component of torque at right angles to the
angular momentum, on the other hand, produces a rotation or
precession of the angular mo-
mentum vector, without change
of length, and hence a change
in the axis of rotation. This
is the effect considered in the
simple theory of the symmetri-
cal top: if p represents the
angular momentum of the top
at a given instant (see Fig. 14),
which is in approximately the
same direction as the axis of
figure of the top, the torque of
gravity on the top will be mgl
n • ., j i y . Fig. 14. — Angular momentum vec-
sm 6 in magnitude, where I is tors for precessing top . The increm ent
the distance from the point of of angular momentum dp, proportional
.iii . c -j. to and in the same direction as the
support to the center of gravity. torque of gravityt changea the total
The torque Will act at right angular momentum from p to p + dp,
angles to the axis and p, so ^^ " a P recession througb the
that the change of momentum
in time dt will be dp, as shown. Thus the angular momen-
tum after time dt will be the vector p + dp, obtained from the
old vector by a precession, as if the whole figure were rotated
about the axis through the angle d<f>. We can easily find the
rate of precession. For d<i> evidently equals dp divided by the
dp
radius of the circle, or is
\p\ sin 9'
on the other hand, dp =
j . Q j. •„- mgl sin 9 dt . d<l> mgl
mgl sm 9dt. Hence * , . — - — = d<$>, or -^ = j~, a precession
increasing with increasing torque, but decreasing with increasing
angular momentum. We note that if we regard the precessional
velocity as a vector, say w, along the vertical direction, and having
mgl
a magnitude
bl'
we have
94 INTRODUCTION TO THEORETICAL PHYSICS
| = (»x P ). - (1)
This is a general relation for a precessing vector, as we readily see.
The elementary ideas of torque and angular momentum do not
permit us to go much farther than we have indicated here,
without further analysis. With a body in the absence of torques,
for instance, we know at once that the angular momentum stays
constant, both in direction and in magnitude. But this tells us
little about the actual complicated motion. We must then
examine the problem more in detail. In the succeeding sections
we consider the angular momentum, kinetic energy, etc., of solid
bodies of arbitrary shape, with arbitrary axes of rotation, though
we always assume that they rotate about a fixed point, as the
center of gravity.
62. Angular Momentum, Moment of Inertia, and Kinetic
Energy. — Let a body rotate about the origin as a center, the axis
of rotation having direction cosines X, ,./*, v with the three axes.
We may regard the angular velocity as a vector, whose direction
is the axis of rotation, and whose magnitude is the magnitude of
angular velocity. Thus, if the vector is co, its magnitude co , we
have w x = Xco , <a y = /xco , w z = vca . Now we can easily find
the linear velocity of any point of the body. This is numerically
poj , where p is the perpendicular distance from the point to the
axis of rotation, and is at right angles to the axis of rotation and
the perpendicular distance. In other words, the velocity is given
by the vector product (w X r); a little consideration shows that
the vector product has the right direction. Now that we know
the velocity of each point, we can compute the angular momen-
tum. We have already seen that this is the sum of terms
m(rXv) for all particles of the body. But v = oo X r, so that
angular momentum = 2ra[r X (w X r)]. This can be easily
expanded. Thus the x component, for example, is
m\y(u X r), - z(co X r)„]
= m[y(o) x y — co y x) — z{w z x — u> x z)]
= m<a x (y 2 + z 2 ) — m<*)yxy — ma> z xz,
with corresponding formulas for the other components. If now
we sum over all particles of the body, remembering that co is
the same for all, we have, if p x , p y , p e are the components of
angular momentum,
THE MOTION OF RIGID BODIES 95
p x = Aa) x — Foiy — Eo) Zf
p y = — Fo) x + B<ii y — Doi z ,
p z = ^Ecox — Da)y + Cd) 2 , (2)
where for abbreviation we set A = Hm(y 2 + z 2 ) r B = 2ra(z 2 + z 2 ),
C = Sm(i 2 + y 2 ), Z> = Zmyz, # = Zmzx, F = Zmxy. The
quantities A, B, and C are called the moments of inertia, and
D, E, F are the products of inertia; the first three are obviously
the moments of inertia of the body in the ordinary sense, about
the x, y, and z axes, respectively. We note one thing at the
outset: the angular momentum vector is not in general parallel
to the angular velocity vector. Thus if co y = co z = 0, so that
the angular velocity is along the x axis, we have all three com-
ponents of p in general different from zero.
Next we find the kinetic energy. For a single particle, this
is fmz> 2 , or |m(co X r) 2 . Again expanding, this is
\m[{o3 y z — w g y) 2 + (o) z x — w x z) 2 + (a> x y — a) y x) 2 ]
= hm[<* x \y* + z 2 ) + V(* 2 + * 2 ) + «. 2 (* 2 '+ V 2 )
— 2o3 x o} y xy — 2u) y o) z yz — 2<a z (a x zx].
Summing over all particles, and using the abbreviations above,
this is
T = \(Ao> x 2 + Boi y 2 + CV - 2Duy<a, - 2Ea> z a> x - 2Fa> x a, y ). (3)
The quantity T can be written as %Ia>o 2 , where
I = \ 2 A + n 2 B + v 2 C - 2fivD - 2v\E - 2\pF. ' (4)
It is easily shown that / is simply 2wp 2 , where p, as before, is
the perpendicular distance from the point to the axis of rotation,
so that I agrees with the elementary definition. If we imagine
X, fi, and v varied in any manner, the quantities A, B, . . . F
do not change. As a variation in \ n, v means a variation of
direction of the rotation axis through the center of rotation 0,
we see that the sums A . . . F completely determine the moment
of inertia of the body about any axis through the same center of
rotation.
63. The Ellipsoid of Inertia; Principal Axes of Inertia. — The
Eq. (4) for the moment of inertia / may be interpreted geometri-
cally in a very simple manner. The equation Ax 2 + By 2 +
96 INTRODUCTION TO THEORETICAL PHYSICS
Cz 2 — 2Dyz — 2Ezx — 2Fxy = constant represents a surface
of second degree. If we denote by r the radius vector drawn
from to a point on this surface, having direction cosines X, n, v,
this equation becomes
r 2 (A\ 2 + Bn 2 + Cv 2 - 2ZV - 2Ev\ - 2F\n) = constant. (5)
The expression inside the parentheses is just I, the moment of
inertia, so that we have r 2 = constant/7, or I = constant/r 2 .
Now, since the moment of inertia is always positive and can
never vanish, r 2 cannot become infinite and our surface is a closed
surface. Since it is of second degree it is an ellipsoid with its
center at 0, and is called the ellipsoid of inertia at the point O.
The ellipsoid of inertia has the simple physical significance that
the moment of inertia of the body about any axis through is
measured by the inverse square of the radius vector from 0,
drawn parallel to the rotation axis and terminating on the surface
of the ellipsoid.
Every ellipsoid has three principal axes which are mutually
orthogonal. These axes are known as the principal axes of
inertia at 0. Just as in the case of an ellipse, when coordinate
axes are chosen coincident with the principal axes, the equation
of the ellipsoid reduces to a sum of squares, so that the coefficients
of the terms in yz, zx, and xy disappear and we have D = E =
F = 0. We shall often use coordinate axes coincident with the
principal axes, but since these axes are fixed with respect to
the rigid body, we must always remember that they are rotating
axes in space, and we must describe their motion with respect to a
system of axes fixed in space. Referred to the principal axes,
the moment of inertia becomes simply \ 2 A + n 2 B + p 2 C ,
where these A , Bo, and C are now computed with respect to
axes fixed in the body, and so do not change with rotation of the
body, as do the ordinary moments and products of inertia com-
puted with respect to fixed axes. The kinetic energy of rotation
is then T = i/co 2 = \(\ 2 oo 2 A + ^u 2 B + v 2 a 2 C ), which is
also T = %(Aqui 2 + Bqwi 2 -f- CW3 2 ), where «i, co 2 , and co 3 are
the components of co taken about the principal axes.
64. The Equations of Motion. — Suppose the moment, or tor-
que, of the external force is M, with components M x , M y , M z .
Then the equations of motion are obtained by setting the torque
equal to the time rate of change of angular momentum: M —■
dp/dt, or for the x component - ■.••-■
THE MOTION OF RIGID BODIES 97
M x = j t (Au x - Fco y - Eo> z ), (6)
where, of course, we are using arbitrary x, y, z coordinates, not
the principal axes. In performing the differentiation, we must
remember that not only are a) x , a> y , co z changing, but also A, F, E,
since the body is rotating, and these moments and products of
inertia are defined with respect to a particular fixed coordinate
system. Thus we have
M x = Aw x — F6) y — E6o z + Ao> x — Fu y — Eu z .
The last three terms can be rewritten, using, for instance,
along with v = a> X r, so that x = o> y z — u> z y, etc. Without
trouble we find that the equations can be written
M x = A6> x — Fw y — Eu z — (B — C)co y co z — D(w y 2 — o> 2 2 ) +
Fo}xO) z — Ecdx&y, (7)
with equivalent equations for the y and z components. The
latter terms seem very complicated ; but we readily see that they
can be written as a vector product, giving
M x =^ = A6> x - F6>y - E6> z + (co X p) x (8)
The equation for time rate of change of angular momentum,
in the form above, has a simple interpretation. Suppose we
have any vector G, and that we consider it with respect to rotat-
ing coordinates, rotating with the angular velocity co. If we
were rotating with the coordinates, the vector would seem to
have a certain time rate of change, which we may call dG/dt.
But this will not be its actual time rate of change, when looked
at from a stationary system of coordinates. For even a vector
which remained constant in the rotating system would actually
be changing, just on account of its rotation. In fact, the rate
of change of the vector for this latter reason, using the same sort
of argument which we met in Eq. (1) in describing the precessing
top, is (o> X G), and the total rate of change of G is the sum of
these two effects, or
*-£ + <»*<»• W
98 INTRODUCTION TO THEORETICAL PHYSICS
In particular, then, with the angular momentum, we evidently
have two terms of the sort considered above. We conclude
therefore that
A(j) x — Fo3 v — Ed), = ( — —
= \m)j
so that these terms represent the rate of change of angular
momentum, with respect to the rotating axes.
one result of the theorem we have just worked out is interest-
ing. Let the vector G be the angular velocity. Then doo/dt =
du/dt, since the vector product (w X «) is zero. Hence the
components of time rate of change of angular velocity are the
same in fixed as in rotating axes.
65. Euler's Equations. — The equations of motion, (7) or (8)>
take on a particularly simple form when expressed in terms of
the principal axes. Let us first take our fixed axes xyz so that
they coincide with the instantaneous values of the rotating,
principal axes. Then D, E, F are instantaneously zero, and the
equations (7) are
M x = A Ux ~ (-So — Co)o)yO) z ,
with two similar equations. But now let coi, w 2 , o> 3 be the com-
ponents of angular velocity with respect to the rotating principal
axes. Momentarily these equal b I( u y , &> z . But also o>i is the
same thing as (da)/dt) x , the x component of the time rate of
change of angular velocity with respect to the rotating axes.
We have just shown, however, that this equals (du/dt) x , or
u x . Hence we can rewrite our equations entirely in terms of
the moving axes,
Mi = A 6>i — (Bo — Co)co 2 co 3
Mi = f?oO>2 — (^0 — A )cO 3 Wi
Ms = C a> 3 — (A Q — jBo)wico 2 , (10)
where Mi, Mi, M z are the components of torque with respect
to the rotating axes. These equations are called Euler's equations.
66. Torque-free Motion of a Symmetric Rigid Body.— We
shall now apply Euler's equations to the motion of a rigid body
symmetric about an axis, subject to the action of no external
torques (either the external forces are zero, or act at the center
of mass). The earth provides a good example, if we neglect
the torques due to sun and moon. We choose the center of mass
THE MOTION OF RIGID BODIES 90
as an origin, and take the axis of symmetry as principal axis
3. The principal moments of inertia are then A . A , Co.
Euler's equations for this case are
A 6)i -f- w 2 g>3((7o — A ) =
A O)2 + O}lO} 3 (A — Co) =
C O>3 = 0.
The last equation integrates at once, giving &> 3 = constant. This
means that the resultant angular velocity has a constant com-
ponent along the axis of symmetry. If we now place a =
C — A
cos - A j the two other equations are coi + aw 2 = and
o> 2 — ouai = 0. Differentiating the first of these, we find oil +
ao> 2 = oil + a 2 o)i = 0, which has as its solution cai = a cos
(at + e), and putting this value of a>i in the second equation
we find co 2 = a sin (at + c), where a and c are integration con-
stants. From these equations we see that the resultant angular
velocity <o = vW + o> 2 2 + « 3 2 = vV + « 3 2 is constant, and
that the projection of a> on the plane perpendicular to the axis
of symmetry and fixed in the body describes a circle of radius a
with a period given by r = — = — ^ ^— In the case of
a co 3 Co — ^.o
the earth, « 3 = 2ir per day, so that r becomes A /(C Q — A )
days, which is about 300 days and is known as the Euler period.
This period is not observed, but there is one of 427 days known
as the Chandler period giving rise to a variation of latitude.
When the imperfect rigidity of the earth is taken into account,
it is possible to identify these two as the same.
We can get an idea of the actual motion most clearly from a
diagram. In Fig. 15, we show an oblate spheroid, to represent
the symmetrical body. There is a circular conical hole Obd
cut out surrounding the north pole a, and a fixed cone touching
the inside of this hole, and centered on the line Oc. The motion
is now as if one cone rolled on the other. We see at once that,
since the axis Ob is instantaneously at rest, it is the instantaneous
axis of rotation co. As time goes on, this axis of rotation traces
out the cone Obd with respect to the body, and at the same time
traces out the cone Obe fixed in space. The axis of the fixed
cone, Oc, is the direction of the constant total angular momen-
tum vector. Other properties of the motion are discussed in a
problem.
100 INTRODUCTION TO THEORETICAL PHYSICS
67. Euler's Angles. — If we wish information about the general
motion of the top, we must introduce some set of coordinates
capable of describing its position. So far, we have not had any
set of coordinates at all. We have worked with angular velocities,
and angular momenta, which were vectors, and all the equations
came out very neatly and symmetrically in terms of them. But
there is a peculiar thing about the three components of angular
velocity: there are no corresponding angles to serve as coordi-
nates.. This is not true in plane motions. If a body rotates
Fig. 15. — Space and body cones for the torque-free rotation of a symmetrical
body. The cone Odb, fixed in the body, rolls on the cone Obe, fixed in space. The
line Oa is the axis of symmetry of the body, Ob is the instantaneous axis of rotation,
Oc the fixed axis of total angular momentum.
with angular velocity u about a fixed axis, we can regard co as
0, where is the angle through which the body has turned about
the fixed axis, and which can be used as a coordinate. Then we
can say that the component of angular momentum Jw is the
momentum conjugate to 0, and the whole Lagrangian and Hamil-
tonian methods go through perfectly. As soon as we have three
dimensions, however, and the possibility of different axes of
rotation, we no longer have such angles. It is readily seen, for
instance (we leave it for a problem), that one cannot use the
angles through which the body has turned about the three
coordinate axes as variables. The fact is that, though angular
momentum is a vector, finite angular rotations are not, and do
not have three components which can be used as coordinates.
THE MOTION OP RIGID BODIES
101
We are forced, then, by the peculiar nature of angular rota-
tions, to look for some set of three angles to describe the position
of the body, which unfortunately cannot have the symmetrical
nature of the x, y, z components of angular velocity. The usual
set of angles are called Euler's angles, and are shown in Fig. 16.
We ordinarily use these angles for discussing a symmetrical
body. Then Oz is a fixed axis, for example, the vertical in the
top problem. OC is the axis of figure of the body, taken as the
Pig. 16. — Euler's angles. For a symmetrical body, OCo is the axis of sym-
metry, OAo and OBo two axes fixed in the body at right angles. and </> measure
colatitude and longitude of the direction of the principal axis; 4/ measures the
rotation of the body about the principal axis.
third principal axis. 6 measures the angle between axis of figure
and fixed axis, <j> measures the angle of precession of the axis
of figure, so that d<t>/dt is the angular velocity of precession, and
if/ measures the rotation of the body about its axis of figure
measured from the line on, called the nodal line. Thus we see
that, though the Eulerian angles do not have symmetry, they are
very natural ones for the problem in hand.
Let us set up the components of angular velocity, and the
kinetic energy, in terms of the Eulerian angles. The motion
of the body may be thought of as consisting of a rotation of the
body about OC and the motion of OC relative to the fixed
frame of reference. The former is described by the angular
velocity 4> which has the components 0, 0, ^ (referred to the
102 INTRODUCTION TO THEORETICAL PHYSICS
principal axes). The latter motion consists of (a) a rotation 6
about the nodal line on as an axis, which was zero in the steady-
motion of the top considered above; and (6) of a precession <f>
about the z axis. The components of these angular velocities
along the principal axes OA , OB , and OC are
(a) 6 cos \p; — 6 sin yp;
(6) <$> sin sin $; <j> sin cos ^; <j> cos 0.
Adding these angular velocity components, we have
coi = cos ^ + 4> sin sin \p
co 2 = — sin \f/ + sin cos i/'
W3 = tj/ -\- <f) cos 0.
Here $ corresponds to the quantity a) used for discussing the
rcteady motion, and 4> to «i. From these components of angular
velocity, of course, we can at once get the angular momenta.
The kinetic energy, as we have seen, is i(A wi 2 + # a> 2 2 +
C co 3 2 ). But in our case of a symmetrical top this simplifies,
since A = B , and substituting we have
T = i[A («i» + w 2 2 ) + C co 3 2 ] = iUo(0 2 + sin 2 6 tf) +
C (^ + 4> cos 0) 2 ]. (11)
Using the kinetic energy, or corresponding Lagrangian function,
in terms of the Eulerian angles, we can easily derive the Lagrang-
ian equations of motion, and find them to be the same Euler
equations which we have already obtained. For instance,
using L = T, |(|) - % = |[Ctf + + cos «)] = C„co 8 = M 3 ,
which is the third of Euler' s equations, when we remember that
A Q - Bo = 0.
68. General Motion of a Symmetrical Top under Gravity —
We are now ready to proceed with the general discussion of the
top under gravity, for which we have already considered the
steady precession. We note first that the torque is at right
angles to the axis of figure. Hence by the third of Euler's
equations, co 3 = 0, or « 3 is constant. Instead of using the other
two of Euler's equations, it is somewhat more convenient to
use the conservation of energy and of angular momentum to
discuss the motion, much as we did in our earlier chapter on
central motion. For the kinetic energy we have T = %[A (P +
THE MOTION OF RIGID BODIES 103
sin 2 <£ 2 ) + C co 3 2 ], and the potential energy is Mgl cos 0, where I
is the distance from to the center of mass. Thus the energy-
equation becomes
E = UMO 2 + sin 2 0<£ 2 ) + CW) + Mgl cos 0, (12)
where i? is the total energy. We now can eliminate <f> from the
equation above by utilizing the fact that there are no torques
taken about the z axis. This means that the component of angu-
lar momentum along this axis is constant. The angular momen-
tum due to the rotation o> 3 of the top about its axis of. symmetry
has a vertical component C w 3 cos 0. The angular velocity of
the axis contributes nothing to the vertical angular momentum.
The other component of the angular velocity of the axis is sin
<f>, and this is about an axis perpendicular to OC, making an
angle of t/2 — with the vertical. Thus the contribution of
this term is A sin 2 00, so that the conservation of angular
momentum about the z axis yields
Vz = C0W3 cos + A sin 2 0<£. (13)
We now substitute the value of <f> taken from this equation into
the energy equation and get a differential equation for alone,
so that we may discuss the time variations of 0, or the variations
of the inclinations of the axis of figure of the top with the vertical.
When we make this substitution, and solve for 0, we have
= V2(E - V'y/Ao, (14)
where V, which plays the part of a fictitious potential energy for
the motion of this coordinate, has the value
The first term is the gravitational energy, decreasing as the angle
increases, showing that gravity tends to make the top fall. The
second is a constant, the energy of the spinning motion. The
third term is a dynamic term, reminding us of the centrifugal
force term in the effective energy for the radial motion in a
central field. It becomes infinite when = or t, since at those
angles the rate of precession <j> would have to be infinitely rapid
in order to conserve the angular momentum component p e ,
contributing therefore an infinite amount to the energy.
Between these angles this dynamic term has a single minimum.
In other words, it exerts a stabilizing influence, quite apart from
104
INTRODUCTION TO THEORETICAL PHYSICS
any external forces which may act, and leads to a stable oscilla-
tion of about a certain minimum of V, whose position is
determined by the external torque.
69. Precession and Nutation. — The minimum of V can be
determined by differentiating with respect to 0, setting the result
equal to zero. This gives
= -Mgl&m +
(^-CoCO3COS0) | CoCO3 gin Q _ Aa cog e gin e
'i
Pz — C0W3 cos 6
A sin 2
}
A sin 2
or, from Eq. (13),
<j>[C rp - (A - C )<j> cos 0] = Mgl. (16)
If the energy is equal to the effective potential V at this angle,
will be zero, and the motion is a pure precession of the sort
described in Sec. 61. If we assume that the rate of precession
is small compared with the rate of
rotation, which is the only case in
which the angular momentum, the
angular velocity, and the axis of fig-
ure are nearly enough in the same
straight line so that the arguments
of that section are valid, we have
<j> < < ip. In that case the equa-
tion becomes 4>(C4d = Mgl, <j> =
Mgl/Ctfp, m agreement with the re-
sult of Sec. 61, when we recall that
in this limit Cot is approximately the
total angular momentum. This
condition, or rather the accurate con-
dition (16), determines the rate of
steady precession <f> for any total
sphere. 0i and 2 are angular angular momentum, a rate independ-
limits of the mutational motion. ^ of Q t() thig approxima tion, but
depending on if we must consider terms in <j> 2 .
If the energy E is greater than the minimum of V , the curve
of E will cut that for V at two values of 6, one greater and one
less than the inclination of the axis for the purely precessional
motion which we have just discussed. In this case, will oscillate
between these two limits. This oscillation is called nutation.
The complete motion then consists of a combination of this
nutation with a precession, as indicated in Fig. 17, where we draw
Fig. 17. — Nutation of a top.
The sinusoidal curve is the pro-
jection of the axis of the top on a
THE MOTION OF RIGID BODIES 10 [)
the intersection of the axis of the top with a sphere. The angles
0i and 2 are the two angles for which E = V, so that thi;
minimum of V, or the angle for the pure precessional motion
corresponding to the same angular momentum, lies between these
two values. In the problems, the frequency of the nutational
motion is discussed. We also discuss, in Prob. 9, the special
case of the "sleeping" top, in which the top starts spinning
vertically. In this special case, the dynamic term in V is finite
at 6 = 0, so that under certain circumstances oscillations about
the vertical can occur.
Problems
1. Prove directly that the moment of inertia /, equal to Swp 2 , is equa]
to \ 2 A + n 2 B + v 2 C - 2nvD - 2v\E - 2\ M F, where X, ft, v, are the direc-
tion cosines of the axis of rotation.
2. Show that, if T is the kinetic energy of a rotating body, p its angular
momentum, w its angular velocity, p x = dT/dw x , and 2T = p x w x + p yWy +
PzUz.
3. In Fig. 15, show that tan AaOb = a/m, and tan AaOc = =-° — , where
Co 0>3
a>s, a represent the components of the angular velocity along and at right
angles to the figure axis. Knowing that the time required for the axis 06
of angular velocity to perform a complete rotation with respect to the body
2x Ao , . . .
18 T = &T C — A ' S * time for rt to P er f orm a complete rotation in
*? A
space is approximately — ^ if angles aOb and aOc are small. Hence show
0>3 <^0
that for the earth the axis of angular velocity is not fixed, but rotates about
a fixed direction approximately once a day.
4. The earth is acted on by torques exerted by the sun and moon, and as a
consequence its angular momentum precesses about a fixed direction in
space. This is entirely separate from the effect of Fig. 15 and Prob. 3,
which we now neglect. This precession has a period of 25,800 years, and
carries the angular momentum about a cone of semi-vertical angle 23° 27',
so that the pole in succession points to different parts of the heavens, result-
ing in the precession of the equinoxes, and in the fact that different stars
act as pole star at different periods of history. Show that the motion can
be represented by the rolling of a cone fixed in the earth, of diameter 21 in.
at the north pole, on a cone of angle 23° 27' fixed in the heavens.
6. A system of electrons moving about a center of attraction has a certain
angular momentum, equal to 2ra(r X v), and also a magnetic moment,
equal to ^^(r X ^' where e is the charge and m the mass of an electron,
c the velocity of light. This magnetic effect results because the electrons
in rotation act like little currents, which in turn have magnetic fields like
bar magnets. An external magnetic field H exerts a torque on the system,
equal io thd vector product of the magnetic moment and H. Show that
^ DEPARTMENT OF CHEMISTRY
LIVERPOOL COLLEGE OF TECHNOLOGY
106 INTRODUCTION TO THEORETICAL PHYSICS
under the action of the field, the system of electrons precesses with angular
velocity eH/2mc about the direction of the field. This precession, which,
as we see, is independent of the velocities of the electrons, is called Larmor's
precession.
6. one reason why finite rotations do not act as vectors is that they do not
commute, that is, the same two rotations applied in one order lead to one
answer, but in the opposite order to a very different answer. Demonstrate
this by diagrams, imagining that we have a cube (label its faces by different
letters or numbers on a diagram), originally in one position, with its edges
parallel to the coordinate axes (position a). First rotate through 90 deg.
about the x axis (position 6), then through 90 deg. about the y axis (position
c), drawing diagram^ of each step. Then, starting again from position a,
rotate first through 90 deg. about the y axis (position d) and then through
90 deg. about the x axis (position e). . Show that (c) and (e) are entirely
different orientations.
7. Write down the kinetic energy of a nonsymmetrical body, in terms of
Euler's angles. Derive the Lagrangian equation for \p, and show that it
reduces to one of Euler's equations.
8. In the same way as in Prob. 7, set up the other two Lagrangian equa-
tions, showing that they lead to the other two of Euler's equations.
9. A top is started spinning vertically, with no other motion, so that
initially 0=0, dd/dt = 0. Show that p z = CWa, E = £CW 3 2 + Mgl. Sub-
stituting these in the expression of Eq. (14) for 0, show that if w 3 > w', where
( w ')2 = 4Mgl Ao/Co 2 , the angle must remain equal to zero, but that if « 3
falls below a', will oscillate between and the angle cos -1 (2(w 3 /«') 2 — 1].
Experimentally, if a top is started as we have described, with « 3 > w', there
will be a frictional torque decreasing w 3 , and as soon as the torque reduces 0)3
below a/, the top will begin to wobble.
10. For a nutation of small amplitude about the steady precessional
motion of a top, the angle oscillates sinusoidally about the equilibrium
angle. Find the frequency of the nutation, by expanding the potential V
in power series in - 0o, where O is the angle of steady precession with the
same angular momentum. Retain only the constant and the term in
(6 — 0o) 2 , and get the frequency by comparing with the corresponding
expression for the linear oscillator.
CHAPTER XI
COUPLED SYSTEMS AND NORMAL COORDINATES
The mechanical problems which we have treated so far have
been those where just one particle moved around, sometimes in a
potential field, sometimes subject to forces not derivable from a
potential. In many problems, however, there are several
particles exerting forces on each other and influencing each other's
motion. As examples, we have the actual solar system, where
the sun, planets, and moons all act on each other; an atom, with
the various electrons reacting; a molecule, with the atoms vibrat-
ing under the action of their mutual forces. A more familiar
case is that of several electric circuits coupled together by
induction or some other method. Another is that in which
several pendulums or springs can react on each other, as through
their supports, and affect each other's motion. There is evi-
dently a very wide variety of problems; we shall treat only the
simplest, in which two linear oscillators, or electric circuits, are
coupled together by a force depending linearly on both' the
displacements.
70. Coupled Oscillators.— Suppose we have two undamped
one-dimensional oscillators, whose displacements are y x and y 2
respectively, and whose equations of motion, if uncoupled would
be
mx ~w + kiyi = °»
m 2 -j^ + k 2 y 2 = 0.
Now let them be acted on by equal and opposite force* propor-
tional to the distance apart, -a(y x - y 2 ) and -a(y 2 - Vl )
respectively, as if there were a spring stretched between them'
1 he equations then become
mx ~W + ( kl + a )2/i - a v* = o,
mT W + ^ 2 + a )?/2 - ay x = 0.
107
108 INTRODUCTION TO THEORETICAL PHYSICS
As a matter of convenience in the calculation, we shall introduce
changes of notation: let yis/m~i = x h y 2 \/m 2 = x 2 , (&i + a)/m 1 =
coi 2 , (fc 2 + a)/m 2 = co 2 2 , a/Vmm 2 = c. Then the equations are
— ^ + COi 2 ^! - 'cx 2 = 0,
^? + co 2 % 2 - czi = 0. (1)
These are two simultaneous differential equations, and there are
several ways of solving them. First we may take advantage of
their property of being linear with constant coefficients, and see
if we cannot get exponential solutions. We assume x x = Ae iut ,
X2 _ Be iut , where A, B, co are to be determined. Substituting,
we have
(-CO 2 + C0! 2 M - cB = 0,
-cA + (-co 2 + co 2 2 )J5 = 0. (2)
If we regarded co as being known, these would be two simultane-
ous equations for the two constants A and B. Evidently they
are linear homogeneous equations. Now it is a theorem of
algebra that in general two such equations do not have any
solutions, unless the determinant of coefficients,
(-CO 2 + CO! 2 ) ~C
— C (-0) 2 + co 2 2 )
(3)
is equal to zero. Let us see what this means. We could solve
the first equation for A in terms of B: A = Bc/(-a) 2 + coi 2 ).
But we could do the same with the second, A = B{- co 2 + o> 2 2 )/c.
If these solutions are to be consistent, it must be that the two
factors on the right are equal, c/(-co 2 + coi 2 ) = (-co 2 + co 2 2 )/c,
or (-0,2 .f ^^(-^ + W2 2) _ c 2 = o. But this is just the equa-
tion obtained by setting the determinant equal to zero, so that
we have verified the result of algebra. Now the equation which
we have obtained, called the secular equation, can be satisfied,
for we still have co at our disposal. Solving the quadratic, this
gives
^ _ oMHp* ± ^ (».» - ^y + cK (4)
This gives two values for co 2 , or two different possible frequencies
of motion for the system. This is natural, since we should have
COUPLED SYSTEMS AND NORMAL COORDINATES 109
two frequencies if they were uncoupled, one for the one particle,
the other for the other. Suppose the first, with the + sign, is
called a/, and the second, with the — sign, co". It is interesting
to find co' and co", in the case where c, measuring the interaction
between the particles, is small. Then we can expand by the
binomial theorem, obtaining
"' 2 = CO! 2 + 2 ^ 2 + • • • ,
Uf — 0> 2
C/' 2 = C0 2 2 + -y^ 2 + • • • , (5)
0>2 — COi
showing that the frequencies approach the natural frequencies
of the separate systems when the coupling goes to zero, but that
they differ from them by quantities which increase as c increases.
It is interesting to see that the frequencies are always spread
apart by the interaction: if coi 2 > co 2 2 , then a/ 2 > co x 2 , co" 2 < co 2 2 ,
and correspondingly if the situation is reversed. There are
several relations between co' and co" which we shall need, and
which we write for reference; they are easily proved from the
solutions already found, and hold independently of the size of c :
Co' 2 Co" 2 = £Oi 2 C0 2 2 — C 2
<0'2 _f_ w "2 = Wl 2 + ^
(-a/ 2 + cox 2 ) (-a/' 2 + cox 2 ) = -c 2 . (6)
Having determined the two possible frequencies of vibration
of the system, we next find the amplitudes A' and B' correspond-
ing to co', and A" and B" corresponding to co". These are evi-
dently given by
4L = c
B' (-co' 2 + COl 2 /
*L = 1 (7)
B" ("CO" 2 + CO! 2 ) ^ }
That is to say, the ratios of A's to B's are determined, but not
the values themselves. The situation is then the following: we
have one possible solution, x x = A'e™' 1 , x 2 = B'e^' 1 , where the
ratio of the amplitudes of x\_ and x 2 is fixed, but the magnitudes
are otherwise arbitrary. Of course, there is a similar solution
with — ico't in the exponent, so that combining these in the usual
way we have an arbitrary phase and amplitude, or two arbitrary
constants. Next we have also the solutions .Ti = A"e iu "\ r 2 =
B"e io,/ ", of the same sort. And now, on account of the linear
110 INTRODUCTION TO THEORETICAL PHYSICS
nature of the equations, we can make linear combinations of
these, obtaining
X! = A'e™'* + A"e ia " 1 ,
x 2 = 5V-" + B"e*"*. (8)
That is, each coordinate has two periods in its motion, or is
doubly periodic. Since the amplitudes are to a certain extent
arbitrary, it is possible for only one frequency to be excited at a
time, or for both to go simultaneously.
It is interesting to consider the physical nature of the motions
described by these equations. Let us assume that the two sys-
tems are only loosely coupled together (c is small). Then one
possible mode of vibration has frequency co', only slightly greater
than the frequency coi which the first oscillator would have had
without coupling. It is not a vibration of the first oscillator
alone; both are vibrating at the same time. However, if we
examine the coefficients A', B' in this case, we find that B' is
small compared with A', meaning that the amplitude of the
second oscillator is small compared with that of the first. Thus,
using B'/A' = W - o> ,2 )/c, and co' 2 = o>i 2 + c 2 /W - co 2 2 ) +
• • • , we have approximately B'/A' = c/(co 2 2 — coi 2 ). This is
as if the first oscillator, vibrating with frequency co', which is
approximately co x , and amplitude A', were forcing the second
oscillator by virtue of the coupling, with a force cx h or cA'e™' 1 ,
or approximately cAV* 1 '. This would produce a forced ampli-
tude of (cA' e ic "'0/(«2 2 — u' 2 ), which is just what we have found.
Similarly the second oscillator can vibrate almost by itself,
with the frequency co" which almost equals co 2 , but it reacts back
on the first and produces a small forced amplitude. It is now in
the further approximations to the interaction that the differences
between coi and co', co 2 and co", come in.
We have considered the types of vibrations separately. But
there is no reason why both cannot be simultaneously excited,
so that each particle will be vibrating with both periods at once.
Then the phenomenon of beats can easily come in; for the sum
of two sinusoidal vibrations of different frequencies is equivalent
to a single vibration of varying amplitude, as we see from the
equation
/ w / _ w " \ „' + co",
cos o't + cos a"t = (2 cos ^ t ) cos 2 '
where the first expression, in parentheses, represents an amplitude
COUPLED SYSTEMS AND NORMAL COORDINATES 111
oscillating with the slow frequency («' — u")/2, and modulating
the latter term, a rapid vibration of frequency (&>' + <o")/2.
If o}' and o}" are approximately equal, the effect gets most
marked, the frequency of the beats approaching zero. There is
in this case a pulsation of amplitude and energy from one of the
oscillators to the other. This is often seen in other similar
problems. Thus, if a weight is hung from a spiral spring and is
set vibrating up and down, it will be observed that after a certain
lapse of time the vertical motion will decrease, but there will be a
torsional motion of considerable amplitude. As time goes on,
these two forms of motion will alternately take up large ampli-
tudes. The reason is that there is a coupling between the two
forms of oscillation, and the beat phenomenon we have just
described comes into play.
71. Normal Coordinates. — We have just seen that the general
oscillation of two coupled particles is a sum of two vibrations of
different frequencies. If only one of these vibrations is excited,
both particles oscillate with the same frequency but different
amplitudes. It now proves to be possible to introduce new
coordinates X and F, called normal coordinates, given by linear
combinations of the displacements xi and x 2 of the two particles,
which have the following properties: the generalized force
acting on X is proportional to X alone, independent of F, so
that the equations of motion are separated, and X and F execute
independent simple harmonic vibrations, of different frequencies.
When one of the coordinates alone is different from zero, the
other remaining equal to zero, just one of the two vibrations is
excited. The existence of such coordinates is made plausible
by the following fact: if one vibration alone is excited, x x is
proportional say to a times a sinusoidal function of time, x 2 to j8
times the same sinusoidal function. In this case jSzi — ax 2 will
be always zero. This linear combination of xi and x 2 will be
proportional, then, to the normal coordinate associated with the
second type of vibration, which is not excited in the case men-
tioned. By assuming that the second vibration alone is excited,
we can in a similar way infer the form of the first normal coordi-
nate. We proceed in the next paragraph to the general formula-
tion of the normal coordinates.
Suppose we set up quantities X, F, defined by the equations
Xi = a'X + a"Y,
x 2 = 0'X + 0"Y, (9)
112 INTRODUCTION TO THEORETICAL PHYSICS
where Co! = A', Cfi f = B', Da" = A", Dp" = B", C and D
being constants. Since only the ratios of the a's and /3's are so
far determined, we may demand that the magnitudes be so
fixed in this case that a' 2 + 8' 2 = 1, a" 2 + 6" 2 = 1. This is
called the condition of normalization, and we shall see its signifi-
cance a little later. Our quantities X and Y can now be treated
as generalized coordinates, and we can easily see that the equa-
tions of motion, in terms of them, have the variables separated.
Let us set up the equations of motion in these new variables.
We have
2
= 1
~ 2
= 1
~ 2
Using the relations (6) and (7), the last term can be shown to be
zero. This is called the condition of orthogonality, for reasons
which will later be evident. Using the normalization conditions
mentioned above, we have finally
T = -
1 2
\dt) + \dt)
(10)
Next for the potential energy we have, from the original
equations,
V = K"iW + co 2 2 X2 2 - 2cxix 2 ),
= M(«i 2 (V* + oc"Y) 2 + co 2 2 (/3'X + p"Y) 2
-2c(a'X + a"Y)(8'X + 8"Y)}
= \ { ( Wl V 2 + o>2 2 /3' 2 - 2ca'8')X 2
+ ( Wl V 2 + co 2 2 /3" 2 - 2ca"8")Y 2
+ 2Wa'a" + a> 2 W - c(a'jS" + a"8')]XY).
Here it can be shown by a little manipulation that the first
parenthesis equals w' 2 , the second co" 2 , and the third is zero, so
that
v = K*>' 2 ^ 2 + «" 2 F 2 ). ( n .)
In terms of the new variables, the variables are separated, and
COUPLED SYSTEMS AND NORMAL COORDINATES 113
Lagrange's equations become simply d 2 X/dt 2 + <a' 2 X = 0,
d 2 Y/dt 2 + <a" 2 Y = 0, whose solutions are X = constant X e iu/t ,
Y = constant X e ia " 1 . Thus each of the generalized coordinates
executes a simple harmonic motion, which of course can have
arbitrary amplitude and phase, and our final result, if we set the
first constant equal to C, the second to D, is
xi = a'X + a"Y = a'iCe^' 1 ) + a"(De*"'"') = A'e^" + A'V-'",
etc., agreeing with the results already found.
It may be proved in general that for any mechanical problem
in which the potential is a quadratic function of the coordinates,
coordinates of this kind (called normal coordinates) can be set
up, having the property that they have no cross terms between
different coordinates in either the kinetic or the potential energy,
so that the Lagrangian function is a sum of squares of coordinates
and velocities, with constant coefficients, and the variables are
separated in the Lagrangian equations. The general method of
setting up these normal coordinates follows exactly the model
we have found for our simple problem. This is one of the few
sorts of mechanical problems in which a general solution is possi-
ble, for no such theorem holds with other laws of force. The
equations of motion for the normal coordinates are just like those
for harmonic oscillators, so that their solutions are sinusoidal
vibrations. In general, there are then as many fundamental
periods in the motion as there are constants, so that the motion
is multiply periodic.
The normal coordinates are of particular value when we come
to discuss the action of external forces on the coupled systems.
For suppose there are external forces F x and F 2 acting on the two
particles respectively, in addition to the elastic forces already
considered. Then we can set up the generalized forces acting on
the two normal coordinates, by the method described in Chap.
VIII. If these are F x and F Y , we have
Fx = Fl JX + F ^ = a ' Fl + ^ 2 '
F r = F^ + F^ = a"F, + 0"F t .
Then the equations of motion are simply
114
INTRODUCTION TO THEORETICAL PHYSICS
d 2 Y
w + °" 2Y - F
Y,
(12)
showing that these normal coordinates have the same sort of
equations of motion, under the action of external forces, as single
oscillators. Thus the complete solution will be a sum of a partic-
ular solution of the inhomogeneous equations, consisting of
vibrations of the same nature as the external force, capable,
therefore, of showing resonance phenomena, and of a general
solution of the homogeneous equations, of the sort we have found.
Fig. 18. — Rotation of coordinates. The distances OA and PA are the x
and y coordinates of the point P, and OB and PB are the X and Y coordinates.
Under certain circumstances, a damping force proportional to the
velocity will also be expressed in terms of normal coordinates as
a constant times the time rate of change of the normal coordinate,
but this is not always true. We shall discuss this question in
Chapter XIII.
72. Relation of Problem of Coupled Systems to Two-dimen-
sional Oscillator. — Our problem of two coupled one-dimensional
oscillators reminds us strongly of the case of two-dimensional
oscillators encountered in Chap. IX. Here, as there, we have
two coordinates (xi and x 2 here, x and y there), and linear restor-
ing forces. But the difference is that here the restoring force
acting on each coordinate depends on the values of both. The
corresponding problem in the two-dimensional case would be
that where F f = — ax -f- cy, F y = — by + ex, where a — «i 2 ,
COUPLED SYSTEMS AND NORMAL COORDINATES 115
6 = w 2 2 . And obviously the problem can be solved just as we
have treated our case of the coupled oscillators. That is, we
introduce new variables X, Y, defined by the equations x = a'X
+ a"Y, y = p'X + 0"Y, where the a's and jS's have the values
found above, and in terms of the new variables X and Y we have
separation, and get a solution in which X and Y execute periodic
vibrations of different frequencies.
But now we can get a very simple geometrical interpretation
of our change of variables: it is merely a rotation of coordinates.
To see this, let us first consider what a rotation of coordinates
means analytically. In Fig. 18, we see old coordinates xy, and
new, rotated ones, XY. The xy and XY coordinates of a point
P are indicated. Now there is a very simple vector way of
writing the coordinates. Let i, j be the unit vectors along x and
y, respectively, and /, J along X, Y. Further, let r be the radius
vector from the origin to point P. Then evidently we have
x = (i • r), y = (j • r), X = (J • r), Y = (J • r). But we can
express i and j in terms of I and J, or vice versa :
i = (* • /)/ + (t • J) J,
j = (i • i)i + (j • J) J.
Hence we have
x = (i • r) = (i • I) (I • r) + (i • J)(J • r)
= {i-I)X + (i-J)Y, (13)
and
y = (j ■ I)X + (j • J)Y.
These are linear equations of just the sort already found and
agree if
(t ' I) = «',
(*' • J) = «",
U-I)=P',
(j • J) = 0". (14)
We may not assume, however, that any linear transformation of
this sort corresponds to a rotation; the general transformation
would be to a stretched, oblique set of axes. For the new coordi-
nates to be obtained from the old by merely rotating, we must
have two conditions: (1) the vectors / and / must be at right
angles, or orthogonal, to each other; (2) I and J must be of unit
length, or, as we say, normalized. That is, in vector notation,
116 INTRODUCTION TO THEORETICAL PHYSICS
(I • J) — 0, P = J 2 = 1. Now we can express these equations
by taking components along the x, y axes: since I = (i • I)i +
j.j = o= (i-i)(i-J) + U'i)U-J)
= a'a"+W" ( (15)
or the orthogonality conditions, which, we have already seen to be
satisfied, and whose significance we now see. Also
p = i = (i . 7)2 + (j ■ iy = a ' 2 + P' 2
J2 = 1 = «"* + p"\ (16)
or the normalization conditions, which we satisfied by proper
choice of arbitrary constants. We can, in conclusion, make the
following statement: any linear transformation in which the
transformation coefficients satisfy the orthogonality and normali-
zation conditions corresponds to a rotation of coordinates.
The advantage of making our rotation is seen when we con-
sider the mechanical problem. In the original problem we have
force components F x = -ax + cy, F y = -by + ex. We can
find the components of force in the new variables. Evidently
F x = (F- i), F y = {F- i), and similarly
F x = (F'l) = (F- i)(i ■ J) + (F • j)(j ■ I) = <*'F X + -pF,
= a'(-ax + cy) + P'(-by + ex)
= -(a'a - P'c)x - {-a'c + P'b)y
' = - (a'a - fic)WX + a"Y) - (-a'c + f?b){fiX + fl"Y)
= -(a' 2 a - 2a'P'c + P' 2 b)X -
(a' a" a - a"p'c - a'&'c + 0'P"b)Y.
But by results already proved, we easily see that the first paren-
thesis equals co' 2 (or a corresponding expression in terms of a and
6), and the second is zero, so that F x = -o/ 2 X, and similarly F Y
turns out to be -w" 2 Y. In other words, by this rotation of
axes, we have got each component of force to depend on displace-
ment in that direction alone. Incidentally, the method of finding
the components of a vector in rotated coordinates which we
have used is of general application.
The object of the rotation becomes even clearer when we
consider the potential energy. This is the quantity whose x
derivative is ax - cy, and y derivative is by - ex. First we
note that dFJdy = dFJdx = c, so that the curl of the force is
zero, and the potential exists. Then we easily see that V =
COUPLED SYSTEMS AND NORMAL COORDINATES 117
\{ax 2 + by 2 — 2cxy), or ^(coi 2 x 2 + W2 2 ?/ 2 — 2cxy). An equi-
potential, obtained by setting this expression equal to a constant,
is an ellipse with its center at the origin, but with its major and
minor axes inclined at an angle to the xy axes, unless c = 0.
But now we have seen that the potential in the new coordinates
has the expression V = |(a/ 2 X 2 + &/' 2 F 2 ). If this is equal to a
constant, the result is the equation of an ellipse whose principal
axes are along the X and Y axes. In other words, our whole
change of variables has been a rotation of the coordinate axes to
point along the principal axes of the elliptical equipotentials.
The process of rotating axes to coincide with the principal axes
of an ellipse or ellipsoid is a common thing in mathematical
physics. We have already seen one example in the last chapter,
where we had the ellipsoid of inertia, and used the principal axes
as coordinates. Other illustrations come from the theory of
elasticity, where there is an ellipsoid of stress at each point, and
we often use the principal axes of stress as coordinates. Again,
in wave mechanics, examples of the same sort of process are
constantly found.
73. The General Problem of the Motion of Several Particles. —
The present problem is the first one we have met in which there
are several particles interacting with each other, and it has illus-
trated one of the useful methods of attack on such a problem.
This is to take all the coordinates, whether they refer to one or
another particle, and imagine them all plotted in a many-
dimensional space, like the phase space which we discussed in
connection with the Hamiltonian method, but with only enough
dimensions to take care of coordinates, not of momenta. Such
a space is often called a configuration space. Then the motion
of the system is given by the motion of a point in configuration
space. If there is a potential, it is a function of position in
configuration space. We can then apply many of the same ideas
to the motion of the point in many-dimensional space that we
would to the motion of a single particle in three-dimensional
space. Thus there will be parts of configuration space where
E — V is positive; there the point can go, but it cannot enter the
regions where E — V is negative. In some cases, changes of
variables in configuration space can simplify the problem
enough so that we can separate variables, or at least go far
toward a solution. The present chapter has supplied one
instance. Another is found in the problem of two particles, as
118 INTRODUCTION TO THEORETICAL PHYSICS
the earth and sun, exerting forces on each other but not being
acted on by outside bodies. There we can introduce new coordi-
nates: first, the three coordinates of the center of gravity of the
system; second, the coordinates of one particle relative to the
other. And in terms of these new coordinates, the three coordi-
nates of the center of gravity become separated from the others,
resulting in a uniform motion of the center of gravity in a straight
line, and the relative motion reduces to a problem mathematically
equivalent to the motion of a single particle in three-dimensional
space. The changes of variables used in these cases generally
have the property, which we have noted in the present case, of
mixing up the coordinates of two or more particles in a single
generalized coordinate.
Problems
1. Two balls, each of mass m, and three weightless springs, one of length
2d, the others of length d, are connected together in the arrangement spring
d — ball — spring 2d — ball — spring d, and the whole thing is stretched in a
straight line between two points, with a given tension in the springs. Grav-
ity is neglected. Investigate the small vibrations of the balls at right angles
to the straight line, assuming motion only in one plane. Show in general
that there are two modes of vibration, one having the lower frequency, in
which both balls oscillate to the same side at one time, then the other, and
the second mode, with higher frequency, where they oscillate to opposite
sides. (Hint: if the first is displaced xi, and the second x 2 , and if these
displacements are so small that the tension t is unchanged, then there will
be two forces acting on the first ball: a force t toward the point of support,
making an angle whose tangent is Xi/d, and another directed toward the
second ball, at an angle whose tangent is {x 2 — Xi)/2d. The component at
right angles to the straight line, and thus producing the motion, is then
—xi (t/d) + (#2 — xi) (t/2d). Similarly the force on the second is —x 2 (t/d)
+ (xi - x 2 )(t/2d).
2. Assume two resistanceless circuits, one with Li, Ci, the other with
L 2 , C%, coupled together by having a mutual inductance M between the two
inductances (that is, back e.m.f . of self- and mutual inductance is — Li dii/dt
— M dii/dt in the first circuit, and — L 2 dii/dt — M dii/dt in the second
circuit, where i\, i% are the currents in the circuits). Find the frequencies
of the natural oscillations of the coupled system.
3. In Prob. 2, assume that the circuits have small resistances R i and R 2 ,
respectively, so small that the logarithmic decrements of the separate cir-
cuits are small. Discuss the damped oscillations, showing that the solution
can be carried out if squares of resistances are small enough to be neglected,
but that it leads to a biquadratic equation for the frequency for large R.
(Hint: write the frequency as the sum of a real and an imaginary part.) S
4. Two identical pendulums hang from a support which is slightly yield-
ing, so that they can interchange energy. Assume that coupling is linear.
Now suppose one pendulum is set into motion, the other being at rest.
COUPLED SYSTEMS AND NORMAL COORDINATES 119
Show that gradually the first pendulum will come to rest, the second taking
up the motion, and that there is a periodic pulsation of the energy from one
pendulum to the other. Show that the frequency of this pulsation gets
smaller as the coupling becomes smaller, until with an infinitely rigid
support the energy remains always in the first pendulum (this is all without
damping forces).
5. one simple pendulum is hung from another; that is, the string of the
lower pendulum is tied to the bob of the upper one. Discuss the small
oscillations of the resulting system, assuming arbitrary lengths and masses.
Use the angles which each string makes with the vertical as generalized
coordinates. In the special case of equal masses and equal lengths of
strings, show that the frequencies of the motion are given by v g(2 ± \/2)/l.
6. Show that if the mass of the upper pendulum becomes very great
compared with the lower one, the solution of Prob. 5 approaches that of
Prob. 8, Chap. IV. Show in the other limiting case, where the upper mass
is small compared with the lower one, that the motion consists approxi-
mately of an oscillation of the large mass with a period derived from the
combined length of both pendulums, and a more rapid oscillation of the
small mass back and forth with respect to the line connecting point of
support and large mass.
7. Given an ellipse ax 2 + bxy + cy 2 = d, perform a rotation of axes so
that the new coordinates will lie along the major and minor axes of the
ellipse. From this rotation, find the angle between the major axis and the
x axis, in terms of the coefficients a, 6, c, d. It is simplest to write
the transformation directly in terms of the angle 6: x' = x cos 6 + ysia 0, etc.
8. Show that if the equations
x' = anx + a x2 y + a lz z,
y' = a 21 x + a 2 2y + a 2S z,
z' = a 3 lX + «322/ + 0332
represent a rotation of coordinates, the a's satisfy orthogonality and nor-
malization relations, both of the form anai 2 + a 2i a 22 + «3i«32 = 0, a 2 u +
a 2 2i + a 2 3i' = 1, and of the form ana 2 i + a,i 2 a 22 + 013023 = 0, a 2 n + a 2 12 +
ah 3 = 1.
9. In the rotation of coordinates above, show that the inverse transforma-
tion is given by
x = a u x' + a 2 iy' + a 3 iz', ♦
y = a 12 x' + a 22 y' + 0322',
z = ai3X r + a 23 y' + 0332'.
Prove that the determinant of the a's is equal to unity.
10. Find the components of an arbitrary vector in the rotated set of
coordinates given in Prob. 8. Show that the components of grad V, where
V is a scalar, in the rotated axes, are dV/dx', dV/dy', dV/dz'; that is, that
the gradient is invariant under a rotation of axes (has the same form in the
new axes as in the old).
11. Prove that the divergence, curl, and Laplacian are invariant under a
rotation.
12. Set up a method for getting the direction cosines of the principal
axes of inertia of a body, and the values of the principal moments of inertia,
if the moments and products of inertia are known in a particular coordinate
svstem.
CHAPTER XII
THE VIBRATING STRING, AND FOURIER SERIES
In this chapter we turn to the discussion of the motion of a
continuous medium. There are examples of such motion in one,
two, or three dimensions; as a vibrating string in one dimension,
a membrane in two, and an elastic solid, or gas, in three dimen-
sions. We first consider the motion of a one-dimensional body,
or string. Suppose we have a string of length L, mass n per unit
length (constant), with a tension T, set into transverse vibrations.
From our elementary work, we know that an infinite number of
modes of vibrations, or overtones, are possible. For the nth
overtone, if it is present alone, the shape of the string at any time
is given by sin (ranr/L), where x is the coordinate of a point on the
string measured from one end, and the function is proportional
to the displacement transverse to the string. The frequency of
this overtone is a)„/2ir, where c*>„ = {mr/L)\^T/n. Thus if A n
4 is- •the complex amplitude of this overtone, and u is the displace-
ment of the point x, we have
= real part of ^,A n sin — t-C
L
n = l
where we sum over all the possible overtones. Our first task is
to derive these results from fundamental principles.
74. Differential Equation of the Vibrating String. — Assume
that at a given time the string is displaced so that its shape is
given by u{x). We consider how this curve will change with
time, and consider transverse displacements so small that the
tension T may be considered constant throughout the string.
Take a short element of the string of length dx aiid mass ixdx.
Its acceleration is d 2 u/dt 2 (x kept constant), so that its mass
times its acceleration is ix dx d 2 u/dt 2 . This must be equal to the
force acting on this element which arises from the tensions.
These tensions (which we take equal to each other in magnitude)
would cancel each other exactly if the string were straight, but
when it is curved, they each give ris* 3 ! to components approxi-
120
THE VIBRATING STRING, AND FOURIER SERIES 121
mately perpendicular to the string which vary with the curvature
of the string (see Fig. 19). At any point x, this component is
approximately T du/dx, and we work only to the approximation
T^- >r
C
^
D
A
e \X
a'
yr
u
" B
x x+ax
Fig. 19. — Tensions on an element of string.
Vertical component at x is
du
— T sin 0. If we approximate sin by tan 0, this is — T^ Similarly at
x + dx, the component is + T — » but now computed at x + dx.
ox
to which this is true. Thus the total force on the element of
string is
J/su\ _ (e_u\
l\dx/ x+ d X ydx/t
= T—dx
1 dx* ax '
if we expand the first term in a Taylor's series and retain only
the first two terms of the expansion. Thus our equation of
motion is
d 2 u . „d 2 u
V-^d* = T d^ dx >
or
d 2 u m d 2 u
dt 2
dx 2
(1)
This is a partial differential equation, since it contains partial
derivatives. This appearance of partial derivatives is charac-
teristic of all equations of motion of continuous media. Since
the equation is linear, with constant coefficients, let us try to
solve it by the exponential method, assuming u = e i(ut+kx \ as
would be suggested by the solution in terms of overtones. The
equation of motion leads immediately to —/j,io 2 /T = —k 2 ,
determining « in terms of k. Combining two exponential solu-
122 INTRODUCTION TO THEORETICAL PHYSICS
tions, allowable since the equation is linear and homogeneous,
we have
u = Ae iut sin kx,
or
u = Be 1 " 1 cos kx. (2)
Now we must introduce the boundary conditions, which tell us
that the string is held fixed at both ends, so that u = when
x = 0, and when x = L. From the first of these conditions
B = 0, and we take only the sine function. From the second,
we must have sin kL = 0, or kL = nr,
7 rnr
k= T ,
where n = 1, 2, 3, • •
Hence the solution is
u = Ane^n* sin —j-> (3)
Li
where
3n " T\m '
Superposing the solutions of all the different n's, as we may from
the nature of the differential equation, we obtain the solution
mentioned at the beginning of the chapter.
Our differential equation is a linear homogeneous partial
differential equation of second order. As such, any linear com-
bination of solutions is itself a solution. But now we have, not a
small number of arbitrary constants, but the doubly infinite set
A n , n = 1, 2, • • • (A n is complex). This is characteristic of
all partial differential equations. Sometimes instead of having
an infinite set of arbitrary constants, we have an arbitrary func-
tion. In our case the A's are determined by giving the amplitude
and phase of each overtone. These must be determined from
the initial conditions; that is, from the values of u(x) and u(x)
at t = 0. The essential point is that our partial differential
equation is equivalent to an infinite number of ordinary differ-
ential equations, so that we need an infinite number of constants.
75. The Initial Conditions for the String. — Suppose we wish
to satisfy initial conditions of the following sort for a vibrating
string: at t = 0, the displacement and velocity are given func-
tions of x. That is, if the displacement is u(x, t), then u(x, 0) =
THE VIBRATING STRING, AND FOURIER SERIES 123
du
f( x )> -^r(, x > 0) = F(x). where f{x), F{x), are arbitrary functions.
at
Now we may write
u(x, t) = >. (C n cos oi n t + D n sin wj) sin -=-> (4)
using the real form of the function of time, and
u{x, t) = ^ ( — C n o>n sin co n t + Z)„co„ cos o) n t) sin — =-•
re
Thus we must have
u(x, 0) = /(a:) = ^C„sin
i(x, 0) = F(x) - ^> D„a>„ sin ^- (5)
u
To satisfy either of these conditions, we must be able to expand
our arbitrary function in series of sines, and to find the coeffi-
cients C n or D n of these expansions. Having found the coeffi-
cients, we can at once set up the series for u(x, t). This is a special
case of Fourier expansion, and we now proceed to consider the
general problem of Fourier series, a question of general interest
apart from the application to a string.
76. Fourier Series. — We shall state Fourier's theorem.
Given an arbitrary function <l>(x). Then [unless (f>(x) contains
an infinite number of discontinuities in a finite range, or similarly
misbehaves itself], we can write
00
±( \ ^-o , ^/ a 2mrx D . 2mrx\
4>{x) = -s- + >( A n cos -^r- + B n sin
2 ■ ^jy-n— x . -»— x
n=l
where
'*/2 0„ ' O r^/2
< 2 f J,/ ^ 2w7rx ^ c 2 f ^ • 2w,r;c ^ ra\
/In = ^? I <£(£) cos — =r- as, £$„ = ^ I #(£) sin -^r- cte. (6)
This equation holds for values of x between — X/2 and X/2, but
not in general outside this range. The series of sines and cosines
is called Fourier's series. Obviously a special case of it could be
used in our problem of the string, the case where only the coeffi-
cients of the sine terms were different from zero.
124 INTRODUCTION TO THEORETICAL PHYSICS
There are two sides to the proof of Fourier's theorem. First,
we may prove that, if a series of sines and cosines of this sort can
represent the function, then it must have the coefficients we have
given. That is simple, and we shall carry it through. But,
second, we could show that the series we so set up actually
represents the function. That is, we should investigate the
convergence cf the series, show that it does converge and that its
sum is the function <i>{x). This second part we shall omit, merely
stating the results of the discussion.
77. Coefficients of Fourier Series. — Let us suppose that
<j>(x) is given by the series above, and ask what values of ^4's and
B's we must have if the equation is to be true. Multiply both
sides of the equation by cos (2mirx/X), where m is an integer, and
integrate from —X/2 to X/2. We have then
A i) 2mirx
cos
2 X
rx/2 2mirx f X/2 \
_i_ '^ / . 2mrx 2mirx . n . 2mrx 2mirx\ I 7
~ ^j( A n cos -^r- cos — y r B n sin -^=- cos „ \>dx.
But now we shall show in the next paragraph that
J.
x/2 2nirx 2mirx , ' n /m
cos v cos „ ax = 0, (J)
-X/2 -X- ' A
if n and m are integers, unless n = m, and that
/.
x/2 . 2mrx 2mirx , _
sin „ cos Y ax = 0,
-X/2 A A
if n and m are integers. Thus all terms on the right are zero but
one, for which n = m. The first term falls in with the rule,
when we remember that cos = 1. This one term then gives us
f.
A m | COS 2 * dx = A m jrl
■ X/2 ^- ^
as we can readily show. Hence
_2f J
n Xj-
x/2 , N 2mrx ,
4>{x) cos v ax.
X/2 - x -
In a similar way, multiplying by sin (2irmx/X), we can prove the
formula for B„-
THE VIBRATING STRING, AND FOURIER SERIES 125
In our derivation of coefficients, we have used the following
results : cos -^ cos -^^ dx = 0, if n, m are different inte-
J-x/2 A A .
gers, and similar relations with sines. We can prove these very
easily from trigonometry. Thus
[cos (a + b) + cos (a — &)]
cos a cos o = 2~ '
so that our quantity is the integral of this, or
X
. 2t(u + m)x . 2t(w — m)x
sm — ^— =^ — — sin ^
27r(n + m) 2ir(n - m)
X/2
-X/2
But the quantity in brackets is zero at both limits, if n, m are
integers, and the result is zero. Such proofs hold in the other
cases. The exception, of course, is the case n = m, in which the
integrand is \ (cos (4amx/X) + 1), so that, while the first term gives
1 C x/2 X
no contribution to the result, the second gives ^ I dx = -^
78. Convergence of Fourier Series. — In this section we shall
merely quote results. In the first place, the series cannot in
general represent the function, except in the region between
— X/2 and X/2. For the series is periodic, repeating itself in
every half period, while the function in general is not. only
periodic functions of this period can be represented in all their
range by Fourier series. If we try to represent a nonperiodic
function, the representation will be correct within the range from
-X/2 to X/2, but the same thing will automatically repeat
outside the range. Incidentally, we can easily change the range
in which the function is correct. If we merely change the range
of integration so as to be from x Q to x + X, where x is arbitrary,
the series will represent the function within this range. The
case we have used above corresponds to x = —X/2; another
choice frequently made is x - 0. Then again, if we change the
value of X, we can change the length of the range in which the
series is correct. To represent a function through a large range
of x, we may use a large value of X.
Although the range within which a Fourier series converges
to the value of the function it is supposed to represent is limited,
as we have seen, there is a compensation, in that within this range
126 INTRODUCTION TO THEORETICAL PHYSICS
a Fourier series can be used to represent much worse curves than
a power series. Thus the convergence of the series is not impaired
if the function has a finite number of discontinuities. It can
consist, for example, of one function in one part of the region,
another in another (in this case, to carry out the integrations,
we must break up the integral into separate integrals over these
parts, and add them). The less serious the discontinuities,
however, the better the convergence. Thus if the function itself
has discontinuities, the coefficients will go off as 1/n, while if
only the first derivative has discontinuities the coefficients go off
as 1/n 2 , and so on. Differentiating a function makes the con-
vergence of a series worse, as we can see, for example, if a function
is continuous but its first derivative discontinuous. Then the
coefficients go off as 1/n 2 , but if we differentiate, the coefficients
of the resulting series will go off as 1/n. There is an interesting
point connected with the series for a discontinuous function.
If the function jumps from one value U\ to another u 2 at a given
value of x, then the series at this point converges to the mean
value, (m + w 2 )/2.
79. Sine and Cosine Series, with Application to the String. —
In the special problem of the vibrating string, the series we
require is somewhat different from the general case, in that there
are only sines, and not cosines. We are therefore led to investi-
gate series of sines only, or of cosines only. Suppose we take
oo
the series — ° + \i„ cos —S^-' the series formed by taking
n = l
the cosine part of the general Fourier series. Now each one
of the terms is even in x; that is, if we interchange x with — x,
the function is not changed. A cosine series represents there-
fore an even function. Similarly the sine series ^5„ sin ^^>
of which each term is odd, represents an odd function (one for
which, if x is interchanged with —x, the function changes its
sign but not its magnitude). It is well known that any function
<j>(x) can be written as the sum of an even and an odd function:
0(z) = hi4>(x) + <f>(-x)] + i[<t>(x) - 4>{-x)], of which the first
term is even, the second odd. Thus the cosine part of a Fourier
series represents the even part of the function, the sine series
the odd part. As a corollary, any even function can be repre-
THE VIBRATING STRING, AND FOURIER SERIES 127
sented by a cosine series alone, an odd function by a sine series.
Now suppose we are really interested in a function only
between and X/2, and that we do not care what the series does
outside that region. Then we may define an even function
<f> e (x) as follows: it equals the given function <t>(x) between and
%&)
Fig. 20. — A function, with even and odd periodic functions made from it.
The even and odd functions, <t> e (x) and <f>„(x), agree with the original function
4>{x) between and X/2. Between and -X/2, <t> e {x) is the mirror image of
4>(x), while <f>o{x) has the opposite sign. Outside the region from —X/2 to X/2,
both functions repeat periodically with period X.
X/2, but has just the same value for — x that it has for x (see
Fig. 20). Outside the range from — X/2 to X/2, it repeats itself.
The Fourier representation of <£ e will be a cosine series, but will
represent our given function # correctly between and X/2.
oo
Evidently it is the series -^ + ^?A„ cos — ^' where we write
n = l
the coefficients as the sum of two integrals,
A n =
-iS.
-KT-.
X/2 2tvkx
4> e (x) cos — yr- dx
-X/2 A
, t . 2mrx ,
<£( — x) cos ■ y dx +
'-X/2
= X I *^ cos ~X~
i
xt* , N 2utx , >
4>(x) cos y dx
128 INTRODUCTION TO THEORETICAL PHYSICS
Similarly we may define an odd function <f> (x), which equals <f>(x)
between and X/2, but at —a: has the negative of its value at -\-x.
oo
This function is represented by a sine series ^ B n sin *' >
n = X
where we readily see that
B n = v I 0W sin — ^- dx.
jr ■ V w oi" — J^"
Hence, between and X/2, the same function can be represented
by either a cosine or a sine series. But outside this range, the
series represent quite different functions.
Our sine series can now be applied to the string problem.
We are interested in the string between and L. Let us then
set L = X/2. The expression then becomes
oo
<t>(x) = ^>B n sin -^-> (8)
n = l
where
B n = y I ^w sin ~^r dx.
This can be used first to find the coefficients C n , from Eq. (5),
substituting u(x, 0) for <t>(x), and next to find the quantities
D n a> n , substituting u(x, 0) for <i>(x), D n w n for B n , and obtaining
D n by dividing through by «». These formulas then suffice to
find the constants C» and D n of the motion of the string, knowing
the initial displacement and velocity of every point of it.
80. The String as a Limiting Problem of Vibration of Particles.
An excellent insight into the problem of the vibration of a string
is obtained by regarding it as a limiting case of mechanical
systems with a finite number of particles, having theref ore a finite
set of arbitrary constants in the solution. This is the method
followed by Lagrange. Suppose we have N equal masses m
at the points x = d/2, 3d/2, • • • (N — |)d, separated by
massless springs, the whole being stretched with a tension T
between supports at x = and x = Nd = L. This forms an
approximation to the continuous string, if n = m/d, the mass
per unit length. We again investigate the transverse vibrations,
letting the displacement of the iih particle be Ui. The problem
THE VIBRATING STRING, AND FOURIER SERIES 129
is similar to Problem 1, Chap. XI. The force on the ith particle
is
T T
Fi = —{ui — Ui-i)j — {ui — Ui+i)-j>
except for the first particle, where we have
p 2T i \ T
and for the iVth,
T 2T
F N = —(u N — u N -i)j — u n~j-
Then, assuming a solution m = d r',w e nave tne N equation 6
of motion in the form
(—» 2 + t) Ci - h = °
_t Ci + (_ w + wy _ t Cz =
-?Ci + (^ -mco 2 + ^Vs - ?C 4 =
+ (""- ' + T) C3 " f (
- jC N -i + (-m« 2 + ^V* =0. (9)
Such a set of equations, all alike in form (except here for the first
and last), are called difference equations. As in the last chapter,
these have a solution only for certain values of co, given by setting
the determinant of coefficients equal to zero. The determinant
is now too complicated to handle simply, wherefore we adopt
another method of procedure. Suppose we let C } - = e ikj , where
k is to be determined from the equations above. All the
equations except the first and last take the form
_:L*«-i> + ( -mco 2 + 2 j )e ik > - -e ifc ^' +1) =
-2j cos k + (-mo 2 + 2j) = 0,
whence
ww 2 = 2^(1 - cos k). (10)
or
130 INTRODUCTION TO THEORETICAL PHYSICS
That is, for any «, we can choose a value of k by this relation,
so that all the equations except the first and last are satisfied.
These fall into line as well if we set up C = —C h and C N+1 =
— Cat, so that if these conditions are satisfied we have e ik \ or
equally well e~ ik] ', or sin kj or cos kj, as solutions of the equations
for Cj. These conditions on C and C N+ i are essentially boundary
conditions, one at each end of the string, and we readily see that
they are satisfied if we make our function zero for x = 0, x = L,
as we do if it is sin -y-, where n is an integer. That is, since
x is U ~ h)d for the jth particle, we have
C in = sin ~(^j - ^ j> (11)
so that k = mr/N. We see from this form of C in that C 0n = — C ln ,
and that only those values of n up to N give us different sets of
C's. If n is greater than N, then for each integral j we get just
the same value of C,„ that we had for a certain n less than N,
so that the whole scheme repeats itself over and over as n
increases, and we really have only N distinct solutions. Similarly
in the expression for the frequency, the term 1 — cos k = 1 —
cos (mr/N) is periodic, so that as soon as n becomes greater than N
we repeat the frequencies already found. There are, then,
just N solutions, each with its frequency and its complex ampli-
tude for each particle. This fits in with the single frequency
for one particle and the two which we have found for two coupled
particles. For each of the N particles there is an arbitrary
amplitude and phase, or arbitrary complex amplitude, so
that there are just 2N arbitrary constants. The whole solu-
tion is the sum, as n goes from 1 to N, of the real parts of
Ane^n 1 sin -y— , or
Li
N
= ^ i B n sin ^ cos (wj - €„). (12)
n = l
Each one of these terms represents the amplitudes of all the
particles when vibrating with a particular .mode of motion,
analogous to an overtone of the string. To get the amplitude
of the jth particle, we set x = (j — \)d. The angular velocity
w„ of the nth overtone is given by
THE VIBRATING STRING, AND FOURIER SERIES 131
mco„ 2 = 2^1 - cos ^ (13)
81. Lagrange's Equations for the Weighted String. — The
equations of motion which we have discussed above may also
be obtained readily from Lagrange's method, and we shall set up
expressions for the kinetic and potential energies. For the
kinetic energy 7\ we have simply
T 1 = ^W + ^ + • • • + un*),
and for the potential energy
V = ^[2wi 2 + (w 2 - wi) 2 + (u 3 - w 2 ) 2 + • • • + 2u N \
and the Lagrangian equations
d U(T,- 7) 1 _ d{T x - V) = Q
dt\^ diij J dUj
lead to the equations already used.
82. Continuous String as Limiting Case. — The solution we
have found for the set of particles differs in two ways from the
solution for the continuous string. First, there is only a finite
set of overtones, and secondly, the frequencies are determined
by different formulas. Both these differences disappear when
the number of particles in the fixed length L becomes infinite.
To determine the limiting form of the expressions for the fre-
quency, we develop cos ^ina power series for large N. We
thus obtain
cos y = 1 - si tf ) +
1/WY
An)
so that w n becomes
l/mr\ T _mr IT
using Nd = L and m = nd. This agrees with our former result.
In this limiting case of infinite N (and infinitesimal d) the expres-
sions for the kinetic and potential energies become
Ti =
$«* c - tadr -UX£)'*> (14)
132
INTRODUCTION TO THEORETICAL PHYSICS
which may also be derived directly for the case of a continuous
string.
Problems
1. Taking the case of four particles on a string, derive their displacements
in the four possible normal vibrations, and compute their frequencies.
Compare these frequencies with the first four frequencies of the correspond-
ing continuous string. Put in n = N + 1, and show how the solution
reduces to one already found.
2. An actual string is composed of atoms, rather than being continuous,
so that it has only a finite number of possible overtones. Assume that it
consists of a single string of atoms, spaced 10 -8 cm. apart. Let the string
be 1 m. long, and at such tension that its fundamental is 100 cycles per
second. Find the frequency of the highest possible harmonic, and show
that it is in the infra-red region of the spectrum. Show that in this highest
harmonic, successive atoms vibrate in opposite phases. Substances actually
have such natural frequencies in the infra-red 3 and they are important in
connection with their specific heat.
3. Prove that u = sin u[t — (x/v)] is a solution of the partial differential
equation for the vibrating string, if v is chosen properly, although it does not
satisfy the boundary condition that the string be held at the ends. Con-
sider the physical meaning of this solution, and show that it represents a
wave traveling down the string with velocity v.
4. Superpose the wave of Prob. 3, traveling along the +x axis, and a
similar one traveling in the opposite direction, and show that the sum repre-
sents a standing wave of the type discussed in this chapter.
5. Find the wave length of the waves in the string, in the solution we have
found in this chapter, and verify the relation v — rik between wave length X,
frequency n, and velocity v.
6. Proceeding as in Prob. 5, find the velocity of a wave along the weighted
string, showing that it varies with frequency. Find a formula for the
variation.
Fig
Artificial electric line.
7. An artificial electric line can be constructed according to Fig. 21,
consisting of N identical resistanceless circuits, each containing inductance
L, capacitance C, and coupled to each of its neighbors with mutual induc-
tance M. Set up the differential equations for the currents tin the various
circuits, showing that they reduce to the same form as with the weighted
string.
THE VIBRATING STRING, AND FOURIER SERIES 133
8. Neglecting boundary conditions at the two ends of the line in Prob. 7,
show that a disturbance can be propagated along the line with a definite
velocity, as in Prob. 6.
9. A string of length L is pulled aside at a point a distance D from the end,
and then released. Thus its initial shape is given by a curve made of two
straight lines, and its initial velocity is zero. Find the solution for its
motion, and find the amplitude of the nth harmonic.
10. Taking the solution of Prob. 9, for the special case where D = L/2,
compute the first five terms of the Fourier series, when t = 0. Add them
and plot the sum, showing how good an approximation they make to the
correct curve.
11. A string initially at rest is struck at a distance D from the end, at
t = 0. Find the intensity in each overtone. Approximate the initial
conditions as follows: the initial displacement is zero, and the initial velocity
is a constant in a small region of length d about the point D, zero elsewhere.
CHAPTER XIII
NORMAL COORDINATES AND THE VIBRATING STRING
In the preceding chapter, we have worked out the elementary
theory of the vibrating string, finding the nature of the possible
vibrations and the method of getting the amplitude of the over-
tones in terms of the initial conditions. When we begin to ask
about slightly more complicated problems, however, we find
that it is necessary to go further into the theory. For example,
we might be interested in the nature of the forced vibrations
under the action of an external sinusoidal force, or the effect of
damping on the oscillations. Such questions are easily answered
by introducing normal coordinates, much as we did with the two
coupled oscillators. These are generalized coordinates, which
prove to be closely connected with the various overtones, so that
if just one normal coordinate is vibrating, that means that the
string is vibrating with the corresponding pure overtone. When
we write Lagrange's equations in terms of the normal coordinates,
we find that we can introduce external forces easily, and solve
such problems.
At the same time, the general theory of normal coordinates
for vibrating strings, which we shall get into, has particularly
interesting relations to many other branches of mathematical
physics. We shall gain much more insight into Fourier expan-
sion, finding a general theory of expansion of which this is a
special case, but which, as we shall find later, includes expansions
in Bessel's functions, spherical harmonics, and many other sorts
of functions. Such problems are met not only in vibrations,
but also in heat flow (for which Fourier series were originally
developed), potential theory, hydrodynamics, and in the newest
branch of mathematical physics, wave mechanics, or the quantum
theory, used in studying atomic structure.
83. Normal Coordinates. — In Chap. XI, we investigated the
vibrations of two coupled particles and set up normal coordinates
to describe the motion. Since we must make a considerable
extension of the idea of normal coordinates in the present chapter,
it will be best to review the results we have already found. We
134
NORMAL COORDINATES AND THE VIBRATING STRING 135
started with two coordinates, xi and x 2 , describing the displace-
ment of the two particles. The normal coordinates X and Y
were introduced by a linear transformation
X X = a'X.+ ct'Y
z % = P'X + /3"F,
which proved to be merely a rotation of axes in the xi — xi space,
so that X and Y were new orthogonal axes in that space. To
express the fact that the transformation was just a rotation, we
had certain conditions holding between the coefficients: orthog-
onality conditions, as a' a" + 0'p" = 0, and normalization
conditions, as a' 2 + jS' 2 = 1. We saw that the quantities a',
a", /3', /3", had a geometrical meaning: a' was (i • /), and similar
relations for the other quantities, showing that a', &', were the
components along the xi and x% axes, respectively, of unit vector
along X, and a", 0" similarly were components of unit vector
along Y. The object of the rotation to normal coordinates was
to separate the variables of the equation of motion, so that each
normal coordinate executed a vibration of its own, as X = Ce™' 1 ,
Y = De Ua " t . This was equivalent to rotation so that the new
axes in the Xix 2 space lay along the principal axes of the elliptical
equipotentials of the problem.
We can now follow exactly the same model in our problem of
the string. We start with the case of n weights separated by
springs. By analogy, the displacement of the first weight should
be a linear combination of normal coordinates, the coefficients
(corresponding to the a's and /3's) being the displacements of
tittx 1
the first weight when only one overtone is excited, or sin — =—
Li
for the nth overtone. The displacements of these weights are
taken to be U\ . . . u N . Then we set up N normal coordinates,
0i . . . <t> N , by the equations
"^"1 mrx\ ,
Wi = >,«» sm —j— <t>*
N
W2
/,o» sin -t^-2 ■ <t> n , etc., (1)
where Xj = (j — \)d, and the numbers a n are determined by a
condition soon to be described. Here the quantities a„ sin
nxxj/L correspond to the a's and 0's of the preceding chapter.
136 INTRODUCTION TO THEORETICAL PHYSICS
But not only that: the coefficients still satisfy orthogonality
and normalization conditions. The orthogonality conditions
will be of the form
i sii
a n a m \ sin —^— sin — ^— + sin — y— sin — f (-.
In I
. . nicx N . rrnrx N \ A /n .
-f- sin — j — sin — j — ■ J = 0, (2)
where n, m are any two indices. This is true, as can be shown by
trigonometrical manipulation, though we shall not stop to do it.
Similarly the normalization will be
sin* ^ + sin* ^ + • • • + sin* ^ = 1. (3)
We can satisfy this by proper choice of a n , since the parenthesis is a
definitely determined, positive quantity. This is then the condi-
tion, called the normalization condition, for determining the
constants a n . Since we have as before an orthogonal transforma-
tion, we can again get a geometrical interpretation. We imagine
an iV-dimensional space, in which the quantities U\ ... u N are
plotted as coordinates. Now our transformation of axes is
equivalent to a rotation of coordinates in this JV-dimensional
space. The normal coordinates <£i . . . <j> N represent new orthog-
onal axes in the space, in the sense that if 0i = 1, all the other
<£'s are zero, the corresponding point is displaced from the origin
unit distance along the 4>i axis. The quantities like a n sin
— y^ represent the components in the direction of the old axes
Li
of unit vectors along the new axes. Thus the one written is the
cosine of the angle between the 4> n and the Xj axes. The equa-
tions of motion are separated in the new coordinates, the solutions
being <f> n = constant X e ia n l . Finally, the equipotentials, which
are ellipsoids in the iV-dimensional space, have principal axes
in the directions which we have chosen for the normal coordi-
nates. Thus the analogy with the two-dimensional problem is
complete. The statements we have made without proof here
are not very difficult to demonstrate, and some of them are taken
up in problems.
We can now go one step farther, to the continuous string.
Here the displacement of a point of the string is given by u(x),
where x measures the coordinate of the point, corresponding to
the in for the problem of discrete weights. We introduce normal
NORMAL COORDINATES AND THE VIBRATING STRING 137
coordinates <j> h . . . <j> n , . . . , (an infinite set, as there are an
infinite number of points on the string), by the equation
u{x) = ^Sa n mx r ^ 4> n . (4)
The orthogonality conditions for the coefficients a n sin -y- must
now be written in terms of integrals, rather than sums; for we
have terms for each value of x, from to L, differing by infinitesi-
mal amounts. Thus these conditions are
a n a m sin -— sin — — - ax = 0, (5;
X
L L
/:
where n and m are different integers. This can be immediately
proved by evaluating the integral. Similarly the normalization
condition is
a„ 2 sin 2 ^ dx = 1. (6)
which as before serves to determine a n .
84. Normal Coordinates and Function Space. — We must now
imagine a space, not of N dimensions, but of an infinite number.
We cannot get an idea of what the coordinates mean, except by
passing to the limit from the case of a finite number of mass
points. With N points, and N dimensions, the first coordinate
measures the displacement of the first mass point, and so on.
Thus a point in the iV-dimensional space determines all the
coordinates, or in other words gives the displacements of all the
masses. Now as N gets larger and larger, and we have more
and more dimensions, it still remains true that a particular
coordinate measures the displacement at a particular part of the
string. We see that this interpretation persists to the limit of
infinitely many variables: each coordinate is connected with a
point of the string, and its value gives the displacement at that
point. But there is now an interesting side light on the situation.
A point in our infinitely many-dimensional space gives complete
information about the displacement of each point of the string.
That is, it gives u{x), a function of x. Each point of this space
is connected with a particular function, and each possible function
is represented by a point of the space (of course, many points
of the space refer to discontinuous functions and, therefore, are
138 INTRODUCTION TO THEORETICAL PHYSICS
not suitable for describing a string). on account of this prop-
erty, our space is often called a function space.
The normal coordinates now represent a set of rectangular
axes in function space, rotated with respect to the original coordi-
nates. Each normal coordinate refers to a particular mode of
vibration v or overtone. If just one of the normal coordinates is
excited, say if <f> n = 1, all the other <£'s being zero, the situation is
represented by a certain point in function space; that is, by a
certain function, giving the shape of the string. We can take
the radius vector out to the point <f> n = 1, all other <j>'s = 0, and
project it on one of our original coordinates. Thus the projection
on the coordinate connected with the point x is a n sin (rnrx/L),
showing that that is the displacement of this particular point of
the string when this overtone alone is excited with unit amplitude.
The expression a n sin (nwx/L) is now a function; it is the function
represented by a unit vector along the <t> n axis, in function space.
Since the <£ axes are orthogonal, we see that the scalar product
of two such vectors along different axes is zero :
J* L . nirx . rrnrx ■, n
a n a m sin -— sin — =- ax = 0,
o L, L
where by analogy the scalar product takes this form, so that we
have the orthogonality conditions and their geometrical meaning.
Similarly the square of the unit vector, which is unity, is
a n 2 sin 2 ^- dx = 1,
L
the normalization condition. This immediately gives a„ 2 L/2 =
1, a n = y/2/L.
Now as before, when we introduce the normal coordinates,
we have rotated axes in function space to make the new coordi-
nates lie along the principal axes of the ellipsoidal equipotentials.
And the equations of motion are separated, each normal coordi-
nate vibrating with simple harmonic motion: <£„ = A n eW.
Finally, then, the motion is represented by
J"
00
u(x) = 2*\I sin TT ^
n=l
n = l
NORMAL COORDINATES AND THE VIBRATING STRING 139
agreeing with the value found previously. In Section 86, we
carry out the demonstration that the equations of motion are
separated in the normal coordinates, and then we apply them to
the discussion, of forced motion.
85. Fourier Analysis in Function Space. — When we come to
the question of satisfying initial conditions, and of Fourier's
series, we meet immediately close connections with function
space. Fourier's theorem, stated for sine series,, can be put in
the following form, by introducing terms \/2/L:
f(x) = ^ (VL/2 B n ) s/2/L sin
rnrx
where
VL/2 B n = \ f{x)s/2/L sin ^ dx.
Jo L
(7)
Now the functions \/2/L sin (mrx/L) are the unit vectors in func-
tion space along the directions representing the overtones of the
problem — functions which are often called the normal functions,
or characteristic functions, of the problem. Thus Eq. (7) is
just like a vector equation, stating that a vector (f(x)) is the sum
of unit vectors [V2/L sin (mrx/L)] each multiplied by the com-
ponent of the vector along the corresponding axis (y/L/2 B n is
the component of f(x) along the nth axis) . To find these com-
ponents, we need only project the vector f(x) on the corres-
ponding unit vector, which means taking the scalar product.
But the scalar product, as we have seen, is ah integral,
f(x)
VVL sin V^~\ = v^/2 B n =
f/CaOv^/Lsin^dz. (8)
Jo L
Thus the formulas of Fourier's method have the simplest possible
vector interpretation in function space. But we can also see
that, if we had some other set of normalized orthogonal functions,
we could proceed with an expansion in an analogous way. It is
worth noting that by using Fourier's method, we can solve for
00
the normal coordinates in terms of u(x): since u(x) = ^ -\/2/L
n = l
1 40 INTRODUCTION TO THEORETICAL PHYSICS
. mcx
sm T~
4> n , it is obvious that <t> n = I u(x) y/2/L sin -=- dx,
the component of u(x) along the nth axis.
86. Equations of Motion in Normal Coordinates. — To find the
equations of motion, we must set up the Lagrangian function.
Let us first write for the velocity of the string
= <f> x y/2/L sin j- + 4>2 V2/X sin -=- + • • • +
u
Li U
nirx
0„ v 2/L sin T
and proceed to the expressions for the potential and kinetic
energies. We have
/ oo \ 2
\n = l
00
. HTX \ 7
sin -J— I dx
n = l
since all the product terms disappear because of the orthogonal
properties of the normal functions. Thus Tx becomes reduced
to a sum of squares in the generalized velocities and the integra-
tion over x leads to the result
^1 = 12^'- (9)
n = l
In a similar manner we set up the expression for V, the potential
energy. We have
* - 1 f (9 2 * - \ f (! «■ ^ t - n 4) *>
which we treat exactly as in the case of the kinetic energy and
obtain V as a sum of squares of the generalized coordinates 0„,
namely
v = £2£**- (10 >
n = l
Using the Lagrangian equations of motion, we have
U = T, - 7,
NORMAL COORDINATES AND THE VIBRATING STRING 141
d/dL x
dt\d<j> r ,
d(dL\ x
1 _ _(^' 2
d<f>~
so that the equations of motion become
aL : - -(r)' 1 *-
te)
M<£» + I -f ) T<f> n = $ n , n = 1, 2, • • • , (11)
where <£„ is the generalized external force corresponding to the
coordinate <f> n . Up to this point we have considered only free
vibrations for which $„ is zero, but now we have generalized our
problem to include such things as forced oscillations. We solve
the equations above for the case of free vibrations, obtaining
with
_mr It
,n ~ ttVm
so that our expression for u is just the one originally found, in
Eq. (3), Chap. XII. It is thus clear that we have essentially
used normal coordinates in our first discussion of the vibrating
string.
The generalized force $„ is defined so that the work done by
the external force during a displacement d4» n is $ n d<]> n . During
a displacement d<f> n , the corresponding displacement of the string
is a„ sin -^-d<t> n = du, so that if the force acting on a length of
Li
string dx at time / is fdx, we find for <l> n ,
*„ = V2jL J Q f sin ^ dx. (12)
In function space, this is evidently simply the component of $
along the nth axis. An interesting case occurs when the force
acts practically at a point x — a, such as when a violin string is
plucked or bowed. We then write
<K = V2/L sin ^ l f dx - F V2/L
nira
sin — =r--
L
This expression brings out the advantage of the concept of
generalized force. For example, if a string is struck or bowed at
142 INTRODUCTION TO THEORETICAL PHYSICS
its center, then a = L/2, and $„ = when n is an even integer.
This means that this force can have no effect on the even over-
tones and can only affect the odd overtones. If the string is
originally at rest, no matter what kind of force is applied at the
center, only odd overtones appear in the resultant vibration. No
even overtones ever occur, as the normal coordinates are
uncoupled and each normal coordinate behaves just as if all the
others were absent. These conclusions, immediately obvious
from the expression for <S> n , are not at all obvious if one considers
only the usual force acting at a point of the string.
Another case of interest occurs when a periodic force acts on a
point x = a of the string. We then have
$ n = F s/2/L sin -y- cos cot
and the equation of motion for <j> n is very much like the equation
of forced motion of a one-dimensional oscillator. The solution
of this equation is then
. N F VtyL sin (mra/L) , •"-
<pn = A n COS (C0 n c ~ in) ~\ 7 5 S\ COS COl.
H{0) n Z — CO 2 )
The first term is the solution of the homogeneous equation and
represents the free vibration of this mode, and the second repre-
sents the forced vibration indicating all the characteristics of
resonance which we have previously studied.
87. The Vibrating String with Friction. — Thus far we have
neglected friction forces which must act in real eases. Let us
assume that the motion of our string is opposed by a frictional
force such that the force on each element of the string is propor-
tional to its velocity. The partial differential equation of the
free motion of the string becomes
d 2 u ■ 7.3w _ T d 2 u
+ k^ = -~- . (13)
We can treat this problem rather simply by noting that there is a
function G, called the dissipation function, which is one-half the
rate at which energy disappears from the system and which has
just the same form as the kinetic energy T\. In fact, we have
G = - iik I u 2 dx.
2 Jo
NORMAL COORDINATES AND THE VIBRATING STRING 143
one can easily show that the Lagrangian equations when there
is a dissipation function become
d/dLA _ dLt ,dG_ Q (u ,
dt\ dq-J % + dq, " Wl * U ;
According to the special law of friction we have assumed, the
dissipation function has the same form as T\, so that if we intro-
duce the normal coordinates fa, fa, . . . etc., which we found
reduced the expressions for T\ and V to sums of squares, they will
also do the same for G, so that we can separate the equations of
motion for each coordinate fa. Proceeding as in the last para-
graph, we find
G = y /*}<i>n 2
The equation for fa then becomes
fa + kfa + 0) n 2 fa = -*» (15)
M
which is the same form as in the case of a one-dimensional damped
oscillator. From this we see that each of the overtones has the
same logarithmic decrement, so that in a free vibration the vari-
ous overtones maintain their relative amplitudes.
In the case of a forced vibration caused by a periodic force F
cos cot acting at the point x = a, we have
<& n = F\/2/L sin -y- cos cot,
and the steady-state vibrations are given by
r^~FF • nira[ (co„ 2 — ca 2 ) cos cot + "& sin cot \ ... _ N
2 / Lsin TrL — k'-^' +'(■*)' — } (16)
This is, of course, essentially the same solution we obtained in
the discussion of a one-dimensional oscillator.
The particularly simple solutions just obtained depend entirely
on the simple form of the law of friction we have assumed. In
general, for vibrating systems, the presence of frictional forces
does not prevent us from setting up the kinetic and potential
energies as a sum of squares. But this transformation will in
general not transform the dissipation function G to a sum of
144 INTRODUCTION TO THEORETICAL PHYSICS
squares. only in very special cases, such as the law of friction
assumed above, does the transformation also reduce G to a sum
of squares. The general equation of motion for the coordinate
<£i, for example, will be of the form
ai0i + ci4>i + £fbu<i>i = $1
instead of the simpler form obtained above
in which we have only <j>i appearing. Thus in the general case
of frictional forces there is coupling between the various coordi-
nates so that we have much more complicated types of motion.
In Chap. XI, Prob. 3, we had such a case with two coupled
circuits with resistance and found that we could get a simple
solution only for very small frictional forces.
Problems
1. Write down the Hamiltonian function for a vibrating string, using
normal coordinates. Set up Hamilton's equations, and show that they
are satisfied for the solution we have found.
2. A sinusoidal force of constant amplitude but adjustable frequency
acts on an arbitrary point of a string. The string is in addition damped by
a frictional force proportional to the velocity. Discuss the resonance of
the string to the force, computing, for example, the total energy of the
string as a function of the applied frequency, and showing that the resulting
resonance curve goes through maxima corresponding to the various over-
tone frequencies. Find approximate heights and breadths of the maxima.
Neglect the transient vibrations.
3. Prove the orthogonality relations for the normal functions for the
weighted string; that is, prove
sin -t— sin — j V ■ ■ • + sin — j- sin —f— = 0.
4.. Using the orthogonality relations of Prob. 3, and the analogy of the
continuous string, set up a method for finding the amplitudes of the various
overtones of the weighted string, in terms of the initial displacements and
velocities of the particles.
5. Apply the method of Prob. 4 to the special case of two coupled par-
ticles, as taken up in Prob. 1, Chap. XI.
6. Apply Prob. 4 to the case of four particles, as in Prob. 1, Chap. XII.
7. Consider two coupled mechanical vibrating systems, with friction.
In general, a dissipative function cannot be set up, and the problem of the
motion cannot be solved exactly. Show what relations the frictional forces
must satisfy in order to have a dissipative function. Write-down the
corresponding relations also for the electrical case.
NORMAL COORDINATES AND THE VIBRATING STRING 145
8. What sort of force must be applied to a string in order that the forced
motion should be a pure vibration of the nth harmonic?
9. Consider the case of two coupled particles as in Prob. 1, Chap. XI.
Show that if equal external forces act on both, the overtone in which they
vibrate in opposite directions can never be excited.
10. In the case of the two coupled particles of Prob. 1, Chap. XI, assume
that at t = both particles are at rest, but that one particle is displaced a
distance d, the other not being displaced at all. Find the amplitudes of the
two overtones, writing down the formulas for the displacements of each
particle as functions of time.
CHAPTER XIV
THE STRING WITH VARIABLE TENSION AND DENSITY
In the last two chapters, we have considered the problem of
the vibration of a string of constant density and uniform tension.
These results may now be extended for the more general case of
variable tension and density. We shall not be able to carry
through the results in complete detail; for, as we shall see, we are
led to a more complicated differential equation, which we cannot
solve in general. But we shall find that the theory of expansion
in orthogonal functions, and all the general relations, go through
just as with the uniform string, so that we can derive a good deal
of information. We shall also develop perturbation methods,
which can be used when the tension and density have only small
deviations from constancy.
The importance of the problems considered in this chapter
arises more from what they suggest than from the specific
problems considered. Strings of variable density are of small
practical importance. But the string is the simplest case of a
vibrating continuum. Waves in three dimensions resemble
waves on a string. A string of variable density resembles an
optical medium of variable index of refraction, and we meet
problems of reflection and refraction. Many three-dimensional
problems can actually be reduced to one-dimensional cases, and
these are all likely then to take on just the character of our string
of variable density. It forms, so to speak, the type for much of
our more complicated work. In wave mechanics, for instance,
most of our problems reduce to a mathematical form which is
identical with that of the present chapter. The perturbation
theory we develop in this chapter is one set up originally for use
with variable strings, yet it has had most important effects in
the development of the quantum theory.
88. Differential Equation for the Variable String. — We set up
the differential equation of motion exactly as we have done in
Chap. XII. In calculating the resultant force on an element dx of
our string we found (ifj)^ - (if^.andtimJs^lf^l*,
146
THE STRING WITH VARIABLE TENSION AND DENSITY 147
which reduces as before to T-^dx for constant tension. The
remainder of the derivation proceeds as before, and the equation
of motion becomes:
where both T and /x are now functions of x. If we assume that
u is proportional to a function of x times e iwt , we find that we get
an equation for the function of x alone :
|(^) + .V« W =0, (2)
where this u(x) is the part of u depending on x.
89. Approximate Solution for Slowly Changing Density and
Tension. — The above Eq. (2) is a linear second-order differential
equation with variable coefficients, on account of the functions
T and /*, which depend on x. We can give no general method of
exact solution, except the power series method. To apply that,
of. course, T and /* must be expressed as power series in x. But
it turns out that the solutions of the equation are not very differ-
ent from sines and cosines of x, and a very useful approximate
method of solution is based on this fact, good when the density
and tension do not change by a large fraction of themselves in
one wave length. This approximate solution is simple, and
forms a convenient method for discussing the equation qualita-
tively. The effect of the variable density and tension comes in
two ways: first, the wave length depends on the position, and
second the amplitude depends on x. Thus, instead of A sin
7 ^-, as with the uniform string, the actual solution for the func-
tion of x can be at least approximately written in the form u —
A(x) sin B (x). We can see easily the form which B must have
for the nonuniform string. For plainly ^—= must
measure the number of wave lengths between x\ and x 2 , on
account of the way in which B appears in the sine function.
But now if X is the wave length, regarded as a function of x,
dx/\ is just the number of wave lengths in distance dx, so that
J* X2 dx
—, from which evidently
XI ^
148 INTRODUCTION TO THEORETICAL PHYSICS
B(x) = 2x / dx/\. Since the wave length can also be written
2tt/\ = coy/n/T, this is equivalent to B(x) = to J s/yjT dx. It is
not hard to show that if we set A = — tt=—> the resulting
expression
A .„ constant . r /— 7™,
A e lB = — e io,)V*/Tdx } (3)
or the corresponding real quantity
constant , r
cos (coj "
</7r
forms an approximate solution of the differential equation.
To prove this equation, we may proceed as follows : we assume
the solution
u = A e^JVi^^
where A is an undetermined function of x, and substitute in the
differential equation. When the necessary differentiations and
substitutions are performed, we obtain a differential equation
for A, which may be written, after a little manipulation,
Ul d * A 1 dT 1 dA\
\A dx 2 + f dx A dx) +
where X = — */— , the wave length of the disturbance. Now
0} \ n
we are assuming that /*, T, and consequently A, do not change by
a large fraction of themselves in a wave length. Thus the quanti-
1 dA
ties like X-r -t-> measuring the fractional change in A in a wave
length, are numbers small compared with 1. Their squares,
then, and their rates of change in one wave length, can be
neglected, and that means that the first set of terms above, in
X 2 , can be neglected in comparison with the second set, in X.
Considering only the latter terms, we can rewrite the Eq. (4)
din
4 \( d In T din A
+ 4\ dx + dx ) '
= 0, A(jtT)H = constant,
dx
d\n{A(jiT)*)
dx
giving the solution we wished to prove
THE STRING WITH VARIABLE TENSION AND DENSITY 149
90. Progressive Waves and Standing Waves. — In the problems
of Chap. XII, we noted that there were two sorts of waves
possible in a uniform string: progressive waves, and standing
waves. The progressive waves traveled along with a velocity v;
an example was cos u>(t — x/v), in which the displacement has the
same value at all points for which i — x/v = constant, or x =
vt + constant, points traveling along with velocity v. Similarly
in our general case, we can set up a complex solution
constant iu y~jT)
where v = y/T/n. The real part is
constant
cos col
(■ - /')
</7r
fdx
where the equation t — I — = constant gives, by differentiation,
dx/dt = v, verifying that the velocity of propagation of the
progressive wave is v = s/T/n, varying from point to point
along the string. Thus in the general case we can have a progres-
sive wave along the nonuniform string. We shall see later in the
chapter, however, that this is only approximately true for strings
with slowly varying density and tension. At a rapid variation
of constants, a reflected wave is set up, traveling in the opposite
direction, and the superposition of direct and reflected waves
eventually produces something more like a standing wave.
An example of a standing wave with a uniform string is
... x . . constant . . . Cdx
sin o3i sin w-> or in the general case — A . — . sin cor sin to I —
v & ^/nT J »
This is a product of a function of t and a function of x, so that such
a wave has nodes, values of x for which the function of x is always
zero, so that the vibration always has zero amplitude. We have
seen that by combination of two progressive waves we can build
up a standing wave; similarly by adding two standing waves
we can get a progressive wave, as we see from the relation
cos co£ cos co I h sin co* sin co I — = cos col t - I — )•
J v J v V J v/
Thus either sort of wave satisfies the differential equation, and
we can add solutions as we always can with homogeneous linear
differential equations.
150 INTRODUCTION TO THEORETICAL PHYSICS
Now suppose a string is held at one point. That means that
we must limit ourselves to a particular set of solutions of the
differential equation: the standing waves which have a node at
that point. Thus in our approximate solution, we must take
the space function
constant . C x dx
sin oj
where x is the point where the string is held. Suppose we
imagine a semi-infinite string, held at one point, with a wave
train of finite length approaching the end. The wave is reflected
from the end, travels back, and the superposition of the two
trains, in opposite directions, forms the standing wave. This
wave will have nodes at definite points on the string. It may-
have any arbitrary frequency, but the nodes will be differently
spaced with different frequencies.
If now the string is held at two points, instead of one, we meet
a difficulty: with an arbitrary frequency, the string will not
have a node at the second point. We must limit our frequency to
one of the discrete set for which there are nodes at both ends.
Thus the fact of having the string held at both ends automatically
sets up a discrete set of possible frequencies of vibration, the
overtones, with a particular form of vibration for each. We let
the nth overtone have a wave form represented by u n (x), an
angular frequency «». Thus the whole solution may be written
u = V(A„ cos w J + B n sin wj)u n (x), (5)
n
where the constants A n and B n are chosen to satisfy the initial
conditions at t = 0. If our analytic approximation to the
function is good, we have
, . constant . C x dx fa \
u " (x) ' "W sm ""J,^' {)
with v = VT/fjL. Since the displacement is zero not only at x ,
but also at the other end xi, we must have
COn I
X1 dx rt n
--2*1 (7)
where n is an integer, which as we readily see equals 1 when there
are no nodes between the ends, 2 when there is one node, etc.
This leads at once to the condition
THE STRING WITH VARIABLE TENSION AND DENSITY 151
mr
(8)
dx/v
for the angular velocities, which for the uniform string reduces to
where L = xi — x is the length of the string. If
Wn ~ T V 7
our analytic approximation to the functions u n is not good, we
must simply choose those particular functions for our w n 's which
have nodes at Xi and £ 2 , labeling them in order, the one with
n — 1 nodes between the ends being called u n> and then must
find the angular frequencies connected with these particular
functions. We meet such a case, for example, in some of the prob-
lems, where the functions u n are Bessel's functions, and where we
simply must look up the nodes in tables of the roots of Bessel's
functions. The particular functions u n satisfying both differ-
ential equation and boundary conditions are called normal func-
tions, or characteristic functions, or wave functions, and the
frequencies co n are sometimes called characteristic numbers.
91. Orthogonality of Normal Functions. — We can now prove
easily, and quite generally, that the normal functions u n are
orthogonal. For this purpose we consider two normal functions
u n and u m , which are solutions of the differential equation. We
then have the identities
A
dx
and
d
We multiply the first equation by u m , the second by u n , subtract
one from the other, and then integrate over the string, which we
assume to extend from x = to x = L. We thus obtain
n-=( r £)—s(*£)]*'-
= k 2 — co n 2 )
I n(x) U n U m dx.
JO
The left side integrated by parts yields immediately
\ T\u — — u ^A1| L _ f L rr( dUn dUm — ^n dUn \ d
[_ \ m dx dx /J|o Jo \dx dx dx dx J
152 INTRODUCTION TO THEORETICAL PHYSICS
The integral obviously vanishes, and the integrated part vanishes
since both u n and u m are zero for x = and x = L. In general
the integrated part would vanish if either u or du/dx vanished
at the boundaries, or if an expression of the form u + a du/dx
vanished at each boundary. Thus the right side of the equation
above yields us as the analogue of our former orthogonality
relation
I n(x) u n u m dx = 0, if n 9^ m, (9)
since, when n = m, the integral need not vanish to satisfy the
original equation. We shall assume the functions to be nor-
malized so that
CV*) u n * dx = 1. (10)
In the previous chapter, where the density n was independent
of x, we could simply omit that factor in the integrals, changing
the normalization condition to Jw 2 dx = 1, without any error
other than a change of a constant factor in the functions u s .
Here, however, the density factor must be kept in. We can see
the analogy to the corresponding situation with the two coupled
particles. There, if the masses of the particles were m h ra 2 ,
and their displacements were y h y 2 , we had to set up new quan-
tities xi, xi, equal to a/wiI/i and Vwfi, respectively. We
could give the normalization conditions by stating, for example,
that the unit vector along X has unit magnitude. The coordi-
nates of the extremity of this vector, in the notation of Chap. XI,
were xi = a', x 2 = ft '. Squaring the magnitude of this vector,
we had the normalization condition
Xt 2 + X2* = «' 2 + /3' 2 = 1.
But this is equal to m x y^ + W22/2 2 , where the y's are the actual
displacements. Thus in that case, just as here, we must weight
the squares or products of displacements, where they appear
in the orthogonality or normalization conditions, with the respec-
tive masses. Here the term n(x)dx is just the mass of the ele-
ment dx, so that the analogy is complete.
92. Expansion of an Arbitrary Function Using Normal Func-
tions. — We have seen that we can write our solution
u = V(A„ cos w n t + B„ sin <aj)u n (x).
THE STRING WITH VARIABLE TENSION AND DENSITY 153
If the initial conditions are u(x, 0) = fix) and -^(x, 0) = Fix),
where u(x, t) is the function of coordinate and time, we have,
substituting in our general solution,
n
F(x) = ^BnUnlln, (11)
n
and we have the general problem of expanding an arbitrary func-
tion in a series of normal functions, very much like our previous
problem of expanding an arbitrary function in a Fourier series.
As before, we shall content ourselves with showing that we can
find expressions for the coefficients A n and B n which formally
satisfy this type of expansion. The remainder of the problem,
showing that the series so built up really represents the function
and that it converges, will not be taken up here. -It is sufficient
to say that such proofs can be given.
Let us multiply each of Eqs. (11) on both sides by fi(x)u m , and
integrate from x = to x = L. Ws thus have
f nix) u m f(x) dx = ^£fA n J nix) u m u n dx
and
f n(x) u m F(x) dx = 2yO>„ B n f n(x) u m u n dx.
on the right side of each of these equations each term for which
m 9* n vanishes because of our orthogonality relations. The
remaining term contains an integral which has the value unity
if the functions u n are normalized. Thus the whole sum reduces
to A m (or in the second equation to u m B m ), and we have found
expressions for our coefficients :
Am = J n(x) f(x) u m dx,
and
B m = — I /i(x) Fix) u m dx. (12)
w m jo
It is clear that our discussion of the Fourier expansion is but
a special case of the general one here discussed. The most con-
venient point of view to take is to define the scalar product of
two functions f(x) and <£(a;) as
j£ n(x) f(x) <t>(x) dx.
154 INTRODUCTION TO THEORETICAL PHYSICS
Then clearly our orthogonality and normalization conditions
are just what we should expect from our discussion of orthogonal
vectors in function space, in the last chapter. The rotation
of coordinates in function space again separates variables,
as it did in the case of the uniform string; but now the separate
normal or characteristic functions are more complicated in
form, as we see from the more complicated differential equations
they satisfy, though they still vibrate sinusoidally with time.
When we carry out an expansion of a function f(x) in terms
of the characteristic functions, the coefficients, as with the
Fourier expansion, are just the scalar products of the correspond-
ing characteristic functions with the given function, or
J q fi(x) f(x) u n dx,
as we wrote above.
93. Perturbation Theory. — one approximate method of inte
grating the differential equation of the nonuniform vibrating
string has already been indicated, making use of the resemblance
of the actual functions to sines and cosines. An entirely differ-
ent approximate method, the method of perturbations, is also
frequently useful. This is a method which applies if the problem
is very nearly a soluble one, the density and tension varying
only slightly from their values in the soluble case. The usual
application is to an almost uniform string. For simplicity we
consider only the case where the tension T is a constant, while
the density is a function n(x), almost equal to no(x), for which
the problem can be solved. We assume that we know the char-
acteristic functions u n ° and frequencies w n ° for the soluble case,
satisfying, therefore, the differential equations
T^- + co n °Vo(*K° = 0. (13)
We now remember that the functions u n ° form an orthogonal
set, and that any arbitrary function can be expanded in series of
such functions. Thus in particular the nth characteristic func-
tion u n of the real problem can be so expanded :
u n = ^i„fc«t°. (14)
k
We may regard our problem as that of determining the constants
A n k. Considered in function space, this problem is very simple.
THE STRING WITH VARIABLE TENSION AND DENSITY 155
The functions u k ° form one set of orthogonal unit vectors, the
u n 's another, and these equations merely express one set in
terms of the other; they are the equations for a rotation of
coordinates in function space, from the axes characteristic
of the "unperturbed" problem with density /t to the "per-
turbed" problem with density /x.
The easiest way of getting at the conditions for rotation
is simply to substitute u n in the differential equation which we
wish it to satisfy,
If we do so, and use the differential equations which w n °'s satisfy,
we have easily
k
Now we may multiply by an arbitrary u m °, and integrate from
to L. Remembering that the w°'s are orthogonal, the result
is
2)A nfc (cO fc °V°m fc - «nV«*) = 0, (15)
k
where n° mk = J^ote) ™ m ° u k ° dx = 1 if m = k, if m ^ k, and
Hmk = f L n(x) u m ° u k ° dx, a quantity differing from n° m k only by
small quantities of the order of the deviation between n and n .
We have here an infinite set of simultaneous homogeneous linear
equations (w can take on any value) for the unknown constants
A nk . These can be written, for a given n,
A nl (0)l° 2 - CdnVll) + A n2 (-0VW + An8(-«»W + * * * =0
A„l(-C0 n 2 iU2l) + A n2 (c02° 2 ~ C0 M 2 /X2 2 ) + * ' * - =0
A„l( — C0 n 2 /i3l) + * * ' ' =
. ... =0.
(16)
In general these will have no solutions; the condition for existence
of a solution is that the determinant of coefficients vanish. This
forms an equation for co n 2 , called a secular or determinantal
equation, and just analogous to that which we found with the
problem of two coupled vibrations, when we made a rotation
of coordinates, and we recognize it as the general type met in
156 INTRODUCTION TO THEORETICAL PHYSICS
such problems. In this case, the equation has an infinite number
of roots, one near each unperturbed frequency.
It is hardly feasible to solve the determinantal equation
directly, though it is not hard to make an approximation to it.
It is easiest, however, to proceed directly from the linear equa-
tions. If the w°'s are nearly the same as the w's, it is plain that we
shall have A n k = 1 almost, if n = k, or = almost, if n ?£ k.
The only term in the equations which is large and need be
considered is then that for which n = k (so that A nk will be
large) and simultaneously m = k (so that n° mk and n m k will
be large). This term gives
o 2
A nn (a) n ° — u„V«) = 0, or o) n 2 = — ^--
M»«
If now ix = /xo + /ii, where mi is small compared with mo, we have
Vnn = 1 + I mi Un° 2 dx, so that, using the first term of a binomial
expansion,
co„ 2 = uj 2 (l - J^mi U n ° 2 dx}, (17)
correct to the first order of small quantities, but neglecting terms
of the order of the square of the integral of ml It is not hard
to get expressions of the same order of accuracy for the A's.
94. Reflection of Waves from a Discontinuity. — We mentioned
earlier that a progressive wave striking a discontinuity of density
would be partly reflected, and only partly transmitted. It is
easy to solve exactly the problem of propagation of the wave
over the discontinuity, and as this is one of the exactly soluble
cases of the vibration of the nonuniform string, and is the simplest
problem of reflection, it is worth carrying its discussion through.
Let us assume two uniform strings of different densities attached
to each other and subject to the same tension T. Let the first
string have a linear density mi and the second a density M2.
We shall take the point of junction as x = 0. We thus have
different velocities of propagation Vi = -y/T/ni andv 2 = \/? 7 /m2
in the two strings. We may also define an "index of refraction"
of one medium with respect to the other as n = Vi/v^ = VM2/ML
At x = we must satisfy certain conditions at every instant
of time. First, the displacement u must be continuous across
the boundary if the strings remain joined together, and secondly,
the slope du/dx must also vary continuously across the boundary.
THE STRING WITH VARIABLE TENSION AND DENSITY 157
Were the latter condition not fulfilled, we would have the impossi-
ble situation of a finite force acting on an infinitesimal piece
of the strings at the junction.
Let us consider a harmonic progressive wave in the first
string (ah) impinging on the junction. In the second string we
shall have a wave traveling in the same direction as the impinging
wave, but in order to satisfy the boundary conditions, we must
assume a reflected wave in the first string. Thus
Ul = Ae V Xl ' + Be \ *»/
. and
U2 = Ce v _ x »'.
The frequency is a fixed characteristic of the wave, independent
of the medium in which the wave is propagated. The wave
lengths Xi and X2 are related by the condition
= 1} = 2!?,
Xi X2
or
x 2
= n.
Mi
At the junction, where x = 0, we have
(t*i)o = Ae 2 * iyt + Be 2irivt
(w 2 )o = Ce^ ivt ,
and
\ dx J Xi Xi
/duA = _27rz (7e2xij , t
Thus the conditions of continuity give
A + 5 = C,
and
A _ £ = C
Xx Xi X 2
whence
5_X 2 — Xi__n— 1
A Xi + X 2 w + 1
giving the ratio of the amplitude of the reflected to the incident
wave. Two limiting cases are interesting: if ju 2 = <x>, so that
158 INTRODUCTION TO THEORETICAL PHYSICS
the junction is held fast, we have n = «, B = —A, or the wave
is entirely reflected, with a change of phase. The other case is
M2 = 0, the junction is free, and we have n = 0, B = A, reflec-
tion again being complete, but with no change of phase. In
both these cases the incident and reflected waves combine to
give standing waves.
Problems
1. A heavy uniform flexible chain hangs freely from one end. The chain
performs small lateral vibrations. Show that the normal functions are
u n = Jd-~\/xY where J represents the Bessel function of order zero;
a; is the distance from the bottom of the chain to any point, g the acceleration
of gravity and o?„ is the angular frequency of the nth. mode of vibration.
For a chain 8 feet long, find the periods of the first few modes of vibration
(use Jahnke Emde's tables to get the roots of the Bessel functions).
2. one end of a uniform flexible chain of length I is attached to a vertical
rod which rotates at a constant angular velocity Qo- Neglect the effect of
gravity, so that the chain stands out horizontally under the tension of
centrifugal force. Show that the differential equation for small vibrations
transverse to the length of the chain is
fak -*■>£]+*"-*
Introduce the variable y = x/l, and solve the resulting equation by the
power series method. The boundary conditions are w(0) = and u for
y = 1 must remain finite. Note that the latter condition can only be
fulfilled if the series breaks off to form a polynomial. Calculate the first
three polynomials and derive a relation for the frequency of the nth mode
of vibration. The polynomials so found are the Legendre polynomials of
odd order.
3. A string stretched with a uniform tension T, and with a density a/x 2 ,
is held at the points x = Xi and x = x%. Solve the equation, using the form
u = s/x z, and show that the general solution is
u = Ax i+ik + Bx*- ik ,
where k is defined by k 2 + yi = u> 2 a/T, and w is the angular velocity.
Show from this that the general form of the normal function is
/- . W7T In (x/xi) ,
■\/x sin -: — t — T-^-' n = 1, 2, 6,
In (xj/xi)
and that
a[_4
+
4 (In Xi/xi) 2 l
4. Solve the differential equation of Prob. 3 by the approximate method
described in this chapter, and show that the solution has the same form as
the exact solution. Show that the two solutions coincide in the limit of
large a.
THE STRING WITH VARIABLE TENSION AND DENSITY 159
6. A progressive wave travels on a uniform string which at x = is
connected to a string whose density w> m = Mo 4-«av This second string
is connected to a third at x = I which has the constant density m = mo + od
and the whole is stretched with a uniform tension T. Using the approximate
method, find the ratio of the amplitude of the wave transmitted in the third
string to the original amplitude of the incident wave in the first string.
6. Consider a string of uniform density m, length L, but with a tension T
which varies slightly from its average tension T . Show with the help of a
perturbation calculation that the angular frequency of the nth mode is
given approximately by
L 2 n \ mrToJo dx L L )
.X 7. A uniform string of density mo, tension T, has a small load m placed
at a; = a. Show that the frequency of the nth mode of vibration is approxi-
mately given by
2 = VlEI 2Yi - — sin 2 —} ■
L 2 mo\ Moi< L /
Show that the effect of the additional load vanishes if it is placed at a node,
and is biggest when at an antinode.
8. Show that the differential equation of Bessel's function J m is the same
as that for a string of tension T = x, iw 2 = x - m 2 /x. Using the approxi-
mate method developed for the vibration problem, show that approximately
JM = 4 T Stant cos (JVl - m 2 /* 2 dx - «),
Va; 2 — m 2
where x > m.
9. Using the approximation of Prob. 8 for J and Ji, compute the approxi-
mation functions for a number of values of x, and show by a table of values
how well these agree with the correct functions. Choose the arbitrary
amplitude and phase factors to make the functions agree with the values of
Jo and J i in the tables, for example making the zeros agree by adjusting a,
and the maxima by adjusting the amplitude, taking such values as to get
the best agreement possible for large z's.
10. Derive the differential Eq. (4) for A, in the approximate solution
. iafy/n/T dx
u = Ae
CHAPTER XV
THE VIBRATING MEMBRANE
The problem of a vibrating membrane is very little more
difficult in principle than the string. Let us take two coordinates,
x and y, in the plane of the membrane, writing u for the displace-
ment at right angles to the plane, so that we wish a, relation
u = u(x, y, t). Consider a small element of the membrane,
bounded by dx and dy. Let the mass per unit area be n, so
that the mass of the element is ndxdy. Then its mass, times
acceleration normal to the membrane, is n dx dy d 2 u/dt 2 . This
is equal to the force arising from the tension. Let the tension
be T. That is, if we cut the membrane along any line, the
material on one side of the cut exerts a force on the material
on the other, normal to the cut and equal to T for each unit
of length of the cut. We assume that T is constant over the mem-
brane. If the membrane were plane, the tension on its opposite
edges would cancel, and we should have .no resultant force.
If it is curved, however, we may proceed as follows. Along
the edge at x + dx, the tension is at right angles to the y axis,
almost along the x axis, but with a small component along the
u direction, equal approximately to T[ — ) per unit of length,
\OX/ x+dx
or this times dy for the actual length dy. Similarly along the
-<£).
edge at x the component is — Ti-^-j dy, so that the sum is
approximately T d 2 u/dx 2 dx dy. The forces acting along the
edges at y and y -\- dy similarly add to T d 2 u/dy 2 dx dy, and the
total force, the sum of these, is T (d 2 u/dx 2 + d 2 u/dy 2 ) dx dy.
Thus the differential equation, dividing by dxdy, is
d 2 u T fd 2 u d 2 u\ r
^ = T \dx~ 2 + W (1)
95. Boundary Conditions on the Rectangular Membrane. —
A membrane is ordinarily held fast around a certain curve.
In this way one can get a great variety of problems, by taking
different curves. The two simplest are the rectangular mem-
160
THE VIBRATING MEMBRANE 161
brane, and the circular membrane, or ordinary drum, and in
the present section we consider the rectangular case, assuming
the membrane to be held at x = 0, x = X, y = 0, y = Y.
We solve first by the exponential method, assuming
1/ = f,i(at+kx+lv) ^
Then the differential equation becomes — /wo 2 = — T(k 2 + I 2 ),
co = \/T(k 2 + l 2 )/n, giving the angular velocity of the vibration
in terms of the quantities k and I. Instead of the exponential
solution we can equally well use sines or cosines. For example,
with a given co, k, and /, we can take
U = gioit/gikx+ily g— ikx+ily gikx—ily _|_ e —ikx—ily\
= e ib>t {2i sin kx)(e ilv - e~ ily )
= — 4e'"' sin kx sin ly.
As a matter of fact, this solution with sines is the one we want,
since it reduces to zero when x = and y = 0. To apply the
condition when x = X and y = Y, we must make the sines zero
at these points, or must have sin kX = 0, sin IY = 0, or k =
mr/X, I = mir/Y, where n, m are integers. In terms of these
constants, we can then write
(2)
so that instead of having overtones whose frequencies are integral
multiples of a fundamental, the frequencies are given by a much
more complicated relation. There is one interesting result of
this. Pleasing musical notes depend on having the frequencies
of the overtones related in simple ways to the fundamental, so
that they sound well together, as with a vibrating string. In a
membrane or drum, in which these relations do not hold, the
sound is far less musical than with a string. This suggests other
cases, which do not exactly fall within the category of the present
chapter. For example, a vibrating bell acts as a two-dimensional
vibrating system, a little like a membrane, and has complicated
overtones which in general are not harmonics. But it has been
found by trial that if bells are made in their conventional shape,
overtones are so adjusted that the loud ones are actually in tune
with each other, though a slight change of shape would destroy
the quality.
162 INTRODUCTION TO THEORETICAL PHYSICS
96. The Nodes in a Vibrating Membrane. — If the membrane
is vibrating with one overtone, the amplitude will be zero along
certain lines, which will stay at rest. These nodal lines form a
rectangular network, coming when nx/X = 1, 2, • • • n — 1,
and for my/Y = 1, 2, • • • m — 1, At any instant, if the
membrane is displaced upward in one rectangle, it will be dis-
placed downward in all adjacent rectangles. Such a nodal
arrangement is characteristic of all sorts of standing wave
problems
97. Initial Conditions. — At t = 0, we may wish to fix the shape
and velocity of our membrane, obtaining initial conditions of
the sort found with the string, and leading as before to Fourier
series. For example, suppose the initial velocity is zero, the
initial displacement a function f(x, y) . Then we must have
"^n a x • mr% ■ wry /on
u = >• A nm cos Wnmt sin -y sin -~) (3)
n,m
where ca nm is given in Eq. (2), and where we have
fix, y) = JSA nm sin -y sin -^- (4)
n,m
To find the coefficients A, the amplitudes of the various overtones
necessary to satisfy the conditions, we must expand the function
f(x, y) in a series of products of sines — a double Fourier series, as
it is called. As in the last chapter, we assume that the expansion
can be carried through, and ask only for the values of the coeffi-
cients. Multiplying both sides of the equation by sin —==- sin
y > where n', m' are definite integers, we integrate with respect
to x from to X, and with respect to y from to Y. We find as
before that I sin -^r- sin — ^- dx is zero unless n = n', and is
Jo A A
X/2 if n = n'. Thus the final result is
Jo Jo
It is worth noting that this is the first time we have had to use a
double integral. If f(x, y) is a complicated function of the
coordinates, it can, of course, be a very difficult problem actually
to evaluate the integral.
An'm' = YY I Jo ^ X ' ^ Sln ^1T Sln ^T^ dX dy ' ^
THE VIBRATING MEMBRANE 163
98. The Method of Separation of Variables.— To solve our
differential equation, we may adopt a slightly different method,
called the method of separation of variables, which does not
directly depend oh the use of exponentials. It is a method for
reducing the partial differential equation to a set of ordinary
equations, and we shall find it very useful. In fact, it is so valu-
able that practically the only partial differential equations which
can be solved at all are those for which this method can be used.
We wish to solve ^ = -( -^ + 3^2 )' Suppose we try to
find a solution u which is the product of a function of x, a function
of y, and a function of t; say u = P(x)Q(y)R(t), where P is a
function of x to be determined, and so on. Of course, it is not
obvious that one can find such a solution, but our experience
would lead us to try it. If we substitute, we have, for example,
du/dt = PQ dR/dt, and so on. If we denote dR/dt by R', with
corresponding notation, we then have PQ R" = (T/n) (P ,f QR +
P Q" R). Next we divide by PQR, obtaining
RT = T(PZ Q^\ (6)
We now make the step characteristic of the method of separa-
tion of variables: we observe that the function R" /R on the left
of Eq. (6) is a function of t alone, the quantity on the right a
function of x and y alone. The equation then states that a
certain function of t equals a function of x and y, whatever x, y,
and t may be. But this is clearly impossible in general. If, for
example, we keep x and y constant, and vary t, the left side would
change, the right remaining constant, and the equation would
not be satisfied. The only exception, as this example shows, is if
the left side is a constant, independent of t, and similarly if the
right side is a constant, independent of x and y. Let us then
impose these conditions, letting the constant be — o> 2 (an arbi-
trary constant so far, but later to be identified with our other co).
We have then two equations,
R 9
-R = -"•
or
R" + o> 2 R = 0,
and
$? + ©—*
164 INTRODUCTION TO THEORETICAL PHYSICS
Taking the latter equation, we may again separate. We write it
T ~ -Q *T (8)
The left side is a function of x, the right side of y, and by the same
argument each is a constant, say — k 2 . Then we have P" +
k 2 P = 0, and -k 2 = -(Q"/Q) - <» 2 {ii/T). We can rewrite
this last Q"/Q = — I 2 , where
-I 2 = k 2 - co 2 £, or co 2 = -{k 2 + I 2 ), (9)
and it becomes Q" + l 2 Q = 0. We now have three ordinary
differential equations for P, Q, and R, whose solutions are
evidently
p = e ikx (or e~ ikx , or sin kx, or cos fcx),
Q = e ilv , R = e**,
so that the final solution is as we found before, with the same
relation between w, k, and I.
99. The Circular Membrane. — The differential equation for
the circular membrane is the same as for the rectangular one, but
the boundary condition is different : the displacement u is always
zero on a circle of radius p about the origin. To solve the prob-
lem, the simplest method is to introduce polar coordinates, r,
6; for then the boundary condition is that u = when r = p,
which is a condition easy to apply. Let us then write our equa-
tion in polar coordinates. Before doing it, we shall give the
conventional names of the equations and symbols we meet. Our
equation, which is often written
d 2 u d 2 u _ 1 &u-_ n ,- n v
Ix~ 2 + dy 2 v 2 dt 2 ~ U ' ' U;
where v — y/T/p. is the velocity of the wave, is called the wave
equation, for u represents waves, either progressive or standing.
The special case d 2 u/dx 2 + d 2 u/dy 2 = 0, where u is independent
of t, is called Laplace's equation. And the expression d 2 u/dx 2 +
d 2 u/dy 2 , which we have already seen can be written in vector
notation V 2 w, is called the Laplacian of u. Our present problem
is to find the Laplacian in polar coordinates.
100. The Laplacian in Polar Coordinates. — Let us introduce
r and by the equations x = r cos 6, y = r sin 0, r = \A 2 + V 2 >
THE VIBRATING MEMBRANE 165
, v ' . A 5r a; 5r y a# — ydB x .
6 = tan" 1 -> so that -r- = -> -x- = -» x- = — ^> ^- = -5* and
a; dx r dy r ox r L dy r z
d 2 r _1 _x* d^r_l_y^fl = ^y j™ = ~2sy .
~dx 2 ~ r r^ , 'dy 2 ~ : r r 3 ' dx 2 ~ r 4 ' 3?/ 2 r 4
Then we
have
5m _ du dr du a#
~dx ~~ ~dr ~dx + ~BB ~dx"
If we apply this process again, we find without difficulty
dhi _ dhi(dr\\ q d 2 u ( dr dd\ d 2 u( dd\ 2 dudh du d 2 6
dx 2 ~ lJr~ 2 \~dx) + drdd\dx dx) + dd 2 \dx) + dr dx 2 + dd dx 2 '
Proceeding similarly with y, and adding, we have
^u _,d^u = dMYdA 2 , / 3rV] , o^( — — -j- — — ^ +
dx 2 + 3y 2 ar 2 L\ax/ + VW J drdd\dx dx "•" dy dy/
VwyLW/ w/ J ar\az 2 + ay 2 / + aava* 2 + dy 2 /
Substituting, this becomes
d 2 u 1 d 2 u 1 du
dr 2 7 2 ~dT 2 r~dr'
which can also be written
i a/ aA 4- ^— fiu
r dr\ dr ) + r 2 dd 2 ' U;
This is the expression for the Laplacian in polar coordinates.
101. Solution of the Differential Equation by Separation. —
Our differential equation is now
IJL( 0?±\ _lA §^a = Af^f (\<x\
r dr\ dr) + r 2 dd 2 v 2 dt 2 ' K }
Let us solve by separation of variables, assuming u =
R{r)Q(d)T{t). Then, substituting, and dividing by ROT, the
result is
ll^/^\^ld 2 = ll L d 2 r n
R r dr\ dr) + r 2 6 dd 2 v 2 T dt 2 ' K }
The problem is separated : the left side depends only on r and 0, the
right on t. Each must then be a constant, which we shall call
-a> 2 /v 2 , giving d 2 T/dt 2 + co 2 T = 0, T = A cos a + B sin a,
166 INTRODUCTION TO THEORETICAL PHYSICS
and
1 1 d( dR\ , 1 ld 2
/ diA J. 1
V dr/ + r 2 9
R rdr\ r 'dr/ ' r 2 6#
We multiply by r 2 , and transfer the first term to the right, obtain-
ing
ld 2 9
Odd
lRrdr\ dr J ^ v 2 \
Again the variables are separated, the left side depending only on
0, the right on r. Let each equal — ra 2 . Then d 2 Q/dd 2 -+- m 2 Q —
0, = C cos md + D sin md, and the equation for r can be
immediately changed to
This is just like BessePs equation (see Prob. 13, Chap. II), except
that it has the constant co 2 /?; 2 in place of 1. A simple change
of variables removes this discrepancy, however. Let x = ar/v.
Then the equation becomes
x dx\
or cancelling <a 2 /v 2 , it is exactly BesseFs equation
i=('© + ( 1 -S>- - (15)
The solution is then R = constant X J m (x), a Bessel's function
of the wth order, whose expansion in power series we have already
considered, for integral values of w, and for which we have found
an approximation in the preceding chapter (Chap. XIV, Prob. 8).
We shall see in the next section that only integral ra's must be
used in the present problem.
102. Boundary Conditions. — Consider in the first place the
solution for 0. At a given point of the membrane, the value of 6
is determined, but not in a single-valued way. Thus if the point
corresponds to 6 = 47 deg., it would equally well correspond to
47 deg. + 360 deg., or 47 deg. + 720 deg., etc. Now 9 must
surely have a definite value at each point of the membrane. Thus
it must have the same value for 0, + 2t, + 4rr, etc. In
other words, 9 is periodic in 6 with period 2ir. But this is true
THE VIBRATING MEMBRANE
167
if, and only if, m is an integer. Hence our first condition, neces-
sary to make the function single valued, is that m be an integer.
Next consider the solution for r: R = J m (i»r/v), where now
m is an integer. At the edge of the membrane, u = 0, which
means that R = 0, or J m (a>p/v) = 0. Now J m (x) is zero only
for certain definite values of x, say x = x\, x 2 , Xz, • • * . From
the properties of Bessel's functions, we have seen that there
are an infinite number of such roots. Thus, to satisfy our
boundary conditions, we must let up/v = xi, x 2 , • • • . The
only adjustable quantity is <a, so that it must be determined
m = 0,K=0
m-1,K-0
m»2, K=0
Fig.
m=0,K=l
22. — Nodes of
m»1,K=1 m-1,K*2
circular membrane. Shaded segments are displaced in
opposite phase to unshaded.
by one or another of the values w == vxi/p, ra 2 /p, • • • . Sup-
pose in particular that w = v Xk/p, determined by the Mh root
of J m . Then we should properly label it o} mk , since it depends
on both these indices. We have thus determined our solution
completely, except for the remaining arbitrary constants. These
can be easily expressed in the following form :
u = (A cos oimid + B sin u mk t) cos (md — a mk ) J m (w m kr/v).
This is a particular solution. The general solutionis the sum
of such terms, taken over all m's and A;'s.
103. Physical Nature of the Solution. — A single term corre-
sponds to a single standing wave. Its nodes are concentric
circles, values of r for which J m (oo mk r/v) is zero, of which of
course the boundary is one; and radii, determined by cos (md —
a) = 0, as in Fig. 22. It is readily seen that there are m radial
168 INTRODUCTION TO THEORETICAL PHYSICS
nodes, k circular nodes without counting the boundary. The
arbitrary constant a mk determines the angles at which the radial
nodes are; changing it simply rotates the whole nodal pattern.
The constants A and B determine the~amplitude and phase
of the disturbance as a function of time. We may, if we choose,
consider that there are two separate waves possible for each
frequency, cos md J m and sin md J m . Such a case is called degener-
ate; we shall see in a problem that the same thing is true of the
square membrane. In a degenerate case, with two or more
possible vibrations of the same frequency, it is plain that any
linear combination of these vibrations gives a possible vibration
of this same frequency. As with the rectangular membrane,
the set of frequencies co m& does not form a simple set of overtones
with pitches in harmonic relation to each other.
104. Initial Condition at t = 0. — Suppose we know that at
t = 0, the displacement of the membrane is given by F(r, 0),
and the velocity by G(r, 0). Now we can write the whole solu-
tion, in a slightly more general way than before,
u = ^[(Amk cos o> mk t + B m k sin oi mk t) cos md +
m, k
(C mk cos oi mk t + D mk sin o, mk t) sin md]J m y^—J. (16)
Thus, writing displacement and velocity at t = 0, we have
F(r, 0) = 2^ mit cos m0 + Cmk sin md ) J Arir)
m, k
G(r, 6) = 2 W ^ B ^ cos md + Dmk sin md ) J Arf~)
m, k
The A' a, B's, C's, D's must be chosen to fit these conditions.
Both conditions are of the same sort. They require us to find
the coefficients for expanding a function of r and 6 in series of
products of sines and cosines and Bessel's functions. Now it
proves to be true that both the sines or cosines and the Bessel's
functions are orthogonal, and as a result of this we can make the
expansions we desire in the usual way, as with Fourier series.
Let us take the first equation, multiply by cos nd Jniunir/v), and
integrate over the area of the drum. That is, we integrate with
respect to r from to p, and with respect to 6 from to 2ir,
and the element of area is rdrdd. Then we have
THE VIBRATING MEMBRANE 169
f " f V(r, 0) cos n0 JnO^f) r dr dd
= V f '(A** cos ra0 -f C mfc sin md) cos n0 d0
mk
By the orthogonal property of the sine and cosine, the right side
is zero unless m = n, giving y-p A nk \ r J n [ -^— J JJ —- J dr.
k
But now we shall prove in the next section that the J's are
orthogonal in the sense that I r J n ( - !L - ) J n ( - !L - J = 0, if k ?£ I.
Using this fact, our sum reduces to the single term
t Ani I r J„ 2 (— J dr.
If the last integral, which could be easily computed if we knew
the properties of Bessel's functions better, were denoted by c n i,
then we should have
1
A nl = —\ F(r, 0) cos nd J n [ — ) r dr dd,
TTCnlJo Jo
i?)
(18)
determining the coefficients A in terms of a single integral.
Similarly we could get formulas for the B's, C's, D's. Of course,
in an actual case, these integrals might be very difficult to com-
pute, but nevertheless we have a general solution of our problem.
. 105. Proof of Orthogonality of the J's. — We can prove the
orthogonality of the J's directly from the differential equation,
as was done in the last chapter for the nonuniform vibrating
string. We wish to prove that
Now we have
Id
r dr
r dr
dJ n (<Jnir/v)
dr
dJ n (ca» k r/v)
dr
(ani 2 n 2 \ T (u>nir\ _ n
/co nk 2 n 2 \ T /w»ifcA _ _
170 INTRODUCTION TO THEORETICAL PHYSICS
Multiply the first by r J n (o} nk r/v) } the second by r J»(w»jr/t/),
subtract, and integrate from to p. The result is
rw^)*[' fis ^]- j <^)4' ss ^]}*
(«nfc 2 — 0>nl 2 \ C T ( w ntf*\ T ( 0) nk r\ ,
—ir—)jo rJ i-T) J \ir) dr -
Just as in the last chapter, the left side can be shown to be zero,
by integrating by parts. Then the right side must be zero,
and either u nk 2 — w n j 2 is zero, which is not true unless k and I
refer to the same overtone, or \\ J n (o} n ir/v) J n (oo nk r/v) = 0,
which we wished to prove. The orthogonality is not quite
of the form discussed in the last chapter, for the differential
equation is of slightly different form, the quantity (a> 2 /v 2 —
n 2 /r 2 ) r appearing in place of co 2 /x, so that the final result is not
just like integrating n times the product of the functions to get
zero.
Problems
1. A rectangular drum is 20 by 40 cm., its whole mass is 100 gm., the
total pull on the faces 50 and 100 kg., respectively. Find the frequencies,
in cycles per second, of the five lowest modes of vibration, and sketch the
nodes for each.
2. The special case of degeneracy arises when a rectangular membrane is
square. Then the two modes of vibration e iat sin (rnrx/X) sin (rrnry/X) and
e tot sin {mirx/X) sin (niry/X) have the same frequency (where we let X = Y).
Thus any linear combination of these is a solution, again with this frequency.
Consider the combinations
e lC0t l A sin -^r- sin -~ + B sin -^r- sin -^? 1.
Work out the nodes in the case n = 1, m = 2, for (1) B = A; (2) B = -A;
(3) B = 2A.
3. A rectangular membrane is struck at its center, starting from rest, in
such a way that at t = a small rectangular region about the center may be
considered to have a velocity v, and the rest has no velocity. Find the
amplitudes of the various overtones.
4. Imagine n and m plotted as two rectangular coordinates. Show that
a curve of constant «, plotted in these coordinates, is an ellipse. Each
integral value of n and m corresponds to an overtone, so that if we draw the
point corresponding to each overtone, the number of points within such an
ellipse gives the number of overtones with angular velocity less than «.
Note that the number of such points per unit area of the plane is just one,
and so find an approximate formula, using the area of the ellipse, for the
number of overtones of frequency less than w, and also for the number
THE VIBRATING MEMBRANE 171
between o> and u> + do>. Check up this approximation by the exact values
of Prob. 1.
6. In the circular membrane, suppose that m = 0, and that k is very large,
so that there are many circular nodes. Consider a small region near the
edge of the membrane. The few nodes in this neighborhood will be almost
straight lines, as if we were near the edge of a rectangular membrane. Find
the asymptotic wave length, using the fact that J m (x) approaches cos
(x — a) at large x, and show that the wave length is connected with the
velocity and frequency in the usual manner.
6. Set up the wave equation in three-dimensional spherical coordinates,
in which x = r sin cos <f>, y = r sin sin <f>, z = r cos 0. Show that it is
±±( r *»»\ + 1 ± /"sin **\ +
r 2 dr V dr ) ^ r 2 sin dO \ d9 J ^
d 2 u 1 dhi
r 2 sin 2 d<t> 2 v 2 dt 2 '
7. Separate variables in the preceding equation. Show that the function
of </> is sin m<t> or cos m<£, where m is an integer. Show that the equations
for r and are respectively
r 2 dr\ dr) ^ \v 2 r 2 ) '
where w, C are constants;
1 d ( . a de\ . ( n m 2 \ .
8. The equation for in Prob. 7 is called Legendre's equation. Let O =
sin" 1 6 F(cos 0). Find the differential equation for F, solving in power series
in cos 9, and show that the series breaks off if C = 1(1 + 1), where I is an
integer. The resulting functions are called Fj m (cos 0), and are known
as associated Legendre functions. Compute the first few Legehdre
functions.
9. In the equation for r in Prob. 7, prove that R = ' + ^T% where x = tar /v.
10. Prove that two functions u n and u m , satisfying differential equations
of the form
^[r(^]+ W ,W-/(#„ = o (
with different w n 's, but chosen so that both u n and u m are zero at x = and
x = L, satisfy the orthogonality condition I n(x)u„(x)u m (x)dx = 0.
CHAPTER XVI
STRESSES, STRAINS, AND VIBRATIONS OF AN ELASTIC
SOLID
In the preceding chapters, we have been treating the vibra-
tions of elastic strings and membranes, one- and two-dimensional
bodies, and now we pass to the three-dimensional case, or the
elastic solid. Of course, the strings and membranes were really-
elastic solids, of particular shapes. But there are several ways
in which we must give a more general treatment than we have
previously done. First, in the strings and membranes, the
rigidity of the material itself was not great enough to affect
ihe vibration, whereas in the problems we now take up this
rigidity, or the elastic properties of the material in general, will
be important. Thus we may imagine all gradations of the prob-
lem of a stretched wire, from the limiting case of a very thin
long wire under large tension, when our previous theory is
applicable, down to a short thick bar under small tension or
even with no tension at all, when the restoring force on a particle,
far from coming from the tension on the ends, comes from the
distortion of the bar itself. Secondly, with the strings and
membranes, we considered only transverse vibrations, while
here we discuss longitudinal vibrations as well. Of course,
strings can vibrate longitudinally, but we have so far neglected
this phase of their motion. Thirdly, a very important part
of the problems of strings and membranes has arisen from
the fact that they were limited in space, the membranes being
very thin pieces of material,, the strings thin in two dimensions.
But while some of the problems of the present chapter have this
property, we shall also consider vibrations and waves in extended
media going, in the limiting case, to infinity in all dimensions,
as sound waves in an infinite gas or solid. It is these sound waves
which show the best analogy to our one- and two-dimensional-
wave equations.
106. Stresses, Body and Surface Forces.— The first step in
discussing the vibrations of an elastic solid, as with the string and
172
STRESSES, STRAINS, VIBRATIONS OF AN ELASTIC SOLID 173
membrane, is to find the force acting on an infinitesimal volume
element, and to set this equal to mass times acceleration. The
forces may be divided into two classes: (1) volume or body forces,
such as gravity, which act on each volume element of the body,
and which for the present we neglect, since we shall not use them
in our applications; and (2) surface forces, with which neighboring
parts of the medium act on each other, and which are transmitted
across surfaces, or the forces transmitted across the bounding
surface of the whole body. The tensions which we have met with
string and membrane are examples of such forces, or pressures in
a gas, or shearing forces in a twisted rod. To specify such a force,
we imagine a surface element dA to be drawn somewhere in the
body, with a normal n. The material on either side of dA exerts
a force on the material on the other side; thus this force is a push
normal to the surface if tbsre is a pressure in the body, it is a
tension if that is the form of stress, or it may be a shearing force.
The force exerted by the material on one side, on the material on
the second side, and the other force exerted by the material on
the second side back on the first side, are action and reaction, and
are equal and opposite, so that one always has an ambiguity of
sign in dealing with these forces, or as we call them stresses. We
adopt the following convention : We imagine dA to be part of the
surface bounding a volume, and n to be the outer normal. Then
the force we deal with is the force exerted by the outside on the
material inside the volume, over dA. Now this force will be a
vector, and proportional to dA ; we call its x, y, and z components
X n dA, Y„dA, Z n dA, respectively. The capital letters indicate
the force components, and the subscript n denotes not a com-
ponent but the direction of the surface normal.
The properties of a stress can be completely specified if we
choose three unit areas at a point, one normal to each of the
three coordinate axes, and give the components of the force acting
across each. Thus for the surfaces normal to the x, y, and z axes,
we have the three force vectors, or nine quantities,
x x
Y x
z x
Xy
Yy
Zy
x z
Y z
z z .
(1)
We see in Fig. 23 the significance of the three components X x ,
Y x , Z x . This set of nine quantities forms the so-called stresi
tensor. The diagonal terms of the array, X x , Y y , Z z , are called
174 INTRODUCTION TO THEORETICAL PHYSICS
the normal stresses or pressures, since the force components act
normal to the surface, and the remaining terms are called shearing
or tangential stresses. It is easily shown that the force across an
arbitrary surface which has direction cosines I, m, n for its normal
has an x component IX x + mX y + nX z , with corresponding
formulas for the other components.
X»dy dz ^
Zxd/eht
dty
Fig. 23. — Components of force acting across dydz.
107. Examples of Stresses. — The simplest stress is probably
the hydrostatic pressure. There the force acting across a square
centimeter is always at right angles to the area, and its magnitude
is by definition the pressure P. The force acts into the body,
and hence is of negative sign. We thus have X x = Y y = Z z =
— P, all other components =0. A second example is a tension,
say in the x direction. Then the unit area perpendicular to x
has a force T exerted across it, normal to the area, but there is no
force exerted across faces perpendicular to y or z. In other words,
X x = T, all other components of the stress are zero. A third
example is a shear. In Fig. 24 a, we have a cube of material,
with equal and opposite tangential forces exerted across the
faces normal to x, the forces acting in the y direction. Over the
right face, the force exerted on the material is in the —y direction,
so that for this face we have Y x = — S, a constant, and X x =
Z x = 0. Over the opposite face, both force and direction of
normal are reversed, so that the stress components are unchanged.
But now we notice an important feature of shearing stress: the
two forces we have mentioned exert a torque or couple on the
cube, and if they were the only forces acting, it could not be in
equilibrium. To get equilibrium, it proves necessary to have at
the same time tangential forces exerted across the faces per-
pendicular to the y axis, as in Fig. 246. These forces are equal
in magnitude to the other, so that the torques obviously balance,
and we have X v = Y x = —S, all other components equal to
..STRESSES, STRAINS, VIBRATIONS OF AN ELASTIC SOLID 175
zero. This property, that X y = Y x , proves to be general: the
stress tensor is symmetrical about its diagonal.
By making a proper rotation of axes, it is always possible to
reduce a stress to diagonal form, in which no shearing stresses
appear. Thus, in the case we have just considered, the problem
is obviously symmetrical about the diagonal of the cube. In
Fig. 24c, we take a surface element whose normal has direction
cosines I = — 1/a/2, rn = 1/V%, n = 0, giving a force exerted
across it of components —S/y/2, S/\/2, 0, or a force of magni-
tude £ normal to the surface. Similarly in Fig. 24d, we have a
y
y
i
i
Y x =-S s
y
y
Y x
-*= —
x^-s \
Y x =-S
V
>
f Yx=-S
1
"2
(a)
(d)
(b) (c)
Fig. 24.— Diagram of shearing stress,
(a) Shear over the faces perpendicular to the x axis.
(6) Additional shear over faces perpendicular to the y axis, necessary to
balance the turning moment of the shear indicated in (a).
(c) and (d) Stress system of (b) referred to principal axes, tension in (c) , pres-
sure in (d).
surface at right angles, and find again a force normal to the sur-
face, but now of magnitude — S, Thus, if we take as new axes
the two 45-deg. diagonals in the xy plane, and the z axis, the
stress consists of a tension S along one axis, negative tension (or
pressure-like force) at right angles, and zero stress across the
face normal to z. Axes of this sort, in which each face has a pure
pressure- or tension-like force across it, and no shear, are called
principal axes of stress. *
108. The Equation of Motion. — Let us find the force on a small
element of volume, having sides dx, dy, dz. Over the face at
x + dx, there will be a force X x (x + dx), Y x (x + dx), Z x (x + dx)
per unit area. Similarly exerted over the face at x there will be
a force —X x (x), —Y x (x), —Z x (x). The x component of the
dX
resulting force is X x (x + dx) — X x (x) = -Q-^dx per unit area,
^ DEPARTMENT OF CHEMISTRY
LIVERPOOL COLLEGE OF TECHNOLOGY
176 INTRODUCTION TO THEORETICAL PHYSICS
ay
or -—dxdydz for the area dydz. The y and z components are
~~dxdydz and -—-dxdydz, respectively. In the same way we
can find the three components of force exerted over each of the
two other pairs of faces. Adding, we have for the total x com-
ponent of force \~^ + ~? + ^Jd x dydz. Thus, if v x , v y , v z
are the components of velocity of the solid at the point in ques-
tion, the equations of motion, remembering that the mass of our
small volume is pdxdydz, are
dX x dX y dX z __ dv x
dx dy dz ^tt
dY x dYy .BY, = dvy
dx + dy + dz p dt
dZ x dZy dZz _ dp?
dx ^ dy ^ dz ~ p dt' W
These equations are evidently simply the generalization of those
used previously with the string and membrane. Thus with the
membrane let the z axis be normal to the plane of the membrane.
We consider then only the third equation, giving velocity along
z. The. stress is a tension along the membrane, and if we cut the
membrane with a surface perpendicular to x, we see that, if
the membrane is inclined so that it makes an angle a with the x
axis, there will be a component Z x , a force in the direction to
produce acceleration, equal to Ta. If then a = du/dx, where
u is the displacement along z, the first term becomes T(d 2 u/dx 2 ),
as we found before. Similarly the second term is T(d 2 u/dy 2 ),
and the third is zero, yielding the equation of vibration which
we have already used.
109. Transverse Waves. — Two sorts of waves are possible in
an elastic solid: transverse waves, in which the displacement is
at right angles to the direction of propagation of the waves, and
longitudinal waves, as the sound waves in a gas, in which the
displacement is in the direction of propagation. We consider
first transverse waves. * Rather than taking the general case,
which involves rather complicated formulas, we assume that our
wave is being propagated along the x axis, and that the displace-
ment of the particles is in the y direction. We shall expect to
g ft a wave equation involving only x derivatives, not y or z, and
STRESSES, STRAINS, VIBRATIONS OF AN ELASTIC SOLID 111
having as solutions either progressive or standing waves. Let
the displacement of a particle in the y direction be rj; since the
wave is being propagated along the x axis, we assume that it has
wave fronts normal to x, such that every point on a wave front
has the same displacement, and this means that 77 is a function of
x only. We may then consider a thin sheet or lamina, as that
between x and x + dx in Fig. 25. Let us suppose that the two
points which in the unstrained medium were at x and x + dx,
y = 0, are displaced to the points P and P', at distances 77O) and
-nix + dx), respectively, from the axis. Then evidently the
lamina has been sheared, and we must find the relation between
the shearing stress and the strain (that is, displacement) which
Fig. 25. — Shear in a transverse plane wave.
it has produced. The type of stress is evidently the sort described
in Fig. 24. The material to the right of x + dx exerts across unit
cross section of the face a force in the y direction, equal to Y x
(or X y ) . But now Hooke's law says that the actual deformation
of the material, or the strain, is proportional to the stress acting.
In this particular case, the deformation is a shearing one, and is
opposed by the rigidity of the medium (which is the reason why
a liquid, having no rigidity, cannot have transverse waves).
The deformation is given in terms of the coefficient of rigidity n
as follows : the strain, measured by the tangent of the angle which
the line PP' makes with the x axis, is equal to the shearing stress
divided by /x- In other words, Y x = n drj/dx. Substituting
this relation between stress and strain in the equations of motion,
we have at once
d( dr\ dv v
TxVTx) = P W
178
INTRODUCTION TO THEORETICAL PHYSICS
or, writing v y = drj/dt,
d 2 7)
dx 2
P d 2 V
H dt 2 '
(3)
the one-dimensional wave equation, representing transverse
waves propagated with the velocity -y/vjp, or the square root of
elastic modulus divided by density. Of course, we should have
got the three-dimensional wave equation if we had considered
propagation in an arbitrary direction.
110. Longitudinal Waves. — Here again we consider propaga-
tion along the x direction. In Fig. 26, let the displacement of a
particle in the x direction be £(x), a function of x only. Evidently
the stress in this case is a pure tension, positive or negative, so
x+dx
jfrjga
Fig. 26. — Compression and rarefaction in a longitudinal plane wave.
that the force across unit cross section is a pull in the x direction,
equal to X x . Hooke's law now states that the tension is propor-
tional to the strain; and in particular, that it is proportional to
the change in thickness of the lamina [which is evidently £ (x +
dx) — £(x)] divided by the thickness. The constant of propor-
tionality in this case is not one of the simple elastic constants; it
proves to be written (X + 2/x), where X is an elastic constant
whose physical meaning is not easy to state. Perhaps as good an
interpretation of X as any is simply to define it from this particu-
lar sort of deformation. We now have X x = (X + 2/z) — , all
other components of stress = 0, so that from the equations of
motion we at once have
STRESSES, STRAINS, VIBRATIONS OF AN ELASTIC SOLID 179
dx 2 X + 2 M dt 2 K J
again a wave equation, representing a longitudinal wave traveling
with velocity -\Z(\ + 2/z)/p, different from the velocity of the
transverse wave.
111. General Wave Propagation. — In the two preceding sec-
tions, we have derived two very specialized waves which can
be propagated in an elastic solid, plane longitudinal and trans-
verse waves traveling along the x axis. Of course, much more
complicated waves are possible, and if we were discussing the
problem completely, we should set up the three-dimensional
1 d 2 u
wave equation, of the form V 2 u = -$ -^-j and derive general
wave solutions. We should have separate equations for the
longitudinal and transverse
waves, generalizations of Eqs.
(3) and (4). As we shall learn
later when discussing optical
problems, such a wave equation
has as solutions not merely
plane waves traveling in all
arbitrary directions, but also FlG - .l 7 ; -1 ^? 1 * tr j* ns 7 e / se wave '
u ■ ' with longitudinal reflected wave.
spherical waves diverging from
point sources, and many more complicated types of waves. All
these are possible in an elastic solid. In our discussion of the
plane waves, we separated the longitudinal and transverse waves
entirely, allowing one type to exist without the other, but
unfortunately in general this cannot be done. For instance,
when a wave of one type is reflected from a surface, then
unless the reflection is at normal incidence, longitudinal motion
will generally be partly converted into transverse, and vice
versa. In Fig. 27, we show diagrammatically how this could
be, the transverse motion in the incident wave evidently being
in such a direction as to be partly transformed into longitudinal
motion in the reflected wave. For this reason, the complete
treatment of the vibrations of an elastic solid is a very complicated
problem. An example is found in geophysical problems, where
one is interested in the propagation of earthquake waves through
the earth. This case is made even more difficult by the fact
180 INTRODUCTION TO THEORETICAL PHYSICS
that the elastic properties of the earth change as a function of
depth, so that one must use solutions of the form we have dis-
cussed in Chap. XIV, in connection with strings whose prop-
erties depend on position.
There is one application of the theory of the waves in an elastic
solid which has at least historical interest. When it was dis-
covered that light was a transverse wave motion, it was attempted
to identify these waves with the transverse vibrations of an elastic
solid, the ether. The general properties, and even some of the
details, as the quantitative laws giving the fraction of light
reflected and transmitted at a boundary, were correctly worked
out, the reflection being treated by analogy with our discussion
of reflection of waves in strings at a point of discontinuity of
density, in Chap. XIV. But the difficulty, which could not
be overcome, was that of eliminating the longitudinal waves,
which certainly do not occur in optics, but which were inherent
in the elastic solid theory. This difficulty does not occur in
the present electromagnetic theory, where only transverse waves
are allowed by the fundamental differential equations. This
lack of longitudinal waves makes the problem of optical wave
motion on the whole simpler than that of elastic waves.
112. Strains and Hooke's Law. — In discussing transverse and
longitudinal elastic waves, we had to introduce certain elastic
constants, measuring the ratio between stress components,
and certain quantities measuring the strain or deformation
of the substance. The fact that these strains were proportional
to the stresses is Hooke's law, the fundamental law of elasticity,
holding for sufficiently small strains. It is now worth while to
state the general relation between stress and strain, though we
shall not go through the proof.
To begin with, we imagine the body unstrained. Then in the
process of deformation, we imagine that the particle originally
at x, y, z has been displaced to a point x + £, y + -q, z + f •
The three quantities £, ij, £ are functions of x, y, z, and are the
three components of a vector. We meet, in other words, a
vector (which we may call the displacement), which is a function
of position. Such a vector field reminds us of a force field, as
a gravitational- or electric-force field, where the force vector
on Unit mass or charge, respectively, is a function of position.
We shall meet such vector fields often in the future. Now, the
displacement is not the same thing as the strain; the body might
STRESSES, STRAINS, VIBRATIONS OF AN ELASTIC SOLID 181
be displaced bodily, without involving any stress or strain at
all. It is only when the displacement of one side of a small
element of volume is different from the other, so that the element
is distorted in size or shape, that we have a strain. In other
words, the essential quantities in determining the strain are the
derivatives of £, 77, f with respect to x, y, z. We have already
seen two examples: with the shear in the transverse wave, the
strain was b-q/bx, and in the compressional wave the strain was
b%/bx. In the two cases mentioned, the stress was proportional
to the corresponding partial derivative, and Hooke's law means
that this is true in general, in the form that the components
of stress are linear functions of the partial derivatives of the
components of displacement. There are nine components of
stress, of which six are independent (remembering that X y — Y x ,
etc.), and similarly there are nine partial derivatives of displace-
ment, of which it can be proved that six again are independent.
This would mean six linear equations, with thirty-six coefficients,
which would act as elastic constants. In the most general type
of substance, a completely anisotropic crystal, it can be shown
that twenty-one of these really are independent, giving a tre-
mendous number of elastic constants. With isotropic substances
showing no crystalline structure, however, most of these con-
stants are either zero or can be written in terms of each other,
and there are only two independent constants, the X and /j.
which we have already met ; all other elastic constants, as Young's
modulus and the compressibility, can be written in terms of
them. Using these constants, the relations between stress and
strain prove to have the following form :
x x
-«, + x)fI + x* + x£
Xy
(b$ b V \
Yy
d£ bn df
= x s + (2m + x)|| + x^
Y,
= \Tz + -by)
z z
= Hr + *tt + < 2 " + x )f
bx by bz
z x
= \ai + Tz)
(5)
In the cases we have taken up already, we have seen two illustra-
tions of these equations: with transverse waves, bri/bx was the
only partial derivative different from zero, and we had X v =
n bri/bx; with the longitudinal wave, b£/bx was the only term
different from zero, and as we see this gives X x = (2/x + X) b%/bx,
as we had before, but also Y v = Z z = X b%/bx. These latter
182 INTRODUCTION TO THEORETICAL PHYSICS
stress components, however, since they do not depend on y or z,
do not contribute to the equations of motion, as we see by refer-
ring back to these.
113. Young's Modulus. — To illustrate the use of the equations
connecting stress and strain, we shall discuss the stretching of
a wire. ■ Let the wire be stretched along the x axis, and let the
stress be a pure tension T, so that X x = T, and all other stress
components are zero. The x, y, z axes are principal axes for
this stress, and it can be shown that the strain has principal
axes, too, parallel to those of stress, so that the last three equa-
tions, for X y , etc., do not enter. We are left, then, with the three
equations
.°-^ + ^ + «> + »&
Subtracting the thi&l from the second, we have dy/dy = d£/dz.
Using this relation, either the second or third gives drj/dy =
where a = t^t — ; — N > and is called Poisson's ratio.
2(X + m)
Since X and //. are always positive, it is obvious that Poisson's
ratio is never greater than %. We have found, then, that as
the wire is stretched (positive d%/dx), it contracts sidewise,
(negative drj/dy and d^/dz) and the ratio of sidewise contraction
per unit width, to lengthwise stretch per unit length, is given by
Poisson's ratio. Actual materials have Poisson's ratio of the
order of magnitude of %. Now we put this expression back
in the first equation, obtaining T = (2/t + X — 2Xcr) d£/dx.
The elastic modulus (2jt + X — 2X<r), giving the tension, or
force per unit area, divided by the elongation per unit length,
is called Young's modulus, and is denoted by E. In the prob-
lems we find other ways of writing the relations between Young's
modulus, Poisson's ratio, and the other elastic constants.
It is worth noticing that Young's modulus was not the elastic
constant which entered into the velocity of compressional waves.
If we had longitudinal waves traveling down a wire, the wire
would contract laterally at those points where it was under
tension, expand when it was under compression, as given by
-•(§)
STRESSES, STRAINS, VIBRATIONS OP AN ELASTIC SOLID 183
Poisson's ratio, and for such a wave the velocity would be
determined from Young's modulus. But in our extended
medium, we did not allow the possibility of the lateral motion
connected with such a contraction and expansion, since in a
medium of large dimensions compared with the wave length
this would amount to a very large transverse motion. We
assumed instead that the motion was purely longitudinal, and
found that we had to assume the existence of other lateral
stresses, tensions Y y and Z z , to counteract the tendency to
expansion and contraction. These stresses changed the condi-
tions of the problem, and in particular the elastic modulus
concerned in the velocity of propagation of the wave.
Problems
1. In Fig. 28, let the normal to the inclined face of the prism have direc-
tion cosines I, m, n. Compute the total forces exerted by an arbitrary stress
Fig. 28. — Prism for computing force exerted by stresses across a face with
arbitrary normal n.
on the prism, and prove that the net force is zero, and the prism is in equilib-
rium, only if the force per unit area over the face perpendicular to n has x
component IX X + mX y + nX z , etc.
2. Rotate coordinates to reduce an arbitrary stress to principal axes.
Carry through the problem of the pure shear, discussed in Fig. 24, as an
illustration of the general method.
3. Prove that in terms of Young's modulus and Poisson's ratio we have
E* _ E
X = 71— ; wl S-V 2 M =
(1 + <r)(l -2a) M 1 +»
4. Assume a body is under pure hydrostatic pressure P. Show that the
distortion is a decrease of all dimensions by a fixed fraction. Show that
the fractional change in volume is d$/dx + dr,/dy + d?/dz. Using this,
show that the compressibility k of a solid under hydrostatic pressure, which
184
INTRODUCTION TO THEORETICAL PHYSICS
Fig. 29. — Bent beam.
by definition is the fractional decrease of volume divided by the pressure,
equals 3(1 - 2a) /E.
5. Show that the velocity of a longitudinal wave in a fluid, for which p. is
zero, is l/y/icp, where k is the compressibility.
6. A rectangular beam held at one end is bent into an arc of a circle,
the radius of curvature of its central section being R. Find the stress
distribution throughout the beam, showing that the beam will be kept in
equilibrium by a torque or couple of the sort
K indicated. Show that for a given torque the
curvature of the beam is inversely proportional
to ab 3 E, where E is Young's modulus (seeFig.29).
7. A circular cylinder of height h rests in
equilibrium under the action of gravity. Take
a coordinate system with the xy plane in the
top base of the cylinder and the positive z axis
pointing downward. Show that the only com-
ponent of stress different from zero is Z z = —pgz,
if p is the density of the cylinder. Using Hooke's
law show that the strains are d£/dx = dv/dy
= (<r/E)pgz, and df/dz = ~(1/E)pgz, and find
the other partial derivatives. Integrate these expressions to find the com-
ponents of the displacement of any point of the medium, remembering that
the strains are partial derivatives. Show that a horizontal plane section
of the cylinder becomes a paraboloid of rotation due to the deformation.
Show that the radius of the cylinder increases from top to bottom when it
is thus deformed.
8. A spherical shell of inner radius Ri, outer radius R 2 , contains a fluid
of pressure Pi, and is immersed in a second fluid of lower pressure P 2 . It
can be shown that the displacements of points on account of the pressure
are given by $ = x(A + B/r 3 ), v = y(A + B/r 3 ), f = z(A + B/r 3 ).
Verify these values by computing the stresses at any point, substituting in
the equations of motion, and showing that they result in equilibrium. Show
further that the force across an area normal to the radius is itself normal to
the surface, so that the stress within the sphere can be balanced by hydro-
static pressures within and without.
9. In the shell of Prob. 8, determine A and B so that the pressure will
have the proper values at Ri and P 2 . Discuss the stress within the shell,
showing that the principal axes at any point are along the radius and two
arbitrary directions at right angles, and find the tension or pressure along
the directions at right angles, discussing the final result physically, with
special reference to possible breaking of the shell under excessive pressure
inside.
CHAPTER XVII
FLOW OF FLUIDS
In the last chapter we discussed the equation of motion of an
elastic body where there was no mass motion or flow. Now we
pass to hydrodynamics and the flow of fluids. Much of what we
say, however, applies to flow in general— such as heat flow, which
we shall take up in the next chapter— and even to such a different
subject as electrostatics. The feature in common in all these prob-
lems is the existence of a vector field. By that we mean a vector
defined at each point of space. We have already met such a field
in our general discussion of forces and potentials in Chap. VI,
for the force is defined at every point of space and forms a vector
field. In the present case the vector
is the velocity of the flowing fluid,
or the closely related flux density.
With heat flow it is again a flux
density for the flowing heat, and for
electricity the electric field. All
these problems, though so different
physically, are thus mathematically
similar and can be treated by the
same analytical methods.
114. Velocity, Flux Density, and
Lines of Flow. — At every point of a
flowing medium, we can define the velocity, a vector (the
time rate of change of the displacement, which we used in the
last chapter, and to which we assigned components £, 17, £)• Also
we can give the density p, and both p and v are in general func-
tions of position (x, y, z) and of time. We may now ask, How
much material will flow across any area per second? This
total flow across a surface is called a flux. In Fig. 30, we con-
sider an infinitesimal surface element dS. With dS as a base we
erect a prism, the slant height being the velocity v, which in
general is not normal to dS. Evidently the material in the
prism will just be that which crosses dS in one second, since in
185
Fig. 30. — Flux through an area
dS.
186 INTRODUCTION TO THEORETICAL PHYSICS
this time it will move a distance v, and fill the dotted prism. But
this is p (the density) times the volume of the prism (the base dS
times the altitude t/», where n is the normal to the surface), or
pv„dS. The quantity pv is called the flux density, and we may
denote it by /. Then for a finite area, the total flux will be the
sum of the contributions from all the surface elements, or a sur-
face integral fff n dS = jjpv n dS. In some kinds of flow, such as
heat flow, there is an analogue to the flux, but not to the density
and velocity separately, so that one regards the flux density as
being the more fundamental vector field.
We can draw lines through the medium, tangent at every point
to the direction of flow at that point. These are called lines of
flow. Similarly we can set up tubes of flow, the elements of their
surfaces being lines of flow. We can imagine the substance to
flow through these tubes, as water flows through a pipe, never
passing outside, since the velocity is always tangential to the
surface of the tube. In hydrodynamics these lines of flow are
called streamlines, and the sort of flow in which they are inde-
pendent of time is called streamline flow.
115. The Equation of Continuity. — Consider a fixed volume
in a flowing fluid. The amount of fluid in the volume is fffpdv,
and this can change in two ways. First, liquid can flow into the
volume over the surfaces. Secondly, it may be possible for
liquid to be produced within the volume without having flowed
in. For instance, in a swimming pool, for all practical purposes
we may consider the opening of the inlet pipe as a region where
fluid is appearing, and the outlet as a place where it is disappear-
ing. Such regions are called sources and sinks, respectively.
Then we have
ni
-rj-dv = rate of inflow over the surface +
at
rate of production inside.
Now we have just seen that the rate of flow over any surface,
or flux, is JjfndS. This represents outflow if n is the outer nor-
mal to a closed surface, so that we must change sign to get inflow.
If in addition we assume that the rate of production of material
per unit volume is P, we have
FLOW OF FLUIDS 187
the volume integrals being over the whole region we are consider-
ing, the surface integral over the surface enclosing this volume.
If we now apply our equation to an infinitesimal volume in the
form of a rectangular parallelopiped, bounded by x, x + dx,
y,y + dy, z,z + dz, we can put the equation in a form not involv-
ing integrals. The flow to the right (into the volume) over the
face x is f x (x)dydz. The net flow over that at x + dx is
f x (x + dx)dydz = f x (x)dydz + —f x (x)dydz • • • .
Thus the total inflow over the faces is — —(f x )dxdydz. Adding
ox
similar contributions from the other faces we have for the total
inflow
-jjus = -(If, + ±f. + !/.)*»** -
— (v • f)dv — — div / dv,
where the divergence is a vector operator discussed in Chap. VI.
Hence
f£=-div/ + P. (1)
This is often called the equation of continuity. We may note
several special cases. If there is no production of fluid in dv, it
becomes
^ + div/ = 0,
or using / = pv,
^ + div (pv) = 0. (2)
Again, in a steady state, where density is independent of time,
div/ -P. (3)
This equation shows the physical meaning of the divergence of a
vector: it measures the rate of production of the flowing sub-
stance, per unit volume. Finally, if no substance is being pro-
duced at the point in question, and density is independent of
time, div / = 0, and we have a divergenceless flow.
116. Gauss's Theorem. — We have proved that the amount of
substance flowing out of a small volume dxdydz = dv per second
equals div / dv in steady flow. Suppose now that we have a large
188 INTRODUCTION TO THEORETICAL PHYSICS
volume and that we wish to find the total amount flowing out of
it per second. This is simply the sum of the amounts flowing
from each element. Thus it is a volume integral, ///div / dv.
on the other hand, the material all flows through the surface, so
that the rate of outflow is Jjf n dS. These two expressions must
be equal:
JJJdivfdv = SJfndS. (4)
This is Gauss's theorem, and it- holds for any vector / which is a
function of position.
117. Lines of Flow to Measure Rate of Flow. — Let us set up a
definite number of lines of flow, so that the number crossing a
unit area perpendicular to the flow is numerically equal to the
magnitude of the flux density. We could surely do this, but we
might have the necessity of sometimes letting lines start or stop,
to keep the right number. We can prove, however, that with a
divergenceless flow this would not be necessary. The lines start
or stop only at places where the divergence is different from zero :
that is, they start at sources, stop at sinks. For an elementary
proof, let us take a short section of a tube of flow, bounded by
two surfaces normal to the flow. Let one of them have an area
Ai, the other A 2 , and let the magnitude of the flux over the one
face be f h over the other / 2 . Then the total current in over one
face isfiAi, and out over the other is f 2 A 2 . If the flow is diver-
genceless, these are equal. But the number of lines per unit area
on the first is f h so that the number cutting the one end of the
tube is fiAi, and the number emerging at the other end is/ 2 A 2 .
Since these are equal, no lines are lost or start within. In other
words, in a divergenceless flow, lines never start or stop except at
sources or sinks. For a more general proof we note that the
number of lines crossing a surface element dS, by definition,
is f n dS. Then the number emerging from a closed surface, and
which therefore have started within the surface, is ///„ dS. But
by Gauss's theorem this is J//div / dv, and is zero if the flow is
divergenceless.
118. Irrotational Flow and the Velocity Potential. — In Chap.
VI we studied vector fields like our flux vector; we were interested
then in forces. We saw that under certain conditions, a force
could be written as a gradient of a potential function. The
condition was that the work done in taking a particle around any
closed path should be zero, or that the field should be conserva-
FLOW OF FLUIDS
189
tive: JF • ds = around any contour. We had another way
of stating the condition: it was curl F = everywhere. In a
similar way, if the curl of our velocity vector is zero, we can
introduce a potential function here. It is now to be regarded as
a purely mathematical device, used simply by analogy with our
previous cases, and having nothing to do with potential energy.
A flow whose curl is everywhere zero is called an irrotational flow.
It is easy to prove that in a whirlpool the curl is different from
Fig. 31. — Lines of flow and equipotentials for flow about a cylinder. Full
lines indicate lines of flow, dotted lines equipotentials. In a corresponding
electrical problem with charges distributed within the cylinder, and placed in a
uniform external electric field, the dotted lines would be lines of force, full lines
equipotentials.
zero (see for instance Prob. 4, Chap. VI), a nonvanishing curl
indicating in fact exactly a whirlpool. Now, physically, we are
acquainted with two sorts of fluid flow: streamline flow and
turbulent flow. In the latter, eddies or whirlpools form, and the
curl of the velocity is not zero. But in the former, there are no
eddies, the curl of the velocity is zero, and the flow is irrotational.
In a streamline flow, then, we can introduce a potential function,
called the velocity potential <f>, defined byv = — grad <t>. The
velocity potential, of course, is not a potential energy; its analogy
with potential energies is mathematical rather then physical.
Nevertheless, we can draw surfaces of constant velocity potential,
190 INTRODUCTION TO THEORETICAL PHYSICS
or equipotentials, and the lines of flow will cut the equipotentials
at right angles. Using the equation of continuity, and assuming
that p is constant, we have as the general equation for the velocity
potential
div (pv) = -p div grad 4> = -pv 2 4> = — ■£ + P. (4)
reducing to Laplace's equation v 2 <£ — for a steady state where
there are no sources or sinks.
The introduction of a velocity potential satisfying Laplace's
equation makes it possible in many cases to solve hydrodynamic
problems by analogy with similar problems in other branches of
physics, as electrostatics. In Chap. XIX we shall find that the
electrostatic potential satisfies Laplace's equation, the lines of
force being normal to the equipotentials, so that any set of electro-
static equipotentials can be used for a suitable hydrodynamic
problem. For instance, in Fig. 31, we show the lines of flow and
equipotentials for flow of a liquid about a cylinder. The same
lines, however, represent lines of force resulting from a certain
distriBution of charges in the center of the sphere, superposed on
a uniform electric field.
119. Euler's Equations of Motion for Ideal Fluids. — The equa-
tion of continuity serves to determine the velocity of flow of a
liquid, but does not determine the pressures, or make any
connection with forces. It is essentially a kinematical rather
than a dynamical law. It is one of two fundamental equations
governing fluid motion. The other is essentially the Newtonian
law, force equals mass times acceleration. For a continuous
medium, we have already seen how this is to be formulated in
the preceding chapter, where we wrote the force on an element
of volume in terms of the stresses. As was mentioned in the last
chapter, an ideal fluid is characterized by the fact that it supports
no shear and hence n = 0. For this case the six stress compo-
nents reduce to one, namely X x = Y v = Z z = — p and X y =
Y z = Z x = 0, if p denotes the pressure in the fluid. Further-
more, if there is flow of the fluid one must consider the velocity
of each particle as a function of x, y, z, and t, and hence
dv x dv x . dv x , dv x , dv x
-dt = -di+ V *-dx- + V «ly- + v *to
and two similar expressions for v y and v s . Written in vector
form with the help of our symbolic vector V = grad
FLOW OF FLUIDS 191
if " IF + (c ' v) "" " IF + ( " • grad)o "
i.e., we form the scalar product of v and V and then operate
on v x . Our general equations of motion become in this case :
*-%- 4% +<■-***•}
where X, Y, Z represent the body force (as gravitation) per
unit mass, which we neglected in the last chapter. Combined
into one vector equation this gives
F grad p = — + (v • grad>, (5)
p at
where F is the body force. These are the Euler equations of
hydrodynamics. In them p (the density) is considered a known
function of the pressure as given by the equation of state of
the substance. We then have p, v x , v y , v z as functions of x, y, z,
and t. The three equations above and the continuity equation
provide the necessary four equations to give a unique solution.
For the case of hydrostatic equilibrium, these equations reduce
to the form F = (1/p) grad p, from which such familiar things as
Archimedes' principle immediately follow.
120. Irrotational Flow and Bernoulli's Equation. — If there is
irrotational flow, and the velocity is derived from a velocity
potential, Euler's equations take a particularly simple form.
If v = — grad <t>, then we have
(v • grad)^ = —(v- grad —J
d<t> d 2 <f> d$ ay d(j> d 2 <£
dx dx 2 by dxdy dz dxdz
so that
2dx[\dx) + \dy) + \dz)\'
(v - grad> « grad ( |- J»
192 INTRODUCTION TO THEORETICAL PHYSICS
in the special case where curl v = 0. Further, we introduce a
—Ap whose gradient
is
grad II = -T- grad p = - grad p.
Euler's equation for the steady state, where v is independent
of time, then becomes
F = grad
(-0
As a result of this equation, we see that for irrotational flow to
occur, F must be the gradient of a certain quantity, or F must be
a conservative force, derivable from a potential. We may then
se t F = — grad V, and Euler's equation becomes
grad (f +11 + = 0,
or, integrated,
v 2
V + II + ~- = constant.
This is Bernoulli's equation. For the special case of an incom-
pressible fluid, p is independent of p, so that n is equal to -• In
that case the equation may be written
pV + p + \pv l — constant.
Bernoulli's equation is essentially an energy integral, the term
P V representing the potential energy per unit volume, p the
contribution to the energy resulting from the pressure, and
|py 2 the kinetic energy per unit volume. As we have stated,
Bernoulli's equation, supplemented for a compressible fluid
by the relation giving density as function of pressure, determines
the pressure at each point of space, when the velocity and external
potential are known. For instance, if there is no external force
field (V = 0), we see that the pressure decreases at points
where the velocity is high, which means at points where the tubes
of flow narrow down.
121. Viscous Fluids. — In Sec. 119 we mentioned the fact that
ideal fluids support no shearing stresses. This, however, is not
true of viscous fluids. Imagine a viscous liquid flowing hori-
FLOW OF FLUIDS 193
zontally, the lower layers dragging along the bottom, and the
velocity increasing with height, so that v x = v x (y), other compo-
nents of v are zero, if the xz plane is horizontal, y is vertical.
Then if we imagine a horizontal element of area in the liquid
at a certain height, the material above the element of area will
pull tangentially on the material below it on account of viscosity,
thus exerting a shearing stress. Experimentally, this stress,
which is X V1 is proportional to the rate of increase of horizontal
component of velocity with height: if k is the coefficient of
viscosity, X v = k-~- This is a special case of the general laws
governing stresses in a viscous medium, connecting the stresses
with the rates of change of the velocity components with position.
In the last chapter we have given the general form of Hooke's
law, the law giving stresses in an elastic medium in terms of the
strains. By analogy we can set up the relations for a viscous
fluid, but now the stresses are proportional, not to the strain
components themselves, but to their time derivatives. By
comparison with Eq. (5), Chap. XVI, we see that k takes
the place of the shear modulus, and that the component of
strain d£/dy + drj/dx must be replaced by its time derivative,
dv x /dy + dVy/dx = dv x /dy in our special case, since v y = 0.
This tells us how in general we are to change Hooke's law for
the case of viscous incompressible fluids. We place dv x /dx +
dVy/dy + dvjdz = divv = 0, corresponding to d£/ dx + dy/dy +
d£/dz = in the strains, replace n by k and insert the time deriva-
tives of the strain components. Thus -we have the following
relations between the stress and strain components for liquids :
r.--p + *£ ; . r, = *(t + t)
*--* + »£ *.-<!? + Sr) <«
where we have included the ordinary pressure of the liquid in
addition to the viscous stresses. Inserting the values of the
stress components in the equations of motion (2) of the previous
chapter and remembering that for an incompressible fluid we
have the continuity equation div v = dv x /dx + dv y /dy -f-
194 INTRODUCTION TO THEORETICAL PHYSICS
dv z /dz = 0, there follow the general equations of motion for
viscous liquids:
"-% + **.-<%
or in vector form: P F - grad p + ky 2 v = p-£, differing from
Eq. (5) by the term ky 2 v.
122. Poiseuille's Law. — Suppose we have an incompressible
liquid flowing in a steady state in a horizontal cylinder of radius
R parallel to the long axis of the cylinder (x axis). We have
v v = v z = and since there are no body forces X = Y = Z = 0.
The equation of continuity becomes dv x /dx = so that v x is a
function of y and z alone. Then dv x /dt = v x dv x /dx + v y dv x /dy +
v z dv x /dz = 0. Furthermore, if we take the divergence of the
fundamental equations of motion, we have :
P div F — div grad p + &v 2 (div v) = p -r (div v)
Now by the equation of continuity div v = 0, and in our case
of no external forces this reduces to
div grad p = v 2 2> = 0.
In our problem dp/dy = dp/dz = 0, so that d 2 p/dx 2 = 0.
The pressure is thus a linear function of x, so that we have a
constant pressure gradient in the tube. Of the three equations,
only the first is left :
dp
dx
and since dp/dx is constant = a, and we have cylindrical sym-
metry, this reduces to
1 d_ ( dv x \ _ a
r dr\ dr / k
where r is the distance from the axis of the cylinder. Integrated,
this yields v x = jrr 2 + 6 In r + c, and since v x is finite for
- T,( Q2Vx _L d * Vx \
FLOW OF FLUIDS 195
r = o, & = 0. If the liquid clings to the walls of the cylinder,
v x = when r = R, so that we find
v x = ±{r*-m. (8)
Thus the liquid flows in cylindrical tubes of constant velocity.
This type of motion is called "laminar" motion. The velocity
varies parabolically across a diameter of the cylinder.
The amount of liquid flowing per second through a cylindrical
ring of thickness dr, radius r, is
dQ = 2irrv x dr
so that the total discharge rate of such a cylinder is
Q = 2.J rv. dr = — ^- = m (p, - Pl ) (9)
where we have placed the constant pressure gradient a =
— PLZLPJ. This law, known as Poiseuille's law, furnishes a
Li
very nice experimental method of determining the coefficient
of viscosity of liquids.
Problems
1. Liquid is confined between two parallel plates, so that it flows in two
dimensions. At a certain point, a pipe discharges liquid at a constant rate
into the region. Find the velocity potential, and velocity, as a function of
position. Show by direct calculation that the flow outward over any
circle about the source is the same.
2. A shallow tray containing fluid has a source at one point, an equal
sink at another, so that liquid flows in two dimensions from source to sink.
Find the equation of the equipotentials and the lines of flow, prove they are
circles and plot them. (Suggestion: since the equations are linear, the
potential or flux due to two sources is the sum of the solution for the separate
sources.)
3. Prove that — (1/r) is a solution of Laplace's equation. Investigate
dx
the lines of flow connected with this as a potential. Draw the lines, in the
xy plane. What sort of physical situation would be described by this case?
4. Consider an ideal fluid at rest. It is subjected to an impulsive pressure
(p) = I pdt , where r indicates the interval of time during which the pressure
is applied. If no body forces act on the fluid, prove by integrating Euler's
equations, that the impulsive pressure divided by the density of the fluid
equals the velocity potential of the ensuing motion. This is the physical
significance of a velocity potential.
196 INTRODUCTION TO THEORETICAL PHYSICS
5. Show for a liquid in equilibrium under the action of gravity that the
pressure varies linearly with the depth below the surface. Calculate the
total force exerted on the surface of a submerged body by the liquid
and show that the resultant force is directed upwards and is given in
magnitude by Archimedes' principle. [Hint: If a vector has only one
component different from zero, e.g., A x , then Gauss's theorem becomes
f ^ dV = Ca x cos (n, x)dS.]
6. The free surface of a liquid is one of constant pressure. If an incom-
pressible fluid is placed in a cylindrical vessel and the whole rotated with
constant angular velocity co, show that the free surface becomes a paraboloid
of revolution. (Hint: Introduce a fictitious potential energy to take care of
centrifugal force and use the hydrostatic equations.)
7. A gas maintained at constant pressure p, flows steadily out of a small
hole into the atmosphere, pressure p . Assume the density constant. Find
the expressions for the velocity of efflux and for the force exerted on the gas
container due to the efflux. If the gas is oxygen at a pressure of 4 atmos-
pheres in the tank, calculate the efflux velocity (1) with the density constant,
and (2) taking into account the variation of density with pressure, assuming
an adiabatic expansion.
8. With the help of Gauss's theorem prove the theorem of the last chapter
that the stress tensor is symmetric.
9. Calculate the rate of discharge of a cylindrical pipe standing vertically,
the liquid flowing in laminar flow under the action of gravity only.
10. A perfect gas at constant temperature is in equilibrium under the
action of gravity. Find the relation between the pressure of the gas and
the height above the surface of the earth.
11. Carry through the derivation of the laws of motion of viscous fluids
using the modified form of Hooke's law and the general equations of motion
of an. elastic medium.
CHAPTER XVIII
HEAT FLOW
The problem of heat flow, although of quite different physical
nature from elasticity and hydrodynamics, involves similar
mathematics. Indeed, Fourier was concerned with problems
of heat flow when he developed the series known by his name
which we have used so much in our study of vibrations. First
we set up the differential equation governing heat flow in a
manner similar to the reasoning of the preceding chapters.
123. Differential Equation of Heat Flow. — The fundamental
physical fact is that when there is a difference of temperature in a
material body, heat will flow, and the rate of flow is proportional
to the temperature gradient. Suppose we have a slab of thick-
ness L, area a, with a difference of temperature Ti — Ti between
the faces. Then the amount of heat flowing per second across
the face is ~ — , where k is the thermal conductivity,
the negative sign meaning that if Ti > T h the flow will be back-
ward toward low temperature. In the limit of an infinitely thin
slab, this is simply — ka— , if £ is the coordinate measured in the
ox
direction of the heat flow. Next, there is the fact that if heat
flows into a region, its temperature rises, the amount of rise being
given by the relation that the amount of heat flowing in equals
the change of temperature times the heat capacity, which in turn
is the specific heat c times the mass. Putting these together, we
obtain an equation which states the following: the rate of heat
flow into a body is proportional to the time rate of change of its
temperature; or, looking at it in another way, it is proportional
to the temperature gradient around its boundaries. By eliminat-
ing the heat flow, we obtain a differential equation for the
temperature.
Our first principle, which we have stated in the form that
a/77
— ka— measures the heat flow across the area a perpendicular
to the x axis, is evidently a special case of the general law that
197
198 INTRODUCTION TO THEORETICAL PHYSICS
the flux density of heat flow is / = — k grad T. This incidentally
shows us at once that, if A; is a constant, / is derivable from a
potential, in this case kT, so that the curl of the flux is zero. The
surfaces of constant temperature are called isothermals, and they
serve as equipotentials, the lines of flow being at right angles to
the isothermals. The equation of continuity now states that the
time rate of increase of heat per unit volume equals the rate at
which the heat flows in over the surface, plus the rate at which
heat is produced inside. To raise the temperature of unit volume
one degree requires an amount of heat equal to the heat capacity,
or cp, if c is the specific heat, p the density of matter. Thus the
time rate of increase of heat is cp times the time rate of increase
of temperature. We have then
where P means the rate of production of heat per unit volume.
By Gauss's theorem, the second term becomes — / / Jdiv / dv, so
that for a small volume we have
Cp ~dt = ~ div / + ' P -
Substituting,
cp^ = k div grad T + P = kV*T + P. (1)
This is the equation of heat flow. At a point where heat is not
being produced, it reduces to
an equation similar to the wave equation as far as the dependence
on space is concerned. It contains, however, a first rather than
a second time derivative, and this results in solutions which are
exponentially damped, like a particle with resistance but no
restoring force, rather than oscillating solutions. The particular
case where the temperature is independent of the time, the steady
state, leads simply to Laplace's equation, the term in time
vanishing.
124. The Steady Flow of Heat. — The isothermals and lines of
flow for the steady flow of heat are determined from Laplace's
equation, and in some elementary cases we can find them with
great ease. First let us consider a one-dimensional flow, which we
HEAT FLOW 199
obtain with a slab of a substance, like a window pane, assuming
that the temperature varies only with the coordinate x normal to
the surface, being independent of y and z. Laplace's equation
becomes d 2 T/dx 2 = 0, so that T = a + bx, with a constant
temperature gradient. Thus if a face at x = is kept at tem-
perature T , the other face at x = L at T h the temperature at
intermediate points is given by T = T + (x/L){T x — T ). It
is this simple case which furnishes the basis for the usual defini-
tion of thermal conductivity.
The cylinder forms a slightly more difficult problem in steady
flow. For instance, let us ask for the steady state of temperature
within a pipe formed of two concentric cylinders, whose inside
and outside faces are kept at fixed temperatures. The tempera-
ture will depend only on r, and will be determined, on account of
the divergenceless nature of the flow, by the condition that the
same amount of heat flows across the surface of any cylinder with
radius intermediate between r and r\, the minimum and maxi-
mum radii of the pipe. This amount of heat is the product of
the normal component of the flow, which is f r = — k(dT/dr),
by the area of the cylinder, which for unit length along the pipe
is 2xr. In other words, 2wrf r = —2irkr{dT/dr) = constant,
dT/dr = a/r, T = a In r + b. The two constants can be
determined by fitting the temperatures at the two surfaces of the
pipe. This example is interesting in showing that the tempera-
ture gradient is not always a constant in the steady state. The
reason is very simple : the tubes of flow are not of constant cross-
sectional area, and thus with a divergenceless flow the number of
lines of flow per square centimeter, and consequently the magni-
tude of the temperature gradient and flux vector, must change
from point to point. The same thing is evident in the flow of
heat in a sphere, where the flow through concentric spheres must
be the same. Hence, since the areas of these spheres increase
proportionally to the squares of the radii, the temperature gradi-
ent must be inversely proportional to the square of the distance
from the center, and the temperature inversely as the first power.
These relations are just like those of the field and potential of a
point charge in electrostatics, and as we shall later see, for just
the same reason: both are solutions of Laplace's equation.
125. Flow Vectors in Generalized Coordinates. — Complicated
problems in the steady flow of heat, as in hydrodynamics and
electrostatics, are best approached by introducing curvilinear
200
INTRODUCTION TO THEORETICAL PHYSICS
coordinates, so that the boundaries of the bodies are expressed
by coordinate surfaces, as with the cylinder and sphere. Thie
suggests the formulation of the equation of steady flow, or
Laplace's equation, in such general coordinates. Let the coor-
dinates be g x , g 2 , qz and let them be orthogonal coordinates, so
that the three sets of coordinate surfaces, q\ = constant, g 2 =
constant, g 3 = constant, intersect at right angles. Now let us
move a distance dsi normal to a surface q\ = constant. By doing
so, g 2 and g 3 do not change, but we reach another surface on which
gi has increased by dqi, which in general is different from dsi.
Thus, with polar coordinates, if the displacement is along the
radius, so that r is changing, ds = dr; but if it is along a tangent
to a circle, so that 6 is changing, ds = rdd. In general, we have
dqi = hidsi, dq 2 = h 2 ds 2 , dq z = h s dsz, (3)
where in polar coordinates the h connected with r is unity, but
that connected with 6 is 1/r. The first step in setting up vector
operations in any set of coordinates is to derive these A's, which
can be done by elementary geometrical methods.
126. Gradient in Generalized Coordinates. — The component
of the gradient of a scalar S in any direction is its directional
derivative in that direction. Thus the component in the direc-
tion 1 (normal to the surface gi =
constant) is -r- = hi- — For in-
dsi dqi
stance, in polar coordinates, the r
component is -z-> and the 6 com-
+ d< *5
dr
ponent
1 dS
Fig. 32. — Element of volume for
vector operations in curvilinear
coordinates.
r dd
127. Divergence in Generalized
Coordinates. — Let us apply
Gauss's theorem to a small volume
element dV = dsids^dsz, bounded
by coordinate surfaces at q lt q x + dq h etc. as in Fig. 32. If
we have a vector A, of components A h A 2 , A 3 along the three
curvilinear axes, the flux into the volume over the face at q h
whose area is ds 2 ds s , is (Aids^dss)^, and the corresponding flux
out over the opposite face is (Aidszdssj^+dqj, where we note that
the area ds 2 ds 3 changes with q x as well as the flux density A\.
Thus the flux out over these two faces is ^— (Aids 2 dsz)dqi =
dqi
HEAT FLOW 201
•-— ( t—J- )dqidq 2 dqz = hjiji-i-r—i — |- )dV. Proceeding similarly
dqAJiJiz/ dqi\h 2 li3/
with the other pairs of faces, and setting the whole outward flux
equal to div A dV, we have
*, a = «44(^) + 4(^) + UM <*>
128. Laplacian. — Writing the Laplacian as div grad <£, and
placing Ax = gradi </>, etc., in the expression for div A, we have
*♦ = div grad * = M*[^ ^J + ^ ^J +
^yd^Jj (5)
It can easily be verified that this formula leads to the same values
for the Laplacian in special cases which we have already obtained
by direct differentiation in Chap. XV. But now we can under-
stand the formula better, for we see that the terms like hi/h 2 h 3
appearing inside the first differentiation arise from the fact that
the flux through the opposite sides of a volume may differ not
only on account of variation of the flux density, but also because
the Opposite sides can have different areas, as they do in the small
volume element determined by coordinate surfaces with curvi-
linear coordinates.
129. Steady Flow of Heat in a Sphere. — Having obtained
Laplace's equation in arbitrary coordinate systems, the problem
of solving for the steady flow of heat becomes that of solving
Laplace's equation in a suitable system, subject to certain bound-
ary conditions. For instance, suppose we know that the surface
of a sphere, radius r , is kept at a temperature independent of
time, though depending on the angles 6 and <£. We then can
set up the steady distribution of temperature within the sphere
by solving Laplace's equation in spherical coordinates. The
problem is mathematically like that of Problems 6, 7, and 8,
Chap. XV, the vibration of a sphere, if we seek a solution inde-
pendent of time. Just as in those problems, we separate vari-
ables in Laplace's equation, obtaining solutions of the form
sin m4>Pi m (cos 6)R, where the P's are called associated Legendre
polynomials, and where R satisfies the equation
ll( r2 dR\
r 2 dr\ dr /
2 dR\ 1(1 + 1)
R = 0,
202 INTRODUCTION TO THEORETICAL PHYSICS
which can be immediately solved by setting R = r n , where n is an
integer to be determined. Substituting, this leads at once to the
equation n(n + 1) = l{l + 1), which has two solutions, n = I
or n = — (I + 1). In the present case, where the function must
stay finite within the sphere, at r = 0, we cannot have inverse
powers, so that the only allowable functions are r l . Other
problems solved by the same method, however, as for instance
those of the electrostatic fields of distributions of charges, often
involve functions which may become infinite at r = but remain
finite at large r's, and they must be expanded in the series of
inverse powers. We now have for a general solution
^^(Ami sin m<f> + B m i cos m<£)Pr(cos d)r l .
I m
To get the coefficients of the various terms in the sum, we set
r = r , and determine the coefficients so that the resulting func-
tion of and <£ is the assumed temperature distribution. This
amounts to an expansion of the assumed function in series in the
orthogonal functions (sin m<f> or cos m<£)Pj TO (cos 0), and can be
done by the usual methods for such expansions.
130. Spherical Harmonics. — To understand the physical
meaning of the various terms of the expansion, we should con-
sider the spherical harmonics, or functions of angles. Solving
for these as in the problems quoted above, we find for the first
few functions the following values:
I = 0, m = "0: constant
I = l } m = ± 1 : (sin 4> or cos </>) sin
m = 0: cos
I = 2, m — ±2: (sin 2<j> or cos 2<t>) sin 2
m = + 1 : (sin <t> or cos 4>) sin cos
m = 0: 3 cos 2 0-1.
These functions are shown graphically in Fig. 33, where the
intersections of the nodal planes or cones with unit sphere are
drawn. Thus the functions with I = 1 have one nodal plane,
which may be perpendicular to any one of the three coordinate
axes. This is seen most easily by remembering that x —
r sin 6 cos <f>, y = r sin sin <f>, z = r cos 0, so that the three
solutions of the problem corresponding to I = 1 (r times the
functions of angle) are simply x, y, z. These are obviously
solutions of Laplace\s ^nation, and have the nodal planes
HEAT FLOW
203
x = 0, y = 0, z = 0, respectively. Similarly by making linear
combinations of these three functions, we obtain solutions having
any desired nodal plane. This is analogous to the degeneracy
in the circular membrane, discussed in Sec. 103. With I = 2,
there are two nodal surfaces, and so on. For discussing the
vibrations of a sphere, of course these nodes would represent
the regions of no displacement, the material on one side. being
displaced one way, the material on the other side in the opposite
m=±1 m =
Fig. 33. — Spherical harmonics. Figures represent nodal lines on the surface of
a sphere, for the functions sin m^Pf 1 (cos 6) and cos m<t>Pi m (cos 0). Upper line,
1=1; lower line, 1=2.
direction. With heat flow, the separate terms represent simple
types of steady temperature distribution. For instance, the
terms with 1 = 1 represent spheres in which the surface tempera-
ture varies as the cosine of the colatitude angle, or as the distance
in a direction along the axis, and our solution tells us that in
this case the temperature within the body varies linearly with
distance, as in a flat slab. Higher terms represent more compli-
cated solutions, and by superposing them any desired steady
heat flow can be built up.
131. Fourier's Method for the Transient Flow of Heat. — The
simplest type of problem in the transient flow of heat is the
following: At t = 0, a body has a temperature which is an
arbitrary function of position. At that instant, it is plunged
into a cooling bath of some sort, which instantly cools its sur-
faces to a fixed distribution of surface temperature which is
204 INTRODUCTION TO THEORETICAL PHYSICS
maintained after that. The problem is to find the temperature
throughout the body as a function of time as it cools from its
initial to its final steady state. This can be easily reduced to a
simpler case. We write the temperature at any time as the sum
of two terms, the transient solution, and the steady-state solu-
tion. The latter is the temperature distribution set up by the
cooling baths around the surface, and is discussed as in the last
few sections in which steady flow of heat has been considered.
The transient solution starts off with a temperature distribution
which, added to the steady-state solution, gives the assumed
initial temperature distribution of the body, and then gradually
damps down to zero, finally leaving the steady-state solution
only. Since at any instant after t = the steady-state solution
by itself gives the correct boundary temperature about the sur-
face of the body, we see that the transient must give zero tempera-
ture at all points of the surface, independent of time. Thus
the transient by itself is the solution of the problem in which a
body is heated to an arbitrary temperature distribution at t = 0,
after that is plunged into a cooling bath maintaining its whole
surface at temperature zero, and gradually cools down to this
temperature. We investigate this transient problem.
First we take the one-dimensional case, again of a slab, in
which the initial temperature is an arbitrary function of x, but
at all times after t = the two faces, at x = and x = L, are
maintained at T = 0. The heat-flow equation becomes
d*T _cpdT = A dT = cp
'dx T ~ k dt A dt k
We solve this equation by separation of variables. If T =
X{x)Q{t), and if we substitute in the equation and divide by T,
we have
1 d*X = AdG = _ C2
X dx 2 6 dt
Then separating we have
dQ CPQ <PX
dt^ A ' dx
^ + ^ = o,^+c*x = o.
The solutions are
-en
6 = e A , X = sin Cx or cos Cx.
We see that the temperature decreases exponentially with the
time, approaching a constant value, a very reasonable behavior.
HEAT FLOW 205
The boundary condition is now T = when x — 0, x = L,
and we satisfy this as we would with the vibrating string: we
take only sines, and only those which reduce to zero &t x — L;
that is, we take sin (nwx/L), where n is an integer. In other
words, C = mr/L, so that the function is constant X e~ {n ' v ' /AL2)t
sin (mrx/L), and the whole solution, writing in the value of A, is
^^ zr n "*\ ■ nirX ,a\
Let us assume that the temperature distribution at I =
is T = f(x). Then we wish to find the coefficients K n , deter-
mining the temperature at later times. At t = the exponentials
go to 1, so that we have f(x) =- ^jK n sin -j — We can then find
the coefficients K n by Fourier's method, so that the problem is
solved. The qualitative nature of the solution is easy to see.
The original shape of the temperature curve will be distorted
as time goes on, since the terms with high n damp down more
rapidly than the others. After a certain lapse of time the whole
slab will have become cooler, but also with a more simple tem-
perature distribution, approximating the single term with n = 1.
Thus, for instance, if it is originally all at a constant high tem-
perature, and then is cooled, the original temperature curve
would rise discontinuously from at the edge to a constant
value T inside. But after a time the curve would be like a single
loop of a sine curve, showing that the edges would cool more
rapidly than the middle.
The transient flow of heat in bodies of other shape may be
considered by extensions of the same method. Thus the transient
flow in the cylinder or sphere can be handled by introducing
cylindrical or spherical polar coordinates, and separating vari-
ables just as for the vibration problems. The solutions, as far
as the coordinates are concerned, come out as with vibrations,
leading, for example, to sines and cosines of the angle, and Bessel's
functions of r, in the case of two-dimensional flow in a circle or
cylinder, but the time enters as a real exponential damping down
to zero, rather than a complex exponential or sinusoidal function.
Special cases are discussed in the problems.
132. Integral Method for Heat Flow.— There is another, differ-
ent, method of great use in discussing the transient flow of heat.
206
INTRODUCTION TO THEORETICAL PHYSICS
This method is based on an important particular solution of the
heat-flow equation. If we consider again the one-dimensional
flow, and let a 2 = k/cp, we can easily show that the function
f(x - x', t) =
4:aH
(7)
2a-\/irt
is a solution of the equation, where x' is an arbitrary constant.
To prove this, it is only necessary to substitute in the differential
Fig. 34.— Function f(x - x', t) of Eq. (7), as function of a;, for different Vs.
The function represents temperature distribution at different times resulting from
initial conditions where the temperature is infinite at x' , zero- elsewhere.
equation. The graph of the function /, plotted against x for
different values of t, as in Fig. 34, has a sharp maximum at x = x',
looking like the familiar Gauss curve for probability distributions.
At t = the curve is coincident with the x axis everywhere
except at x = x' , where it forms an infinitely high and narrow
mountain, so that the area under the curve is finite. As time
goes on, this mountain becomes flatter and broader, until finally
the function is zero everywhere.
The function / can be used to discuss the following problem :
At t = the temperature throughout an infinite body is given
by a function T (x), and we are interested in the way in which this
temperature distribution changes with time. We can break up
the problem into a sum of other simpler problems, by dividing up
HEAT FLOW 207
the distance x into small intervals, by a succession of points x h
x 2 • • • x n . We set up the following problems:
1. The initial temperature is T (x ) between x and x h but
is zero elsewhere;
2. The initial temperature is T (xi) between, xi and x 2 , but
is zero elsewhere;
n. The initial temperature is T (x n -i) between z n _i and x n ,
but is zero elsewhere.
The initial temperature distribution connected with one of
these problems would be similar to the curve of Fig. 34, for very
small value of t, in that it would be large in a very small region,
negligible or zero elsewhere. To make the maximum come at the
right place, we must choose x' for the ith problem equal to x t .
As time goes on, the function / gives a good approximation to the
way in which the temperature in this simple problem changes.
Now if, at t = 0, we add together all the temperatures of Probs. 1
to n, we get the correct initial distribution of temperature.
Therefore, if we add all the solutions at a later time, we again
get the solution for the whole problem. This, of course, actually
becomes an integral, the element of the integrand connected with
the interval dxi, which equals x i+ i - x if being proportional to
T (xi)f(x - Xi, t)dxi. As a matter of fact, the constant of
proportionality in / is so chosen that this gives just the right
answer:
T(x, t) = /^ T (x') f{x - x' } t) dx'. (8)
To prove this, we need to do two things: first, prove that it is a
solution of the heat-flow equation; secondly, show that it
approaches the correct value at t = 0. The first is obvious, for
the integrand, regarded as a function of x and t, has already been
shown to be a solution of the equation, and on account of the linear
nature of the differential equation a sum of solutions is a solution.
For the second, we note that at t = the function fix — x' , t) has
appreciable values only at x = x'. The whole integral will then
come from the immediate neighborhood of x' = x, so that we
may insert this value in T , and take it outside the integral sign,
obtaining
T(x, 0) = T ix) f^fix - x' t 0) dx'.
208 INTRODUCTION TO THEORETICAL PHYSICS
X i* jo x — x
The integral is — j=. I e ~ u ' 1 du, where u = „ ~ i and this equals
unity. Hence we have shown that T{x, 0) = T (x), so that we
have verified our solution.
By a slight variation, it is possible to solve the problem in which
the temperature of a semi-infinite slab bounded by x = is
initially any desired value, and in which the surface is kept at
T = at all subsequent times. Let the initial temperature be
T (x), where this function is defined only for positive x's, inside
the slab. We now define an odd function equal to T (x) for
positive x's, equal therefore to — T ( — x) for negative x's. If
we set up an infinite slab with this temperature distribution,
then on account of symmetry the temperature at x = will
always be zero, and our boundary condition is satisfied, the part
of the solution for positive x's being the desired function.
Integral methods similar to that described can be used also
to discuss the problem in which the surface of a semi-infinite
slab is kept at a temperature which varies in an arbitrary way
with time. Two- and three-dimensional problems can also be
treated, though the principles are not essentially different from
those already considered.
one interesting feature of heat flow is brought out by the
integral solution which we have just used. That is its irreversi-
ble nature. Thermodynamically, heat conduction is a typical
irreversible process, and this is shown in the fact that heat
always flows from the warmer to the cooler body, never in the
opposite direction. With reversible processes, as for instance
vibration problems, one can change the sign of the time where it
appears in the solution and still have a possible solution of the
equation ; a vibration running backward is not essentially
different from one running forward. But that is not the case
in the heat-flow equation, as we see easily from Eq. (7), where,
if we attempt to give t a negative value, the solution becomes
imaginary. The essential mathematical difference between the
two cases is that in heat flow a first time derivative appears,
while in vibration problems and wave equations there is a second
time derivative. This second time derivative is unchanged
when t is changed to —t, whereas the first time derivative in
the heat-flow equation changes sign with t, so that, if a given
function satisfies the equation, it will no longer satisfy it if time
is reversed.
HEAT FLOW 209
Problems
1. Derive the divergence, gradient, and Laplacian in spherical polar
coordinates by the general method of this chapter.
2. Discuss the steady flow of heat in a spherical shell contained between
two concentric spheres, the temperature being an arbitrary function of
position over both surfaces.
3. Discuss the steady two-dimensional flow of heat in a semi-infinite
rectangular bar bounded by x = 0, x = L, y = 0, extending to infinity
along the y axis, subject to the boundary condition that the temperature
is zero along the two infinite sides of the bar, but that it is an arbitrary
function of x along the end from x = to x = L. Build up the solution
out of individual solutions varying sinusoidally with x, and exponentially
with y, noting that they must decrease rather than increase exponentially
as y increases.
4. Discuss the steady flow of heat in a semi-infinite cylindrical rod with a
flat end, if the temperature is kept at zero along the cylindrical face, but is
an arbitrary function of position on the end.
5. A slab is heated to a uniform temperature Ti, then plunged in a bath
which keeps its temperature at TV Find the interior temperature as a
function of the time, computing and drawing several graphs, so chosen as
to show the progress of the cooling process.
6. For small times after the cooling process has commenced in Prob. 5 ?
the interior temperature will not have changed appreciably, and the slab
will act practically like a semi-infinite slab. Compare the solution of
Prob. 5, using Fourier's method, with the corresponding solution by the
integral method, computing both curves and comparing.
7. In an infinite body the temperature is initially unity between the
planes x = — 1 and x = 1, and is zero everywhere else. Plot the tempera-
ture as a function of x for several instants of time, and finally for t = «> .
e-» 2 dtt.)
8. Prove that the integral f °° e~ ui du = %^-- (Suggestion : Multiply this
integral by the equal integral [ e~ v2 dv, and consider u and v as Cartesian
coordinates in a plane. Introduce polar coordinates in the plane, carrying
out the integration in those coordinates.)
9. Show that a particular integral of the equation for heat flow in an
infini + e medium is constant c ~I^, where r is the distance from the origin.
Discuss the initial temperature distribution corresponding to this solution.
10. Show that the integral
1 f /• /• _' 2
T =
■&3%sn rMT w "v*
is a general solution of the heat-flow equation in three dimensions corre-
sponding to an initial temperature distribution of T (x, y, z), where r 2 =
(x - xV + (y - y') 2 + (2 - *') 2 .
CHAPTER XIX
ELECTROSTATICS, GREEN'S THEOREM; AND
POTENTIAL THEORY
The problems of electrostatics are practically identical mathe-
matically with those of flow, which we have been considering
in the last few chapters. The fundamental physical law is
very simple. Electric charges exert forces on each other, given
by Coulomb's law, which states that the force is directed along
the line of centers, and equal to ee'/r 2 , where e and e' are the
strengths of the charges, r the distance between. The force
on a particular charge is then given as the sum of the individual
attractions and repulsions exerted by all the other charges.
The force per unit charge at any point is the intensity of the
electric field, a vector function of position. The lines tangent
to the force vector, similar to the lines of flow in the last two
chapters, are called the lines of force.
133. The Divergence of the Field.— Consider the field of a
point charge at the origin of coordinates. The field intensity
E is a vector of magnitude e/r 2 , pointing out along the radius;
its components are thus
ex ey ez
We then have
div E = ~(—\ + JL(?M\ -l JL(?*\ =
dx\r 3 J dy\r 3 J dz\r 3 )
3 3(*> + W)-l
J3
\ r 3
We thus see that the field of a point charge is divergenceless.
In other words, if we represent the field strength by the number
of lines of force per square centimeter, these lines will never start
or stop in empty space. They will, of course, start or stop on
charges. We cannot see this directly, but we can prove it by
using Gauss's theorem. Take a small sphere of radius R about
the origin. Then we know that the volume integral of the
divergence of E over the volume equals the surface integral
210
GREEN'S THEOREM, AND POTENTIAL THEORY 211
of the normal component of E. This component is e/R 2 , and
the surface area is 4ir# 2 , so that the surface integral in question
is 4xe. Thus the volume integral of the divergence over our
small volume is iire, which is different from zero. Since the
number of lines emerging across an area equals the field strength,
the total number of lines of force diverging from the charge e
is also ire.
Now consider the field of many point charges. The field of
each charge separately has zero divergence. Therefore, since
the divergence of the sum of several functions is the sum of the
divergences, it is plain that the divergence of the whole field
vanishes: div E = in general. The only exception is for those
points where there is charge, for there we have seen that the
divergence does not vanish. Let us see what does happen there.
In the first place we introduce p, the volume density of charge.
Now take a small volume dv, containing a charge pdv. Surely
if dv is small enough this field will be just as if the same charges
were concentrated at a point. Thus 4wpdv lines will diverge
from the charge, or JfE n dS = div E dv = lirpdv. Dividing by
dv, we have
div E = 4tt P . (1)
This is the general equation for the divergence of the field, and
we see that it reduces to div E = at points where the charge
density vanishes. This equation, div E = 4xp, is mathemati-
cally equivalent to the continuity equation
^=-div/ + P,
at
if we set the time derivative equal to zero, and consider 4irp as
the quantity analogous to the rate of production of material.
Here, of course, there is no actual idea of flow, the analogy being
merely mathematical.
134. The Potential. — We can immediately show that the curl
of the field of a point charge vanishes. And unlike the divergence
equation, this is true everywhere, even right at the charge. Then,
-if we superpose many charges, the curl still is zero, so that we
have the general equation curl E = 0. This holds in all static
cases (we shall later have a term to add to the equation, contain-
ing a time derivative). Thus we can always set up an electro-
static potential <f> t such that E = - grad <£. Taking the divergence,
we find the equation which the potential satisfies: it is
212 INTRODUCTION TO THEORETICAL PHYSICS
-div grad <j> = -V 2 <f> = 4rp, (2)
which is called Poisson's equation. Laplace's equation V 2 <£ =
is the special case which holds in those regions of space that
contain no charge.
If we form the line integral of the electric field intensity along
a given curve between two points of the field, A and B, then
JE ■ ds along this curve is called the electromotive force along the
path. It is obviously the work per unit charge done by the field
when a charge is moved along the given path from A to B.
In the electrostatic case, since E can be obtained from a potential,
E = —grad 4> and
rB r*B
E.m.f. = I E • ds = — I grad <l> • ds =
-f(s* + s*ts*)
-S.
B
so that in this case the e.m.f. is equal to the potential difference
between the points A and B. The distinction between e.m.f.
and potential difference is of importance in cases where curl E 5*
and hence there is no potential. Even in this case we may still
use the idea of e.m.f.
135. Electrostatic Problems without Conductors. — There are
two principal sorts of electrostatic problems. The first is that
in which we know the distribution of charge, and wish to compute
the field. We could always do this by direct summation of the
fields due to the individual charges, but often that is very difficult,
and we can simplify greatly by using the potential and Laplace's
equation. Thus suppose we have charge uniformly distributed
over an infinite plane, the amount per unit area being <r, and
suppose we wish the field at a distance R from that plane. We
may get this by a direct calculation. Thus we take a set of polar
coordinates in the plane, which center at the point directly
beneath the place where we wish the potential, as in Fig. 35.
Between the circles of radius r and r + dr, and between and
6 -f dd, will be an amount of charge ardddr. This will be at a
distance \/R 2 + r 2 from the point we are interested in, so that its
field will have the magnitude „ 2 — ^ The component normal
GREEN'S THEOREM, AND POTENTIAL THEORY 213
R
to the plane, which is all that we need, is this times
aRrdddr
(R 2 + r 2 ) 3 ^
\/R 2 + r*
The total field is then
x dx
(i + x*y<
where x = - B -
K
Fig. 35. — Field of a charged plane. From charge between r and r + dr, 6 and
+ d$:
ardddr
E n =
aRrdddr
R* + r 2 ' ~" (fl 2 + r*)?2
Letting 1 + x 2 = y, so that xdx = dy/2, this is
'dy _ 2ira l
2ir<r J
"2"Ji
-(-2<r^)
= 27TCT.
214
INTRODUCTION TO THEORETICAL PHYSICS
Thus the field is a constant, independent of position. Similarly
on the other side of the plane it is — 2x0-, so that there is a dis-
continuity in E of 4x0- in crossing the surface.
We have seen that it is possible in such a simple case to compute
the field directly. But it is done far more easily by using our
general principles. Thus the potential can depend only on the
coordinate normal to the plane, which we denote by x. Its
differential equation, outside the charged sheet, is then
^ =
dx*
<j> = ax + 6,
showing that the field is constant everywhere, and in the x direc-
tion. To investigate conditions on the surface, we set up a thin
flat volume, with its broad sides parallel
to the charged plane, and enclosing just
1 sq. cm. of this plane. It will then hold
charge <x, so that 4xcr lines will diverge
from it. By symmetry, these will leave
it at right angles, and an equal number
over each face. Hence 2x0- will leave over
each face, or the field strength is 2x<r on
the one side, — 2xo- on the other. We
have the same result as before, with
very much simpler calculation.
Similar problems are met in the theory
of the condenser. Take, for example,
parallel the parallel plate condenser, as in Fig-
36, two charged plates of area A , so large
in proportion to their separation d that they can be almost treated
as infinite. Let the charge per square centimeter be <r on one
plate, — o- on the other. Then we must find the potential
difference between the plates, for by definition the capacity C =
^j.- But now, just as in the last case, the field must be constant
and perpendicular to the plates. It can have different values in
the three regions to the left of the plates, between, and to the
right. And it has a discontinuity of 4x<r in passing through a
plate of surface density <r. These conditions are all satisfied by
having no field outside the condenser, and by having a field 4xo-
within, pointing from the positive plate to the negative. Thus
E=0
Fig
< d
-*l
+
-f
+
+
+
+
+
+
+
+
+
4-
+
E=4airf
- -6
E-0
36.— Field
plate condenser.
GREEN'S THEOREM, AND POTENTIAL THEORY 215
the potential difference, being the field times the distance, is brad,
so that
C = -^ = A (3)
the familiar formula for a parallel plate condenser. It should
be noticed that capacitance has the dimensions of a length in
the electrostatic system of units.
136. Electrostatic Problems with Conductors. — The second
sort of electrostatic problem is more difficult. It is that in which
there are conductors as well as charges. Now in the presence of
a charge, induced charges are set up on conductors, and it is
usually a difficult problem to find how they are distributed, and
hence to find their field. In this case it is practically indis-
pensable to make use of the methods of potential theory. To see
how to proceed, let us imagine the train of events which would
occur when a charge was brought near a conductor. The charge
would carry with it a field, which in general would be such that
different parts of the conductor were at different potentials.
Now a conductor has the peculiarity that if there is a field in it,
a current flows, and continues to flow as long as the field remains.
Thus charge will start to flow through the conductor, being
attracted or repelled by the external charge. This will continue
until just such a charge distribution has been set up in the con-
ductor that the field resulting from it plus the external charges
reduces to zero within the conductor, or the potential throughout
the conductor is constant, for this is the condition for no current
flow. In other words, the whole of a conductor, surface and
inside, is part of a single equipotential. We then solve such a
problem in the following way: we look for a solution of Poisson's
equation, holding in the region outside the conductors, and reduc-
ing to constants on the boundaries. This solution thus gives the
potential of the problem, and its gradient gives the field.
We can illustrate better by a problem. Consider an infinite
conducting plane, uncharged as a whole, with a charge e in front
of it at a distance d. Now we wish a solution of Poisson's equa-
tion, reducing to a constant over the face of the plane. We set
this up by a device, called the method of images. We imagine
the plate removed, its face replaced by an imaginary plane, and
at a distance d behind the plane we put a charge —e, as if it were
e's image in a mirror, as shown in Fig. 37. Then these two
216
INTRODUCTION TO THEORETICAL PHYSICS
charges together would keep the whole plane just at potential
zero. For any point of the plane is equidistant from both charges,
one has the potential e/r, and the other — (e/r), and they just
cancel. The potential at any point of space can be easily found,
now, in the field of these charges. It is simply
-- l
if r x is the distance from the charge e, r 2 the distance from its
mirror image. The lines of force and equipotentials look like
\ \ \
/
I
! >'
W£>~'
/ / / 1 \ \
\
W \ ^
/ /
/ I
/ l \
/ \ v
Fig. 37.-
-Lines of force for charge e in front of conducting plane, by method of
images.
those of a bar magnet, and it is perfectly true that the plane
bisecting the magnet is an equipotential. In our actual problem,
now, the potential in the empty space is just that given by our
field of two charges; in the metal the potential is zero.
We might naturally inquire what induced distribution of
charge would be set up in the conducting plane, to produce this
final field. In the first place, in a steady state, the charge within
a conductor is always zero. For the field is zero within it, there-
fore its divergence is zero. Thus all charge is concentrated on
the surface. Next, as we showed before, the normal component
GREEN'S THEOREM, AND POTENTIAL THEORY 217
of the electric field has a discontinuity of 4tto- at a surface carrying
a surface charge a. Thus if we can compute the discontinuity,
we can in turn get the surface density of charge. In our case the
field is normal to the plate, by symmetry, so that the discontinuity
of E n in crossing the surface is just equal to the total E outside.
This may be found at once from our known potential function,
so that we could get the necessary surface charge.
137. Green's Theorem. — The fundamental theorem of
potential theory is a mathematical relation called Green's
theorem. It is a result of Gauss's theorem, and is easily proved.
Gauss's theorem states that J// div E dv = fJE n dS for any
vector E. Now let E = <f> grad ^, where <£ and ^ are two scalar
functions, then div E = div O grad yp) = 4>VV + grad <f> • grad ^,
as we can easily prove. Also E n = <£— » where — is the normal
derivative, the component of the gradient along n. Hence we
have
f f f(*VV + grad 4> • grad *p)dv = I 1^-^dS. (4)
This is one form of Green's theorem. To get the more familiar
form, we next write just the same expression with <£ and yp
- interchanged :
f f f (*v"V + grad <£ • grad +)dv = f |V ^ dS.
Now we subtract, obtaining
J JJW - **♦)*- JJ(* f n - *£)*?. (5)
This is the common form of Green's theorem. We shall now
consider a number of applications of this mathematical theorem.
These applications come mostly in the discussion of methods of
solving Poisson's and Laplace's equations. Of course, these can
be solved by the method of separation of variables, and develop-
ment in series of orthogonal functions. But the present method,
called Green's method, is quite different, and almost more useful
in a general discussion, though perhaps not in particular problems.
138. Proof of Solution of Poisson's Equation. — We can easily
see how to solve Poisson's equation, V 2 = — 4?rp. For this
gives the potential <f> due to a charge distribution. Now if we
218 INTRODUCTION TO THEORETICAL PHYSICS
divide space into small elements of volume dv, the charge pdv
will exert a potential pdv/r, if r is the distance from the point
where we wish the potential due to dv. Thus the whole potential
-US*
But p = — -t-v 2 $> so that we have
4t
if If 5 ?* «>
d&
giving the solution of Poisson's equation. In this integral,
we must integrate over all space, so as to include all charges.
We have derived our solution rather intuitively from the known
solution for a point charge. But we can derive it rigorously
from Green's theorem.
In the last form of Green's theorem, let \p = 1/r, where r
is the distance from a point P, and let </> be the potential <j>.
Thus we have
This is true no matter what volume we use. Let us choose as
our volume the whole of space, except for a tiny sphere of radius
R surrounding the point P where we wish to compute the poten-
tial. Now v 2 (lA)" !?= 0> except where r = 0, so that it is zero
throughout the whole of our volume, and the left side becomes
— I I I dv. Let us compute the right side. The integral
is to be taken over the surface of our volume, which consists
of our tiny sphere, and a surface at infinity, which for the present
we neglect. Over the surface of the tiny sphere, the direction
n is simply the radial direction, pointing in toward P (because
it is directed out of the volume). We have
d(l/r) = d(l/r) = 1 H = _H
dn dr r 2 dn dr
Then the right side is
But on the surface of the sphere, r — R, so that this is
GREEN'S THEOREM, AND POTENTIAL THEORY 219
Now ■ ffjp is J ust the mean value * of * over the surface »
and
JJ
dr
ffdS
is the mean value of -^- But the I I dS is the area of the sphere
= 4ttR 2 , so that our integral is 4tt<£ + ^ R ^> and the whole
relation is, changing sign,
M
r or
If now R approaches zero, the last term vanishes, and <£
approaches <£, the value at the point P. Hence we have
the solution of Poisson's equation which we wished to prove.
There are several points to be mentioned in connection with
this proof. In the first place, the volume integral is taken over
all space, except an infinitely small sphere surrounding P : a point
charge exerts an effect on all other charges, but not on itself.
Secondly, we neglected entirely the fact that our volume has
a surface at infinity, which we should take into account in
calculating our surface integrals. Suppose that the volume were
not really infinite, but merely very large, being bounded, say,
by a second large sphere of radius R' '. Then the surface integral
over the large sphere is similar to that over the small one, but
with opposite sign: it is Airfi + AnR'-^) where now 4>' is the
mean over the large sphere, etc. To neglect these terms, as
we have done, their limits must be zero as R' becomes infinite.
Q if
That is, $' must go to zero at infinite distance, and R'-^i must
also go to zero. These are both satisfied if <t> is the potential
of a set of charges at finite points, for then <j> will go as 1/r,
6<p/dr will go as 1/r 2 , and r d<f>/dr will fall off as 1/r, becoming
zero as r becomes infinite,
or
<f> =
220 INTRODUCTION TO THEORETICAL PHYSICS
139. Solution of Poisson's Equation in a Finite Region. —
Suppose now that instead of extending our integral over all
space, we integrate only over a finite volume V, with surface S,
excluding in each case our infinitesimal sphere of radius R.
Then plainly we have
where the volume integral is taken over the whole volume V,
excluding the infinitesimal sphere, and the surface integral
is taken over S.
We can explain this important formula in words much better
than by mathematics. The potential at a given point can be
written as the sum of two parts : the potential of all the charges
within a certain finite volume surrounding the point, and another
part, which, of course, must represent the potential of the other
charges outside our volume. But the second term appears as
a. surface integral, not a volume integral. This is an example
of the usual sort of application of Green's theorem: the replace-
ment of a volume integral by a surface integral.
There is one interesting way of regarding the solution. Sup-
pose first that p were zero all through our volume, though not
outside. Then vV will be zero inside, and the volume integral
will vanish. Further, will satisfy Laplace's equation within
the region. The surface integral, in other words, represents a
solution of Laplace's equation within our region, in terms of an
integral' over the boundary of the region. As a matter of fact,
any solution of Laplace's equation in this region can be written
in this way, by using the proper boundary values of <j> and d<f>/dn
at the surface. The last two terms in our solution, in other
words, represent a general solution of the homogeneous equation
V 2 ^ = 0, the arbitrary functions (which with partial differential
equations replace the arbitrary constants) being the boundary
values of <£ and d<f>/dn. The volume integral, on the other hand,
represents a particular solution of the inhomogeneous equation
VV = — 47rp, satisfying the equation but not its boundary
values. Thus we have the familiar case in which the solution
of an inhomogeneous equation is the sum of a particular solution,
GREEN'S THEOREM, AND POTENTIAL THEORY 221
and the general solution of fae related homogeneous equation.
And this general solution is to be so chosen that the sum of both
terms satisfies the boundary values, on the surface of the volume.
140. Green's Distribution. — When we examine the surface
integral of Eq. (7) more in detail, we can see what it represents.
The term -i f 1-^-dS represents evidently the potential arising
J J r d n
1 r)ih
from a certain surface charge, of surface density ^ — • The
other term, -r_( f ^j, dS, is a little complicated. The
term ' ■ is the difference between the potential of two unit
dn
charges, spaced at a distance dn along the normal, divided by
dn; that is, it is the potential of two charges, one of strength
1/dn, the other — 1/dn, at distance dn,
as in Fig. 38. Such a combination of an
equal and opposite positive and nega-
tive charge very close together is called
a dipole. The strength of a dipole, or r j y£+ Ar
the dipole moment, is the strength of
one of the charges times the distance of
separation. Thus in our case the
strength is (l/dn)dn, so that we have
the potential of a unit dipole. Then „**•!* •*
^ . , - j. ! Fig. 38. — Potential of unit
the integral is the potential of a dipole dipole, consisting of charges
distribution of moment <£/4ir per unit + JL at distance dn apart.
area. Such a distribution is called a dn
double layer, since it consists of layers of positive and negative
charges close together. We then see that by spreading on the
surface of our region a suitable layer of surface charge, and a
double layer of dipoles, we produce just the same field inside
that the external charges would give. This distribution of
charge and double layer is called Green's distribution.
Suppose that we know that a given function <£ satisfies Lap-
lace's equation within a given region. Suppose further that
we know its boundary value <£, and its normal derivative d<j>/dn,
at all points of the surface of the region. Then we can at once
write the solution of Laplace's equation having these boundary
values. It is
222 INTRODUCTION TO THEORETICAL PHYSICS
--=//(♦ ^ } -^)«.
integrated over the boundary. This is obviously a very simple
way of getting a solution of a differential equation satisfying
given boundary values. In particular it is simpler than the
methods we have used so far, in that we can apply it to any
form of surface.
There is a simple interpretation of Green's distribution.
Suppose that the field within our volume were just what it is,
but that outside the volume the field and potential were every-
where zero. Then at the boundary there would be a discon-
tinuity of potential and field. Now we have already seen that
at a surface charge o- there is a discontinuity of field, ixa, so
that at a discontinuity of the field there is a surface charge equal
to 1/4tt times the discontinuity of the normal component of
the field. Thus if the normal component of the field is zero
outside, d(j>/dn inside, the surface charge is l/4jr d<j>/dn. This is
just the surface charge concerned in Green's distribution.
Similarly, at a boundary where there is a discontinuity of poten-
tial, there must be a double layer, of moment per unit area
equal to 1/4t times the discontinuity of the potential, as we see
from a condenser of charge <r, dipole moment ad per unit area,
potential difference 4tiW. This gives the double layer of Green's
distribution. In other words, these surface charges and layers,
plus the charges within the region, are just those necessary to
give the potential its actual values within the volume, and to
reduce it to zero outside.
141. Green's Method of Solving Differential Equations. —
We have seen in the present chapter a method, called Green's
method, for solving differential equations, quite different from
any we have met before, except the integral method of treating
heat flow, which is very similar. The most characteristic
part of the method is in the solution of Poisson's equation, as
an integral of p/r over all space. Here we had an inhomogeneous
equation, v 2 <£ = — 4xp. Suppose we let p = p x -f- p 2 -f- p 3 • • •
where p; is equal to p in the ith. volume element dv i} but is zero
elsewhere. Then we can write the equations v 2 <£i = — 4irpi
y-20,, — _ 4 T p 2j . . . , for each of these, where pi is different from
zero only in a very small region, so that the problem is practically
that of a point charge, which we can solve. We add these func-
GREEN'S THEOREM, AND POTENTIAL THEORY 223
tions to get the whole solution, according to Sec. 26, Chap. IV.
This is the essence of Green's method, the separation of the
inhomogeneous part of the equation into simple parts, each of
which we can solve. The function 1/r, which is the solution
for one of these problems, is called the Green's function. As a
matter of fact, a general method of solving differential equations
by means of Green's functions has been worked out, and it lies
at the basis of much of the more advanced work on the theory
of differential equations, particularly of the second order.
Problems
1. Given a spherical distribution of charge, in which the density is a
function of r. Prove that the field at any point is what would be obtained
by imagining a sphere drawn through the point, with its center at the origin,
all the charge within the sphere concentrated at the center, and all the
charge outside removed. Apply to gravitation, showing that the earth acts
on bodies at its surface as if its mass were concentrated at the center.
2. Given a sphere filled with charge of constant density. Prove that at
points within the sphere, the field is directly proportional to the distance
from the center.
3. A condenser consists of two concentric spheres, holding equal and
opposite charges. Find its capacity. Similarly find the capacity of a
condenser consisting of two long concentric circular cylinders.
4. Compute the surface density induced by a charge on a plane conductor.
5. In a certain spherical distribution of charge, the potential is given by
g— ar
• . Find the charge density as a function of r. Also find the charge
contained between r and r + dr. This represents roughly the charge
distribution within an atom.
6. Prove div (4> grad \f/) = </>vV + grad <t> ■ grad \p.
7. There are certain charges and conductors in an electrostatic field,
whose potential is $. Show that the surface density of charge on the surface
of a conductor is -t— — , where n is the normal pointing out of the conductor.
Show that the electric field is normal to the surface of a conductor.
8. It requires several volts energy to remove an electron from the interior
of a metal to the region outside. Find how many volts, if the double layer
at the surface consists of two parallel sheets of charge, a sheet of negative
electricity, of density as if there were electrons of charge 4.77 X 10 -10 e.s.u.,
spread out uniformly with a density of one to a square 4 X 10 -8 cm. on a
side, and inside that at a distance of 0.5 X 10 -8 cm. a similar sheet of posi-
tive charges. Remember that 300 volts = 1 e.s.u. of potential.
9. Discuss the potential and field of a dipole.
10. An uncharged metallic sphere of radius R is placed in a homogeneous
electric field of intensity E a . Calculate the potential at any point of space,
and sketch the equipotential curves. (Hint: Solve Laplace's equation in
224 INTRODUCTION TO THEORETICAL PHYSICS
polar coordinates taking the z axis as the direction of E . Note that there
is symmetry about the z axis. Try a solution of the form
4> = Fxir) + F 2 (r) cos
with the conditions that
Fi{r) — >0, as r — > <»
i'V? 1 ) — > £V, as r — > oo
and that <t> must be constant all over the sphere of radius R.) Solve the
problem for the case that the sphere carries a total charge e.
11. The equipotentials due to two point charges e and e' are given by
e/r + e'/r' = C. Show that the surface becomes spherical if e is of opposite
sign to e' and C = 0. Consider a spherical conductor coinciding with this
surface which is grounded. This does not disturb the field, so that these
charges give the field we would have if one of the charges were removed and
the metallic sphere left there. Show that if a is the radius of the sphere and
L the distance from the charge (outside the sphere) to the center of the
sphere, the image charge inside the sphere lies a distance L' from the center
such that a 2 = LL' and has a charge e' = (—ea/L). Show that the surface
density of induced charge varies inversely as the cube of the distance from
the charge outside the surface to the point of the surface under consideration.
CHAPTER XX
MAGNETIC FIELDS, STOKES'S THEOREM, AND VECTOR
POTENTIAL
The static magnetic field resembles the electrostatic field in
many ways. The intensity of the field due to a magnetic pole is
equal to the pole strength divided by the square of the distance
of the point at which the intensity is measured, so that magnetic
poles display close analogy to electric charges. The intensity
of this field H is defined as the force per unit magnetic pole, and
this is measured in the system of units known as the electro-
magnetic, as distinct from the electrostatic. We shall discuss
the relation between these systems of units in a later section.
The vector H satisfies the equation
div H = 47r X density of magnetic poles,
but here a very important difference appears; north and south
magnetic poles never can exist alone. No matter how small one
takes a volume element, the north and south poles just cancel,
so that the total density of magnetic poles is zero. Hence we
have
div H = 0. (1)
Thus we must always deal with at least a pair of opposite poles,
and here we always have a magnetic dipole, whose behavior is
just like that of an electric dipole. The magnetic moment of a
bar magnet is defined as the product of the strength of one of the
poles times the distance of separation, and magnetic fields are
measured by measuring the torque exerted on a suspended mag-
net (magnetometer). Exactly as we have defined the electromo-
E -ds, we can now define as the
magnetomotive force J B H • ds. This is the work per unit pole
done by the magnetic field as the pole is moved along a path from
A and B. There is also a magnetic potential $ = —JH-ds,
and in the field of permanent magnets JH • ds taken around any
closed path is zero.
225
226 INTRODUCTION TO THEORETICAL PHYSICS
142. The Magnetic Field of Currents. — It is when we come to
consider the magnetic fields due to currents that we meet differ-
ences from the electrostatic case. Suppose that we have a
straight wire in which a steady current flows. The magnetic
lines of force are concentric circles around the wire and it is clear
that if we calculate the integral JH • ds following one of these
circles, we shall not find that its value is zero for such a closed
path. on the other hand if we evaluate JH • ds around any
closed path which does not encircle the wire, it does vanish, and
the situation is then analogous to the electrostatic case. These
considerations hold for any closed circuit carrying a current.
We can reduce our problem to an ordinary magnetostatic one
by the following device : suppose that
we construct a surface bounded by
the wire carrying the current and do
not allow any of the curves along
which we calculate JH • ds to cut
through this surface. Then no closed
paths are possible which encircle the
current, JH • ds = around every
Fig. 39.— Magnetic shell and path, and everywhere in space there
multiple valued potential. The . ,. , ,. T _ o,
potential difference between a 1S a magnetic potential $. Suppose
and b is 47rmo, or Airi, where m we evaluate JH • ds along a Curve
is the strength of the double , , . , . T , ,,
layer producing the same mag- starting at a on one side of the sur-
netic field as the current i in the face and following a line of force
wire encirc mg e s e . around to a point b on the other side
of the surface, as in Fig. 39. The difference of magnetic potential
between a and 6 is given by
3> a — $ 6 = —
I H • ds = — I -r-ds.
Ja Ja ds
and the potential difference does not approach zero as we let
a approach 6 since then the curve would cut our surface. This
must mean that there is a jump in potential as we cross the sur-
face. We have already seen in the last chapter that a surface
distribution of dipoles (a double layer) produces a discontinuity
in potential, so that we can replace our current by a surface
layer of magnetic dipoles on a surface whose boundary is the
current-carrying wire, and produce exactly the same magnetic
field as the current. Suppose that we have a surface of area
A on which we have a dipole layer of constant moment m
STOKES'S THEOREM, AND VECTOR POTENTIAL 227
per unit area. (This may be either an electric or magnetic
dipole layer). Consider a point P outside the surface. If one
looks from P to the surface, the surface subtends a solid angle
0. It is easy to show that the potential at P is equal to m Q
times ft. The proof of this is left to a problem. In particular
if P approaches the surface, 12 approaches 27r so that the potential
at a point just one side of the surface is 2irm . Similarly on the
other side of the surface the potential is — 2xm , so that there is
a discontinuity of potential equal to 4rm as one crosses the
double layer. Thus in our case we have
$ a — $& = +4xm
(the + depends on which way we go around the curve ah), so
that
fH • ds = ±4rm
around a closed curve which cuts through the double layer sur-
face and is zero for every other closed curve. In the following
we shall always go around the curve in such a direction that
JH • ds = 4rm .
If we now ask how m depends on the current, we must get the
answer from experiment and the relation turns out to be exceed-
ingly simple; the magnetic moment per unit area m is propor-
tional to the current. If we have not as yet defined the unit
of current we may place m = i, and this equation defines the
unit current in the electromagnetic system of units. Thus
fH-ds = 47rt (2)
where the integration is carried once around a path encircling
the wire. If we go around again the value of the integral
increases again by 4x2, and so on for every complete circuit of
the path. This unit of current which we have introduced is
called the abampere and is ten times as large as the practical
unit, the ampere. on the other hand, we might wish to utilize
the electrostatic unit of current, defined as the current in which
one electrostatic unit of charge passes a given point per second.
It is necessary to determine experimentally the proportionality
constant between ra and i. This has been done and turns out
to be 1/c, where c = 3 X 10 10 cm. per second. If we express
our current in electrostatic measure, the work done in carrying
228 INTRODUCTION TO THEORETICAL PHYSICS
a unit north pole around a circuit enclosing the current is
J
H-ds = — • (3)
The e.s.u. of current is }4 X 10~ 9 ampere.
143. Field of a Straight Wire. — We can illustrate these ideas
easily in the case of a straight wire carrying a constant current.
Since the lines of force are circles, let us calculate the work done
in carrying a unit pole around such a circle of radius r. In this
case H has the same value all along the circle and is tangent to it.
Thus
£H-ds = Hjds = 2*rH = 4m
so that the magnetic field intensity at a distance r from a straight
wire carrying a current i is
H = -• (4)
r
We can now set up the potential for
this case.
Thus, let the wire be along the z
axis, as in Fig. 40, so that the field is
given by
Fig. 40. — Magnetic lines of „ _ ~2iy „ _ 2ix rr _ n
iorce (circles) and equipoten- tLx — ' 2 ' "v ~ ~~^' n z — \).
tials (radii) for the field of a
wire carrying a current (at Th y H = 0, as we can im-
right angles to the paper). A ^ '
H is perpendicular to radius, mediately prove by substitution.
Therefore #*: H y =-y:x. rj^^ for examp le, CUT \ Z H =
d(2ix/r*) _ d(-2iy/r*) = 2t _ W 2t _ 4^ =
dx dy r 2 r 4 ^ r 2 r 4
can have a potential, and it is easy to see that it must have as its
equipotentials the lines 6 = constant, where 6 is the polar angle
in the xy plane, since these are at right angles to the lines
of force. If we set $ = — ~2id = — 2i tan -1 (y/x), we have
-d$/dx = -2iy/r 2 = H x , -d<f>/dy = 2ix/r 2 = H y , so that we
have actually exhibited the potential.
But now we see that the potential is not single-valued. For a
given value of x and y, the angle tan -1 (y/x) can have an infinite
number of values, differing by 2x, and the potential can have
an infinite number of values differing by 4xi, in agreement with
what we found before. Thus the potential is not defined ir
STOKES'S THEOREM, AND VECTOR POTENTIAL 229
as simple and definite a way as in electrostatics. The interpreta-
tion of this situation comes from a theorem called Stokes's
theorem.
144. Stokes's Theorem. — Stokes's theorem states that if we
have any closed curve, and integrate the tangential component
of a vector around it, the result is equal to what we obtain if
we take some surface bounded by the curve, and integrate the
normal component of the curl of F over this surface :
JF, ds = / JcurL F dS. (5)
To prove it, we first divide up the surface into small surface
elements, of area dS. For one of these the surface integral is
CUrl n F dS. NOW Suppose We Choose x ,y+ dy „ x+dx , y+dy
the axes so that the z axis is normal
to dS, and the area dS is bounded by
x, y, x + dx, and y + dy as in Fig. 41.
Then the surface integral is
\dx
- j dxdy. Let us next compute
d y J — .•/• — ' — "— w~+t>~»~ xy - x+dx,y
jF s ds for the element of area. It is FlG - 41.— Circuit for proving
evidently F x {x,y)dx -\- F y {x-\-dx,y)dy
- F x (x,y + dy)dx - F y {x,y)dy = f -^ - —Jdxdy, if we go
around so as always to keep the surface on the left. Thus the
theorem is true for such an infinitesimal surface. But now, if we
put the whole surface together out of its elements, the total
surface integral will be the sum of the parts, or Jjcurl n F dS. Also
the total line integral will be the sum of the integrals over the
separate elements. To see this, we note that in making the sum,
all boundaries except the outside edge of the area are shared by
two elements of the area, and the line integral from one traverses
the boundary in one direction, from the other in the opposite
direction, so that the contributions all cancel, leaving only the
integral over the outer boundary, which is then fF s ds. Thus
Stokes's theorem is proved.
145. The Curl in Curvilinear Coordinates. — It is often useful
to have the curl, and Stokes's theorem, in curvilinear coordinates.
We refer back to Chap. XVIII, using methods analogous to those
used there in discussing the divergence and gradient. Consider
an approximately rectangular area, similar to that in Fig. 38,
230 INTRODUCTION TO THEORETICAL PHYSICS
bounded by q h q\ + dq x , q 2 , q 2 + dq 2 . The line integral about
the circuit is Fi(q 1} q 2 )dsi + F 2 (qi + dq lf q 2 )ds 2 — Fi(q\, q 2 +
dqi)dsi — Ftiqi, q 2 )ds 2
F 2 (qi + dq h q 2 ) ^sfai, jfOL- .j"Fi(gi , g 2 + dq 2 )
dq\
— — j t«/2 r—
/i 2 ai 2 I I hi
Fi(q h qi)
hx
Since this must be curl 3 F dsids 2 , we have
— •' -**[4&) - £(£)} (6)
with analogous relations for the two other components.
We can illustrate the formulas by showing that the curl of
the field of a straight wire is zero. Let us take cylindrical coordi-
nates, in which r = q u 6 = q 2 , z = gs, /k = 1, h 2 = 1/r, h s = 1.
The assumed magnetic field, along the tangent, is H r = 0,
H e = 2i/r, H z = 0. We then have H d /h e = 2i, a constant, so
that its derivative is zero, and the curl vanishes.
146. Applications of Stokes's Theorem. — Let us apply Stokes's
theorem in a few cases. First, if the curl is everywhere zero,
the line integral of the vector is zero around a closed path. It
follows that the line integral from one point to another along
any path is the same. This is the condition for the existence
of a potential, and we now see that the vanishing of the curl
is just the condition that we must have in order to set up a
potential. But in the magnetic case, it is not true that the line
integral around any path is zero. Any contour including the
current has an integral different from zero. The whole situation
is then explained if inside the wire carrying the current the curl
of H is not zero, but is a vector pointing along the direction of
the current, of such a magnitude that the total surface integral
over the cross section of the wire is 4W. Thus, for example,
a contour going once around the current has a surface integral
of the curl equal to 4ri, which therefore must be the value of the
line integral of the tangential component of H.
To find the exact relation between the current and curl H,
we imagine the current in the wire to be spread out through the
actual material of the wire, as in fact it is. We set up u, the
STOKES'S THEOREM, AND VECTOR POTENTIAL 231
current density, or flux of electricity, satisfying the equation of
continuity dp/dt + div u = 0. Then i = jju n dS, where the
integration is over the cross section of the wire. We must have,
then, 4x jju n dS = J/curL H dS, and since this must hold for
any size wire, the natural assumption is that the same relation
holds between the small elements of current, so that 4ru n =
c\irl n H, or more generally
curl H = 4tm. (7)
Here u is in e.m.u. If it is in e.s.u., the equation is curl H =
4nru/c. We can see one result of these equations. If the current
instead of being in a single wire, is distributed through space,
the curl is different from zero everywhere, and there is no possi-
bility of writing a potential at all.
147. Example: Magnetic Field in a Solenoid. — Suppose we
have an infinite solenoid, of finite radius, with n turns per centi-
meter, carrying current i, and that we wish to calculate the
magnetic field inside it. We assume that it is in no external
magnetic field, so that the field outside is zero. By symmetry,
the field inside will point in the direction of the axis. Now let
us apply Stokes's theorem to a path as follows: (1) Inside, along
a line parallel to the axis, for 1 cm. The integral of H will be
H i} the H inside, times unit distance. (2) Straight out, radially,
to the outside of the solenoid. Since H is at right angles, the
integral of H will be zero. (3) Outside, back for 1 cm. along a
line parallel to the axis. The integral is zero since H is zero
outside. (4) Straight in again, closing .the figure, and contribut-
ing nothing to the integral. Thus we have jH s ds = H { . Now
J/curU H dS = 4rfju» dS = 4r X total current flowing through
the contour = 47rra. Hence we have H t = hrni, the formula
for the magnetic field inside a solenoid, showing that it is constant
independent of position.
148. The Vector Potential. — In magnetic fields coming from
permanent magnets, where there is no current, we can write an
ordinary potential letting H = — grad <E>. But this is only
possible when curl H = 0, which is not true in the presence of
currents. on the other hand, it can be shown that if the diver-
gence of a vector is zero, as div H = 0, it is always possible to
set up a vector A, called the vector potential (to distinguish it
from 4>, which is called a scalar potential), such that H = curl A.
This is often a useful thing to do. We can prove readily that
div curl A = always, so that we have div H = 0.
232 INTRODUCTION TO THEORETICAL PHYSICS
The vector potential satisfies a simple differential equation.
We know that curl A = H, but this does not determine A
uniquely. In fact, to determine a vector field uniquely we must
specify both its curl and its divergence, and we can find a vector
whose curl and divergence are any desired functions. Let us
then demand that div A = 0. We now have curl H = 4iru/c =
curl curl A. It can be proved that curl curl A = grad div A —
V 2 A = — v 2 i, since div A = 0. Hence
V'A = -^ (8)
c
similar to Poisson's equation for the scalar potential in terms of
the charge density,
V 2 <£ = — 47T/3.
These two equations, expanded to include terms depending on
time, prove to be very important in general electrical theory.
Let us set up the vector potential for a current in a straight
wire. Take cylindrical coordinates, with the wire pointing along
the z axis. Poisson's equation for A is a vector equation, but
since u has only a z component, A will likewise have only a z
component, which will depend only on r. Thus we have
1 d ( dA A 4xw , ^ n
—A r-r- J = — ■ — - forr < R,
r dr\ dr / c
= for r > R.
where R is the radius of the wire.
The solutions of this equation are
A z = + a In r + b for r < R
c
= d In r + e for r > R.
Since A cannot become infinite at r = 0, we must have a = 0.
We may choose 6 = 0. Then d and e must be chosen to make A
and its derivative with respect to r continuous at r = 22. Noting
that ttR 2 u = i, the total current, this easily leads to
2i
A * = In r + constant for r > R.
c
The only component of H is then H e , which is
dr\ h z / J cr
STOKES'S THEOREM, AND VECTOR POTENTIAL 233
149. The Biot-Savart Law. — In the case of a linear conductor
carrying a current i, the expression for the vector potential,
using the solution of Poisson's equation from Chap. XIX, becomes
a i i ds
A = - I — ;
c.l r
where ds is the vector element of length taken along the con-
ductor, and pointing in the direction of the current. To find the
intensity of the magnetic field, we take the curl, finding
H = curl A
i C i ds
— - I curl —
cj r
In this equation, ds is a vector and r a scalar. In general, if S
is a scalar and B an arbitrary vector, it is easy to show that
curl (SB) = S curl B + (grad S) X B.
Applying this relation to our case, B = ds, and S = 1/r, and we
must remember that in taking the curl we differentiate only with
respect to the coordinates which fix the point at which we wish
the value of H (the field point). Now these coordinates appear
only in r and. not in ds, which depends on the circuit only. Thus
the first term vanishes and we have
— >
where r is the vector from ds to the field point, and r is the length
of this vector. If we imagine that the resultant H v is made up of a
sum of contributions from each conductor element ds, we may
write the law in its differential form
dH = ^(dsX^y (10)
This is known as the Biot-Savart law. The magnitude of dH is
obviously
\ dH \ = £& sin d, (11)
where 6 is the angle between the direction of ds and r; the direc-
tion of dH is perpendicular to the plane of ds and r. Applied to
closed circuits it always yields the same results as the integral
law. For open circuits this is not obvious, since we can add to
the expressions for dH a differential d\f/ provided fdip around a
234 INTRODUCTION TO THEORETICAL PHYSICS
closed curve is zero. In this way we leave the law for closed
circuits unaltered, but for open circuits change the value of H
so calculated. Thus the integral law must be looked upon as the
more fundamental.
Problems
1. Prove that a double layer of moment m per unit area leads to a poten-
tial <f> at point P equal to m fl, where Q, is the solid angle subtended by the
area from the point P.
2. Show that in the electrostatic system of units, charge has the dimension
m^l^t- 1 , current the dimensions ra^V 2 , voltage (e.m.f.) the dimensions
mfiifit-\ resistance the dimensions l~H, and capacity the dimensions I.
3. Derive the dimensions of charge, current, voltage, resistance, and
capacity in the electromagnetic system of units.
4. Prove that if S is a scalar and B a vector
curl (SB) = S curl B + grad S X B.
5. Prove div curl F = 0; curl curl F = grad div F - v 2 F, where F is any
vector.
6. Using the Biot-Savart law, find the magnetic field at any point on the
axis of symmetry of a circular loop of wire of radius R carrying a current i.
7. A current flows in a circular loop of wire, of radius R. Find the vector
potential of the resulting magnetic field, at large distances compared with R,
by adding the contributions to the vector potential due to the separate
elements of current.
8. Compute the field, from the potential of the last problem, and show
that it is approximately the field of a single dipole. Find the strength of
the dipole, in terms of current and radius R.
9. Two parallel straight wires carry equal currents. Work out the
magnetic fields due to the two together, in the two cases where the currents
flow in the same or in opposite directions, drawing diagrams of the lines of
force.
10. Find the magnetic field at points inside a wire carrying a current,
assuming the wire is straight and of circular cross section and that the
current has constant density throughout the wire.
11. Compute the curl in spherical polar coordinates. Verify directly
that the divergence of a curl is zero in these coordinates.
CHAPTER XXI
ELECTROMAGNETIC INDUCTION AND MAXWELL'S
EQUATIONS
We now leave the restriction of the steady state and inquire
into the extensions of the theory necessary to have it hold for
nonstationary phenomena. The fundamental fact concerning
electromagnetic induction may be stated as follows: If a set of
circuits carrying current (or magnets and circuits) are set in
relative motion with respect to each other, the currents in the
circuits change during the relative motion. Instead of formulat-
ing a law for the induced currents, it is simpler to consider the
induced electromotive force. Take a closed circuit in the neigh-
borhood of a moving magnet (or moving circuit), and let N be
the number of magnetic lines of force through the circuit. Then
dN
the induced electromotive force is — -=r> expressed in electro-
magnetic units, if N is in these units. If the e.m.f . is expressed
in electrostatic units it is equal to -rr. The minus sign
expresses what is commonly termed Lenz's law and indicates that
dN
if -rr is represented by a vector going through the circuit, the
induced current flows in a clockwise fashion.
150. The Differential Equation for Electromagnetic Induction.
We can now state this law in more analytical form. Consider
the closed curve formed by the circuit, and any surface whose
boundary is this curve, so that the surface forms a sort of cap
over the curve. Then the magnetic flux
N = IJH n dS
where the integral is carried out over the whole surface. Further-
more the electromotive force is by definition the work done in
carrying a unit charge once around the circuit. This work may
be done either by the electric field or by chemical forces in a
battery. Since the latter are considered absent we have
e.m.f. = (pEtds
235
236 INTRODUCTION TO THEORETICAL PHYSICS
where the integral is taken completely around the circuit. The
fact that this line integral does not vanish shows us at once that
we shall not be able to introduce a potential, as we have done
in the electrostatic case. Thus we have
$E s ds = -j t f CH n dS.
(1)
dS.
It should be noticed that the flux of the magnetic field through
the circuit may change in several ways, either by changing H n ,
or by changing the shape of the circuit, thus causing a change in
the enclosed area, or by moving the undeformed circuit to other
parts of space where H n is different. In general dN/dt is com-
posed of several terms. In the case of fixed circuits, we may
replace the total time derivative by the partial derivative so that
dN/dt = dN/dt. With the help of Stokes's theorem we rewrite
the induction law as
This holds for any fixed circuit, and hence for any fixed area of
integration. Thus it must hold for an infinitesimal area dS, so
that the integrands must be equal and we obtain
curl E = —-qj:'
This is the differential form of the induction law. In it, E and H
are both expressed in e.m.u. If E is expressed in e.s.u. and H in
e.m.u., the law takes the form
curl E = -i d i- (2)
c at
151. The Displacement Current.— We have now derived four
fundamental electromagnetic equations:
div E = 4ttp,
div H = 0,
1 °H
curl E = -- -%>
c
where E, p, and u are in e.s.u. and H in e.m.u. These aie aimost
the Maxwell equations, but there is difficulty with the last of
ELECTROMAGNETIC INDUCTION 237
them. Of course, we have derived it on the basis of steady closed
currents and for this case it is surely correct. The difficulty
occurs when we try to apply this result to nonstationary cases.
In the nonsteady state we have the new possibility of current
flowing in "open" circuits. The simplest example is that of the
discharge of a condenser. Here the current starts at the posi-
tively charged plate, whose charge diminishes as the current flows
to the negatively charged plate and annuls the charge there.
Thus we can look upon the condenser plate as a source (or sink)
of current. Now if we take the divergence of the last equation,
we have
4:7T
div curl H — — diy u
c
and since the divergence of any curl is zero, we find that div u
equals zero, which means that the current is always closed and
there are no sources or sinks. Thus open circuits lead to a
Contradiction to this equation. We have derived the equation
from steady-state considerations, however, and if we are to extend
it to hold under all conditions, it is clear that there must be some
term which vanishes for the steady state which we must add.
The equation of continuity applied to electric charge and current
tells us that
div u + |? =
ol
expressing the fact that the flow of current out of a volume
results in a decrease of charge in that volume. In the steady
state dp/dt = 0, so that div u = 0, and we have no inconsistency
with our fundamental equation. It is certainly clear that if
curl H is to be proportional to a current, this current must be
divergenceless, and u is not. Maxwell made the bold step of
assuming that the whole current consisted of two terms u and u' ,
where u' was so chosen that div {u + u') = 0. In this way the
distinction between open and closed circuits vanishes and a unity
hitherto lacking was given to the laws. Maxwell saw at once
1 r)W
that we must set u' = -. — • For then we have
Air dt
div (u + u') = div [u -\- -. — \ = div u + -r- — (div E)
\ 47T Ol I 4x at
= div u + — =
Ol
238 INTRODUCTION TO THEORETICAL PHYSICS
and this is the equation of continuity which we have been trying
to satisfy. In other words, Maxwell assumed the correct equa-
tion to be
critf-iaf + *r tt . (4)
c dt c
1 riF 1
The new term -: — is called the displacement current, in con-
4t or
trast to the convection current u.
Actually the real advance of Maxwell over his predecessors
lies in the introduction of this displacement current. The physi-
cal meaning of this current can be obtained by considering the
charging of a condenser. Current flows from one plate through
the wire to the other plate. If the current is i, this equals the
rate of increase of charge on the plate. Suppose the plates are of
area A, separation d, then the field between them is
4x
E = iira = -j- X total charge'
and the displacement current density in the region between the
plates is
^^ = a? = i^ (totalcharge) = r
A rtW
Thus the displacement current is j — -^ — i, and is equal to the
convection current in the wire, so that the current becomes
continuous throughout the circuit. The fundamental assump-
tion of Maxwell was that the displacement current is always
present when an electric field varies in time and produces the
same magnetic effects as convection currents.
It is clear that a test of Maxwell's hypothesis can only be made
1 f)Tf
with very rapidly varying fields, since we must make j — -rr > >
u in order to keep the convection current effects from masking
the displacement current effects. As is well known, Hertz, in
1888, performed the experiments on electric waves which con-
firmed this assumption of Maxwell. There is an interesting
connection between the displacement current and the Biot-
Savart law. All the attempts before Maxwell were to find a
correct form of the Biot-Savart law for "open" circuits. As we
pointed out in the last chapter, the addition of a total differential
to this law would yield nothing when it was applied to closed
ELECTROMAGNETIC INDUCTION 239
circuits, and the hope was that the correct form to be added to
this law could be found so as to account for open circuit
phenomena.
152. Maxwell's Equations. — We can now write the correct
Maxwell equations
1 dE . 4xw , ^ 1 dH
curl H = curl E = rr-
c dl ^ c c **
div H = div E = 4xp
These are the fundamental equations of electromagnetic theory.
They need extension in but one way. If there are dielectric and
magnetic bodies present, in them Coulomb's law and its analogue
for the magnetic-field become
* ~ — v
and
F =
mm
/xr 2
where e is the dielectric constant and m the magnetic permeability.
We now introduce a new vector called the electric displacement
D, defined by D = eE, where E is the intensity of the electric
field. Similarly, we introduce the magnetic induction vector
B = nH. It is easy to see from our previous work that we now
have the relation div D = 4rrp. Furthermore, Faraday's induc-
tion law refers to the rate of change of magnetic flux through a
circuit and hence H must be replaced by B in this relation.
1 /)
Finally, we have div curl E = = — — div 5, so that div
B = 0, rather than div H = 0. The final equations are thus
found to-be:
, rr 1 d'D , 4aru ' , e, .1 dB
curl H = - -37 H curl E = — -^
C dt c C at
div B = div D = 4ttp
B = nH D = eE. (5)
In these equations, E, D, p, u are in electrostatic units, H and
B in electromagnetic units. In Chap. XXIV we discuss in detail
the significance of B and D, and the interpretation of e and /*•
Maxwell's equations sufl&ce to determine the field, when we
are given the charges and currents. To make a complete set of
dynamical principles, however, we need two more relations.
240 INTRODUCTION TO THEORETICAL PHYSICS
First is the formula giving the force acting on a charge and
current. The electrical force per unit volume is simply pE, the
force on unit charge multiplied by the charge per unit volume.
The magnetic force is that acting on the current, as observed in
the ordinary action of the electric motor. This force acts at
right angles both to the current and to the magnetic field, and
is proportional, as is shown in the elementary study of electricity,
to the current (in electromagnetic units) times the component
of magnetic field at right angles to the current. For unit volume,
this is just given by the vector product u X H. If u is in elec-
11
trostatic units, it is - X H. Thus we have for the force vector
c
F = P E + -{u X H).
c
If the current density is produced by the motion of charge, we
have u = pv, where v is the velocity vector of the charge. In this
case
F --• p \e + ±(vXH)\
This relation has been particularly used by Lorentz.
Finally, one must have a law, such as Newton's law stating
that the force is equal to mass times acceleration, determining
the motion of charge in terms of the force acting. With such a
law, we find the field from the charge, the force from the field,
and the motion from the force, obtaining therefore a complete
system of dynamics.
Let us now summarize the various steps gone through in build-
ing up Maxwell's equations. Consider first the static case.
Here dD/dt = dB/dt = and u = 0. The equations become
curl H = curl E =
div B = div D = 4tnp
B = pM D = eE.
The three equations on the left are those of magnetostatics, and
the remaining three are those of electrostatics. Each system is
completely independent of the other. The equations curl H = 0,
and curl E = 0, show that scalar potentials exist.
In the stationary case, we still have dB/dt = dD/dt = 0, but
now u 5* 0. The only one of the equations above which is
modified is curl H = 4iru/c } the others remaining unchanged.
ELECTROMAGNETIC INDUCTION 241
It is usual to include Ohm's law in the statement of the equations,
however. This law is easily stated in differential form by
considering a small volume, having length L in the direction
of the current flow, and cross-sectional area A normal to the
current. We apply Ohm's law in the form p.d. = iR. Here
the potential difference is the field E times the length L of the
volume, the current is the area times the current density u,
and the resistance is the specific resistance times L/A. Hence
we have
EL = Au-j X specific resistance,
or
u = <rE, (6)
where <r, the specific conductivity, is the reciprocal of the specific
resistance. This equation is Ohm's law in the form suitable
for Maxwell's equations, and it is commonly included along with
D = eE and B = jxH.
If we now proceed to the nonstationary state we must strictly
use the correct Maxwell relations. But there is a case of utmost
practical importance, in which dD/dt <<C 4xw, and hence for
which the effects of displacement can be neglected in comparison
with those of the convection currents. The Maxwell equations
with the displacement current omitted apply to the so-called
"quasi-stationary" processes, and these form practically the
whole domain of electrical engineering. The magnetic field
inside and outside conductors is calculated as if produced only
by the convection currents, but the induction law is not left
out as in the stationary state. Here we have a double coupling
of electric and magnetic fields, first, as in the stationary case,
where electric currents produce magnetic fields, and, secondly,
by the induction law. Since the essentially new contribution
of Maxwell, the displacement current, is neglected in quasi-
stationary calculations, it is clear that no study in that field
can give experimental confirmation of Maxwell's idea.
153. The Vector and Scalar Potentials. — We observe that, if
H depends on time, curl E 9^ 0, so that there is no potential
for E. The ordinary electrical potential is thus confined to static
problems. Further, if u or — 5^ 0, there is no potential for H .
Uv
We have seen in the last chapter how a potential can be intro-
242 INTRODUCTION TO THEORETICAL PHYSICS
duced for H: one uses a vector potential A, possible because
div H = 0. That is, we let
H = curl A. (7)
We can do this even in the general case. And it proves that we
can use a scalar potential <£, reducing to the electrostatic poten-
tial in the case of a steady state, but different in other cases,
by a special device. The relation which proves to be satisfied
is that
E = -gmd <f> - j^, (8)
reducing to the familiar E — — grad <£ when everything is
independent of time. These relations are written for the case
of empty space, where e = n = 1, and we shall give the discus-
sion only for that case.
To verify our statements about the vector potential A and
the scalar potential <f> we substitute the expressions for E and H
in Maxwell's equations, and see if they can be satisfied by the
proper choice of A and <£. First, we notice that div H = div
curl A = 0, so that this equation is automatically satisfied.
1 A
Next we take div E — —div grad <f> ^ div A = — v 2 <£ —
. c dt
1 ?j
- — div A. This must equal 47rp. Now we consider the curl
c at
equations. We have curl E = —curl grad <t> ^ curl A.
1 ?\
Since the curl of any gradient is zero, this is — — curl A =
1 r)TT
— > verifying another of Maxwell's equations. Finally
curl H = curl curl A = grad div A — v 2 ^. This must equal
1 BE , 4ttw Id , ± 1 d 2 A . 4ru „ . ,
c^t + —=—cdt grad *-?***- + — Hence > m order
to satisfy Maxwell's equations, we must have
-V 2 — div A = 4ttp,
c at
AAA 2 A . 1 ° A J. I 1 9 * A 4™
grad div A-v*A+~- grad * + - 2 ^ = — •
But now let us choose A and <£ subject to the condition that
ELECTROMAGNETIC INDUCTION 243
1 dtb
div 4 H ~ = 0. Since div A is so far arbitrary, we can do
this. Then the first equation becomes
and the second
1 d 2 <f>
W - -s -^ = -47TP,
, , 1 d 2 A —4iru /m
c 2 dt 2 c
These are the equations for the potentials. If A and <£ satisfy
them, then, as we stated before, the fields determined from them
>/ ^ Qj±
by the equations E = — grad <f> ^r-, H = curl A, satisfy
Maxwell's equations. The equations for the potentials are
of the form called D'Alembert's equation, and as can be seen
are extensions of Poisson's equation, obtained by adding the time
derivatives. We observe that in regions where there is no charge
and current density, the potential satisfies the wave equation,
which is the homogeneous equation obtained by setting the right
side of D'Alembert's equation equal to zero. That is, <f> and A
are given by functions representing waves traveling with velocity
c. Hence the same thing must be true of the fields E and H.
This is the origin of the theory of electromagnetic waves, and
of the electromagnetic theory of light, and the proof that c,
the ratio of the units, is at the same time the velocity of light.
In regard to our condition imposed on the potentials, that
1 fith
div A -\ — - = 0, we can readily sh6w that if the potentials
satisfy Eqs. (9) above, this condition can also be satisfied. For
take 1/c times the time derivative of the first, and the divergence
of the second, and add. Using the fact that div y 2 A = v 2 div A,
where A is any vector, the result is
' v°(div a+\>±)-^ *W + i/ca*/a<)
That is, the quantity div A + ' --£ satisfies the wave equation
c at
everywhere. It can be proved that no function, other than
zero, can satisfy the wave equation everywhere, unless its value
244 INTRODUCTION TO THEORETICAL PHYSICS
at infinity is different from zero. Hence in an ordinary problem
of charges at finite points, where certainly the potentials must
1 r)th
vanish at infinity, it must be that div A -\ ^ = 0, and in
other cases we can certainly choose the potentials so that this
condition will be satisfied.
Problems
1 d 2 E
1. Show that E and H satisfy the wave equations v 2 E - — 2 -^ =0,
with a similar equation for H, in empty space, where u and p are zero, and
e = M = 1. (Suggestion: for the first, take the equation for curl E, and
take its curl, then substitute for curl H in terms of E. Proceed in an analo-
gous way with the other equation.)
2. In a region where u and p are zero, but e and p. are different from 1,
3how that the velocity of light is —==■
3. A magnetic field points along the z axis, and its magnitude is propor-
tional to the time, and independent of position. Find the vector potential.
Assuming that the scalar potential is zero, find the induced electric field.
Prove by direct integration using a circular circuit, that the law of induction
holds.
4. Describe the magnetic field between the plates of a condenser while
it is charging up.
6. Starting from the induction law, show that the line integral of
(E + - — ^ around a closed path is zero, where A is the vector potential,
c dt /
From this show that the curl of the above vector vanishes and hence that
1 dA
E = — erad 4> — » where <f> is the scalar potential.
& C dt
6. In conductors where p, = 1 and p = show that E and H both satisfy
differential equations of the form
,_ 4xcr dE e d 2 E _ _
v E ~ -& ~di ~ 7* ~w ~ °*
7. Derive the differential equations satisfied by E and H for quasi-
stationary processes.
8. Show that if a voltage is induced in a circuit (2) by a changing magnetic
field due to a circuit (1 ) , the induced e.m.f . in (2) is given by
where' A i is the vector potential at the element ds 2 due to the current in
circuit (1). For quasi-stationary processes we can write
a M C C fWidVl
ELECTROMAGNETIC INDUCTION 245
where ui is the current density in circuit (1) and dvi a volume element
thereof. For linear currents show that the induced e.m.f. is then given by
'#**- -5 !('•#*£*)■
where Ii is the current in the first circuit, r i2 is the distance between dsi and
dsi.
The coefficient of mutual induction M i 2 is defined as
-ss%
ri2
so that the above relation becomes
(E.m.f. ) 2 = -I j t (M 1 J 1 ).
CHAPTER XXII
ENERGY IN THE ELECTROMAGNETIC FIELD
The idea of energy is as useful in electromagnetic theory
as in mechanics. Maxwell's equations correspond in a general
way to the equations of motion, and in the present chapter we
introduce electrical and magnetic energies analogous to the
potential and kinetic energies. The analogy is particularly
close with the mechanical energy in a vibrating medium, since
electrical oscillations in free space, as in a light wave, are similar
to mechanical oscillations in sound. The energy of an elastic
solid is distributed throughout the body, each volume element
having a potential energy on account of its strain, and a kinetic
energy on account of its velocity. Correspondingly we shall
find that the electromagnetic energy can be considered as
localized throughout the field, with a definite density of electrical
and magnetic energy. Finally, the potential energy is propor-
tional to the square of the stress or strain, and kinetic energy
proportional to the square of velocity or momentum, and in a
similar way here we shall find electrical energy proportional
to the square of E or D, and the magnetic energy to the square
of H or B. The analogy can be carried out completely, Maxwell's
equations, for instance, being written in the form of Lagrangian
equations; however, we shall not do this. We start the discus-
sion by deriving the electrical and magnetic energy by elementary
means from the condenser and solenoid, and then pass to general
theorems involving energy density and energy flow.
154. Energy in a Condenser. — Given a condenser of capacity
C, let its charge at a given moment be q. Assume that we are
charging up the condenser, and that we wish to know how much
work we shall have to do on it to charge it. To take a small
additional charge dq around the circuit, against the difference
of .potential q/C, will require an amount of work (q/C)dq. Thus
the whole work done in setting up a charge Q is
246
ENERGY IN THE ELECTROMAGNETIC FIELD 247
*
This is the expression for the energy in a condenser which we
found in Chap. V, Prob. 6.
But now there is an interesting way in which we may consider
this. We may imagine that the energy resides directly in the
electromagnetic field, between the condenser plates. Let the
area of the plates be A, the distance of separation d, and the dielec-
tric constant e, so that C = Ae/4rd. Also the field between
the plates will be E = q/Cd, the difference of potential between
the plates divided by the distance. Hence we have \q 2 /C =
\E 2 Cd 2 = (eE 2 /8ir)(Ad). But Ad is simply the volume of the
condenser, or of the region of space where the field is E. Hence
we may consider the energy to be located in the electromagnetic
field, with a volume density eE 2 /8ir, and the integral of this over
the condenser will give precisely the total energy.
155. Energy in the Electric Field. — It is not difficult to show
that in an arbitrary electrostatic field the energy is given by
F = 2
y _ eie2
^- I I I E 2 dv. Let us consider two point charges d and e 2
in a medium of dielectric constant e separated by a distance ri 2 .
The force acting on each is given by Coulomb's law as
er 2 i2
and the potential energy of the system by
l/ e 2 , n e x \
eri2 2\ er i2 eri 2 /
We have written this, in two terms and notice that the first is
just the charge ei times the potential at the point where the charge
is due to the charge e 2 . Similarly the second term is e 2 times the
potential at e 2 due to e\. Thus we can write
V = -Rifinpi + e 2 <p2)
where <pi and <p 2 are the potentials. In general for n charges we
have
-12
o / i ^kfPk (2)
k ■
and if the charges are distributed in space instead of being point
charges, this becomes an integral
V = l[ [ [p<pdv (3)
248 INTRODUCTION TO THEORETICAL PHYSICS
where p is the density of charge. Now, by Poisson's equation
we know that vV = ~4xp/ e > so that the integral can be written
= srj J J '
F 'srJJJ'" v *-
We now make use of Green's theorem in its first form
J7J>V 2 dv + J/Jgrad ^ • grad <t> dv = J/^ grad„ <f> dS
where 4> and ^ are any two scalar quantities. Place \p = <j> — <p
and this becomes
JJJVvV dv + J/J^ 2 dv = J JV grad n <p dS
since i? = —grad <p.
Now since we integrate over all space, we must examine the
behavior of the surface integral as the surface (a sphere of radius
R, for example) gets larger and larger. The potential <p varies
as 1/R for large R, grad* <p as 1/R 2 and dS is proportional to R 2 ,
so the whole surface integral vanishes as R-* <x>. Thus sub-
stituting in our expression for V, we find
V = £- I I I EHv (4)
m-
8x
which is the equation we set out to derive. From our derivation
it is easy to show that if e is not constant
8 J J J
V = ^ | | \E-Ddv
where D is the electric displacement vector. This shows us the
origin of the name for D. If we think of D as an ordinary dis-
placement (per unit volume) of electricity, then the work done
per unit volume is the scalar product of the force times the dis-
placement. In an infinitesimal displacement dD, the work per
unit volume is proportional to
E -dD = eE-dE
and for a finite displacement D we get something proportional to
I EdD=\ eEdE = -j- =— g—
Thus, except for the numerical factor 1/4*-, we have the potential
energy per unit volume.
156. Energy in a Solenoid. — In a similar way, we may con-
sider the magnetic energy in a solenoid to reside in the magnetic
ENERGY IN THE ELECTROMAGNETIC FIELD 249
field within the coil. We have found earlier that the energy
in a solenoid of self-induction L, in which a current i was flowing,
was \IA 2 . But now we can easily write this in terms of the field
H within the solenoid. We have seen that this field is 4rra,
where n is the number of turns of the coil per centimeter. The
coefficient of self-induction L for a coil is easily found. By defini-
tion, it is the e.m.f . induced when there is unit time rate of change
of current through the coil. The e.m.f. per turn = ^ (B X
JTT
cross-sectional area) = irr 2 n -rr> if r is the radius of the coil, n the
permeability. Thus the e.m.f. for the whole N turns is JVrr*u
— • Since H = 4xra = — r^> if N is the whole number of turns,
dt d
, , , . *r o (4*N)di ,, , T 47T 2 iVV 2
d the length, the e.m.f. is iWr> — -5— -& so that L = ^— —
1 T ., (27rWV 2 )/ ^ \ 2 M^V 2 ^ _ i*H*(irr*d)
Hence we have ^ = a ^^/ g 8tt
Since irr 2 d is the volume, this indicates a volume density of mag-
. H 2 n
netic energy 01 -^ —
The proof that the total magnetic energy in a magnetostatic
field is #- I ( I H 2 dv or ^ I I I # ' • B dv is carried out in
exactly the same manner as the one for the electrostatic energy
given in the last paragraph.
157. Energy Density and Energy Flow. — The examples we
have considered suggest that in a combined electric and magnetic
field there should be a volume density (l/&r)(e.# 2 + nH 2 ) of
electromagnetic energy. As a matter of fact, it proves to be
quite possible to make this assumption, and to carry it out in a
logical way. one can regard the electromagnetic energy almost
as a fluid, having a certain density, flowing from place to place
in the field. Thus, there is a flow vector associated with it,
calle'd Poynting's vector, which we shall show in the next section
to be equal to (c/4ac){E X H). We shall prove that there is an
equation of continuity for the energy:
div[^ Xff )] + ±[^ + ,ff*)]=0. (5)
This is only true, however, in regions where electromagnetic
energy is not being produced. Of course, energy as a whole is
250 * INTRODUCTION TO THEORETICAL PHYSICS
conserved, but there can easily be sources and sinks of electro-
magnetic energy. Thus batteries are sources, in which chemical
energy is converted into electrical energy, and resistances are
sinks, in which the electrical energy is converted into heat. We
imagine the field as being worked on by the battery, and as doing
work against the frictional resistance. Hence our whole relation
is that d/dt (electromagnetic energy) = rate of production of
energy from e.m.f. per unit volume — rate of dissipation of energy
into heat — div (energy flow). This equation, put in mathemati-
cal form, is Poynting's theorem.
158. Poynting's Theorem.— Let us compute the quantity
d* [£ ( *xm] +![!(.*■ + ,*•>].
It can be shown in general that
div (A X B) = B • curl A - A- curl B.
Also —=t = 2 A • -— -• Hence the expression is equal to
?-(h • curl E - E • curl H + *-E • *® + »H • ?**\ =
4tt\ c dt c dtj
£\H.(cuTlE+l™)-E-(curlH-±™)~-
M. \ c dtj \ c dtj
But by Maxwell's equations curl E -\ — = 0, curl H —
c dt
- -~r = — , so that the result is —E • u. Hence Poynting's
theorem is
div ±(JE X H)
+ it* E2 + * iH2) ] = - E - u ' (6)
From the analysis of the last section, we see that —E-u must
represent the total rate of production of electromagnetic energy
by e.m.f.s minus the rate of dissipation into heat. The latter is
simple : in regions where Ohm's law holds, u = <tE, so that here
we have the contribution — aE 2 to the right side. The quantity
<rE 2 represents the ordinary dissipation of energy into heat. We
must examine the other sort of term, the external e.m.f., rather
more carefully.
159. The Nature of an E.M.F. — In a conductor carrying a
current, there will be a current u set up, equal to the total force
ENERGY IN THE ELECTROMAGNETIC FIELD 251
per unit charge, times <r. The force is ordinarily simply the
electrical force E. But sometimes there are other sorts of force
acting. For example, in a battery, the various concentrations of
electrolytes produce a definite pressure on the ions, forcing them
mechanically in one direction, and this force would not ordinarily
be considered as being electrical in nature. Inside a battery,
the electric field is actually opposite to the flow of current, point-
ing from positive pole to negative, while the current flows from ,
negative to positive. But the additional force acting on the
charges counteracts the electric field, and does enough more sO
that it can push the current through the internal resistance of the
battery. This latter part is already taken care of in computing
the work done by the resistance. The former part, just equal
and opposite to the E in the battery, is the force responsible for
the applied e.m.f. of the battery. Thus it is — E per unit charge.
And the rate of working of the force on unit charge is the force
times the velocity of the charge. We actually wish the rate of
working per unit volume, so that we must multiply by the charge
per unit volume. This is p, and its product with the velocity
is just the current density u, so that we have — E • u as the rate
of working of the e.m.f. on the electrical system. This is just
the contribution to the right side of Poynting's theorem which
we should get inside the batteries.
160. Examples of Poynting's Vector. — The conception of the
energy of the electromagnetic field as residing in the medium is a
very fundamental one, which has had great influence in the devel-
opment of the theory. Thus Maxwell thought of the medium
as resembling an elastic solid, the electrical energy representing
the potential energy of strain of the medium, the magnetic energy
the kinetic energy of motion. Such a definite view is no longer
held. Nevertheless, the energy is always believed to travel
through space. Thus, in a light wave, there is a certain energy
per unit volume, proportional to the square of the amplitude
(E or H). This energy travels along, and Poynting's vector is
the vector which measures the rate of flow, or the intensity of the
wave. We shall show that the vector actually points along the
ray of light, the direction of flow. If, for example, we have a
source of light, and we wish to find at what rate it is emitting
energy, we surround it by a closed surface, and integrate the
normal component of Poynting's vector over the surface. The
whole conception of energy being transported in the medium is
252 INTRODUCTION TO THEORETICAL PHYSICS
evidently quite fundamental to the electromagnetic theory of
light.
When we come back to charges and currents, however, it is a
little harder to see the significance of the energy in the medium.
For example, in a circuit consisting of a battery, and a wire con-
necting the plates, Poynting's vector indicates that the energy
flows out of the battery, through the space surrounding the wire,
and finally flows into the wire at the point where it will be trans-
formed into heat. This seems to have small physical significance.
In a moving electron, the situation is somewhat more reasonable.
Suppose that the electron at rest is to be represented by a sphere
of radius R, on the surface of which the charge is distributed.
Then the field will be e/r 2 at any point outside the sphere. The
1 e 2
total electrical energy is the volume integral of ^ - t over all
space outside the sphere, or
1 * f "^dr =
&r Jr r 4
2R
In the theory of the electron, it is this quantity which is inter-
preted as being the actual constitutive energy of the electron;
although a correction must be made of an additional energy
required to keep the sphere in equilibrium. Neglecting this
correction, we can compute the mass of the electron. For a
relation of Einstein says that a given energy has a mass, given
by the relation, energy = rac 2 . Hence mc 2 = e 2 /2R. Solving
for the radius, we have R = e 2 /2mc 2 , a familiar formula for the
radius of the electron. The correct formula, inserting the correc-
tion we omitted, differs only by a small factor. Inserting the
correct values of e = 4.774 X lO" 10 e.s.u., m = 9.00 X lO" 28
gm., c = 3 X 10 10 cm. per second, we have R = 1.41 X 10~ 13
cm. Now if this electron moves, it will have a magnetic field,
as a current would, and hence will have a certain magnetic energy.
Since the magnetic field is proportional to the velocity (or the
current), the magnetic energy is proportional to the square of the
velocity. This can be shown to be the kinetic energy. Further,
there will be a Poynting vector, pointing in general in the direc-
tion of travel of the electron, and representing the flow of energy
associated with the electron. All these relations prove on closer
examination to be more complicated than they seem at first sight;
but they lead to a consistent theory of the nature of the electron.
ENERGY IN THE ELECTROMAGNETIC FIELD
253
It should be stated, however, that this theory does not fit in with
the quantum theory, and that its correct form on the basis of
that theory is not known at present.
161. Energy in a Plane Wave. — Let us compute the flow of
energy in a plane wave of light. It was shown in the last chapter
and its problems that the potentials and fields satisfy a wave
equation of the form
V & - "2 ~M ~ U >
(7)
c* dt 2
corresponding to propagation with the velocity v = c/n, where
n = Vw- Here n, the ratio of the velocity of light in empty
space to the velocity in the y
medium we are interested in, is
the index of refraction. It is
easy to set up a plane wave solu-
tion of the wave equation.
Thus a wave of frequency v,
propagated in a direction whose
direction cosines are/, g, and h,
is represented by
E = E e L c J .
(8)
E is a constant vector, measur-
ing the amplitude of the wave. fig. 42.-
The exponent is constant, rep-
resenting constant phase, or a
wave front, when/r + gy + hz = (c/n) t = vt. Now/i + gy +
hz is the projection of the radius vector x, y, z on the direction
/, g, h, so that, as we see from Fig. 42, all points for which fx +
gy + hz is constant lie on a plane whose normal is /, g, h, and
whose distance from the origin is given by the constant. If this
constant is vt, the plane travels out with a velocity v, as a wave
front should. To have a wave of arbitrary phase, E would have
to be a complex vector. We can immediately show by substitu-
tion that the wave as we have written it is a solution of the wave
equation. For instance, dE/dx = -(2Trivnf/c)E, and carrying
out the various differentiations and substitutions, and making
use of the relation f + g 2 + h 2 = 1, the result follows at once.
Having the form of the solutions for E and H, we may apply
Maxwell's equations. We note that the wave equations separate
-Plane wave front AB, satisfy-
ing equation
fx + gy + hz = constant = distance OB.
254 INTRODUCTION TO THEORETICAL PHYSICS
E and H completely, but Maxwell's equations prescribe rela-
tions between them, so that actually Maxwell's equations are
more restrictive than the wave equation. First, we cannot hope
to satisfy the relations unless E and H both have the same
exponential factor, corresponding to the same frequency and
wave normal. Assuming this to be true, we can apply the equa-
tions in succession. Let us first take div D = 0. This leads at
once to — • (fE x + gE y + hE z ) = 0, showing that the scalar
product of unit vector along the wave normal, which we may
call k, and E, is zero. In other words, E and D have no compo-
nent along the wave normal, or are in the plane of the wave front.
Similarly div B = shows that B and H are in the plane of the
wave front. Next take the curl equations, beginning with
curl H = - — . This gives for its z component
c at
-*™p{gH z -m y ) = I (2*iv)E x ,
which is the x component of
E = -% X H), = -^0 X H),
showing that E is at right angles both to H and the wave normal,
these three then forming a set of three orthogonal directions.
Further, since k and H are at right angles, the magnitude of E
equals V 'n/e times the magnitude of H. The fourth equation
can be easily shown to lead to the same condition.
Now we find the energy density. It is evidently
1 eE 2
as we see from the relations between E and H. Setting
E = E cos 2wv\ t (fx -}- gy .+ kz) , and squaring, we have
a quantity oscillating with time, but its time average, which
alone has physical significance, is E Q 2 /2. Hence the mean energy
density is eE 2 /Sr. Next, Poynting's vector, being at right
angles to E and H, is along k, as it should be. Its magnitude is
(c/4t)E X s/T/JxE, so that its mean is (c/8t)\/«7m-^o 2 , or
c/\/efi times the energy density. But this is the result we
ENERGY IN THE ELECTROMAGNETIC FIELD 255
should expect. This energy would be contained in a volume 1 sq.
cm. in cross section, and of length v = c/ve/* cm. But if the
light moves with a velocity v along the long axis of the volume,
this energy will cross the 1 sq. cm. in one second, so that it should
represent the flow vector, or Poynting's vector.
162. Plane Waves in Metals. — Let us consider the propagation
of a plane wave in a metallic conductor, where for simplicity
we shall take fi = 1, p = 0, but u = aE. Rather than satisfying
the wave equation first and then substituting in Maxwell's
equations, as we did in the preceding case, we shall vary the
procedure by assuming a wave with undetermined velocity,
and satisfying all four of Maxwell's equations (in the preceding
case only three of Maxwell's equations, and the wave equation,
were actually used, Maxwell's fourth equation being auto-
matically satisfied). Let us then assume that E and H are given
by expressions of the form
E e L C J , (9)
where a is to be determined. The divergence equations show as
before that E and H are both in the plane of the wave front.
The equation for curl E leads to a(n X E) = H, showing as
before that E and H are at right angles to each other, and that
the magnitude of H is a times the magnitude of E. The equation
curl H = - -rr H E gives a new condition,
c at c
-^(* xh) = (^iv + *™ta.
c \c ' c J
This condition likewise shows that E and H are at right angles to
each other, but now gives the magnitude of H equal to -( e -J
times the magnitude of E. These conditions are only consistent
if
«-!(._ ^), «. - « - 3fe. (io)
a\ v / v
We see, in other words, that a, the quantity corresponding to
the index of refraction, is complex. Let us write a = n — ik f
where n and k are real, so that, as we can easily see,
n 2 — k 2 = €, nk .= <r/v }
256 INTRODUCTION TO THEORETICAL PHYSICS
and
n = [Kv V + 4<r 2 A 2 + 6)]*, (n)
fc = [KV« 2 + 4<r 2 /v 2 - €)]*.
To understand the meaning of n and k, we substitute in the origi-
nal expression for the plane wave, Eq. (9). This can be written
E e'
-^^(fx+gy + hz) 2*i»f t--(fx+OV + hz)~\
c p L c J
The second factor is just like an ordinary plane wave, with index
of refraction n, though since n depends on frequency, we find the
Maxwell theory predicting dispersion of electromagnetic waves
in metals. But the first factor, a pure exponential term decreas-
ing as fx-\-gy + hz increases, means that there is a decrease of
amplitude and energy as the wave travels along, or an absorption,
as we can easily see from an application of Poynting's theorem,
computing the Joule heating within the metal. For this reason
k is called the absorption coefficient.
We have found that the magnitude of H is a, or n — ik, times
the magnitude of E. If we write the complex number n — ik
in the exponential form, we have
n - ik = \/n 2 + k 2 e^ 2 ™\
Ik
where 8 = K — tan -1 -> and
2irv n
17-71 n /, , m - 2j T*(fx+gV+hz) 2*iJt-;(fx+gy+hz)-8~\
\H\ = EzS/n 2 + k 2 e c e L c J ,
so that there is a phase difference between E and H in a conductor,
whereas in an insulator they are in phase. The details of the
calculation of electric and magnetic energies are left to a problem.
Problems
1. If the generation of heat per cubic centimeter in a conductor carrying
a current is crE 2 , prove that for a cylindrical conductor of resistance R,
carrying a current i, the rate of generation is i 2 R.
2. Given a cylindrical wire carrying a current. Find the values of E and
H on the surface of the wire, computing Poynting's vector, and show that it
represents a flow of energy into the wire. Show that the amount flowing
into a given length of wire is just enough to supply the energy which appears
as heat in the length. Note that the surface of a wire carrying current is
not an equipotential so that there can be a component of electric field
parallel to it.
3. Prove div (A X B) = B • curl A - A • curl B.
ENERGY IN THE ELECTROMAGNETIC FIELD 257
4. The maximum electric field in a light wave is 0.1 volt per centimeter.
Find how much energy is transported by the beam across 1 sq. cm. per
second.
6. Given a 40-watt lamp, and suppose that all its energy is dissipated in
radiation of one wave length or another. Take a sphere of radius 1 m.
surrounding it, and suppose the radiation is of equal intensity in all direc-
tions. Find the maximum electric field in the radiation at this distance,
in volts per centimeter, and the maximum magnetic field in gauss. Find the
energy per cubic centimeter at this distance, in ergs per cubic centimeter.
6. Apply Poynting's theorem to the case of a plane wave traveling in a
conductor and show that the rate of dissipation of electromagnetic energy
just equals the Joule heating.
7. Calculate the electric and magnetic energies in a plane wave traveling
in a metal and show by direct comparison that they are different from each
other. What happens in the limiting cases <r—*0 and <r— *•<», i.e., insulators
and perfect conductors?
8. Investigate the behavior of n and k for a metal as functions of fre-
quency, drawing curves. Take e = 1, and take the conductivity of copper.
Note that the conductivity in electrostatic units has the dimensions of a
frequency, and find in what part of the spectrum this frequency lies. Show
that the value of e is only significant when the frequency becomes greater
than a\
9. The significance of «■ as a frequency is found from the relaxation time,
the time taken for a volume charge set up within a metal to die down to 1/e th
of its original value. Derive this in the following manner. Set up the
equation of continuity for the current density u and charge density p. In
this, write u in terms of E by Ohm's law, and write the result in terms of p
by the relation e div E = 4?rp. Solve the resulting differential equation
for p, showing that the solution is p = poe~ t/r , where t, the relaxation time,
is e/ixcr, so that a- is, as far as its order of magnitude is concerned, the fre-
quency connected with the relaxation time.
CHAPTER XXIII
REFLECTION AND REFRACTION OF ELECTROMAGNETIC
WAVES
According to the electromagnetic theory of light, light con-
sists of electromagnetic waves, propagated according to Maxwell's
equations. We have already seen how we are led to the wave
equation for E and H, or for the potentials, and we have investi-
gated the plane wave solutions of these equations, showing that
E and H are at right angles to each other and to the direction of
propagation, the latter being the same as the direction of Poyn-
ting's vector, giving the energy flow. We shall now investigate
the electromagnetic theory of some simple optical phenomena,
beginning with reflection and refraction.
163. Boundary Conditions at a Surface of Discontinuity. —
We have seen in the last chapter the conditions that hold for a
wave in a refracting medium, whose index of refraction is con-
stant. In the problem of reflection and refraction at a boundary
between two media, however, the index changes suddenly from
one medium to the other, and we must investigate what happens
there. Let us assume that the boundary is a plane normal to the
z axis. Then we shall apply Maxwell's equations, in the inte-
grated form, to small regions containing the boundary. Thus
take a thin flat volume, its faces parallel to the boundary and
containing it. Let the area of the face be A. Apply to the
above the divergence theorem, div D = 4wp, or ///div D dv =
ffD n dS = 47r<7, where q is the total charge within the volume.
The surface integral comes almost wholly from the flat faces; it
is A(D n2 — Dni), if D 2 is the value of D in the upper medium, Di
in the lower. If now the surface is uncharged, q gets smaller
and smaller as the volume becomes thinner, so that in the limit
A(D n2 - Dm) = 0, or D„ 2 = D nl . That is, the normal com-
ponent of D is continuous at an uncharged surface.
Next let us apply the curl equations, to contours of the follow-
ing sort: infinitesimal contours of long thin shape, in which one
long side is in one medium, the other in the other, parallel to
258
REFRACTION OF ELECTROMAGNETIC WAVES 259
the surface, and the parts of the contour which cross over from
one medium to the other are of negligible length compared with
the long sides. Consider curl H = - — H > or integrated,
° c at c
Cll 8 ds = f [(\^f+ ^~) dS - If there is no surface cur-
rent, D and u are finite vectors, so that as the contour gets narrower
and narrower, and the area smaller and smaller, the right side
of this equation will vanish. The left side approaches (H a2 —
H„i)L, where L is the length of the contour, H s i and H s2 are
the tangential components of H in the media 1 and 2, respectively.
Thus finally we have H s2 = H sl , or the tangential component
of H is continuous. Similarly we show that the tangential
component of E is continuous.
Now we can see how to solve a problem involving two media
separated by a plane surface, as air and glass. In one medium,
we assume a plane wave approaching the boundary. But it
must stop at the boundary, for the same plane wave, with the
same wave length, would not be a solution of the problem for the
second medium. There must be some wave in the second
medium, however, for otherwise the boundary conditions could
not be satisfied. Thus we are led to the existence of the refracted
ray. As a matter of fact, we find that we cannot satisfy the
boundary conditions without an incident, refracted, and also
a reflected ray. By using all these, with proper relations between
direction, amplitude, etc., we can actually satisfy the boundary
conditions at the surface of separation of two media.
164. The Laws of Reflection and Refraction. — Assume a plane
wave in the first medium, striking the surface of separation.
This wave will have the form e **'v » '. Let the surface
of separation be given by z = 0, the xy plane. Further let
the axis be so chosen that the wave normal is in the xz plane,
as in Fig. 43, so that m = 0. Then at points of the surface of
2Triv(t -) T . - .
separation the disturbance is given by e >. »/. it is tins
disturbance which, taken together with the corresponding
expressions from the reflected and refracted waves, must satisfy
certain boundary conditions.
Next we consider a possible refracted wave. It will be in
2 '(t l ' x + m 'v+ n ' z \
general of the form c *n v ' ' , so that in the surface
260
INTRODUCTION TO THEORETICAL PHYSICS
of separation it will reduce to the value of this with 2 = 0.
The boundary conditions must be satisfied for all values of x, y,
and t, and yet we have only one constant at our disposal, an
amplitude, in addition to the frequency and direction. It is
obvious that the only possibility of satisfying the conditions
will come if we make v' = v, V '/v' = l/v, m' = 0. For then we
shall have just the same function of x, y, and t for both incident
and refracted waves, at all points of the boundary. First,
then, the refracted wave must
have the same frequency as the
incident one. Next, if the inci-
dent wave normal is in the xz
plane, this must also be true of
the refracted wave. Finally,
there is a relation between the
angle of incidence and the angle
of refraction. We have I =
cosine of the angle between the
wave normal and the x axis =
sine of the angle between the
wave normal and the normal to
the surface = sin i, where i is
the angle of incidence. Simi-
larly, V = sin r, where r is the angle of refraction. Thus we have
Fig. 43. — Law of refraction.
sin 7v
— — = -t = index of refraction of the second medium with
sin r v
respect to the first. In other words, we have the ordinary law of
refraction, as a necessary consequence of the boundary conditions.
Similarly, for the reflected wave, moving in the first medium,
•we see that m must be equal to zero, and I equal to the value
for the incident wave, showing that the angle of reflection equals
the angle of incidence. Now the reflected wave must be different
from the incident wave, and to do this we must have the n for
the reflected wave the negative of the value for the incident
one, showing that the reflected wave travels away from the
surface rather than towards it.
165. Reflection Coefficient at Normal Incidence. — After prov-
ing the laws of reflection and refraction, we still have much
more to do to apply the boundary conditions. For we must
compute the values of the various vectors at the surface, and
actually satisfy the conditions. Let us take first the simple
at
REFRACTION OF ELECTROMAGNETIC WAVES 261
case of normal incidence, where I = 0, and all waves travel
along the z axis. Let us suppose that in the incident beam we
have E along the x axis, H along y. For simplicity we assume
the first medium to have the index of refraction unity, the second
the index n = s/l. Then in the refracted wave we assume
that E is along the x axis, H along y, and that the value of E
is E', so that H' = nE'. In the reflected wave, assume that
E has a changed phase, H not, so that E is along —x,H along y,
and each numerically equal to E" . The change of phase of one
vector and not the other is necessary to reverse the direction
of the Poynting's vector.
Now we may apply the boundary conditions. All normal
components are zero, so that these conditions are automatically
satisfied. For the tangential component of E, we have E —
E" = E'; for the tangential component of H , H + H" = H' '.
The latter is then E + E" = nE'. Combining the two, we have
op E'(n — 1)
! once E' = -=^ (by adding), and E" = — ^ L (by sub-
-pin ~i i
tracting), leading to -^ = , . - This gives us directly the
reflection coefficient at normal incidence. The ratio of reflected
to incident intensity is proportional to the ratio of the squares of
( n _ 1)2
the amplitudes, or K ' • This shows that the reflected
intensity is never so great as the incident, but that the ratio
approaches closer and closer to unity as n becomes larger. It
is interesting to compute the reflection coefficient for familiar
substances. For instance, for glass, n is about 1.5, so that the
coefficient is (0.5/2.5) 2 = 1/25, showing that only a few per
cent of the intensity is reflected from a glass plate at normal
incidence.
> We can check the energy relations: the amount of energy
brought to the surface per unit time in the incident wave should
equal the amount carried away in the refracted and reflected
waves. The first is £-(E X H), whose magnitude is -^-E*. The
reflected energy is j- , T iL ^ 2 - The refracted intensity is
4ir (n + l) z
UE' X H') = -%-nE'* = -£- . ^Ix^ . The sum of the
4t ' 4t 47r (n + l) 2
262
INTRODUCTION TO THEORETICAL PHYSICS
refracted and reflected intensities is
c
47T
(n — l) 2 + 4/i
]* - hF
{n + l) 2
equal to the incident intensity.
166. Fresnel's Equations. — Now we pass to Fresnel's equations,
the extension of the last section to an arbitrary angle of incidence.
Here, for the first time, we meet the question of polarization.
The vector E is at right angles to the direction of propagation,
but that does not fix the direction uniquely, and it is said that
the wave is polarized in a particular direction if its electric vector
points in that direction. Let us then consider the two extreme
up.
Fig. 44. — Vectors in reflection and refraction.
Case 1. y axis points down into the paper. E and E' point down, E" points
p. _ ,
Case 2. H, H', H" all point down.
cases. We take the wave normal of the incident wave to be in
the xz plane, as before. Then we consider the case where the
electric vector is along the y axis, and the case where it is in the
xz plane, as in Fig. 44.
Case 1. Electric vector along the y axis. All vectors depend
on space in the following way, rewriting I, m, n in terms of the
angles of incidence and refraction : for the incident wave,
for the refracted wave,
„ . /. x sin r-\-z cos r\
2mv\ t— ; I
for the reflected wave,
2iriv(t —
0. \
a; Bint— zcos i
(1)
(2)
(3)
REFRACTION OF ELECTROMAGNETIC WAVES 263
We take E and E' to be along the y axis. Then H is in the xz
plane, at right angles to the wave normal. That is, for the
incident wave, H x = —E cos i, H z = E sin i. Similarly, in
the refracted wave, HJ = —nE' cos r, HJ = nE' sin r, and for
the reflected ray HJ' = —E" cos i, HJ' = — E" sin i. Hence,
we have the following relations :
Normal component of D : nothing, since D is tangential.
Normal component of B : E sin i — E" sin i = nE' sin r.
Tangential component of E: E — E" = E'.
Tangential component of H: —E cos i — E" cos i = —nE'
cos r.
sin %
Remembering that — — • = n, the first two equations reduce to
the same equation, E - E" = E' . The last is E + E" =
n cos r E' _. tan i „ ,, . , ... , . ,, _ ,
: — = E From this at once, multiplying the first
cos i tan r
by 7 > and subtracting, we have
jJtoti _ A = Jten.' \
\tanr / \tanr /
E"
tan i — tan r sin i cos r — cos i sin r
E
tan i + tan r sin i cos r + cos z sin r
E" sin (* - r)
E sin (t + r)
(4)
This gives the amplitude of the reflected wave, and is one of
Fresnel's equations. We note that as i and r become zero, the
law of reflection becomes i/r = n, i = nr. Thus in the limit
of normal incidence, the ratio approaches {nr — r)/(nr + r) =
(n — l)/(n + 1), as we found above. We also note, in the
other extreme of tangential or grazing incidence, that i = 90
deg., so that the ratio is — — ,._ , g ' , — { = 1. That is, the
sm (90 deg. + r) '
reflection coefiicient equals unity for grazing incidence. The
formula gives a gradual increase of amplitude as the angle of
incidence increases.
Case 2. Electric vector in the xz plane. Let H be along the
y axis in all the waves: H y = E, H y ' = nE'; H y " = E". Then
we take E x = E cos i, E z = —E sin i, EJ = E' cos r, EJ = — E'
sin r, E x " = —E" cos i, E z " = —E" sin i. Then we have:
Normal component of D: —E sin i — E" sin i = —n 2 E' sin r.
264 INTRODUCTION TO THEORETICAL PHYSICS
Normal component of B : nothing.
Tangential component of E: E cos i — E" cos i-— E' cos r.
Tangential component of H : E + E" = nE'.
Using the law of refraction, the first and last are the same,
E + E" = nE'. The other is E - E" = E'^—.- Multiplying
COS %
the first bv ■> the second by n = -. — > and subtracting, we
J cos i sm r
have
/ cos r _ sin A _ ™//_ cos r _ sin_A
\cos i sin r) \ cos * sin r/
or
i£" _ cos r sin r — cos i sin i
^ ~ cos r sin r + cos i sin i
Now we see at once that
sin (i ± r) cos (i + r) —
(sin i cos r ± cos i sin r) (cos i cos r + sin i sin r) =
sin i cos z (cos 2 r + sin 2 r) ± sin r cos r (sin 2 i + cos 2 i) =
sin i cos z ± sin r cos r.
Hence we have
E" _ sin (i — r) cos (t + r) __ tan (t — r) ,_.
1 sin (i + r) cos (i — r) tan (t + r)
This is the other of Fresnel's equations.
167. The Polarizing Angle. — In Case 2 of Sec. 166, where the
electric vector is in the plane of incidence, or the xz plane, we
notice an interesting fact. If i + r = 90 deg., a perfectly possi-
ble situation, we have tan (i + r) = « , so that E"/E = 0.
That is, the amount of reflected light, at this angle, is zero.
There is no such situation for the other sort of polarization. Sup-
pose, then, that we take an unpolarized beam, such as would, be
emitted by any ordinary source, and reflect it from a mirror at
this angle, called the polarizing angle. The reflected light will
consist entirely of the light polarized with the electric vector
at right angles to the plane of incidence. It was by this phe-
nomenon that polarized light was first discovered. Light was
reflected from one mirror at this angle. Then its polarization
was found by reflecting from a second mirror at the same angle.
As the second mirror was rotated about the beam as an axis,
so that the Dolarization changed from being at right angles to
REFRACTION OF ELECTROMAGNETIC WAVES 265
the plane of incidence to being in the plane, the doubly reflected
beam changed from a maximum intensity to zero.
The polarizing angle r' is fixed by i' + r' = 90 deg., and this
occurs when cos i' = sin r' . Using the law of refraction, we find
tan i' = n, thus fixing the definite angle i' . For glass the angle
of polarization is 56 deg.
168. Total Reflection. — For light passing from a dense medium
with index of refraction n to a vacuum of index 1, the law of
refraction is n sin i = sin r. For the angle of incidence given by
sin i = 1/n, we have sin r = 1, r = 90 deg., and the refracted
ray emerges at grazing incidence. For larger angles of incidence,
sin r is greater than 1, and there is no real angle r. Physically
we know that at these angles, greater than the critical angles,
there is total reflection, with no transmitted beam. We can
easily investigate the situation mathematically.
In the first place, let us consider the disturbance in the second
medium, for we find there is a disturbance, even though no trans-
mitted beam is observed. This is given by an exponential
_ . /. xainr + z cosr\
2wlv[ t 7— I
e ^ ',
where we remember that the second medium has index 1, velocity
c. But cos r = ± \/\ — sin 2 r = ± \/—l-y/n 2 sin 2 i — 1, a
pure imaginary. Thus the exponential becomes
where we have used the negative square root. The first term
represents a wave propagated along the x axis, or parallel to the
surface of the medium, with an apparent velocity c/sin r, a value
less than c. The second factor indicates that the amplitude of
this wave is damped out as z increases, or as we go away from the
surface, so that the wave fronts (surfaces of constant phase)
are at right angles to the surfaces of constant amplitude. This
disturbance ordinarily damps out in a very short distance. Thus
if n 2 sin 2 i is decidedly greater than 1, the exponential becomes
small when z is a few wave lengths (vz/c a reasonably large num-
ber). Consequently the disturbance is not observed. It is
easily shown that Poynting's vector for this wave has no com-
ponent normal to the surface, so that it does not carry any energy
away.
266 INTRODUCTION TO THEORETICAL PHYSICS
The reflected wave may be treated by Fresnel's equations.
Thus, in Case 1 we have
E" _ co s i sin r — sin i cos r _ a — ib
~E cos i sin r -j- sin i cos r a + ib
where a = cos i sin r, 6 = — sin i\/n 2 sin 2 i — 1. This ratio
can now be written as — e - 2it * n ~ lb / a , so -that E" and E are of the
same magnitude, showing that all the light is reflected, but they
differ in phase. We may write
E e '
where
Si -\/sin 2 i — 1/n 2
tan -^ — :
2 cos i
Similarly for Case 2 we have
E" _ cos i sin i — cos r sin r _ c — id
157 cos i sin i + cos r sin r c + id
where c = cos i sin *, d = — sin r\A* 2 sin 2 i — 1. Again all the
light is reflected, but with a change of phase 5 2 given by
ET
E
where
^ gl'82
8 2 n 2 \/sin 2 i — 1/n 2
tan tt = r—
2 cos z
Thus, in the general case, where E has components both in the
xz plane and along the y axis, there is a difference of phase between
these components upon total reflection, and linearly polarized
light in general will become elliptically polarized upon total
reflection. To see this, we note that two vibrations at right
angles, with the same frequency and phase, produce a resultant
vector whose extremity moves in a line (plane polarization), but
if the two components are in different phases the extremity of the
vector traces out an ellipse. If the phases differ by 90 deg.,
and the amplitudes are equal, the polarization is circular.
It follows from our expressions for 8 X and 5 2 that the difference
between these phase angles, which we denote by 8, is given by
the relation
8 cos i\/sin 2 i — 1/n 2
tan k =
2 sin 2 i
REFRACTION OF ELECTROMAGNETIC WAVES 267
only in the case of grazing incidence, i = tt/2, does 5 = 0, so
that our above remarks hold valid except in this case. It is
clear that by causing an elliptically polarized beam to be totally
reflected at the correct angle, it can be transformed into a beam
of linearly polarized light.
169. The Optical Behavior of Metals. — We shall now examine
the law of reflection for light falling on metals, restricting the
discussion to the case of normal incidence. In the last chapter
we have already shown that in the case of metals we must
introduce a "complex index of refraction," n' = n — ik, where k
is the extinction coefficient, and in so doing we retain the identical
form of the relations which we have been using in this chapter.
We have already found (/* = 1)
n = V|(Ve 2 + 4(r 2 7^""rfe)
k = Vkv^+i^t^ - e )
where <r is the conductivity, v the frequency, and e the dielectric
constant, e is unknown for metals, but since <r for metals is
so large (in e.s.u. a = 10 18 ), we can neglect e at least for light of
sufficiently long period. Thus we find
n = k = y/a/v (7)
relations first found by Drude. For the infra-red, s/a/v > > 1.
We may still use Fresnel's equations as we have done for total
reflection. For normal incidence these are simply
WL = n ' ~ 1 .
E ri + l'
and we must insert a complex value of n' for reflection from
metals. Thus we have
E" = E n ~
n + 1 — ik
and taking the square, we find for the ratio of the reflected
to the incident intensities,
„ = ft 2 + k 2 - 2n + 1 = \n - l) 2 + fc 2 ' .
n 2 + F + 2n + l (n + l) 2 + fc 2 ' W
R is known as tHe reflective power of the metal. Since n = k,
we may write
< 7P = -, 4n
■ 2n 2 + 2n + 1
268 INTRODUCTION TO THEORETICAL PHYSICS
and since n = \/<t/v > > 1, this becomes
c
k
E.-l-?-l-J
or
R = 1 - -r=r < (»)
This relation holds experimentally in the far infija-red, down to
X ^ 5a*. The reflective power varies with the cQlor of the inci-
dent light, and colors which are strongly absorbed are also
strongly reflected.
Problems I
1. Light is reflected from glass of index of refraction 1.(5. Compute and
plot curves for the reflected intensity as a function of angle, for both sorts of
plane polarization.
2. Find the intensity of light in the refracted medium, lor arbitrary angle
of incidence and both types of polarization. Show th^it the amount of
energy striking the surface is just equal to the amount carried away from it.
Note that the amount striking the surface is computed, not from the whole
of Poynting's vector, but from its normal component.
3. Show that the reflection coefficient from glass to air at normal incidence
is the same as for air to glass, but that the phases of the reflected beams are
opposite.
4. Light passes normally through a glass plate. Find; the weakening in
intensity on account of the reflection at the faces.
5. Ten plates of glass of index 1.5 are placed together and used as a
polarizer. Light strikes the plates at the polarizing angle, and the trans-
mitted light is used. Since all the reflected light is of on^ polarization, and
the reflections at both surfaces of all plates are enough to Remove practically
all of the light of this polarization, the transmitted light Will be practically
polarized in the other direction. Find the intensity of both sorts of light
in the transmitted beam, assuming initially unpolarize4 light, and hence
show how much polarization is introduced. You may; have to consider
multiple internal reflection.
6. Derive the expressions for tan Si/2 and tan 5 2 /2 in the paragraph on
total reflection.
7. Derive the formulas for the phase difference 5 of! the two reflected
components of E in the case of total reflection.
8. The conductivity of copper in e.s.u. is 5 X 10 17 per second. Calculate
the reflective power of copper for wave lengths of light jX = 12j* and X -
25.5/*. The observed values of 1 - R are 1.6 per cent ai|d 1.17 per cent at
these wave lengths.
9. Consider light linearly polarized so that the incident electric vector
has equal components in the plane of the wave normal and) the perpendicular
REFRACTION OF ELECTROMAGNETIC WAVES 269
thereto. If this light falls on a metal, using Fresnel's equations find the
ratio of the reflected components of E. If this ratio is written as pe*'« show
that
„ 1 — pet's ^ sin i tan i
1 + peis ~ yV 2 -sin 2 /
where i is the angle of incidence and n' the complex index of refraction of
the metal.
CHAPTER XXIV
ELECTRON THEORY AND DISPERSION
Maxwell's theory and Maxwell's equations are based on the
assumption of dielectrics with dielectric constant e, magnetic
substances with permeability m, conductors with conductivity <r.
These assumptions are unsatisfactory for two reasons. First,
cases are known, and in fact are usual rather than exceptional,
in which the three constants mentioned are not really constants.
Thus the permeability of iron depends on the field strength.
The dielectric constant of almost all substances depends on the
frequency; as we have seen, the index of refraction n is given
by the relation n = \/~e, and the well-known phenomenon of
dispersion shows a dependence of refractivity on wave length
or frequency. An extreme case is water, whose index of refrac-
tion in the visible is about 1.4, and whose dielectric constant is 80,
a result of the fact that the dielectric constant is measured
for static fields, and that n as a function of frequency goes from
V80 at zero frequency, through a region in short radio or long
infra-red waves in which the index greatly decreases, so that
with the very high frequency of visible light it is reduced to 1.4.
The second reason why Maxwell's assumptions are unsatisfactory
is that, since matter is known to be composed of electric charges,
electrons with negative charges and atomic nuclei with positive
charges; it ought to be possible to explain these typically electrical
properties of matter directly in terms of the electronic structure,
without having to resort to empirical relations of the sort implied
by a constant or variable dielectric constant. The attempt
to derive the. electrical properties of matter from the electron
theory was first made by H. A. Lorentz, and he was successful
not only in explaining the physical meaning of the dielectric
constant, permeability, and conductivity, but in deriving their
dependence on frequency, field strength, etc. Further develop-
ments of the theory, making use particularly of wave mechanics,
have carried the subject much further than Lorentz was able to.
270
ELECTRON THEORY AND DISPERSION
271
and in our later chapters on wave mechanics we return to these
questions.
170. Polarization and Dielectric Constant. — The fundamental
physical fact about a dielectric is that, when placed in an electric
field, it acquires surface charges on its faces, proportional to the
strength of the field. Thus in Fig. 45, a slab of dielectric is
shown with positive and negative surface charges, as if the posi-
tive had actually been pulled along to the face by the action
of the field, the negative pushed to the other face. These surface
charges, of course, contribute to the field, just as do other charges,
which we actually have control over.
The essence of the electron theory is that it
treats these induced surface charges in the same
way as any other charges, applying the ordinary
Maxwell's equations to all charges in existence,
and not considering dielectrics as being essen-
tially different from free space, except in so far
as they contain these polarizable electrons.
Thus, if p and u are charge density and current
density, respectively, of the so-called "real
charge" which we can move about at will, and
p p and u p the charge and current density of the
charge arising from polarization, we assume Maxwell's equations
for a nonmagnetic medium are
. „ 1 BE . Air, , , . „ 1 dH
curl H = - — + —{u + u p ), curl E = —
C dt C C dt
Fig . 45. — Polariza-
tion of dielectric.
div E = 4tt(p + P p), div H = 0. (1)
In other words, we assume that the field E is the field of all
charge, both "real" and polarization charge, and that the total
current resulting from both sources produces the magnetic field.
The polarization charge must be produced, from the Originally
uncharged dielectric, by the motion of positive charges in the
direction of E, and of negative charge in the opposite direction.
Suppose that in equilibrium two equal charges of opposite sign
lie so near together that they exert no appreciable external effect.
By means of an external field these charges may be displaced
relative to each other by a distance r. The charges then form a
dipole of moment
p = er.
272 INTRODUCTION TO THEORETICAL PHYSICS
In producing such a dipole there is clearly a current
dr dp
e-7-=ev = -Tj-
dt at
If we add the dipole moments of all the polarization electrons
in a unit volume we obtain the polarization vector, or the dipole
moment per unit volume
P = Sp, (2)
and a current density due to these electrons equal to
u p = ppVp = — • (3)
In producing dielectric polarization, charges cross a surface in
the body. In fact all the charges pass across the surface which
originally were contained in a cylinder of base equal to the
surface and length r. If r» is the component of r normal to
the surface, then we have as the charge passing through the
end dS
Ver n dS = P n dS (4)
which is the surface charge appearing on dS if this is an element
of the outer surface of the body. If we consider a closed surface,
the enclosed volume loses the charge
JjP n dS = JJJdiv P dv
according to Gauss's theorem. The density of polarization
electrons remaining is given by p P = — div P, since these have
the opposite sign to those which have crossed the surface. We
thus can write both polarization charge and current in terms
of the polarization vector P.
We have seen that the field E is that resulting from all charge,
including the polarization charge. The displacement D, how-
ever, is simply the field resulting from the real charge p, so that
div D = 4rp. To get Maxwell's equations in terms of D, we
take Eqs. (1), and make the substitutions
p P = — div P,
Up = -77) {O)
dt
which, as we note, obviously satisfy the continuity equation
for polarization charge and current. Then we have at once,
for the only two equations affected by the change,
curl H = i | (E + 4xP) + ^ (6)
C Of v
ELECTRON THEORY AND DISPERSION
273
div (JE + 4ttP) = 4rrp.
If we set D = E + 4xP, these become the ordinary Maxwell
equations.
171. The Relations of P, E, and D .— We have seen that E
measures the field of all charge, D that due to the "real" charge,
and that P is the polarization per unit volume. To understand
P better, we may take a unit cube of dielectric, one pair of faces
being perpendicular to the field. Since the polarization surface
charge is P„, one of these faces will have a charge on its unit
area of |P|, the other of — |P|, so that the dipole moment of the
cube, coming from these two charges at unit distance apart,
+tf
■t-
-
+
+
+
e t
_
-
+
+
1
*
4-
-
+
+
+
*
'
*
■(-
-
,. *
+
c
*~
■!-
-
'
+
*
*
LI K
+
-
+
+
+
*
*
-d
a b c d
Fig. 46. — Condenser containing dielectric. Condenser plates a and d have
surface charges ±<r. Induced surface charges are shown on faces b and c of
dielectric. The force on unit charge within cavity e is E, and within cavity / is
would be P. Similarly, if the volume had had length L parallel
to the field, area A in the plane at right angles, the charges
on the ends would be ±PA, and the moment, remembering that
these are a distance L apart, is PAL, or P times the volume,
showing that the moment is proportional to the volume, so
that it is really correct to regard P as the moment per unit
volume.
The relations of the three quantities are perhaps best under-
stood from a simple illustration in the theory of the condenser.
In Fig. 46 we have a condenser consisting of two parallel plates
a and d with surface charges +<r, respectively. Between them
there is a slab of dielectric be, with surface charges ±P, on the
faces c and 6, respectively. The field E now is determined from
274 INTRODUCTION TO THEORETICAL PHYSICS
the whole charge ; that is, using our relation regarding the rela-
tion of discontinuity of field to surface charge, the field within
the dielectric is given by
E = 4tt(o- - P).
The displacement D, however, is determined only from cr, so
that
D = 4x0- = E + 4ttP. (7)
The capacity of the condenser is given by the charge, D/4n,
divided by the potential difference, E times the distance L
between the plates, or is r= -— =•• If we define the dielectric
constant e as the ratio D/E, this leads correctly to the relation
that the capacity of the condenser is e times the capacity of the
same condenser with vacuum in place of the dielectric.
Let us now consider the meaning of the field within the dielec-
tric. Actually, on account of the atomic and electronic struc-
ture, the field will change rapidly from point to point, so that
it is not so easy as it might seem to define it. The usual method
is to set up a long needle-shaped cavity e, pointing in the direction
of the field. A point charge placed within the cavity would now
be acted on by just the field of real and polarization charges,
so that the field E is the force on unit charge in such a cavity.
The necessity of choosing that particular shape of cavity is
shown by considering the cavity/, which is supposed to be disk-
shaped, with its flat face perpendicular to the field. This
cavity will have surface charges ±P set up on its two faces,
and it is evident that the lines of force starting from the polariza-
tion surface charges on plates b and c will terminate on these
faces of the cavity, not crossing it at all, so that the field within
it will come wholly from the real surface charges on a and d,
or will be E + 4tP = D. Thus if we choose we may define E
as the field in a cavity shaped' like e, in which the effect of the
charges on its faces is negligible because the faces are of negligibly
small area and arbitrarily far from the point where we are
finding the field, while we may define D as the field in a cavity
shaped like /. These definitions were originally used for the
corresponding magnetic case, by Kelvin. It is interesting to
notice that the fields in cavities of other shapes are different,
depending on the shape of the cavity. Thus in a later section
we shall see that the field in a spherical cavity is E + (47rP/3).
ELECTRON THEORY AND DISPERSION 275
We notice finally that since D = eE = E + 4rP, we have
« = 1 + (4nP/E), a constant if the polarization is proportional
to the field. To compute the dielectric constant, or refractive
index, we have then to find the polarization, per unit field, and
we proceed to do this for gases, and later for solids.
172. Polarizability and Dielectric Constant of Gases. — In
gases the molecules or atoms are relatively so far apart that we
can neglect the interactions between them. Each molecule
contains charges which can be displaced under the action of an
external field, and these charges act as if they were held to posi-
tions of equilibrium by restoring forces proportional to the
displacement. Thus in a static case an electron e is acted on
by the forces eE of the external electric field, and — ex the linear
restoring force. The displacement is then x = (e/c)E, and
the induced dipole moment ex = (e 2 /c)E. The ratio e 2 /c, giving
the dipole moment set up by unit field, is called the polariza-
bility, denoted by a. Thus the dipole moment per molecule is
aE, and if there are N molecules per unit volume the polarization
P is NaE, so that e = 1 + 4nNa.
A very simple model of an atom will give us the order of
magnitude of the polarizability. The atom consists of a nucleus
of charge Ze, where Z is an integer, e the magnitude of the charge
on the electron, surrounded by a distribution of negative charge
equal to — Ze. In the external field the negative charge will
be displaced with respect to the nucleus. The restoring force
may be computed as if the negative charge filled a sphere of
radius R with uniform charge density. Then the positive
charge Ze, at distance r from the center, would be acted on by
a force as if the negative charge within a sphere of radius r
were concentrated at the center, all other negative charge being
neglected. This charge would be r 3 /j£ 3 times the total charge,
(Ze) 2 r
so that the force would be ' • The polarizability is then
found to be R 3 , proportional to the volume of the molecule.
173. Dispersion in Gases. — We now assume a sinusoidal
external field of frequency v, as in a light wave. The magnetic
force on the electron on account of its motion can be neglected.
In addition to the external electric force, and the elastic restoring
force, we introduce a damping force proportional to velocity,
to account for absorption. The equation of motion for the
electron is then, for the x coordinate,
276 INTRODUCTION TO THEORETICAL PHYSICS
mx + mgx + <a 2 mx = eE x °e i(at (8)
where we have placed w = 2irv. Thus we have the problem of
the damped linear oscillator in forced oscillation. We have
solved this problem in Chap. IV, and can write for the steady-
state solution
~E x e^ 1 -E
m m (9)
wo 2 — co 2 + iwg coo 2 — co 2 + iwg
in complex form. This shows that the electron vibrates at
the same frequency as the light wave but with an amplitude
depending on the frequency and out of phase with the light
wave. If we have N electrons per unit volume characterized
by the constants o) k and g k (electrons of the fcth kind) we get for
the dipole moment per unit volume :
-2— -2*=
m
lo) k 2 — co 2 -H iugk
k
whence we get for the displacement vector
D = E + 4xP = e[ 1 + 4^2
A7 e
m
- w 2 + icogkl
k
and if we introduce a "dynamic" refractive index (n — ik)
denned by D = eE = (n - ik) 2 E, we find
(n - iky = i + 4*2;?-=
N k -
m (io)
I (ak 2 — co 2 + ioigk
k-
so that the index of refraction is a function of the frequency of
the light, and different colors traveLwith different velocities. This
is known as the dispersion of light. Furthermore, in general,
the index of refraction is a complex quantity and, as we have seen
in our discussion of electromagnetic waves in metals, this indi-
cates absorption and is not surprising in view of our introduction
of a damping force.
In the limit of slow frequencies (long wave lengths of light
where «. « co*) we may neglect the last two terms in the
denominator and find
ELECTRON THEORY AND DISPERSION 277
Nk-
- . - 1 +-4r.2-?
as the static value of the dielectric constant of insulators, agreeing
with the value found in the last section^
If the frequency of the light does not lie near any of the
natural frequencies of the electrons, we may neglect the frictional
force and find a real index of refraction given by
Ar e 2
Wh-
in
n = 1 + 2^-^
COk — CO
h
if we remember that the index of refraction for gases varies but
slightly from unity. Thus there is no absorption and we have
the case of normal dispersion. JLet us consider the index of
refraction as a function of frequency of the light in the visible
region of the spectrum. If the natural frequencies of the elec-
trons lie in the ultra-violet (and also for the case that they lie
in the infra-red) the index of refraction increases with increasing
frequency, the normal behavior.
In case the frequency of the light lies near a natural frequency,
we obtain the phenomenon of anomalous dispersion. In this
case the frictional term becomes , important and we find an
absorption band in the neighborhood of co . The whole discus-
sion is similar to the case of a resonant electric circuit. For
simplicity let us assume only one resonant frequency. Remem-
bering that for gases n is very nearly unity, we have :
e 2 N
n — ik = 1 -f 2x = .
m co — or + iu>g
and if we separate into real and imaginary parts, we obtain,
eW
C0 — 0}
n = 1+ 2tt— 2 " (11)
m (coo 2 — w 2 ) 2 + cc 2 g 2
and
k = 27r ^r(c 2 -co 2 ) 2 + coV' (12)
n is known as the principal index of refraction and k the absorp-
tion coefficient. If we plot n — 1 and k against the light fre-
quency, we get curves of the form shown in Fig. 47. Such
278
INTRODUCTION TO THEORETICAL PHYSICS
curves have already been considered in Prob. 10, Chap. IV.
In the neighborhood of the absorption region we see that the
index of refraction decreases with increasing frequency and this
is the anomalous behavior giving rise to the term anomalous
dispersion.
Fig. 47.
-Anomalous dispersion, showing index of refraction and absorption
coefficient as function of frequency.
174. Dispersion of Solids and Liquids. — In the case of solids
and liquids we may no longer make the approximation that the
force acting on an electron is simply the electric vector of the light
wave in free space, but must take into account the added force
on the electron due to the polarization of the body. We can
calculate this force as follows: we imagine a small sphere of
radius R (with its center at the position of the electron in ques-
tion) cut out of the medium. If we do this, we have induced
charges on the surface of this spherical volume from which we
calculate the force at the center of the sphere. We have for
the surface density of induced charge on a spherical ring at 'an
angle 0, a = P n = P cos 0, as in Fig. 48. The area of the ring
is 2rrR sin • Rdd = 2wR 2 sin dd, so that the charge on this
ring is
2tP# 2 cos sin dd.
ELECTRON THEORY AND DISPERSION 279
This charge produces a field at the center of the sphere whose
component parallel to E is
,-, 2ttPR 2 cos 2 6 sin dd
. dE,= —^
so that the total charge on the sphere produces a field at the center
equal to
4tP
Ei = 2xP I cos 2 sin 6 dd =
Jo 3
The total electric field at the center of this sphere is then
4irP
E + ~ ^ ~ (13)
Of course, there is still the contribution to the force by the atoms
inside the little sphere we have cut out, but in an isotropic medium
Fig. 48. — Field in spherical cavity in dielectric.
this averages zero. We can now carry over our calculations for
gases if we replace E by E + (4xP/3) in the expression for x.
Thus we get
N k —
P = [E + ™ '-' m
-(* + ¥)2
wV — oj 2 + iiog k
k
and using the relations D = eE = E -f 4xP, we have
B + **-•-+**
and we find for e
e
e-1 (n - iky - 1 4*->w? " k m
N k -
i
e + 2 (n — t'A;) 2 + 2 3 ^J« 2 ofc — co 2 + iwgr fc
2 ~ 3 ^lw 2 n fe -
If N represents the number of atoms, then
N k =f k N
280 INTRODUCTION TO THEORETICAL PHYSICS
and fk gives the number of electrons of the kth kind per atom,
the so called "oscillator strength," and we have
(n — ik) 2 — 1 1 471--%-^ , e 2 /m
471-^-71, e* m /1>t v
= T >/*"! V_i_ • ( 14 )
{n-iky + 2 N 3 ^r*co 2 OJk - co 2 + iwflf*
In all cases of transparent substances, where we can neglect the
damping force, and the index of refraction is real, we have for a
given frequency of light :
n 2 — 1 1
, n — = constant.
n 2 + 2 po
where p is the density of the body, obviously proportional to N.
This law, known as the Lorenz-Lorentz law, is surprisingly well
obeyed for many substances. Of course, in the limit of very long
electromagnetic waves, and for the electrostatic case,
— — - • — = constant,
€ + 2 po
giving us a relation between dielectric constant and density.
If we use the expression E + (4rP/3) instead of E in the equa-
tion of motion of an electron, we find similarly to our equation
for gases:
-^^ Nk e 2 /m ., „.
(n - »)• = 1 + 4,^ as _ ,, + ^ (15)
with the only difference that instead of the natural frequency
coofc of the electrons we find the apparent natural frequencies
«>* = " 2 o* - jN^. (16)
Thus we have the same type of anomalous dispersion phenomena
in solids and liquids that we have in gases.
175. Dispersion of Metals. — In metals we picture free electrons
wandering about among fixed ions, and these electrons are the
conduction electrons. on the average there is no resultant force
on the electrons, so that under the influence of an external field
we can place the force on an electron equal to eE. If we imagine
the ions as rigid structures possessing no polarizability, we then
have the simplest possible picture of a metal. Thus, in contrast
to the bound electrons of the previous sections, we have no restor-
ing force on these electrons. We must, however, introduce a
ELECTRON THEORY AND DISPERSION 281
damping force, so that steady-state motion becomes possible.
Thus we have as the equation of motion of conducting electrons:
mx -f- mgx = eE. (17)
This equation must allow an atomistic calculation of the con-
ductivity and if the external field E is constant and in the z
direction, we get as the steady-state solution of this equation
x = — / + constant.
mg
Thus the velocity is x = eE/mg, and if TV is the number of con-
ducting electrons per unit volume, we get for the current density
,, . Ne 2 E
u = Nex =
mg
Now by Ohm's law u = aE, we find
Np 2
a = — (18)
mg
so that we are led to an expression for conductivity from an
atomic point of view. It is interesting to note that dimensionally
a and g are both of the dimensions of frequencies. We have
already seen in Prob. 10, Chap. XXII, that the period associated
with <r is the relaxation time, the time taken for any irregularity
in charge distribution within the metal to decrease to 1/eth.
of its value, and have seen that this frequency, for good conduc-
tors, is in the ultra-violet part of the spectrum. The meaning of
g is similar, as one could see by imagining an electron initially
with a given velocity, and finding the time taken for its velocity
to decrease to 1/eth of its initial value, the result being essentially
the period associated with g. It seems very reasonable to sup-
pose that approximately equal times would be required for the
velocity of electrons to be damped down, and for charge irregu-
larities to be ironed out, and, as a matter of fact, g is found to
be of the same order of magnitude as a. one can estimate g
by making a guess as to the value of N, the number of free elec-
trons per unit volume, assuming, for instance, that there is one
free electron per atom, and then computing g from the equation
g = Ne 2 /ma. one has, then, two independent constants charac-
terizing the optical behavior of a metal, so that complicated
results are not surprising. In addition to this, metals like other
substances contain polarizable electrons, which make additional
complications.
282 INTRODUCTION TO THEORETICAL PHYSICS
The formulas for the optical constants of a metal may be found
simply by including the free or conduction electrons as a class of
bound electrons whose binding force, and natural frequency, are
zero. Thus
( ..v. 1 . 4rNe 2 /m ^i N k e 2 /m
(n — i«) = 1 H r - -. — — 4- 47T > — « 8 . »
A
n » _ k 2 = i _ ^ * .
n * l g l + co 2 /g 2 +
TO (co* 2 — co 2 ) 2 + (<ag*)'
k
j, _ <r 1 , s^N k e 2 ugh nQ s
Uk ~ v 1 + »»/0 f + ^ m W ~ co 2 ) 2 + (co<7*) 2 ' Uy;
where in the last two we have written Ne 2 /m as <xg. The sum-
mations are over the bound electrons. We notice that as the
frequency becomes low compared with a, the first term in the
product nk becomes very large compared with unity, masking
the effect of the bound electrons. The difference n 2 — k 2 does
not become correspondingly large, so that in the limit, as we
stated in Chap. XXII, n becomes equal to k, and both approach
■y/cr/v, neglecting co compared to g. It is easy to see that at low
frequency n 2 — A; 2 approaches e, if in the dielectric constant we
include a contribution —4x<r/g from the free electrons. However,
it is only at low frequencies that these simplifications enter. As
the frequency enters the near infra-red or visible region, it
becomes of the same magnitude as a and g, so that the contribu-
tions of the free electrons become complicated, and at the same
time nk decreases so that the contributions of the bound electrons
become important. It is thus natural that experimentally the
curves of n 2 — k 2 and nk throughout the visible part of the
spectrum are very complicated, though they can be fitted fairly
accurately with formulas of the type we have derived, assuming
bound as well as free electrons. In the ultra-violet, the frequency
becomes too high for the free electrons to follow, the contributions
of the free electrons become small compared to those of the bound
electrons having resonance in that region, and a metal does not
behave essentially differently .from an insulator.
In conclusion, we should mention that the introduction of a
frictional force proportional to the velocity of the electrons is at
ELECTRON THEORY AND DISPERSION 283
best an extremely rough approximation. In metals the steady
state is made possible by collisions of the electrons with the ions
of the lattice, and the energy of the electrons gained from the
external field is thus transmitted to the lattice, excites lattice
vibrations, and appears as heat, as we shall describe more in
detail later. All in all, when one considers the approximate
nature of this classical electron theory, it is gratifying that it
checks as well as it does with experiment and assures us that a
more refined atomic picture will lead to an exact theory.
Problems
1. Show that in the case of normal dispersion for the visible spectrum
where there is an absorption band in the ultra-violet, the index of refraction
can be written as
---* + £ + §+■••.
where \ is the wave length in vacuum and A, B, C are constants.
If there is also absorption in the infra-red show that the index of refraction
is then given by
»« =4+^+^ +....- A'\* - B'\*
2. Measurements of Hi gas give the following values of the index of
refraction :
X in A.
(n - 1
5,462.260
1,396
4,078.991
1,426
3,342.438
1,461
2,894.452
1,499
2,535.560
1,547
2,302.870
1,594
1,935.846
1,718
1,854.637
1,760
Using the expression in Prob. 1 for n 2 in reciprocal powers of X, calculate
the best values of A, B, and C. If the measurements are made at room
temperature and atmospheric pressure, calculate the resonant frequency
wo and wave length from these constants.
3. Prove that in the case of anomalous dispersion for gases the maximum
and minimum values of n occur at the positions where the absorption coeffi-
cient reaches half its maximum value. Show that the half width of the
absorption band equals the damping constant divided by the mass of an
electron. Assume g/o>o < < 1.
4. For the D line of sodium the following values of the constants in the
dispersion formula are found:
wo = 3 X 10 1S ; g = 2 X 10 10 ; 4irNe*/m = 10 23 .
Plot the index of refraction n and the absorption coefficient A; as a function
Of the frequency of light. Find the maximum and minimum values of the
284 INTRODUCTION TO THEORETICAL PHYSICS
index of refraction n. Find the maximum value of the absorption coefficient
k and the half width of the absorption band in Angstrom units.
5. Show that for gases the Lorenz-Lorentz law takes the approximate
2 n — 1
form 5 • = constant. The following measurements have been made
o po
on air (p given in arbitrary units), t
po n
1.00 1.0002929
14.84 1.004338
42.13 1.01241
69.24 1.02044
96.16 1.02842
123.04 1.03633
149.53 1.04421
176.27 1.05213
Calculate 5 • and — for each of these measurements and com-
3 po n 2 + 2 po
pare the constancy of the results (calculate to four significant figures).
6. The indices of refraction for the sodium D line, and densities in grams'
per cubic centimeter of some liquids at 15°C. are
Po
Water
Carbon bisulphide.
Ethyl ether
0.9991
1.2709
0..7200
1.3337
1.6320
1.3558
Calculate the indices of refraction for the vapors at 0°C. and 760 mm. pres-
sure. The observed values for the vapors are 1.000250, 1.00148, and
1,00152, respectively.
7. -The quantity , , , , is called the refractivity of a substance if m
(n* -\- z)p
denotes its mass. Prove that the refractivities of mixtures of substances
equal the sum of the refractivities of the constituents. (Neglect damping
forces from the start.)
8. Show that the molecular refractivity of a compound, defined as ■ •
n 2 -\- 2
M
-'-, where M is the molecular weight, is equal to the sum of the atomic
Po
refractivities of the atoms of which the compound is formed. (Neglect
damping forces.)
9. Prove that the apparent natural frequencies «*, in the equation for
the index of refraction for a solid or a liquid, are related to the natural fre-
quencies mo for the electrons in isolated atoms by the equation
_ „ , 4tt Nke 2
cojr = Wao — -5-
6 m
ELECTRON THEORY AND DISPERSION 285
10. For the following gases we have the following values of (n — 1)„
extrapolated to long wave lengths:
Gas (» - 1), • 10 6
H 2 136.35
N 2 294.5
2 265.3
Calculate the values of (n — I), for the following gases: H 2 0, NH 3 , NO,
N 2 4 , 3 . The measured values are 245.6, 364.6, 288.2, 496.5, 483.6, all
times 10 6 . Find the percentage discrepancy between the calculated and
observed values.
CHAPTER XXV
SPHERICAL ELECTROMAGNETIC WAVES
Suppose that we have an electrical charge oscillating back and
forth sinusoidally with the time. This charge will send out a
spherical electromagnetic wave, radiating in all directions.
There are several physical problems connected with such a wave.
First, the phenomenon may be on a large scale, as in a radio
antenna. Radiation from a vertical antenna, as a matter of
fact, can be approximately treated by replacing the antenna by
such an oscillating charge. But also on a smaller scale we can
treat the radiation of short electromagnetic waves, or in other
words light, from an atom which contains oscillating electrons.
The electrons may have been set in motion by heat or bombard-
ment, in which case we have the treatment of the emission of
light from a luminous body; or they may be in forced motion
under the action of another light wave, as in the case of the scat-
tering of light. As a first step in the discussion of these problems,
we consider spherical solutions of the wave equation, then passing
on to the special case of electromagnetic fields.
176. Spherical Solutions of the Wave Equation. — The wave
equation can be solved by separation in spherical coordinates, as
we have seen in Probs. 6, 7, and 8 of Chap. XV, and in Sec. 130,
Chap. XVIII. The solutions are of the form e ±iut sin ra<£
Pi m (cos 8)R(r), where R satisfies a differential equation which, by
a slight transformation of the results quoted above, can be written
d 2 (rR)
aS _ 1(1+ 1)
«)2 *.2
rR = 0. (1)
dr 2
The solution of the equation for R was shown in the problem
quoted above to be expressible in terms of BessePs functions, of
half integral order, divided by y/r. It proves to be possible,
however, to express these functions in an alternative manner in
terms of exponential or trigonometric functions, and we shall use
that more elementary method in the present chapter. Further,
we shall find that we have to consider only the very simplest
types of spherical waves, for the purposes we are interested in.
280
SPHERICAL ELECTROMAGNETIC WAVES 287
The simplest solution in spherical coordinates is the one inde-
pendent of angle, for which I = 0. In this case, solving Eq. (1),
. ucr
we have rR = e ± • , giving as the solution of the wave equation
» o+ia(t±r/v) -I
the functions ■> reducing to - for the static case where
r r
(a = 0. This represents a sinusoidal wave, traveling out along r
(if we have t - r/v) or in along r (if we have t + r/v), with a
velocity v, and with an amplitude which decreases as 1/r. This
decrease of amplitude is necessary if equal amounts of energy
are to flow across all concentric spheres, since the intensity,
proportional to the square of the amplitude, must be proportional
to 1/r 2 so that its product with the area of the sphere may be
constant.
A more general spherical wave can be obtained if we are not
limited to sinusoidal vibrations. Thus the wave equation in
spherical coordinates, neglecting terms in 6 and <t> which are zero
for solutions independent of angle, is
d\ru) _ 1 d 2 (ru) = .
dr 2 v 2 dt 2 ' . U) ■
which has a general solution u = - \f(t - -J + g(t + - ) , as
can be proved by direct substitution, where /, g, are arbitrary
functions. This represents one wave traveling out from the
center, another traveling in, with arbitrary wave form, and
corresponds to the solution At - lx + m V + nz \ for the wave
equation in rectangular. coordinates, expressing a plane wave of
arbitrary wave form.
More complicated waves are those which are not spherically
symmetrical, but instead depend on the angles. We have seen
in Sec. 140, Chap. XIX, that if 1/r is a solution of Laplace's
equation, then ^-( -J is a solution, where n is an arbitrary direc-
tion. This solution represents the potential of unit dipole, the
differentiation giving the difference of the potentials of two oppo-
site charges infinitesimally close together. If 6 is the angle
between n and the direction in which we are finding the potential,
, d/l\ d/l\ 1
we nave ~{-\ = — ( -\ cos 6 = -- cos 6. This is a solution
u =
288 INTRODUCTION TO THEORETICAL PHYSICS
of Laplace's equation, and in terms of our standard solution in
spherical coordinates it is the solution corresponding to Z = 1,
m = 0. The function of r is r _(m) , in accordance with the
results of Sec. 130. As a matter of fact, it can be shown that all
the solutions of Laplace's equation, and therefore all the spherical
harmonics, can be derived in this way by differentiations of the
simple solution 1/r with respect to different directions.
In a similar way, if we are given the solution • of the
wave equation, we may differentiate with respect to n and again
obtain a solution. This gives
dj e ^-ryv)\ cog / 1 _ t«\ fc i.(*-r/.) cos 0. (3)
dry r J \ r 2 rvj
This is the solution corresponding to I = 1, as before, and the
function of r in Eq. (3) is the alternative way mentioned above of
writing the Bessel's function obtained by direct solution of Eq.
(1). For values of r small compared with a wave length, the
term 1/r 2 is large compared with w/rv, remembering that <a/v =
2r/X. Thus at short distances the second term iD Eq. (3) can
be neglected, and the function falls off as 1/r 2 , as in the static case.
Further, at short distances, the quantity r/v in the exponent
represents a time lag which is only a short fraction of the period
of oscillation, so that we may neglect it, obtaining — 2 cos
e*"', the potential we should expect from a dipole of variable
moment e iut from a quasi-stationary argument in which we
supposed that the variation of the moment was so slow that we
could treat the dipole instantaneously as if it were constant.
on the other hand, at large distances, the other term predomi-
nates, and the solution of the wave equation falls off as 1/r. This
part of the field is called the radiation field, and we see from it
that this solution for a dipole persists to large r's just as does the
spherically symmetrical solution, the intensity falling off as
1/r 2 , and the field representing a wave traveling out with velocity
v. This radiation field is a characteristic feature of solutions of
the wave equation, and is not present in the limiting case of
Laplace's equation.
177. Scalar Potential for Oscillating Dipole. — Let a charge e
oscillate up and down along the z axis, its displacement being
given by the real part of Ce iut . We shall assume an equal
SPHERICAL ELECTROMAGNETIC WAVES 289
and opposite charge to be always at the origin, so that the whole
thing is electrically neutral, and constitutes a dipole of moment
eCe io>t = Me ^t^ We wigh tQ find itg fieM We ghaU d() thifl
by finding the scalar and vector potentials, first computing
these directly, then in a later section showing that they can be
easily obtained from another vector, called the. Hertz vector.
The scalar and vector potentials are solutions of D'Alembert's
equations, in which the charge and current densities, respectively,
appear on the right sides of the equations. These are different
from zero only at the dipole, which is assumed to be of infinitesimal
dimensions, so that, except at the origin, the potentials satisfy
the wave equation. We must then look for solutions of the wave
equation satisfying the one condition that they reduce to the
correct value at the origin, or at the dipole itself. The solution
(3) is a function reducing to the scalar potential of a dipole in the
limiting case of a static field, and we have seen that it also reduces
to the value we should expect for points close to the dipole, even
in a variable field. It corresponds to the scalar potential of a
unit oscillating dipole. We expect, therefore, that for the dipole
of moment M e iwt the scalar potential will be
d /f>~ i<ar / c \
* = ~ M dl\~r~ ) cos deiat ' < 4 )
where now we write the velocity equal to c, for the case of light
waves.
178. Vector Potential.— Next we may find the vector potential,
using two facts: first, div A + (l/c)d<t>/dt = 0; second, since the
current is always along the axis, the vector potential must also
be in this direction. If now A is along the z axis, we easily
have A r = A cos d, A = -A sin 6, A+ = 0, if A is the magnitude
of the vector. Let us suppose tentatively that A is a function
only of r (being prepared to reject this if it does not work).
Then, using the equations for the divergence in spherical coordi-
nates, we have
div A = - 2l r(r 2 A cos 0) -\ t—JLf-A sin 2 d)
r 2 dr v ' r sin 6 dd K J
Also
dA „™ a . 2A a s 2A cos d dA
— cos e + — cos e _ = _ cos e.
1 d<j> _ io)<f>
c dt c
290 INTRODUCTION TO THEORETICAL PHYSICS
Hence we have
dA
8 - ^M-^l?—-] cos Oe*' = 0.
c ar\ r J
This can be satisfied by
, COS M — \
dr
A = —M-
c r
We note that this, which represents A z , satisfies the wave equa-
tion, as, of course, it must. Then we have
A r = — M cos de wt ,
c r
A e = M sin 6e iat ,
c r
A* = 0. (5)
179. The Fields. — Let us first find the magnetic field H = curl
A. We at once have
H r = He = 0, H+ = -4-(-—Me-^ c sin de'A- - -^
V dr\ c J r 38
— M- cos de™ ) = M=* sin Oe^H - 1 - ^ V (6)
c r J c L r y nor J
From this we see that the magnetic field always goes in circles
around the axis, as we should expect from the resemblance of the
problem to that of a linear current. At large distances, the
second term vanishes compared with the first, leaving
H+ = 9 sin de wt . (7)
1 dA
Next we find the electric field E = — grad <j> — • We have
c ot
(9 T rl /p—i<»r/c\ ~] .,2 p—iur/c
E r = i-\ M^-l- ) cos Be*"' + M% cos de™ 1
ar\ dr\ r I J c 2 r
e -icor/ C /2iu 2\
= M cos Be^l ■ — • + -5 V
Eg = ~-M^-\ sin 0e iut - — 9 M- — ^ sin Be™
r dr\ r \ c z r
= -M-s sin Be^H 1 + — =^ V
c z r \ i(ar wV/
#* = 0. (8)
SPHERICAL ELECTROMAGNETIC WAVES 291
From these results we see that at large r's (large compared
with a wave length), E and H become equal to each other, at
right angles, and at right angles to the direction of propagation,
CO 2 o—iwr/c
just as with a plane wave. They equal M -j sinfle^,
* C T
the amplitudes, apart from the sinusoidal parts, being M% — —
c 2 r
on the other hand, at small distances, the electrical field
approaches that calculated for the dipole by electrostatics,
falling off as 1/r 3 , while the magnetic field, 90 deg. out of phase
with the moment, or in phase with the current, is proportional
to 1/r 2 . At intermediate distances, the transition from the one
situation to the other is of a complicated form. For discus-
sion of radiation fields, it is only the result at large r that inter-
ests us.
The law giving the electric field at large r's can be put in an
interesting form. First we take the acceleration, —co 2 Me ia,t .
We imagine this to be a vector along the axis. Now if we wish
the field at a certain point, we take a plane normal to r passing
through this point, and project the acceleration vector on that
plane, using, however, not the instantaneous value but the value
at the previous time (t — r/c). The result, dividing by re 2 , gives
^2 gi&)(<— r/c)
— M-£ — sin 0, the correct value for the field. We see
from this that the dipole sends out maximum radiation to the
sides, none along the line of its motion. There is an interesting
extension of this to the case of a particle vibrating, not in a line,
but in an arbitrary ellipse (the most general sinusoidal motion).
To get the field, we again project the acceleration vector, which
is proportional to the displacement, on the normal plane. Thus
the vector E in general traces out an ellipse, and the wave is
elliptically polarized. An interesting case is that in which the
charge rotates in a circle. Then at a point along the axis, the
resulting light is circularly polarized; at a point in the plane
of the circle, it is linearly polarized; between, it is elliptically
polarized.
180. The Hertz Vector. — There is another interesting way of
considering the dipole solution, due to Hertz. The scalar and
vector potentials satisfy the relation
divA+i^ =
c dt
0.
292 INTRODUCTION TO THEORETICAL PHYSICS
It would be convenient to have only one quantity from which the
electromagnetic field can be derived. It is possible to find such
a quantity, a vector n, called the Hertz vector. The above
relation can be satisfied identically if we place
i an
A ~ c dt
tf> = -divll. (9)
This vector II satisfies the wave equation with no subsidiary
conditions such as are imposed on the vector and scalar potentials.
If any solution II of the wave equation is found, then this repre-
sents an electromagnetic field, and the electric and magnetic
fields are given by
, ,. __ i a 2 n
E = grad div II - - 2 -^
H = -cml^- (10)
c at
It turns out that the Hertz vector representing the field of an
oscillating dipole is simply a spherically symmetrical solution
of the wave equation. The correct solution, representing an
outgoing wave, is
n = A — il (ii)
r
so that, if p represents the dipole moment of our oscillating charge
(including the time variation) pointing along the z axis, it is easy
to show that the vector and scalar potentials derived from this
Hertz vector are just those derived in the previous sections. For
example, the vector potential
1 dp(t — r/c)
and if
the dipole moment,
cr dt
p - Me io,( - t ~ r/c) ,
t- = itaMe^ 1 -*'*
dt
giving for A the value we have already found
cr
SPHERICAL ELECTROMAGNETIC WAVES 293
In finding the vector potential, we must remember that n is a
vector pointing along the z axis and has the components:
n r = n cos 0; lie = — II sin 0; II,, = 0.
If we take the divergence of this vector with a negative sign we
are led to our first result for the scalar potential. The fields
E and H must, of course, be the same as those discussed, since
the vector and scalar potentials are identical. For convenience
we write the expressions for them in vector notation. From the
above equations relating E and H to n, we have
# = grad div f*^^
and
1 p"(t - r/c)
•.2
H = - curl
c
tSL^iM] (12)
where the dashes denote differentiation with respect to t. These
expressions lead to the same values we have been using when p
varies sinusoidally with its argument. They are somewhat more
general since they hold for any periodic motion of the dipole.
181. Intensity of Radiation from a Dipole. — We can easily
compute Poynting's vector, and find the total rate at which the
dipole is radiating energy. Poynting's vector is evidently
cM 2 o> 4 cos 2 o(t — r/c) . , „
I" ~i 2 S111 d >
<br c 4 r 2 '
the time average being
M W sin 2
8wc 3 r 2
Let us now integrate over the surface of a sphere of radius r, to
get the total radiation. The element of area is r 2 sin ddd<f>, so
that the result is
^M 2 C 2v C v . z MW 16ttWV
8^ Jo Jo Sm $ dd d * = ~W - —8?-' (13)
if v = co/2x is the frequency. This is a well-known formula for
the radiation from a dipole. The two essential features are that
the radiation is proportional to the square of the amplitude
of the dipole, and to the fourth power of the frequency.
182. Scattering of Light. — In addition to direct radiation, it
is important to consider the process of scattering of light. Sup-
294 INTRODUCTION TO THEORETICAL PHYSICS
pose that a wave, for example a plane wave, falls on a dipole of
the sort we have considered. Let the dipole have an equation of
motion
mix + o}q 2 x) = eE,
if m is the mass, e the charge, of the vibrating particle, E the
external field, and x the displacement. Then, letting E = Eae*"*,
we have ex, the moment, equal to
E e l
m co
This is the oscillating dipole moment produced by the field.
Then the dipoles set into motion by the wave will emit light,
which is scattered. The rate of emission by a single dipole is
co
4 (e 2 E V
■>} \m coo 2 — <o 2 /
Often the scattering is measured by the amount of light scattered
per cubic centimeter of material, divided by the intensity of the
incident light. The latter is (c/ir)(E X H), its mean value
being cE 2 /Sr. Further, the amount scattered per cubic centi-
meter is N times that scattered by a single dipole, if there are N
dipoles and they scatter independently (as the molecules of a
gas do). Hence for the scattering we have
8xNe i 1
3c 4 w 2
[(;)•-']'
(14)
There are three important special cases of this scattering
formula :
(a) The Rayleigh Scattering Formula. — This is what we have in
the case where co is small compared with w . Since for ordinary
atoms co is a frequency in the ultra-violet, we have this condition
in the visible range of the spectrum. Then we may neglect 1
compared with (co /co) 2 , obtaining for the scattering
87riVe 4 fa) 4 ,- »v
3c 4 mW* K }
The scattering is here proportional to co 4 , or to 1/X 4 , where X
is the wave length. This proportionality to the inverse fourth
power of the wave length means that the short blue and violet
waves will be scattered much more than the long red ones. An
SPHERICAL ELECTROMAGNETIC WAVES 295
example is the scattering of light by the sky. The air molecules
scatter, and on account of the law they scatter much more blue
light, resulting in the blue color. The transmitted light thus
has the blue removed and looks red, explaining the color near the
sun at sunset.
(b) The Thomson Scattering Formula. — In the other limiting
case of x-rays, when the frequency is large compared with co , the
scattering becomes
3c 4 w 2 UD;
This formula gives a scattering independent of the wave length,
and is very important in discussing a>ray scattering by substances.
(c) Resonant Scattering. — If to is nearly equal to o> , it is evident
that the denominator can become very small (of course, if we
consider damping, it will not vanish), resulting in a very large
scattering. This phenomenon can be much more conspicuous
than the two other cases. Thus a bulb filled with sodium vapor,
which has a natural frequency in the visible region, illuminated
with light of this color, will scatter so much light that it appears
luminous. This phenomenon is called resonance scattering.
183. Polarization of Scattered Light. — We observe that, if
the incident light is plane polarized, the dipoles will all vibrate
along the direction of its electric vector. Thus there will be no
intensity in the scattered light along this direction. The scat-
tered light will have a maximum intensity at right angles, and it
will be plane polarized. It was by experiments based on these
facts that the polarization of x-rays was first found.
184. Coherence and Incoherence of Light. — In the previous
paragraphs we calculated the scattering by N molecules which
scatter independently by adding the intensities of the scattered
radiation from each. The justification for this requires closer
consideration. Since the Maxwell equations are linear the field
vectors E and H satisfy the superposition principle, so that we
should expect the total amplitude to be the sum of the amplitudes
in the various waves, in which case the total intensity, being the
square of the amplitude, would certainly not be the sum of
the separate intensities. The key to this situation is found in the
relations between the phases of the various waves which we are
adding: if they are all in the same phase, they are said to be
coherent, and the amplitudes add, while if they are in phases.
296 INTRODUCTION TO THEORETICAL PHYSICS
having random relations to each other they are incoherent and
the intensities add.
To be more precise, let us consider the sum of a number of
sinusoidal waves, all of the same frequency, but of different
amplitude and phase:
Y^Ucos (cd - a k ) = (2)A*cosa*) cosco* + (^A k sma k ) sinco*.
k k *
If all the phases should be the same, say a k = 0, then the
amplitudes of the cosine and sine terms will be ^A k and 0,
k
respectively, so that the amplitudes add, and the intensity is
proportional to (^A fc ) 2 , or if, for instance, there are N terms of
k
equal amplitude, proportional to N 2 times the intensity of a
single wave. on the other hand, the as may be completely
independent of each other, meaning that each a is equally likely
to have any value between and 2w, independent of the others.
Then we can see that y.A k cos a k will be far less than ^A k , since
k *
we shall have just about as many terms with positive values of
cos a k as with negative, and the terms will just about cancel.
The cancellation will not be complete, however, as we see if we
compute the squares of the summations, which we must add to
get the intensity. The square of the first summation, for
instance, is
(y\A k cos a fc V = 2% 2 cos2 ak + XX Ak Al C0S ak cos au
k k tei
We must find the average of this, taking the as as independent.
That is, we must perform the operation of integrating each a
from to 2tt and dividing by 2tt. When we do this, the terms
cos 2 a k average to )4, while the products of two independent
a's average to zero, leaving ^^i^ 2 - The ° ther summation
k
gives an equal term, so that we find that the mean square ampli-
tude, or mean intensity, averaged over phases, is the sum of the
individual intensities. This is the state of complete incoherence,
in which for N waves the intensity is N times the intensity of
a single wave, rather than N 2 as for the coherent case. The
cancellation of waves, then, while not complete, is more and more
SPHERICAL ELECTROMAGNETIC WAVES 297
perfect as N increases, for N becomes a smaller and smaller frac-
tion of N 2 as N increases.
We can now apply the idea of coherence to the scattering of
light from a gas. The phase of the wave at a point P, scattered
by an atom at a (Fig. 49), depends on the total path the light has
traveled from the source to a, and from a to P. Since the mole-
cules of a gas have no fixed positions with respect to each other,
these paths are in a random relation to each other, the phases are
incoherent, and we are justified in adding intensities. Such a
procedure would not be allowed for example in discussing the
scattering of x-rays by crystals, where the various atoms are in
fixed lattice positions. Indeed, here we do get interference, and
it is just by studying the interference patterns so obtained that
(b)
Fig. 49. — Scattering from atoms.
(a) At right angles to the incident beam, where the paths of the scattered light
from the atoms a, b, c are of different and random lengths, so that there is no
regular interference, and we add intensities.
(6) Scattering straight ahead, where the paths are approximately equal, and
the beams interfere to produce the refracted beam.
we obtain our information about the lattice structure of crystals.
Neither would the procedure be allowed in discussing the scatter-
ing from a gas in the same direction as the incident radiation,
as in (6). For then the paths of the beams scattered from the
various atoms are approximately equal, the waves are in phase,
and they produce a resultant field at P proportional to the ampli-
tude, rather than the intensity, of the incident wave. This
scattered field can be shown to interfere with the incident wave
in such a way that the resultant produces the refracted wave.
The close relation of our scattering formulas to the formulas for
the index of refraction, therefore, becomes clear, and it is evident
that our two problems of refraction and scattering, though we
have treated them separately, are really parts of the same sub-
ject. The scattering straight ahead produces refraction, and
does not depend on the exact placing of the molecules. Scatter-
ing to the sides, on the other hand, does not occur unless the
where
298 INTRODUCTION TO THEORETICAL PHYSICS
molecules have a random arrangement, and then the intensity,
not the amplitude, is proportional to the number of molecules.
185. Coherence and the Spectrum. — The amplitude of a wave,
as a function of time, is never exactly sinusoidal, but is really a
much more complicated function. It is often desirable, how-
ever, to resolve such a function into a spectrum; that is, write it
as a sum of sinusoidal waves of different frequency. This can
be conveniently done by Fourier series. To do this, we take a
Fourier series with an extremely long period T, so long that all
the phenomena we are interested in take place in a time short
compared with T, so that we are not bothered by the periodicity
of the series. Then, if our function is /(f), we have
/(f) = ^ (An COS 0) n t + B n Sm Q} n t),
n
o C T/2 2 C T/2
A n = w\ f(t) cos ajdt, B n = ^ f(t ) sin <a„t dt ,
TJ-T/2 J-J-T/2
Wn = Y 11 - ^
This gives an analysis into an infinite number of sine waves, with
frequencies spaced very close together (on account of the very
small size of 2*/T). No actual, physical wave is then perfectly
sinusoidal, in the sense of having but one term in this expansion
with an amplitude different from zero. We shall show in a prob-
lem that even a perfectly sinusoidal wave which persists for only
a finite length of time will have appreciable amplitudes for all
those frequencies within a range Aco, equal in order of magnitude
to the reciprocal of the time during which the wave persists, so
that a sine wave of long lifetime will correspond to a sharp line
in the spectrum, while a rapidly interrupted wave will give a
broad line. This is observed experimentally in the fact that
increasing the pressure of a gas, thereby making collisions more
frequent and interrupting the radiating of the atoms, broadens
the spectral lines.
The intensity is proportional to p(t), or to the square of the
summation over frequencies. Just as before, this square consists
of terms like A n 2 cos 2 <o„f, and cross terms like A n A m cos uj
coSiCdJ . Instantaneously none of these terms are necessarily zero.
But if we average over time, the terms of the first sort average to
A n 2 /2, while those of the second sort average to zero. The final
SPHERICAL ELECTROMAGNETIC WAVES 299
result, then, is that the time average intensity is the sum of the
intensities of the various frequencies: p(t) = s^*( A n 2 +B n 2 )
n
We are justified in considering the terms connected with a given
n to be the intensity of light of that particular frequency in the
spectrum, so that we have the theoretical method of determining
the spectral analysis of any disturbance. And we see that the
following statement is true: on a time average, sinusoidal waves
of different frequencies are always incoherent, and never interfere.
186. Coherence of Different Sources. — It is known experi-
mentally that light from two different sources never interferes;
to get interference we must take light from a single source, split
it into two beams, and allow these beams to recombine. If we
regarded the sources as being monochromatic, it would be hard
to see why this should be, for the amplitudes of two waves of the
same frequency should add, rather than the intensities, and this
is the essence of interference. But when we observe that each
source really is represented by a Fourier series, the situation
becomes plain. For two sources are always so different that their
Fourier series will be entirely different. If we analyze both of
them, the phase of the radiation of frequency <o n from one will
be entirely independent of the phase of the corresponding fre-
quency from the other. Thus if we add the disturbances, square,
and average over this random relation between the phases of the
two sources, the cross terms will cancel, and the intensities add.
The randomness comes in this case, not in adding a great many
terms of the same frequency, but in combining the terms of
different frequencies, which are related in entirely independent
ways in the two sources.
Problems
1. Discuss the weakening of sunlight on account of scattering, as the light
passes through the atmosphere. Assume that the molecules of the atmos-
phere have a natural frequency at 1,800 A. (where absorption is observed).
Let each molecule contain an electron of this frequency. Assume that the
number of molecules is such as to give the normal barometric pressure.
Find the fractional weakening of a beam due to scattering in passing through
a sheet of thickness ds, and from this set up the differential equation for
intensity as a function of the distance. Solve for the ratio of intensity to
the intensity before striking the atmosphere, for the sun shining straight
down, and for it shining at an angle of incidence of 60 deg. Constants:
300 INTRODUCTION TO THEORETICAL PHYSICS
e = 4.774 X 10~ 10 e.s.u., m = 9.00 X 10~ 28 gm., number of molecules in
1 gm.-mol = 6.06 X 10 23 .
2. A vibrating dipole radiates energy, and therefore its own energy-
decreases. Noting that the rate of radiation is proportional to the energy,
set up the differential equation for the energy of the dipole as a function
of the time. Find how long it takes the dipole to lose half its energy. Work
out numerical values for the sort of dipole considered in Prob. 1.
3. Using the results of Prob. 2, find the equivalent damping term which
would make the dipole lose energy at the same rate as the radiation. This
damping is called the radiation resistance.
4. Show that the values for E and H, which we have found, satisfy Max-
well's equations, by direct calculations in polar coordinates.
5. Derive the expressions for E and H in terms of the Hertz vector n from
the equations defining II. * ■
6. Show that the fields E and H in terms of p(t - r/c) and its time deriva-
tives reduce to the values in terms of the dipole moment M.
7. Show that near an oscillating dipole the magnetic field is given by
H = ~Jr X p'(t)}
and thus can be derived from the Biot-Savart law when we place
p'(t) = I(t)ds,
where I(t) is the current and ds an element of length in the direction of the
dipole.
8. Show from the Hertz vector for the dipole case, that at large distances
from the dipole,
and
B--M. rxp "( t -i)}
9. Suppose we have an alternating current of maximum value / (meas-
ured in e.m.u.) in a vertical antenna of length I. Treating this as a dipole,
show that the total radiation is
4tt 2 c l 2 P
3 X 2
Show that the equivalent resistance necessary to produce the same power
loss (the radiation resistance) -is
R = 80*-^
if R is measured in ohms, and if we place c = 3 X 10 10 cm. per second.
10. Find the spectrum of a disturbance which is zero up to t = 0, is
sinusoidal until t = T , then is zero permanently. (Hint: make the period T
of the Fourier series indefinitely large compared with To.)
11. Find the spectrum of a disturbance which starts at t = 0, and is a
sinusoidal damped wave after that. Show that the curve for intensity as a
SPHERICAL ELECTROMAGNETIC WAVES 301
function of frequency has the same form as a resonance curve, in general,
and that its breadth is connected with the logarithmic decrement in the
same way. This illustrates an important principle : the emission and absorp-
tion spectrum of the same substance are essentially equivalent. The
resonance curve represents the absorption curve, on account of the relation
of forced oscillators and dispersion, while the damped wave is the emission.
(Hint: make the period T indefinitely large compared with the time taken
for the oscillation to fall to 1/e th of its value.)
CHAPTER XXVI
HUYGENS' PRINCIPLE AND GREEN'S THEOREM
Huygens' principle is a well-known elementary method for
treating the propagation of waves, and in this chapter we shall
consider its mathematical background, showing its close connec-
tion with Green's theorem. The method is this: From each
point of a given wave front, at t = 0, we assume that spherical
wavelets start out. At time t, each wavelet will have a radius
ct, and the envelope of these wavelets will form a new surface,
which according to Huygens is simply the resulting wave front
at this later time t. Thus, if the original wave front was a
plane, it is easy to see that the final one will be a plane distant
by the amount ct, while, if it is a sphere, the final wave front
will be a concentric sphere whose radius is larger by ct. In
either case this construction gives us the correct answer, agreeing
with the more usual methods of computation. The one diffi-
culty is that our construction would give a wave traveling back-
ward, as well as one traveling forward; the solution of this
difficulty appears when we use the methods of this chapter.
We may look at our process in a slightly different way, not
used by Huygens, but developed later when the interference
of light was being worked out. Suppose that, instead of taking
the envelope of all the spherical wavelets, we consider that each
of these wavelets has a certain amplitude, consisting of a sinu-
soidal vibration. We then add these vibrations, just as if
the wavelets were being sent out by interfering sources of light,
and the resulting amplitude is taken to be that in the actual
wave. This process can be shown to lead to essentially the same
result, and it is this which can be justified theoretically. As
a further generalization, it is not necessary to take the original
surface to be a wave front; it can be any surface, so long as we
allow the scattered wavelets to have the suitable phase and
amplitude.
Our final result, then, is this: The disturbance at a point P
of a wave field may be obtained by taking an arbitrary surface,
302
HUYGENS' PRINCIPLE AND GREEN'S THEOREM 303
and performing an integration over this surface. The contribu-
tion of a small element of area dS of this surface equals the
amplitude at P of a spherical wave starting from dS at such a
time that it reaches P at time t. That is, if the distance from
dS to P is denoted by r, this wave is of the form ^ ~ r ' c K
r
Now the contribution, for a given wavelet, must surely be pro-
portional to the disturbance at dS, which we may call / (a func-
tion of time and position), and to dS. Hence we have something
C Cfff _ r / c )
like I I — dS for the final result. We are thus led to
a formula of this sort:
/ (at a point P) = constant X I IfSjZlM d$,
where the surface integral is over a surface surrounding P.
This suggests the solution of Laplace's equation by Green's
method, where we had the value of a function <f> at an interior
point of a region where v 2 <£ was zero as a surface integral over
the boundary. As a matter of fact, an analogue to Green's
theorem is the correct statement of Huygens' principle, and
replaces the formula which we have derived intuitively above,
and which is not just correct.
187. The Retarded Potentials— In Chap. XXI, we have
introduced scalar and vector potentials <f> and A, giving the
electric and magnetic fields by the relations
E= -grad - \ d A
H = curl A.
For these potentials we found the equations
or D'Alembert's equation. We ask first how to get a solution
of D'Alembert's equation analogous to the simple solution
>^ DEPARTMENT OF CHEMISTRY
LIVERPOOL COLLEGE OF TECHNOLOGY
304 INTRODUCTION TO THEORETICAL PHYSICS
of Poisson's equation. • We shall not carry through the proof of
the solution, for that is rather complicated. But the essence
of Poisson's equation is that we divide up all space into volume
elements dv, and that pdv/r is the potential of the point charge
pdv at a distance r. This potential, of course, is a solution of
Laplace's equation, as is 1/r, at all points except for r = 0,
where the charge is located.
In a similar way, to solve D'Alembert's equation, we divide
up our charge into small elements, and write the potential
as the sum of the separate potentials of these small charges.
The separate potentials must now be, except at r = 0, solutions
of the wave equation. This means that, since any change of
the charge will be propagated outward with the velocity c,
the potential at a given point of space resulting from a particular
charge cannot be derived from the instantaneous value of the
charge, but must be determined, instead, by what the charge
was doing at a previous instant, earlier by the time r/c required
for the light to travel out from the charge to the point we are
interested in. In other words, if p(x, y, z, t) is the charge density
at x, y, z at the time t, and r is the distance from x, y, z to x',
y',z', where we are finding the field, we shall expect the potential
of the charge in dv to be
p(x, y, z, t - r/c)dv ^
r
and for the whole potential we shall have
. _ i f f Cv±Z?mL£. (3)
This solution is, as a matter of fact, correct. We have already
seen that *" ~ r ' ■ is a solution of the wave equation, where
r
/ is any function, so that the integrand actually satisfies the
wave equation, as in the earlier case 1/r satisfied Laplace's
equation. The potential <p determined by this equation is
called a retarded potential, since any change in the charge is not
instantaneously observable in the potential at a distant point,
but its effect is retarded on account of the finite velocity of
HUYGENS' PRINCIPLE AND GREEN'S THEOREM^ 305
light. The solution for the vector potential is determined in
an analogous manner.
188. Mathematical Formulation of Huygens' Principle. — In
discussing the application of Green's theorem to the solution of
Poisson's equations in a finite region of space, we have proved
that
the result of the last paragraph being the special case where the
region of integration is infinite and the surface integral drops
out. We now wish to find an analogous theorem for use with
D'Alembert's equation. Here again we shall not give a real
derivation, for this is very complicated, but shall merely describe
the formula which results, and show that it is plausible. We have
already discussed the volume integral. In the surface integral,
the first term gave the potential of a double layer of strength
(t>/4r, the second the potential of a surface charge of magnitude
j- -r— Each of the terms, <j> . and - ~, is a solution
4t an an r dn
of Laplace's equation since it represents the potential of certain
charges.
In our case of the wave equation, the formula has two corre-
sponding terms: one giving the potential of a double layer,
the other of a surface charge. But now the charges change with
time, so that we must use solutions of the wave equation in
the integral. We have already seen that the solution of the wave
equation corresponding to - is — ^-J; hence we expect the
second term to be replaced by — ( ^r ) , where this means
r\dn/(t-r/c)
that the partial derivative, which is now a function of time as
well as of position on the surface, is to be computed, not at t,
r d(l/r)
but at i Similarly corresponding to ■ \ , the differ-
c an
ence of the potentials of two equal and opposite point charges at
neighboring points of space, we have — y— — f. Remem-
bering that in differentiating with respect to n we must regard r
as a variable each time it occurs, this is
306 INTRODUCTION TO THEORETICAL PHYSICS
/(<-3^ + ;/I'H)]=
_ cos {n, r) f f(t - r/c) 1 df(t - r/c) \
r \ r c dt j
where in the last term we have used the relation
df(t - r/c) = df(t - r/c) d(t - r/c) = df(t - r/c) / 1 jh\
dn d(t — r/c) dn dt \ c dn)
1 ( n df(t - r/c)
= — cos (n, r) -^ — ^-.
c dt
We should, therefore, expect to have
dv
-r/c
This, as a matter of fact, is the correct formula. The first term
represents the potential due to all the charge within the volume ;
if there are no sources of light within this volume, the volume
integral is then zero, and that is the usual case with optical
applications. The surface integral represents the remaining
potential as arising from a distribution of charge and double
distribution about the surface, each surface element sending
out a wavelet which on closer examination proves to be the
Huygens' wavelet we are interested in. Thus, starting from
Green's theorem and D'Alembert's equation, we have arrived
at a mathematical formulation of Huygens' theorem.
To give a suggestion of the rigorous proof of this formula,
we could proceed as follows: First, we notice that <j> defined by
this integral satisfies the wave equation; for since each term
of the integrand separately is a solution, the sum must also
be. Now it follows from this, although we have not proved it,
that if the solution reduces to the correct boundary values at
all points of the boundary, the solution must be the correct one,
the reason being essentially that the boundary values determine
a solution uniquely, so that, if we have one solution of the
equation with the right boundary values, it must be the only
HUYGENS' PRINCIPLE AND GREEN'S THEOREM 307
correct solution. We must then show that the <j> denned by the
integral actually has the correct boundary values. This could
be done by a more careful treatment, and we should then have a
demonstration of the formula. The more conventional proof,
however, is a fairly direct though complicated application of
Green's theorem.
189. Application to Optics. — We shall now take our general
formula (4), and apply it to the cases we meet in optics, showing
that it reduces to something like the formula which we had earlier
derived intuitively. We suppose that light is emitted by a point
source, and that the value of some quantity connected with, and
satisfying, the wave equation (one of the components of the fields
or potentials — they all satisfy the same relations) has the form
> where ri is the distance from the source to the point
where we wish to find the disturbance. Then we wish to get
the disturbance at P, not by direct calculation, but by using
Huygens' principle. Suppose we take a closed surface. This
surface can either surround the source, or the point P where we
wish the disturbance. In any case, we have n as the normal
pointing out of the part of space in which P is located. At a
point of the surface, <f> = , where n is the distance
from the source to the point on the surface. We then have,
if r is the distance from P to a point on the surface,
<-9-
J^ e 2viv[t— (r+r t )/c]
n
d<f>(t — r/c) _ ZirivAe 2 ™^-^^' ]
■ dt 7i
d<f>(t — r/c) / 1 2iriv\ e M * t -l r + r *>/e]
s = -i« («, n) {- + _j _ ,
Thus finally
~(^ + ? 7 !: )cos(n,r 1 )|^. (5)
In this formula, as in Chap. XXV, we have two sorts of terms,
some significant at small values of r and r lf others at large.
308 INTRODUCTION TO THEORETICAL PHYSICS
We easily see that, if r and r x are large compared with a wave
length, as is always the case in optics, the only terms we need
retain are those in Hence to this approximation
* = I I W~ e 2 ™ [ '~ (r+ri)/c] [ cos (n, r) - cos (n, n)] dS. (6)
This final form suggests our earlier, intuitive formulation of
Huygens' principle. The incident amplitude at dS is
Now we set up, starting from dS, a wavelet whose amplitude
is this value, retarded by the amount r/c, divided by r, and
multiplied by the factor j— [cos (n, r) — cos (n, r{)]dS. This
is just what we should expect, except for the last factor. The
term i introduces a change of phase of 90 deg., not present in
Huygens' form of the principle, but necessary. The term
cos (n, r) — cos (n, n) makes the wavelets have an amplitude
which depends on angle. When r and r x are in opposite direc-
tions, which is the case when the surface is between the source
and P, the factor approaches 2, while when r and r x are parallel,
and the surface is beyond P, it becomes zero. This means that
the wavelets do not travel backwards, thus removing the diffi-
culty noticed earlier in Huygens' method. The wavelets have
an amplitude depending on their wave length, decreasing for
the longer wave lengths.
190. Integration for a Spherical Surface by Fresnel's Zones. —
Let us now carry out our integration, and verify Huygens'
method, in a simple case. We take the surface to be a sphere,
surrounding the source, and therefore a wave front. We note
that n is the inner normal of the sphere. Thus r\ is constant
aii over the sphere, and cos (n, n) = — 1 at all points, so that
the formula simplifies to
* = — 2x^ — J J ~~T~ [cos (n ' r) + ]
Now suppose we introduce, as a coordinate on the sphere, the
distance r from the point P; that is, we cut the sphere with
spheres concentric with P, laying off zones between them, as in
Fig. 50. We can easily get the area between r and r + dr, and
hence the element of area. Take as an axis the line joining
HUYGENS' PRINCIPLE AND GREEN'S THEOREM 309
the source and the point P, and consider a zone making an angle
between and + dd with the axis. The area of the zone is
27rri 2 sin dd. But now by the law of cosines, if R is the distance
from the source to P, r 2 = R 2 + n 2 - 2Rr x cos 0, and differ-
entiating, 2rdr = 2Rr x sin dd. Hence for the area of the zone
we have * x dr. Introducing this, we have
K
/ r max
e -^r A[cos{nfr) + 1]dr)
r min
where r m { n = R — r i} r ma x = R + n.
To carry out this integration, we use a device called Fresnel's
zones, giving us an approximate value in a very elementary way.
rite
Fig. 50.— Construction for Fresnel's zones on a sphere surrounding the source.
Beginning with r™, we take a set of zones such that the outer
edge of each corresponds to a value of r just half a wave length
greater than the inner edge. The contributions of successive
zones will almost exactly cancel. The integral, then, consists
of a sum of terms, say si - s 2 • • • + s», where the magnitudes of
Si, s 2 . . . , vary only very slightly from one to the next. Now
it is true in general that in such a series the sum is approximately
half the sum of the first and last terms. We can see this as
follows. We group the terms T + ( $T — S2 ~*~2/ ' ' ' ~*~
f^l — Sn _ 1 _|_ ?M + p. Now, on account of the slow varia-
tion of magnitude, we have very nearly Sk = » ^ **" s
were so, however, each of the parentheses would vanish, leaving
only Si ~T Sn - In our case, the contribution of the first zone is to be
considered, but that of the last zone is practically zero, on account
of the factor cos (n, r) + 1, so that the result is half the first zone.
310 INTRODUCTION TO THEORETICAL PHYSICS
Now, in the first zone, cos (n, r) + 1 is so nearly equal to 2
that we can take it outside the integral, obtaining
Jr
X R jR~r,
A y,2viv(t—ri/c)
5
A^vivCt—B/c)
R-ri + \/2
R-ri
,— 2jri(fl— n)/\
— , the correct value. (7)
it
191. The Use of Huygens' Principle. — In the derivations of
this chapter we have traveled in a very roundabout way to reach
a very obvious result. We naturally ask, what is Huygens'
principle good for, aside from a mathematical exercise? The
answer is found in the problem of diffraction. There one has
certain opaque screens, with holes in them, and a light wave fall-
ing on them. If the light comes from a point source, geometrical
optics would tell us that the shadow of the screen would have
perfectly sharp edges. But actually this is not true; there are
light and dark fringes around the edge of the shadow. If the
shadow is observed at a greater and greater distance, these fringes
get proportionally larger and larger, until they entirely fill the
image of the hole. Finally at great distances the fringes grow in
size until the resulting pattern has no resemblance at all to the
geometrical image. There are then two general sorts of diffrac-
tion: first, that in which the pattern is like the geometrical image,
but with diffuse edges, and which is called Fresnel diffraction;
secondly, that in which the pattern is so extended that it has no
resemblance to the geometrical image, and which is called Fraun-
hofer diffraction. Both types of diffraction, as well as the inter-
mediate cases, can be treated by using Huygens' principle.
192. Huygens' Principle for Diffraction Problems. — Suppose
that light from a point source falls on a screen containing aper-
tures, and that we wish the amplitude at points behind the screen.
Then we surround the point P, where we wish the field, by a
surface consisting of the screen, and of a large surface, perhaps
hemispherical, extending out beyond P, and enclosing a volume
completely. We apply Huygens' principle to the surface. In
doing so, we assume (1) that the amplitude of the incident wave,
at points on the apertures, is the same that it would be if the
HUYGBNS' PRINCIPLE AND GREEN'S THEOREM 311
screen were absent; and (2) that immediately behind the screen,
and at points of the hemispherical surface as well, the amplitude
is zero, the wave being entirely cut off by the screen. This is,
of course, an approximation, since at the edge of a slit, for exam-
ple, the amplitude of the wave does not suddenly jump from zero
to a finite value. The exact treatment is exceedingly difficult,
but in the one case for which it has been worked out, it substanti-
ates our approximations.
To find the disturbance at P, then, we integrate over the sur-
face, but set the integrand equal to zero, except at the openings
of the screen, obtaining
C CiA 1
* = J J 2^ e2 "" [ '" (r " Kl>A1[cos ( n > r > ~ cos &> r ^ dS >
the integral being over the openings. We note that only the
edges of the openings are significant, the shape of the screen
away from the opening being unimportant. Now let us assume,
as is almost always true in practice, that the distances r x and r,
from source to screen and from the screen to P, are large compared
with the dimensions of the holes. Then \/rr x and [cos (n, r) —
cos (n, ri)] are so nearly constant over the aperture that we may
take them outside the integral, replacing r and r x by mean values
f and f i. If in addition we write r + n in the exponential as
f + f i + r' + n', where r' and r x ' are the small differences
•between r and r t and their values at some mean point of the aper-
ture, we have finally
* = 2~X W x 1 - C0S ^ ^ ~ C0S ^ fl ^ « 2 ™1 r c ri) J f Ce~ M(r '+ r ^/^dS.
(8)
The whole factor outside the integral may be taken as a constant
factor so that, if we are interested only in relative intensities,
we may leave it out of account. We finally have a sinusoidal
vibration of which the amplitudes of the components of the two
phases are proportional to C = | jcos — (/ + n') dS, and S' =
I I Sin ~X ^ r ' "*" ri ^ d ^' • Hence tne intensity is proportional to
C" 2 + S' 2 , and our task is to compute this value.
193. Qualitative Discussion of Diffraction, Using FresneFs
Zones. — By using Fresnel's zones, one can see qualitatively the
312
INTRODUCTION TO THEORETICAL PHYSICS
explanation of the diffraction fringes, particularly in Fresnel
diffraction. Suppose that we join the source S and a point P
with a straight line, as in Fig. 51, and consider the point of the
screen cut by this line, a point for which r + r x has a minimum
value. Let us surround this point by successive closed curves in
which r + ri differs from its minimum value by successive whole
numbers of half wave lengths. It is not hard to see that these
curves will be the intersections with the screen of a set of ellipsoids
of revolution, whose foci are S and P. Hence if the line SP is
approximately normal to the screen, the curves will be approxi-
mately circles. Successive zones included between successive
Elipsoid
r x +r~ constant
Fig. 51. — Fresnel's zones on a plane.
curves will propagate light differing by a half wave length from
their neighbors. Now on the screen we may imagine the pattern
of zones, and also the apertures. The whole nature of the diffrac-
tion depends on what zones are uncovered, and can transmit light,
and what ones are obscured by the screen. We may distinguish
three eases, shown in Fig. 52 :
1. The center of the system of zones lies well inside the aper-
ture. The central zone is entirely uncovered, as are a number of
the others. As we get to larger zones, we shall come to one of
which a small part is covered; then one which is more covered;
and so on, until finally we come to one only slightly uncovered;
and then the rest are entirely obscured. Now we can write our
integral, as in paragraph 190, as a sum of integrals over the
successive zones. As before, these contributions will decrease
very gradually from one zone to the next. When we reach the
HUYGEN'S PRINCIPLE AND GREEN'S THEOREM 313
zones that are obscured, the decrease will become a little more
rapid, but not so much as to interfere with the argument. We
can still write the whole thing as half the sum of the first and
the last zones. In our case, the last zone which contributes has
a negligibly small area exposed, so that it contributes practically
nothing, and the whole integral is half the first zone. But this
gives just the intensity we should have in the absence of the
screen.
2. The center of the zone system is well behind the screen (P
is in the geometrical shadow). Then the first few zones are
2 3
Fig. 52. — Fresnel's zones and rectangular aperture.
(1) Directly in path of light.
(2) In geometrical shadow.
(3) on edge of shadow.
obscured. A certain zone begins to be uncovered, until finally
some zones are uncovered to a considerable extent. Large zones
become obscured again, however. Thus in our sum, while there
are terms different from zero, both the first and the last terms
are zero, so that the sum is zero. The intensity well inside the
geometrical shadow is zero.
3. The center of the zone system is near the edge of the screen.
Then the first zone may be partly obscured, so that there is some
intensity, but not so great as without the screen. Or the first
zone may be entirely uncovered, but the next ones ' partly
obscured. In these cases, the contributions from the successive
zones may differ so much that our rule of taking the first and last
terms is no longer correct. It is possible for the whole amplitude
to be more than half the first zone, so that the intensity is actually
greater than without the screen. As we move into the geometri-
cal image from the shadow, it turns out that there is a periodic
314 INTRODUCTION TO THEORETICAL PHYSICS
fluctuation, on account of the uncovering of successive zones,
and this explains the diffraction fringes.
Problems
1. Try to carry out exactly the integration which we did approximately
by using Fresnel's zones.
2. The source is at infinity, so that a wave front is a plane. Set up Fres-
nel's zones, and find the breadth of the nth zone, and its area.
3. A plane wave falls on a screen in which there is a circular hole. Inves-
tigate the amplitude of the diffracted wave at a point on the axis, showing
that there is alternate light and darkness as either the radius of the hole
increases, or as the point moves toward or away from the screen. (Sugges-
tion: the integral consists of a finite number of zones.)
4. A plane wave falls on a circular obstacle. Show that at a point
behind the obstacle, precisely on the axis, there is illumination of the same
intensity which we should have if the obstacle were not there. Explain
why this would not hold for other shapes of the obstacle.
6. Take a few simple alternating series, as 1/2 — 1/3 + 1/4 — 1/5 • • • ,
1/2 - 1/4 + 1/8 • • • , 1/22 _ 1/32 + !/ 4 2 -..-., etc., and find
whether our theorem about the sum of a number of terms is verified for
them. In doing this, it may be necessary to start fairly well out in the
series, so as satisfy our condition that successive terms differ only slightly
in magnitude.
6. Prove the statement that the boundaries of Fresnel's zones are the
intersection of the screen with ellipsoids of revolution whose foci are the
source and the point P. What happens to these ellipsoids as the source is
removed to infinity?
CHAPTER XXVII
FRESNEL AND FRAUNHOFER DIFFRACTION
In the present chapter we proceed to the mathematical dis-
cussion of Fresnel and Fraunhofer diffraction, based on the
methods of Huygens' principle derived in Chap. XXVI. The
problems which we take up are Fresnel and Fraunhofer diffrac-
tion through a slit; Fraunhofer diffraction through a circular
aperture; and the diffraction grating, an example of Fraunhofer
diffraction. In Eq. (8) of the last chapter, we have seen that the
essential step in computing the diffraction pattern is the evalua-
tion of the integral
where the integration is over the aperture of the screen, dS is an
element of surface in the aperture, r is the distance from the
source to the element dS, and r x the distance from the element
to the point P where the field is being found. If the incident
wave is a plane wave, and the plane of the aperture is a wave
front, then r is the same for all elements, and the factor e~ 2 * ir/x
can be cancelled out of the integral. The remaining integral,
jj e -2wi ri /\ dg } represents the sum at P of the amplitudes of
spherical waves of equal intensity and phase starting from all
points of the aperture. It is the interference of these waves which
produces the diffraction pattern.
194. Comparison of Fresnel and Fraunhofer Diffraction. —
The two types of diffraction, Fresnel and Fraunhofer, arise
from observing the pattern near to, or far from, the screen.
Let the normal to the screen be the z axis, as in Fig. 53, and let
the screen containing the aperture be at z = 0. The light
passing through the aperture is caught on a second screen at
z = R. Physically, the diffraction pattern has the following
nature: close to the aperture, the light passes along the z axis
as a column or cylinder of illumination, of cross section identical
with the aperture, so that, if the screen at R is close to the
aperture, the illuminated region will have the same shape as
the aperture, and we speak of rectilinear propagation of the light.
315
316
INTRODUCTION TO THEORETICAL PHYSICS
As R increases, however, the column of light begins to acquire
fluctuations of intensity near its boundaries, so that the pattern
on the screen has fringes around the edges. This phenomenon
is the Fresnel diffraction. The size of the Fresnel fringes
increases proportionally to the square root of the distance R.
Thus Fig. 54 shows, in its upper diagram, the slit, parallel
column of light, and parabolic lines starting from the edges
of the slit, indicating the position of the outer bright fringe
of the Fresnel pattern, if we are sufficiently near to the slit.
As R becomes larger, the fringes become so large that there are
only one or two in the pattern of the aperture, and the pattern
Fig. 53. — Aperture and screen for diffraction through rectangular slit.
shows but small resemblance to the shape of the aperture, though
it still is of roughly the same dimensions. With further increase
of R, we finally enter the region of Fraunhofer diffraction. Here
the beam of light, instead of consisting of a luminous cylinder,
resembles more a luminous cone (indicated by the diverging
dotted lines in the top diagram of Fig. 54). Thus the Fraunhofer
pattern becomes larger and larger as R increases, being in fact
proportional to R, so that we can describe it by giving the angles
rather than distances between different fringes. Often Fraun-
hofer diffraction is observed, not by placing the screen at a great
distance, but by passing the light through a telescope focused
on infinity. Such a telescope brings the light in a given direction
to a focus at a given point of the field. Thus it separates the
different Fraunhofer fringes, since each of these goes out from
the source in a particular direction. In Fig. 54, diffraction
patterns are shown indicating the transition from Fresnel to
Fraunhofer diffraction. The pattern a illustrates the Fresnel
FRESNEL AND FRAUNHOFER DIFFRACTION
317
pattern for one edge of an infinitely wide slit. The patterns
b to g represent the actual diffraction patterns from the slit,
at distances indicated in the upper diagram. These patterns
are all drawn to the same scale. They are drawn for a slit
:
_„—- "■ ~"~~
^
-_-
'
Ibe
c
«
i
f\A/liiwwrf
<b>
Fig. 54. — Transition from Fresnel to Fraunhofer diffraction for a slit,
(a) Fresnel pattern for edge of infinitely wide slit.
(b)-(g) Actual diffraction patterns from slit, at distances indicated in upper
diagram.
(h) Fraunhofer pattern.
five wave lengths wide, for the sake of getting the figure on a
diagram of reasonable scale. If the wave length were shorter,
then for the same slit the distances would be stretched out to
the right, and the Fraunhofer pattern would correspond to
smaller angular deflections. This would be necessary to bring
318
INTRODUCTION TO THEORETICAL PHYSICS
the Fresnel cases far enough from the slit so that our approxima-
tions would be really applicable. Finally, in h, we give the
limiting Fraunhofer pattern, not drawn to scale.
Let coordinates in the plane of the aperture be x, y, and in
the plane of the screen at R let the coordinates be x , y , as in
Fig. 53. Then, if the element of area is at x, y, 0, and the point
P at x , y , R, the distance r\ between them is
n = VOro - x) 2 + (*/o - y) 2 + R*.
The integration cannot be performed with this expression for r lf
and Fresnel and Fraunhofer diffraction lead to two different
Fig. 55.-— ri as function of xo — x: n = i/(xo - x) 2 + R*. n is the distance
from a point of the aperture to a point on the screen; xo — x is the difference
between the x coordinates of the points.
approximate methods of rewriting r h leading to different methods
of evaluating the integral. We can see the relation of these two
methods most clearly from Fig. 55, in which n is plotted as a
function of x — x, for the special case where y — y = 0.
The resulting curve is a hyperbola. Now in all ordinary cases,
R is large compared with the dimensions of the aperture. That
is, the range of abscissas representing the dimensions of the
aperture from (x — x\ to x — £ 2 , if x\ and x 2 are the extreme
coordinates of the aperture), is small compared with the distance
R, the intercept of the hyperbola on the axis of ordinates.
The two cases are now represented by the ranges ab and cd of
abscissas, respectively. In the first, x — xi and x — # 2 are
separately small, as well as their difference, and this means that
the point P is almost straight behind the aperture, in the region
FRESNEL AND FRAUNHOFER DIFFRACTION 319
where the Fresnel diffraction pattern occurs. In the second,
x is large, of the same order of magnitude as R, showing that
we are examining the pattern at a considerable angle to the
normal, as we do in the Fraunhofer case. The two approximate
methods can now be simply described from the curve: for
Fresnel diffraction, we approximate the hyperbola near its
minimum by a parabola; for Fraunhofer diffraction, we approxi-
mate it farther out by a straight line. In the first case, assuming
R to be large compared with (x — x), we have by the binomial
expansion
_ D 1 (so - x) 2
or including the terms in y,
t _ p , 1 (x - x) 2 + (y - y) 2 ,
ri - R + g — ^ + (1)
In this case, in the notation of Eq. (8) of the previous chapter,
we take f = R, so that r' is the remaining term of Eq. (1).
For Fraunhofer diffraction, on the other hand, we have x > > x.
Then we write r x 2 = (x Q 2 + y 2 + R 2 ) - 2{xx + yy ) + x 2 + y 2 ,
and we can neglect the terms x % + y 2 . If we let R 2 = x 2 +
y 2 + R 2 } where R Q measures the distance from the center of the
aperture to the point P, we can use a binomial expansion,
obtaining
r, - B. - xx ° + yy° ... (2)
In this case we take f = R 0) so that / is the remaining term of
Eq. (2). Letting x /R = I, yo/Ro = m, the direction cosines
of the direction from the center of the aperture to P, we have
r' = — (Ix + my) • • • , involving the position on the screen
only through the angles, so that we see at once that the pattern
will travel outward radially from the aperture.
195. Fresnel Diffraction from a Slit. — Let the aperture be a
slit, extending from x = — (a/2) to x = a/2, and from y =
— (6/2) to 6/2. We assume a to be small, 6 comparatively
large, as in Fig. 53, so that it is a long narrow slit. Using the
results of Eq. (1), our integral is
ff e - 2 ™'/*dS= f fe-™U x - x o) 2 +(v-vJ*V R *dS.
320 INTRODUCTION TO THEORETICAL PHYSICS
This can be immediately factored into
f b/2 e ->ri<x-v t )*/R*dy f a/2 e-^ x ~ x o^ R Mx.
J -6/2 * J-a/2
Since these two integrals are of the same form, we can treat just
one of them. This will prove to give fringes parallel to one
set of axes. The whole pattern is then simply the combination
of the two sets of fringes. The single integral, for instance the
one in x, has a real part, and an imaginary part (with sign
changed), equal to
f o/2 7r0r-zo) 2 , ' f a/2 • t(x - x ) 2 J ^
I cos — — ^r — — ax and I sin p — — ax. (3)
J-a/2 It* J-a/2 K*
It is customary in these integrals to make a change of variables:
- — D = — • Then the integrals become v#A/2 times C
and S, respectively, where C = I cos ^ u 2 du, S = I sin ^ u 2 du,
Jui & Jui £
, , x — a/2 x + a/2 ™ . .
and where U\ = — , > u 2 = — , Ihese integrals are
VR\/2 VR\/2
called Fresnel's integrals. They cannot be explicitly evaluated,
but their values have been computed by series methods.
196. Cornu's Spiral. — Let us plot the indefinite integral
cos jr u 2 du as abscissa, I sin ^ u 2 du as ordinate, of a graph,
o 2. jo J,
as in Fig. 56. Then it is not hard to see that the resulting curve
is a spiral, which is known as Cornu's spiral. To see this, we
can first compute the slope. This is the differential of the ordi-
nate, over the differential of the abscissa, or
sin „^
= tan kW 2 .
t „ z
COS -jjtt
Thus, when u 2 increases by 4, the tangent of the curve swings
around a complete cycle, and comes back to its initial value.
Each point of the spiral corresponds to a particular value of u.
We can show at once that the difference of u between two points
is simply the length of the curve between the points. We show
this for an infinitesimal element of the curve. The square
of the element of length, ds 2 , is equal to the sum of the squares
FRESNEL AND FRAUNHOFER DIFFRACTION 321
of the differentials of abscissa and ordinate, or is cos 2 1 ~u 2 \du 2 +
sin'
I ~u 2 \du 2 .
Hence ds = du } and we can integrate to get
s = Ui — u\. From this fact we can make sure of the spiral
nature of the curve. For one turn of the curve corresponds to
an increase of u 2 by 4. That is, if u', u" are the values at the two
Jsinl
0.5 --
A h
J7u z du
-r— f-
\ 1
0.5 Jcosljiu 2 du
-0.5
Fig. 56. — Cornu's spiral. The points of the spiral marked by cross bars corre-
spond to increments of 0.1 unit in u.
ends, u" 2 = u' 2 + 4. This is u" 2 - u' 2 = 4, (w" - u')(u" + u')
= 4, u" — v! = 4:/(u" + u'). The difference u" — u' is, how-
ever, simply the length of the turn, so that we see that, as
we go farther along, the turns become smaller and smaller, so
that they eventually become zero, which is characteristic of a
spiral. It is plain that the spiral is symmetric in the origin,
having two points, for u = ± °o , for which it winds up on itself.
Let us take our spiral, mark on it the positions u x and u 2
corresponding to the limits of our integral, and draw the straight
line connecting these points. The length of this line will then
322 INTRODUCTION TO THEORETICAL PHYSICS
be proportional to the amplitude of the disturbance, and its
square to the intensity. This is easy to see: the horizontal
component of the line is just C, and the vertical component S,
so that the square of its length is C 2 + S 2 . Knowing this, we
can easily discuss the fluctuations of intensity, as seen in Fig. 54.
As x Q changes, it is plain that u\ and u% increase together, their
difference remaining fixed and equal to , ■ • Thus essen-
tially we have an arc of this length, sliding along the spiral,
and the intensity is measured by the square of the chord between
the ends of this arc. Now when x is large and negative, the arc
is wound up on itself, so that its ends practically meet, and the
intensity is zero. This is the situation in the shadow. As x
approaches the value —a/2, however, w 2 approaches zero, so
that one end of the arc has reached the center of the figure.
There are two quite different cases, depending on whether
u 2 — Ux is large or small. If it is large (a large slit and relatively
short distance R and small wave length), then u x will still not
be unwound much at this point. The chord will then be half
the value between the two end points of the spiral, and the
intensity will be one-fourth its value without the screen, and
will have increased uniformly in coming out of the shadow.
As we go farther along the x direction, however, the arc will begin
to wind up on the other half of the spiral, producing alternations
of intensity at the edge of the shadow. Then for a while u 2
will be nearly at one end of the spiral, U\ at the other, so that
the intensity for some distance will be nearly constant, and the
same that we should have without the slit. This is the illu-
minated region directly behind the slit. Finally we approach
the other boundary, and u\ commences to unwind. We then
go through the same process in the opposite order. The other
quite different case comes when w 2 — u\ is small, which is the
case for small slit, or large wave length or distance. Then there
is never a time when Ui is on one branch of the spiral and u%
on the other. All through the central part o£ the pattern,
therefore, there are no fluctuations of intensity. Such fluctua-
tions come only far to one side or the other. They come about
in this way: At some places in the pattern, the arc is long enough
to wind up for a whole number of turns, and the chord is practi-
cally zero, while at other places it winds up for a whole number
plus a half, and the chord has a maximum. The resulting fringes
FRESNEL AND FRAUNHOFER DIFFRACTION 323
are the Fraunhofer fringes which we shall now discuss- by a
different method.
197. Fraunhofer Diffraction from Rectangular Slit. — Using
the approximation (2), our integral for Fraunhofer diffraction
is e -2*iR 0/ X fj^iOx+my^X dS The firgt termj ag in Fregnel
diffraction, contributes nothing to the relative intensities, and
may be neglected. We then have ^ e 2wi{lx+my) /' K dS, as the
integral whose absolute value measures the amplitude of the
disturbance.
Let us suppose that the aperture is the same sort of rectangle
considered above, extending from — a/2 to a/2 along x, from
— 6/2 to 6/2 along y. Then the integral is
J°' r t> / 2 (pTrila/\ p— irila/\\ I p irimb/\ —Kimb/\\
e 2 "'*A dx e 2*ir*v/\ dy = ^ — -— i 1 K - i l
- a /2 J -6/2 2iril/^ 2Trim/\
_ sin (irla/\) sin (xm6/X)
irl/\ xm/X
(4)
The intensity is the square of this quantity. Let us consider
its dependence on the position of the point P on the screen.
The coordinates of this point enter only in the expressions
I, m, showing that the pattern increases in size proportionally
to the distance, as if it consisted of rays traveling out in straight
lines from the small aperture, rather than having an approxi-
mately constant size as with the Fresnel diffraction (see Fig.
54). When we consider the detailed behavior of the intensity
as a function of the angle, we find that this can be written as
a 2 sin 2 (irla/\) ,. ■.■,*.„
/ j a /\\ 2 tunes a similar function of m, giving a curve of
,, j. sin 2 a , irla „,,
the form ^— > where a = — • This function becomes unitv
when a = 0, goes to zero for a = x, 2tt, 3tt, • • • , with maxima of
intensity approximately midway between. The maxima decrease
rapidly in intensity. Thus at the points 3tt/2, 5x/2 . . . which
are approximately at the second and third maxima, the intensities
are only (2/3tt) 2 , (2/5tt) 2 , ... or 0.045, 0.016 . . . , compared
with the central maximum of 1. Let us see how the size of
the fringes depends on the dimensions of the slit. The minima
come for a = rnr, or la/\ = n, I = n\/a. Thus we see that
the greater the wave length, or the smaller the dimensions of the
slit, the larger the pattern becomes.
324
INTRODUCTION TO THEORETICAL PHYSICS
The positions of the minima can be immediately found by a
very elementary argument. Assume for convenience that we
are investigating the pattern at a point in the xz plane, so that
m = 0. Then draw a plane normal to the direction I, passing
through one edge of the aperture, as in Fig. 57. This represents
a wave front of the diffracted wave, just as it passes one edge of
the aperture. From the geometry of the system, this wave front
is a distance la from the other edge, or la/2 from the middle of the
aperture. Now, if the distance of the middle is just a whole
number of half wave lengths different from the distance from the
edge, the contributions of these two points to the amplitude will
Fig. 57. — Elementary construction for Fraunhofer diffraction.
just cancel, being just out of phase. The other points of one
half of the aperture can all be paired against corresponding points
of the other half whose contributions are just out of phase, finally
resulting in zero intensity. This situation comes about when
la/2 = wX/2, where n is an integer, or I = n\/a, the same condi-
tion found above. Since most of the intensity falls within the
first minimum, and since I is the sine of the angle between the
ray and the normal to the surface, we may say that by Fraun-
hofer diffraction the ray is spread out through an angle X/a.
198. The Circular Aperture. — The problem of Fraunhofer
diffraction through a circular aperture is slightly more compli-
cated mathematically. Here we must evaluate jje 2 ^ lx+my) / x dS
over a circle. Let us introduce polar coordinates in the plane of
the aperture, so that x = p cos 0, y = p sin 0. Further, on
FRESNEL AND FRAUNHOFER DIFFRACTION 325
account of symmetry, we may take the point P to be in the xz
plane, so that m — 0. Then if p is the radius of the aperture,
the final result is f *ddf P °e 2 * i <> cos e l ^pdp. We can integrate with
respect to p by parts, obtaining for the integral ""
2W T Po6 27 " P ° C ° 8 ° ^ X (e 27ri P0 0O8 e i/\ ]V
J>[
_2iri cos ei/\ (2iri cos l/\) 2 J
For the integration with respect to 0, it is necessary to expand
the exponentials in series. If we do this, the integrals are in
each case integrals of a power of cos 0, from to 2x. These are
easily evaluated, and the result, combining terms, proves to be
H. 1 "" Kt) + \\y\) - 1(it) + b(it) } where
k is an abbreviation for irp l/\. If we recall the formulas for
BesseFs functions, we can see without difficulty that this is equal to
-y-Ji( 27rporr- )• It is not hard, using some of the properties of
BesseFs functions, to prove this formula directly, without the
use of series. From the series, we see that the intensity has a
maximum for I = 0, the center of the pattern. As I increases,
we can see the behavior most easily from the expression in terms
of BesseFs functions. Since J\ has an infinite number of zeros,
there are an infinite number of light and dark fringes. The
first dark band comes at the first zero of J h which from tables
is at 2t Po I/\ = 1.2197tt, l Po /\ = 0.61. The next is at Po l/\ =
1.16, and so on, with maxima between. We see that, except for
a numerical factor, the pattern from a circular aperture has about
the same dimensions as that from a square aperture. Thus if
the side of the square were equal to the diameter .of the circle,
2p , the first dark fringe would be at 2 p l/\ = 1, p l/\ = 0.5, and
the next one at 1.0.
199. Resolving Power of a Lens. — Whenever light passes
through a lens, it is not only refracted, but it has passed through a
circular aperture, the size of the lens itself or of the diaphragm
which stops it down, and as a result it is diffracted. Suppose,
for example, that the lens is the objective of a telescope, and that
parallel light falls on it, as from an infinitely small or distant
star. Then after passing through the diaphragm, the light will
no longer be a plane wave, but will have intensity in different
directions, as shown in the last section. The central maximum
326 INTRODUCTION TO THEORETICAL PHYSICS
will have an angular diameter of 0.61 \/p , where p is now the
radius of the telescope objective. The resulting waves are just
as if the light came from an object of this diameter, but passed
through no diaphragm. When the telescope focuses the radia-
tion, the result will be not a single point of light, but a circular
spot surrounded by fringes, as of a star of finite diameter. For
this reason, the telescope is not a perfect instrument, and one
would say that its resolving power was only enough to resolve
the angle 0.61 X/p . This is usually taken to mean the following :
if two stars had an actual angular separation of this amount, the
center of the image of one star would lie on the first dark fringe
of the other, and the patterns would run into each other so that
they could be just resolved. We see that the larger the aperture
of the telescope, or the smaller the wave length, the better is the
resolution. The same general situation holds for microscope
lenses.
200. Diffraction from Several Slits; the Diffraction Grating. —
Suppose we have a number N of equal, parallel slits, equally
spaced. Let each have the width a along the x axis, and let the
spacing on centers be d, so that the centers come/at x = 0, d • • •
(N — l)d. Now let us find the Fraunhofer pattern. The part
of the integral depending on y will be just as with the single slit,
and we leave it out of account. We are left with
f /2 ^/Nfe + f d+a/2 e^^dx + • • • + CT****"****-
J -a/2 Jd-a/2 J(N-l)d-a/2
But this is, as we can immediately see, simply
a / 2 e 2lrilx ^dx(l + g2ir*ta/X l & 2vil2d/\ _|_ . . . _|_ e 2inHN~\)d/\\ _
a/2
By the formula for the sum of a geometric series, this is J_ a/ e 2vilx/x
(1 _. e 2irilNd/\\
2^Wx" )" ke* the first term be A, the amplitude due
to a single slit, which we have already evaluated. Now to find
the intensity we multiply this by its conjugate, which gives
2 1 - cos {2irlNd/\) = . 2 sin 2 (rlNd/\) . .
A 1 - cos (2ttW/X) sin 2 {irld/\) ' W
That is, with N slits the actual intensity is that with one slit,
but multiplied by a certain factor. This factor goes through
zero when lNd/\ is an integer, so that I equals an integer multi-
plied by \/Nd. This gives fringes with a narrow spacing, charac-
FRESNEL AND FRAUNHOFER DIFFRACTION 327
teristic of the whole distance Nd occupied by the set of apertures,
crossing the other pattern, and they are what are usually called
interference fringes, since they are due, not to diffraction from
a single aperture, but to interference between different apertures.
But in addition to this, the denominator results in having these
fringes of different heights. .The minimum height occurs when
the denominator equals unity, when the fringes are of height A 2 ,
and the most intense fringes come when the denominator is zero.
Here the ratio of numerator to denominator is evidently finite,
and gives fringes of height N 2 A 2 . Thus the greater N is, the
greater the disparity in height between the largest and smallest
maximum. Evidently every iVth maximum will be high, and
the high ones will be spaced according to the law ld/\ = k, an
integer.
Now suppose N becomes very great, as in a diffraction grating.
Then the small maxima will become so weak compared with the
strong ones that only the latter need be considered. The latter
will seem to consist of a set of sharp lines, with darkness between.
These sharp lines come, as we have seen, at angles to the normal
given by k\ = d sin 0, where k is an integer, and sin — I. This
is the ordinary diffraction grating formula, where k is for the
central image, 1 for the first-order spectrum, 2 for the second
order, etc. But we cannot entirely neglect the fact that thSre are
other small maxima near the important ones. Thus for ld/\ =
k, the intensity is N 2 A 2 . This comes for lNd/\ = Nk. But
for IN d/\ = Nk + %, we again have a secondary maximum, whose
A 2 A 2 A 2
height is now ^ = 71 r= — r-^-r, = -. — - .
sm T sm x NT^ sm \ k + 2NJ
Now sin 2 (^ + 9^) = (oXf) approximately, if iV is large,
so that the height of the maximum is 4N 2 A 2 /9t 2 , or about 0.045
of the height of the highest maximum. Thus the first few second-
ary maxima cannot be neglected. To get an idea of the width of
the region through which the intensity is considerable, we may
take the width of the first maximum. From the center to the
first dark fringe, this is given by the fact that at the center
lNd/\ = Nk, at the dark fringe = Nk + 1, so that Al = \/Nd.
This is closely connected with the resolving power of a grating.
For a single frequency gives not a sharp set of lines, one for each
order, but a set broadened by the amount we have found. Thus
328 INTRODUCTION TO THEORETICAL PHYSICS
two neighboring frequencies, differing by AX, could not be resolved
if the first minimum of one lay opposite the maximum of the
other. Since Z = \k/d, this would be the case if Al = AX/b/rf =
\/Nd, or if AX/X = 1/Nk. The resolving power thus increases
as the number of lines in the grating increases, and as the order
of the spectrum increases.
Problems
1. Carry through a discussion of Fresnel diffraction from a slit, when the
source is at a finite distance, directly behind the center of the slit. In what
ways will the result differ from the case we have discussed?
2. Light of wave length 6,000 A. falls in a parallel beam on a slit 0.1 mm.
broad. Work out numerical values for the intensity distribution across the
slit, at three distances, first, in which the Fresnel fringes are small compared
with the size of the pattern, second in which they are of the same order of
magnitude, and third, in which they are Fraunhofer fringes. Either con-
struct Cornu's spiral yourself, from tables of Fresnel's integrals, or use the
one of Fig. 56.
3. Find the coordinates of the points at which Cornu's spiral winds up on
itself. From the chord between these points, compute the intensity behind
an infinity broad slit, which essentially means no slit at all. Find whether
this agrees with what you should expect it to be.
4. Prove that the maxima of the function
sin 2 (irla/X) sin 2 a
(xZa/X) 2 ~ a.
are determined by the equation a. = tan a. Find the first three solutions
of this transcendental equation and compare them with the approximate
solutions a = 3ir/2, 5ir/2, 7ir/2.
5. Discuss the Fresnel diffraction pattern caused by an edge coincident
with the y axis, the screen occupying one-half the xy plane. The diffraction
pattern is obtained in a plane parallel to the xy plane and a distance R from
it. Plot the variation of intensity of light along the x direction from a
region inside the shadow to well into the directly illuminated area. Prove
that the intensity, of light just at the edge of the geometrical shadow is
one-fourth of its value if there were no diffraction edge.
6. Evaluate the Fresnel integrals f" cos ^uHu and i sin ^uHu in a power
series. What is the range of convergence of these series?
7. Evaluate the Fresnel integrals in series of the form
cos 2 =«Si + sin 2 2 mS 2,
where Si and S 2 are power series in u. What is the range of convergence
of these series?
8. Find a semiconvergent series for the Fresnel integrals of the same form
as in Prob. 7 where the power series are now in inverse powers of u. (Hint:
Write f °° cos xHx = f °° x cos x 2 — and integrate by parts, repeating the
process.) Calculate the remainder in these series after the nth term. Show
that this is smallest when n is about x 2 /2.
CHAPTER XXVIII
WAVES, RAYS, AND WAVE MECHANICS
The beautiful success of the wave theory in explaining diffrac-
tion patterns, which we have been discussing in the last chapter,
has been the best proof of the correctness of this theory. But
the proof has not always gone unchallenged. Ever since the
time of Newton, at least, there has been a rival theory, the cor-
puscular theory. Newton imagined 1 ght to consist of a stream
of particles. These particles, or corpuscles, traveled in straight
lines in empty space, and were reflected by mirrors as billiard
balls would be by walls, making equal angles of incidence and
reflection. Refraction was explained by supposing that different
media had different attractions for the corpuscles. Thus glass
would attract them more than air, the potential energy of a
corpuscle being constant within any one medium, but being lower
in glass than in air, so that the corpuscles would have a normal
component of acceleration toward the glass, without correspond-
ing tangential acceleration, and would be bent toward the normal
on entering the glass. By working out this idea, the law of
refraction easily follows. Newton was aware of the wave theory •
Huygens was advocating it at the time. But his objection was
that light travels in straight lines, whereas the waves he was
familiar with, waves of sound or water waves, certainly are bent
out in all directions on passing through apertures. Newton
considered this to be a fatal objection to the wave theory.
The answer to this objection, of course, came later with the
quantitative investigation of diffraction. In the preceding
chapter, we have seen that a plane parallel wave, falling on a small
aperture of dimension a, does not form a perfectly parallel ray
after emerging from the hole. on the contrary, it spreads out,
first by forming fringes on the edges of the ray (Fresnel diffrac-
tion), then at greater distance by developing a conical form/with
definitely diverging rays (Fraunhofer diffraction). The angle
of this cone is of the order of magnitude of X/a, where X is the
329
330 INTRODUCTION TO THEORETICAL PHYSICS
wave length. Newton was tacitly assuming that the wave
length, as with sound, was large, that X/a would be large for a
small slit, and there would be large spreading out and a com-
pletely undefined ray. But it was found early in the nineteenth
century that the wave length was really so small that, with
apertures of ordinary size, we can neglect diffraction, and obtain
an almost perfectly sharp ray, a band of light separated from the
darkness by sharp, straight edges.
201. The Quantum Hypothesis. — More recently, in the present
century, a more serious argument for a corpuscular theory has
appeared. This is the hypothesis of quanta, originated by Planck
in discussing the radiation from a heated black body. The most
graphic application of this hypothesis was made by Einstein to
the theory of the photoelectric effect. It is known that light of
frequency v, falling on a metal surface, liberates electrons, as for
example in the photoelectric cell. Now the law of emission is
remarkable: the energy of each emitted electron, independent
of the intensity of the light, is a definite amount proportional to
the frequency, hv, where h is Planck's constant, equal to 6.54 X
10- 27 in c.g.s. units, introduced by him in his first discussion.
This energy of the emitted electron is really decreased by the
amount of energy it loses in penetrating the surface, so that
hv will act as a maximum energy, rather than the energy of each
electron. Of course, the total emission is proportional to the
intensity of the light, but increasing the intensity increases the
number of electrons, not their energies.
Einstein's hypothesis to explain the photoelectric effect was
that the energy of the wave was not to be computed in a continu-
ous manner by Poynting's vector, but that it was localized in
little particles or corpuscles (now called photons), each of energy
hv. Then it would be perfectly obvious that if no photon fell on
a spot of the metal, no electron would be ejected; but that a
photon which happened to fall on a given place would transfer
all its energy to an electron, being absorbed, and ceasing to exist
as light. The intensity of light would be measured simply by
the number of photons crossing an arbitrary surface per second,
times the energy carried by each photon.
Einstein's hypothesis found many supports. one of these
comes from the structure of atoms. Atoms emit monochromatic
spectrum lines, falling often into regular series. Bohr was able
to explain this, at least in hydrogen, the simplest atom, by assum-
WAVES, RAYS, AND WAVE MECHANICS 331
ing that the atom was capable of existing only in certain definite
stationary states, each of a definite energy. He supposed that
radiation was not emitted continuously, as the electromagnetic
field from a rotating or vibrating particle would be, but that the
atom stayed in one energy level until it suddenly made a jump
to a second, lower, level, with emission of a photon. If the higher
energy is E 2 , the lower E 1} the energy of the photon would be
E 2 — Ei, so that its frequency would be E 2 /h — Ei/h. This
formula has proved to be justified by great amounts of experi-
mental material. First, it states that the frequencies emitted by
atoms should be the differences of "terms" E/h, each referring
to an energy level of the atom. This is found to be true in spec-
troscopy, and has been the most fruitful idea in the development
of that science. Even tremendously complicated spectra can
now be analyzed to give a set of terms, and the number of terms
is much less than the number of lines, since any pair of terms,
subject to certain restrictions, gives a line. But also, Bohr was
able to set up a system of mechanics to govern the hydrogen
atom, very simple in its fundamentals, though different from
classical mechanics, which gives a very simple formula for the
energy levels, agreeing perfectly with the extremely accurate
experimental values. Bohr's idea of stationary states, in turn,
was tested by experiments on electron bombardment. It was
found that an atom in state of energy Ei could be bombarded
by an electron. If the electron's energy, as determined from the
electrical difference of potential through which it had fallen, was
less than E 2 — Ei, where E 2 is the energy of the upper state (we
consider only one), it would bounce off elastically, without loss
of energy. But if its energy was E 2 — E lf or greater, it would
often raise the atom to the upper state, which could be proved
by subsequent radiation by the atom, and would lose this amount
of energy itself. This definitely verified the existence of sharp
energy levels in the atom. At the same time, it furnishes an
example of a very interesting phenomenon. An electron bom-
bards an atom, loses energy E 2 — Ei. This energy is emitted
as a photon hv. The photon falls on a metal, is absorbed, ejects
a photoelectron of energy E 2 — Ei (minus a little, for the work
of coming through the surface). The photoelectron bombards an
atom, loses its energy, which goes off as a photon. Energy, in
other words, passes back and forth from electrons to photons
332 INTRODUCTION TO THEORETICAL PHYSICS
indiscriminately. If electrons are particles, surely photons are
too.
202. The Statistical Interpretation of Wave Theory. — All
these phenomena suggesting photons, and a corpuscular structure
for light, must not cause one to forget that light still shows inter-
ference, and that the arguments for the wave theory are as strong
as ever. Various attempts were made to set up laws of motion
for the photons, which would lead to the correct laws of interfer-
ence and diffraction (Newton had already done it for refraction),
but without success. We can see easily why this should be so.
Consider very weak light, so weak that we only have a photon
every minute, for example, going through a diffraction grating.
Such weak light, we know experimentally, is diffracted just like
stronger light. But that means, as we saw in the last chapter,
that the resolving power depends on a cooperation of the whole
grating; if half of it were shut off, its resolving power would be
decreased, and the intensity distribution changed. Even the
single photon shows evidence of the full resolving power, in that
if we make a large enough exposure to have many photons, so
that we can develop the photograph and measure the blackening,
which surely measures the number of photons which have struck
the plate, we find the full resolving power of the grating in the
final photograph. But it is difficult to imagine any law of motion
of a photon which will depend on rulings over the whole face of a
grating, if the photons went through only one point of it.
After such difficulties, the theory that has emerged is a com-
bination of wave theory and corpuscular theory. It is assumed
that atoms emit wave fields as in the electromagnetic theory,
emitted by certain oscillators connected with the atom, and
vibrating with the emitted frequencies. These waves do not
carry energy, but serve merely to determine the probable motion
of the photons. The rate of emission of waves by the oscillator
determines the probability of emission of photons. The Poyn-
ting's vector at any point of the radiation field determines the
probability that a photon will cross unit cross section normal to
the radiation, per second. If the oscillator is damped with time,
that indicates that the probability of emission of a photon
decreases with time; that is, that the probability that the atom is
in its upper, excited state, from which it could emit the radiation,
is decreasing with time. one can carry such a probability con-
nection through in detail.
WAVES, RAYS, AND WAVE MECHANICS 333
Probably the most graphic picture of the probability relation
between photons and waves is obtained if we imagine very weak
light, in which photons come along one in several seconds, forming
a diffraction pattern. The diffraction pattern is assumed to be
on a screen which is capable of registering the individual photons
as they come along. This screen might be a photographic plate,
in which a single photon is enough to make a grain developable,
or it might be a screen having slits opening into Geiger counters
or other devices for registering individual photons. Of course,
the only way of detecting that there was light falling on the
screen would be to detect the photons. First, one photon would
strike the screen, in one spot, then another photon in another
spot, and so on. So long as there were only a few photons, the
arrangement might seem to be haphazard. But as more and
more photons were present, we could find where they were densely
distributed, and where there were only a few. It would then
prove to be the case that the places where photons were dense
were just those places where the wave theory predicted a large
intensity, and the places where there were no photons were those
where the wave theory indicated darkness.
203. The Uncertainty Principle for Optics. — It is characteristic
of the theory that no law of motion of photons is assumed beyond
this probability; according to the present view, no such detailed
laws exist. Given a plane monochromatic wave of light, we
know exactly the energy of each photon (hv), and its momentum
(this' proves to be hv/c = h/\, pointing in the direction of the
wave normal), but, if the intensity is uniform over space, we have
no information as to the position of the photon. If we let the
plane wave fall on a slit of width a, the light passing through will
be more defined as to its position in space. It will be in the form
of a small ray or beam, spreading by diffraction, but still, in the
region of Fresnel diffraction, of width approximately a. Thus,
if x is the coordinate along the wave normal, y the coordinate at
right angles, the photon will surely be in a beam whose length
along the x axis is infinite, but of width only about a along the
y axis, as in Fig. 58. That is, the uncertainty in the y coordinate
has been reduced to a: Ay = a, if Ay is the uncertainty. At the
same time, however, a, compensating uncertainty in the momen-
tum has appeared. The wave is now spreading, the wave nor-
mals making angles up to about X/a with the x axis, as shown
in Sec. 197. Thus, if the whole momentum remains p = h/\
334
INTRODUCTION TO THEORETICAL PHYSICS
this will have a component along y, equal to p times the sine of
the angle between the momentum and the x axis, or approxi-
mately p\/a = h/a. But we do not know which angle, up to the
maximum, the actual deviation will make, for all we know is that
the photon is somewhere in the diffraction pattern. Hence the
uncertainty in y momentum is of this order of magnitude of
h/a. If we call it Ap y , we have the relation
AyAp y = ?- = h. {1)
This is an example of the uncertainty principle, concerning the
amount of uncertainty inherent in the description of the motion
Fig. 58. — Uncertainty principle in diffraction through slit.
Ap __X_
V ~ &q
(Compare Fig. 54, top diagram).
— = — , ApAg = Ap =
of photons by the probability relations with wave theory.
Further examination indicates that this law is very general:
where a beam is limited to acquire more accurate information
about the coordinates of the photon, we make a corresponding
loss in our knowledge as to its momentum, and vice versa.
A similar relation holds between energy and time. Suppose
we have a shutter over our hole, and open it and close it very
rapidly, so as to allow light to pass through for only a very short
interval of time At. Then the wave on the far side is an inter-
rupted sinusoidal train of waves, and we know by our Fourier
analysis, as in Sec. 185, that the frequency is no longer a definitely
determined value, but is spread out through a frequency band
of breadth Av % given by Av/v = 1 /(number of waves in train).
WAVES, RAYS, AND WAVE MECHANICS 335
Now the number of waves in the train is cAt, the length of the
train, divided by X. Hence Av/v = \/{cAt), AvAt = 1. Using
E = hv, we have
AEAt = h, (2)
an uncertainty relation between E and t, showing that energy
and time are roughly equivalent to momentum and coordinate:
if we try to measure exactly when the photons go through the
hole, their energy becomes slightly indeterminate. Further,
here we know that the x coordinate is now determined, at any
instant of time, with an accuracy cAt: the photon must be in
the little puff of light, or wave packet, sent through the pinhole
while the shutter was open. Thus Ax = cAt. But now the x
component of momentum, which to the first order is the momen-
tum itself, is uncertain. For p x = p = — > Ap x = -Av =
h/(cAt) = h/Ax, so that
AxAp x = h, (3)
again the uncertainty relation. We can, in other words, make
our wave packet smaller and smaller, until it seems almost like
a particle itself, and its path is the path of the photon. The
wave packet will be reflected and refracted, just as large waves
would be, giving the laws of motion of photons in refracting
media. But if we try to go too far, making the wave packet
too small, we defeat our purpose, and make it spread out by
diffraction. We cannot, that is, get exactly accurate knowledge
about the laws of the photon's motion from the probability
relation. In some cases, this is even more obvious than here.
Thus, if a wave packet is sent through a diffraction grating, it
will spread out much as a plane wave would, into the various
orders of the diffraction pattern. We cannot, then, make any
prediction at all, except a statistical one, as to which order
of the pattern a given photon will go to. We completely lose
track of the paths of individual photons in a diffraction pattern.
204. Wave Mechanics. — It is now a remarkable fact that many
indications point out that there is the same dualism between
waves and particles in mechanics that there is in optics. We have
seen one in the way energy passes from electrons to photons,
and back again. We can paraphrase our earlier remark by
saying that surely if photons are connected with waves, electrons
are connected with waves too. But there are more substantial
336 INTRODUCTION TO THEORETICAL PHYSICS
reasons. In discussing the statistical relation of waves and
photons, we mentioned that the electromagnetic waves were
produced by oscillators, and it appears that these oscillators
have only a statistical relation to the atoms. Thus we noted
that the oscillators connected with radiating atoms would be
exponentially damped, while the atoms were discontinuously
jumping from an excited state to a lower state from which they
did not radiate. This suggests a statistical connection between
the oscillators and the atoms or electrons, the number of atoms
in the excited state at any instant being related to the instan-
taneous amplitude of the corresponding oscillators, as the number
of photons is related to the amplitude of the electromagnetic
wave. But there are two compelling reasons which have led
to the acceptance of the connection between the motion of
particles and waves. The first was the experimental proof,
by Davisson and Germer, G. P. Thomson, and others, that
electrons can show the same sort of diffraction effects that light
shows, being diffracted by crystals, and even by ruled gratings.
The second was the fact, discussed by de Froglie and developed
by Schrodinger, that the stationary states of atoms and molecules
correspond to the various overtones of a standing wave system.
Thus. the waves associated with particles not only can have
progressive form, connected with particles traveling along, but
can also exist as standing waves, and these are precisely the
oscillators which are statistically connected with the atoms,
and which represent the stationary states of Bohr's theory.
We shall elaborate the theory of these stationary states in
succeeding chapters.
It is definitely settled, then, that mechanics is just as much
a wave phenomenon as optics is. The wave mechanics leads
to Newtonian mechanics as a limiting case, just as the wave
theory of light leads to geometrical optics, where one treats
rays only, and where one can assume that the light consists of
particles following fixed paths and moving according to fixed
laws. Our work, so far in this book, has been divided roughly
into two sections, mechanics, and the electromagnetic theory
and optics. We now commence a third section, of equivalent
importance, on wave mechanics. But as the standing waves
of wave mechanics are often the atoms themselves, it is natural
that our treatment should be intimately bound up with the struc-
ture of matter, a subject which one can mostly leave out in
WAVES, RAYS, AND WAVE MECHANICS 337
speaking of mechanics or optics, but which is of the very essence
of the problem with wave mechanics.
205. Frequency and Wave Length in Wave Mechanics. — If
we are considering a mechanical particle of energy E, momentum
p in a given direction, we assume that associated with it is a wave
(of course, not a light wave or a vibrational wave of a material
medium ; we are now accustomed in physics to the idea of purely
mathematical waves, without reference to any medium) whose
frequency v and wave length X are given by the equations
E = hv, p = * (4)
the wave normal being in the direction of motion of the particle.
The reason why one ordinarily is not conscious of the wave
nature of mechanics is the extraordinarily small wave length
involved. A particle of mass 1 gm., moving with velocity
1 cm. per second, 'has a wave length given by h/\ = mv = 1,
X = h/1 = 6.54 X 10 -27 cm., exceedingly small compared with
all ordinary dimensions. If such a particle passed through a
pinhole, the corresponding wave would be diffracted, but the
angle of spreading would be extremely small. With other
magnitudes for the mass, however, the diffraction effect can
become important. Thus an electron, of mass 9 X 10 -28 gm.,
moving, for example, with a velocity of 10 8 cm. per second, has a
wave length of 9 x 10 - 28 x 1Q8 = 7.3 X 10" 8 cm, a quantity
of atomic dimensions. Thus if the electron passed through an
aperture of atomic size, as a hole between atoms, it could be
diffracted through a large angle. It is then evident that diffrac-
tion of electrons on an atomic scale is important; in fact, we
shall see in the next chapter that this is just why the atomic scale
is what it is.
206. Wave Packets and the Uncertainty Principle. — Just as
with light, we assume a statistical relation between the intensity
of the wave and the probability of finding the particle at the
corresponding point. A uniform infinite monochromatic plane
wave corresponds to a particle traveling with a definite energy
and momentum in a definite direction, but whose position is
entirely unknown. Such a mechanical system would be approxi-
mated by electrons which had been all accelerated to the same
speed in a vacuum tube, but whose individual positions we did not
338 INTRODUCTION TO THEORETICAL PHYSICS
know. If we wished to fix the positions, we could let the beam
of electrons fall on a screen containing a pinhole. Then any
electron found on the far side would have gone through the pin-
hole, so that we would know its y coordinate with an uncertainty
Ay (using the same coordinates as with the optical case, x normal
to the screen, y in the plane of the screen). After passing
through, the electrons would travel practically in a straight
line; but the ray will be deviated on account of diffraction, and
since the law of motion of the electron is not definitely fixed,
but is merely a probability law connecting it with the wave,
there will be an uncertainty in its y momentum, given by AyAp y =
h. Similarly if we try to determine the x coordinate of the
electron by opening and closing a shutter, so that we know exactly
when it went through the hole, we thereby introduce a broaden-
ing into the spectrum of the wave, hence an uncertainty in wave
length of the particle, and finally in its x component of momen-
tum, given by AxAp x = h. Thus the principle of uncertainty
operates with particles as with photons.
The wave packet, as set up in this way, may be made extremely
small without diffraction, if the wave length is as small as it often
is. Thus with a particle of the mass of familiar objects, the wave
function representing the motion of its center of gravity can be
concentrated in a region much smaller than atomic dimensions,
without being troubled by diffraction. This packet would then,
in a force field, travel around in a certain way without appreciable
spreading. We know at each instant that the particle is within
the packet. Thus for all practical purposes the law of motion
of the packet is the same as the law of motion of the particle.
This then is the direction in which we look for the derivation
of Newtonian mechanics from wave mechanics. We at once
see that the motion of a wave packet in mechanics will be more
complicated than in optics, for the wave length in mechanics,
X = h/p, changes continuously from place to place. If we
have a conservative motion, for which alone it is easy to formu-
late wave mechanics, we have p 2 /2m + V = E, X = h/p =
h/y/2m{E — V), a function of position on account of V. E
stays constant, as usual, so that the frequency is constant, as
in optics. But the variable X corresponds to a variable index
of refraction. There are only a few optical cases where this is
true. Generally the index change's sharply from one medium
to another, and the ray of light consists of segments of straight
WAVES, RAYS, AND WAVE MECHANICS 339
lines. In refraction by the atmosphere, however, as in astron-
omy, or in the refraction by heated air over the surface of the
earth, as in mirages, the path of the light rays is curved instead
of sharply bent, and this corresponds to the usual mechanical
case, where the paths or orbits are curved. To proceed further
with the connection between wave mechanics and Newtonian
mechanics, we must first investigate the shape of a ray in a case
where the index changes with position. The general principle
governing this is called Fermat's principle.
207. Fermat's Principle. — Assume that we have an optical
system, with a ray traveling from Pi to P 2 . We may start the
ray by letting parallel light fall on a pinhole, so that really the
light travels in a narrow beam, eventually reaching P 2 . We
assume that the dimensions are so large that diffraction can
be neglected. Then suppose we compute the time taken for
light to pass from the point Pi to P 2 along the actual ray. This
Jrp* fa
— > where the integral is a line integral, com-
Pi v
puted along the ray from Pi to P 2 , ds is the element of length
along the ray, and v is the velocity, a function of position if
the index of refraction changes from point to point. Next,
suppose that we compute the same integral for other paths
joining P x and P 2 , but differing in between. Since in general
the integral is not independent of path, we shall get different
answers. In general, if we go from one path to another, the
difference of the integral between the paths will be of the same
order of small quantities as the displacement of the path. But
Fermat's principle says that if one path is the correct ray, and
the other is slightly displaced from it, the difference in the integral
is of a higher order of small quantities. This is a sort of condi-
tion met in the calculus of variations. In that subject we have
J'»i > 2 ^
— is the variation
Pl V
of the integral, and it means the difference between the integral
over one path, and over another infinitely near to it. Fermat's
principle says that the variation of the integral is zero for the
actual path; meaning that the actual variation is infinitesimal
of a higher order than the variation of path, so that it vanishes
in the limit of small variation of path. The idea of the varia-
tion of an integral is closely analogous to that of the differential
of a function in ordinary calculus. Thus, if the variation of an
340
INTRODUCTION TO THEORETICAL PHYSICS
integral is zero, for a given path, that means that the integral
itself is a maximum or minimum with respect to variations of
path; or, more generally, that it is stationary, not changing with
small variations of path. Set-
ting the variation equal to zero
corresponds to setting the deriv-
ative of a function equal to
zero in calculus.
Let us verify Fermat's prin-
ciple in two simple cases.
First, we assume that v is every-
where constant, so that there
are no mirrors or lenses. Then
we can take v outside the
integral, dividing through by it,
and having 5 1 ds = 0. That
JPi
is, the true path of light between
Pi and P 2 is that line which has
minimum (or maximum) length,
and j oins Pi and P 2 . Obviously
the minimum is desired in this
case; and the shortest line
between Pi and Pi is a straight
line, which then is the ray. Let
us compute the variation of
path, to check the variation
principle. In Fig. 59 (a), we show the straight line joining Pi and
P 2 , and also a varied path, Pi#P 2 . The length of this second path is
(«)
Fig. 59. — Variation of length of path.
(a) The straight line P1AP2 differs in
length from the varied path P1BP2 by a
small quantity of the order of the square
of AB.
(b) The broken line P1AP2 differs
from P1BP2 by a quantity of the order of
AB itself. Hence the straight line of
(a), rather than the broken one of (0),
is the one for which the variation of
length is zero.
2V(PiAy + (AB)" = 2(iM)
1 + i
(AB) 2
2 (PiAY
+
= (PiP.)
_j_ 2 ,p p v differing from the direct path P X P 2 by an infinitesimal
(P1P2)
of the second order, if (AB), the deviation of the path P X BP 2
from P1AP1, is regarded as small of the first order. In other
words, the path PivlP 2 satisfies the condition that the variation
of its length is zero (that is, small of the second order). on the
other hand, if we started with a crooked path, as P1AP2 in (b),
then the path PiPP 2 differs from it approximately by the amount
(5(7) _|_ (BD), or approximately 2 (AB) sin 6, an infinitesimal
WAVES, RAYS, AND WAVE MECHANICS 341
of the same order as (AB), so that in this case the variation is not
zero, and the crooked path is not the correct one.
As a second example, we take the case of reflection. In Fig.
60, consider the path PiAP 2 , connecting P x and P 2 , satisfying the
law of reflection on the mirror OA. This path evidently equals
PiAP 2 in length, where Pi' is the image of P lm Similarly a
slightly different path PiPP 2 equals Pi'PP 2 , which is therefore
longer, since PiAP 2 is the straight line connecting P/ and P 2 .
In other words, PiAP 2 makes the integral a minimum, and is the
correct path. In this case we *?
could again easily show that the
integral along PiBP 2 differed
from that along P1AP2 by
quantities in the square of AB,
verifying our statement that if
the path is displaced by small
quantities of the first order
(AB) the integral is changed
only in the second order (AB 2 ).
A similar proof can be carried
through for the case of refrac-
tion, showing that the law of jf
refraction is given by Fermat's fig. 60.— Fermat's principle for
principle. reflection. The path P1AP2, equal to
. . Pi'APi, differs in length from its neigh-
A lundamental proof Of bor PiBPz by a small quantity of the
Fermat's principle can be given order of the square of AB -
directly from the determination of the ray from diffraction theory.
The condition that a point P 2 lie in the ray, if we discuss diffraction
through the aperture by Huygens' principle as in the last chapter,
is that the various paths leading from Pi to P 2 , by going to various
points of the aperture, and then being scattered in Huygens'
wavelets from there to P 2 , should be approximately the same, so
that the light can interfere constructively at P 2 . This means
that such paths, as measured in wave lengths, are all approxi-
mately the same length. In other words, for constructive inter-
C P3 ds
ference, I — , the number of wave lengths between Pi and P 2 ,
JPi a
C Pi ds
must be independent of slight variations in the path, or 8 I — = 0.
JPi ^
This clearly is the condition whether X is independent of position
>■
342 INTRODUCTION TO THEORETICAL PHYSICS
or not, for, even if the waves change in length from point to point,
we must still have the waves interfere to get the ray, and this
still demands the same number of wave lengths along neighboring
paths. Now X = v/v, and since v, the frequency, is a constant
throughout the path of the light, we may then write the varia-
C Pi ds
tion as v 8 I — = 0, from which, dividing by v, we have Fermat's
JPi v
principle. This interpretation in terms of the interference of
the waves along the ray is the fundamental meaning of Fermat's
principle.
208. The Motion of Particles and the Principle of Least Action.
We shall now show that if we use the analogue to Fermat's prin-
ciple in mechanics, it leads to the correct motion of the particle
according to Newtonian mechanics. As we have seen, the wave
problem representing the motion of a single particle whose vari-
ables we know is a ray. And the path of this ray is given by
Fermat's principle, which we may write in the form 8jds/\ = 0.
But now in wave mechanics, h/\ = p, the momentum, so that,
canceling out the constant factor h, this becomes Sjp ds = 0.
But this is a well-known equation of ordinary mechanics: the
integral jp ds, or Jp dq, if q is the coordinate in a one-dimensional
motion, is called the action, and the principle Sfp dq = 0, showing
that the action is a maximum or more often a minimum, is called
the principle of least action. And by the calculus of variations
we can show that the principle of least action leads to Lagrange's
equations, as the equations giving the motion of a particle which
obeys the principle. This principle, or a closely related one called
Hamilton's principle, also stated in terms of the calculus of varia-
tions, is often considered a fundamental formulation of the whole
of mechanics, more fundamental than Newton's laws of motion,
since these, in the form of Lagrange's equations, follow from it.
As a matter of fact, the derivation of Lagrange's equations from
the variation principle is the simplest way of deriving them, for
one familiar with the calculus of variations, and leads to the
equations directly in any arbitrary coordinate system. But
here we have gone even farther : we have sketched the derivation
of the principle of least action from wave mechanics, as the law
giving the shape of a ray, determined from interference of the
waves. As we see from this, wave mechanics is the fundamental
branch of mechanics, and ordinary Newtonian mechanics, the
mechanics of particles, is derived from it.
WAVES, RAYS, AND WAVE MECHANICS
343
Problems
1. Assume in Fig. 61 that POP' is the path of the optically correct ray
passing from one medium into a second one of different refractive index.
Prove Fermat's principle for this case, showing that the time for the ray to
pass along a slightly different path, as PAP', differs from that along POP' by
a small quantity of higher order than the distance AO. The figure is drawn
so that AB, CO, are arcs of circles with centers at P and P', respectively,
and it is to be noted that for small AO, the figures AOB, AOC, are almost
exactly right triangles.
Fig. 61. — Fermat's principle for refraction.
2. An electron of charge e = 4.774 X lO" 10 electrostatic units falls
through a -difference of potential of V volts (1 volt = 1/300 e.s.u.) and
bombards a target, converting all its energy into radiation, which travels
out as one photon. Using the relations that the energy of the photon =
hv, v = c/\, where c, the velocity of fight, is 3 X 10 10 cm. per second, find
the wave length of the resulting radiation. Find the number of volts neces-
sary to produce visible light of wave length 5,000 A. (1 A. is 10 -8 cm.);
x-rays of wave length 1 A. ; gamma rays of wave length 0.001 A.
3. Assume that light falls on a metal and ejects photoelectrons, the energy
required to pull an electron through the surface being at least 2 volts. Find
the photoelectric threshold frequency, the longest wave length which can
eject electrons, remembering that the long wave lengths have small photons
which have not enough energy. Discuss the effect of work function (the
energy required to pull the electron out) on photoelectric threshold.
4. Newtonian mechanics becomes inaccurate when the wave length of the
particle becomes of the same order of magnitude as the dimensions involved.
Consider the accuracy of Newtonian mechanics in the problem of an electron
344 INTRODUCTION TO THEORETICAL PHYSICS
in an atom. Assume for purposes of calculation that the electron moves
in a circular orbit of radius 0.5 A., with an angular momentum h/2ir (deter-
mine its speed, and hence wave length, from this fact).
5. Consider as in Prob. 4 the accuracy of Newtonian mechanics for a
hydrogen atom in a hydrogen molecule. The hydrogen atom weighs about
1,800 times as much as an electron. Assume the speed of the atom to be
such that its energy is the mean kinetic energy of a one-dimensional oscil-
lator in temperature equilibrium at temperature 300° abs., or }ikT, where
k = 1.31 X 10 -16 , T is the absolute temperature. Compare the wave length
with the amplitude of oscillation of the atom. To find this, assume that it
oscillates with simple harmonic motion, and that its frequency of oscillation
is 3,000 cm" 1 . (The unit of frequency, cm -1 , is the frequency associated
with a wave length of 1 cm.) Knowing the energy, mass, and total energy, it
is then possible to find the amplitude.
6. Consider, as in Prob. 5, the same hydrogen molecule at 10° abs.; an
atom of atomic weight 100, in a diatomic molecule of two like atoms, similar
to the hydrogen molecule, with the same restoring force acting between the
atoms (therefore with a much slower speed of vibration, on account of the
larger mass), at 300° abs.; at 10° abs.
7. Consider whether the uncertainty principle is important in phenomena
of astronomical magnitude. Assume a body of the mass of the earth (found
from its radius of 4,000 miles, mean density 5.5), moving with a speed of
20 km. per second. Now a measurement of the position is considered, in
wave mechanics, to introduce an uncertainty in the velocity, determined in
terms of the uncertainty in the measurement of position by the relation
ApAq = h. Suppose that the position of the body was determined in space
with an error of only 1 m. (a much greater accuracy, of course, than could
be really obtained). Find the corresponding uncertainty in momentum,
and the angle 9 through which the path is deviated by the measurement.
Find how far from its original path the deviation would carry the body in a
year.
8. Conjugate foci in optics are points connected by an infinite number of
possible correct paths. Thus by Fermat's principle the optical path, or
length of time taken to traverse the ray, is stationary for each of these paths,
meaning that the optical path is the same for each. Discuss this, showing
that for the conjugate foci of a simple lens the optical path is the same for
each ray, carrying out the actual calculation of time.
9. Using the properties of conjugate foci mentioned in Prob. 8, prove
that if a hollow ellipsoid of revolution is silvered, to form a mirror, the
foci of the ellipsoid are optical conjugate foci. Prove that a paraboloidal
mirror forms a perfect image of a parallel plane wave coming along its axis.
CHAPTER XXIX
SCHRODINGER'S EQUATION IN onE DIMENSION
The mathematical treatment of wave mechanics starts with a
wave equation, similar to those of mechanical vibrations or of
light. We shall not try to derive this equation from more
fundamental principles, as we derive the equation of mechanical
vibration from Newton's equations, or the wave equation of
optics from Maxwell's equations; there are some ways of stating
wave mechanics apparently somewhat more fundamental than
.the wave equation, but they are not the best methods to start
one's study with. We shall thus commence by postulating the
wave equation, though arriving at its form by analogy with
other cases. In this chapter we take only the form not involving
the time, since this has a close analogy to optics. The form
including the time is more remarkable, in that it involves com-
plex quantities explicitly in its statement. We shall later
treat it, separate variables in it, and show that the part inde-
pendent of time is the equation treated in this chapter. This
equation was first given by Schrodinger, and is called Schrodin-
ger's equation.
As we recall, the index of refraction, and wave length, of the
waves vary from point to point. This means that the differential
equation is very much like that of the nonuniform string, which
we discussed in Chap. XIV. We shall be able to use the same
approximate solution developed for that problem. We shall
also get the condition for stationary waves, corresponding to
the string held at both ends. This is the so-called quantum
condition, and it now determines, not the overtones of a vibrating
string, but the energy levels and stationary states of atoms and
other systems. The problem, as in the string, leads to expansion
in orthogonal functions, and we shall consider this theory in
later chapters.
209. Schrodinger' s Equation. — The wave equation of optics,
after the time is eliminated, can be written v 2 w + (4V 2 /X 2 )w = 0,
where u is the displacement. In the mechanical problem,
345
346 INTRODUCTION TO THEORETICAL PHYSICS
h/\ = p = momentum. We assume a potential function V
(wave mechanics is very difficult to formulate when there is no
potential). Then the total kinetic energy is p 2 /2m, so that
p 2 /2m + V = E, the total energy, and p = \/2m(E — V).
Thus we have the equation
V*u + ~P(E - V)u = 0,
or
h 2
87r 2 m
V 2 w +Vu = Eu. (1)
These are two forms of Schrodinger's equation in the form not
involving the time.
Suppose that a solution of this equation is u(x, y, z). Then the
corresponding solution of the problem involving the time is this
times an exponential function of the time. Since the frequency
v is E/h, this is e 2wiEt/h u(x, y, z). We note that the differential
equation for u, and hence the resulting solution, depend on the
energy E, just as the function describing the shape of a vibrating
string depends on the frequency. Hence we should properly
use a subscript, u E (x, y, z). The general solution would now
be a sum of such solutions for all different values of E,
^A E e^ Et ' h u E (x, y, z), (2 )
E
as we had a sum of solutions as the general solution for the vibrat-
ing string.
210. one -dimensional Motion in Wave Mechanics. — For
one-dimensional motion, where u is a function of x alone, Schrod-
inger's equation becomes
g + §£»(* - F)« = 0. (3)
Since in general V is a function of x, this is an equation very much
like that of the string with variable density but constant tension.
Just as with that problem, we can easily set up an approximate
solution of the problem, if the quantity E — V, corresponding
to the density, does not change by too large a fraction of itself
in, a wave length, though the exact solution is generally difficult,
and has been worked out in only a few special cases. The
approximate solution is easily shown, by the method used in
Chap. XIV, to be
SCHRODINGER'S EQUATION IN onE DIMENSION 347
constant ±-£h<ix
where p has the value y/2m{E — 7), as before. This method
of solution, as applied to wave mechanics, is often known as
the Wentzel-Kramers-Brillouin method. It immediately leads
to one result of physical interest, when we consider the amplitude
of the wave.
We have seen in the last chapter that the intensity of the
wave measured the probability of finding the particle at the corre-
sponding point, just as in optics the intensity of the light-wave
measures the probability of finding the photon. Now, if we use
the wave function given above, with its complex exponential,
we must evidently multiply by its conjugate to get the intensity,
or the square of its amplitude:
constant T ipdx ^, constant --jfjpo*
uu = —. . . -e X ,, -e
J/E - V \/E - 7
constant constant ,_.
= , = (5)
VE - 7 V
To get the probability that the particle is in a small element of
length ds, we must multiply through by ds, obtaining a con-
stant X ds/p. But now suppose a particle were moving along
the x axis according to the Newtonian mechanics, with the
same energy E, in the same potential field 7. The length of
time which it would spend in any small element of length ds
would be ds/v, or m ds/p. Apart from the arbitrary constant,
which could be determined to bring agreement, this is just
like the quantum expression. If we knew that the classical
particle was moving in this way, but did not know when it
started, all we could say would be that the probability of find-
ing the particle in a given region at any time was proportional
to the length of time which it would have to spend in that region.
In other words, our solution, of constant energy, corresponds
to a classical particle whose energy is determined but whose
initial time of starting is undetermined, and we can find from
our wave function the probability of finding it in any region.
To the approximation to which the Wentzel-Kramers-Brillouin
solution is correct, the classical and quantum probabilities agree
exactly, but they do not to a higher approximation. At any rate,
however, we can say that the wave function is large in regions
348 INTRODUCTION TO THEORETICAL PHYSICS
where the particle is likely to be, or is moving slowly, and is small
where the particle is moving rapidly and is unlikely to be. It
should be stated that sometimes, instead of the wave function
with complex exponential, we use the corresponding real wave
function
constant 2x , , .„.
,, cos -j- J p dx. (6)
^/E -V h
In this case, the probability function has a factor of cos 2 -j- jp dx,
introducing a sinusoidal fluctuation of probability which must be
ignored in making comparisons with the classical probability.
In the preceding paragraph, we have tacitly assumed that
the kinetic energy E — V was always positive, so that p was
real. But in many problems, as we have seen from our discus-
sion of classical mechanics, this is true only in limited regions,
and outside these regions p becomes imaginary. Even in this
case, the method of Wentzel, Kramers, and Brillouin is still
formally correct. But there are two physical differences. First,
+ -JT- Jp dx is now real, so that we have a real exponential,
either increasing or decreasing with x, depending on the sign.
Secondly, to keep the whole function real, we must make the
first factor constant/ -\/V — E, which amounts to changing
the constant by multiplying by -\f^\. The approximate
solution does not hold at all in the neighborhood of the point
where the kinetic energy is zero, for there the wave length is
infinite, and the assumption that E — V changes only a little
in the distance of a wave length cannot be true. But we can
easily see how to construct an approximate solution in this
region, for the differential equation here is simply d 2 u/dx 2 = 0,
the equation of a straight line; the actual curve of u against x,
as we readily see, has a point of inflection at the point where
E = V, being concave downward where the kinetic energy is
positive, concave upward where it is negative. We can then
take the exponential solution in the region of negative kinetic
energy, and the oscillatory one in the region of positive kinetic
energy, and join them by a line which is approximately straight.
It is obvious, as we see for instance in Fig. 62, that, if we know
beforehand the constants of the exponential solution (as for
instance the amplitudes of the two terms, one increasing and
the other decreasing exponentially, which we must add to get the
SCHRODINGER'S EQUATION IN onE DIMENSION 340
complete solution) the initial value and slope of the sinusoidal
solution must be definitely determined to make the two join
smoothly. That is, the phase of the sinusoidal solution, or the
amplitudes of sine and cosine functions which we add together,
Fig. 62. — Joining of exponential and sinusoidal functions at point where
p = 0. Upper curve shows potential and total energy against x, lower curve
shows wave function. The exponential part of the function is so chosen that
the amplitude of the term increasing exponentially with decreasing x is zero;
otherwise the function would go to infinity instead of asymptotically to zero as
x became negatively infinite.
are determined. The same thing is true at every such boundary
that we cross; if we once determine the two arbitrary constants
in one part of the region, the whole function is determined, to
make exponential and sinusoidal curves join smoothly. This
must naturally be true, since the differential equation is one of
the second order, with just two arbitrary constants.
350 INTRODUCTION TO THEORETICAL PHYSICS
211. Boundary Conditions in one -dimensional Motion. —
Suppose, first, that we consider a mechanical problem where the
kinetic energy is always positive. Then there are no regions
where the wave function is exponential; it is always sinusoidal,
of finite amplitude. For any energy we have two solutions,
which, bringing in^the time but writing in exponential form, are
constant Jg m± f pdx) ^
■\/E - V ' K '
of which the real parts represent progressive waves traveling to
left or right along the x axis. This corresponds to the fact that
the corresponding mechanical particles can travel in either direc-
tion, and, as we have seen, the intensity of the wave at any point
properly agrees with the probability that the particle should
be in that region, as computed classically on the assumption
that we do not know when the particle started.
Next let us assume that E — V remains positive to infinity in
one direction, say to the right, but becomes negative to the left
of a certain point, say x = x x , as in Big. 62. The solution will
then be exponential to the left of x = xi. But in general it will
be a linear combination of two exponential functions, one increas-
ing exponentially in magnitude to <» as x approaches — <*> , the
other decreasing exponentially to zero. If the amplitude of the
former is different from zero, then the intensity of the wave will
be infinite at — <x> , meaning that the probability of finding the
particle at — oo is infinitely greater than of finding it anywhere
else. This is ordinarily not the physical situation we wish to
describe; hence we must assume that the amplitude is zero, and
that the solution to the left of x\ has just the one term
A I
2w Cx /-
m(V-E)dx
A , eh J VMlK - Jt ' M , (8)
which goes to zero at x = — °o . But now this comes up to the
point x = x x with a definitely determined slope (or rather, the
ratio of slope to function, in which the arbitrary constant factor
cancels out, is definitely determined). Then there is just a very
definite sinusoidal function which joins onto this, as Fig. 62
suggests : the approximate solution for x > xi is given by
. sin ( -^ 1 p dx + a). (9)
SCHRODINGER'S EQUATION IN onE DIMENSION 351
It can be shown that for continuity of Eqs. (8) and (9) at x x we
must have a = tt/4 = 45 deg. This statement means that the
sine curve, instead of having a node at x h has already at that
point passed through an eighth of a wave length. It is as if this
eighth wave length were stretched out to infinity to form the
exponential part of the curve.
We have seen, then, that a boundary where E ~ V imposes a
definite boundary condition on the solution. In our problem
where the motion extends to infinity in one direction, the condi-
tion can be always satisfied, by proper choice of phase and ampli-
tude of the sinusoidal function, as we have seen. But there are
two interesting results, of our calculation. First, the wave in
the region where kinetic energy is positive becomes now a real
function of position, or correspondingly a real function of time.
In other words, it is a standing wave, not a progressive wave. It
corresponds to superposed progressive waves traveling with equal
intensity in both directions. The progressive wave approaching
from the right is reflected at the boundary, and turns back with-
out diminution of intensity on the reflection. The mechanical
situation is that the particle, approaching the point where kinetic
energy is negative, is reflected and turned back, just as it would
be in the same problem in classical mechanics. But the other
interesting result is that, on account of the exponential terms to
the left of x = 0, the particles can slightly penetrate the region
where kinetic energy is negative. on account of the rapid dying
out of the exponential, this effect is not large, but we shall see in
the next section that there can be cases where it is very important
physically. This penetration by an exponential wave has an
analogy in optics: a wave of light approaching an optically rarer
medium at an angle greater than the critical angle is totally
reflected, but at the same time, as we have seen in Sec. 168,
Chap. XXIII, there is a disturbance, dying out exponentially, in
the rarer medium, almost exactly equivalent to what we have
here.
212. The Penetration of Barriers. — The exponential penetra-
tion of particles into the region of negative kinetic energy has as
a result that in wave mechanics, unlike classical mechanics, a
particle can go from one region of positive kinetic energy to
another, even though there is a barrier of negative kinetic energy
between. Such barriers are found, for example, in some cases
at the surface of a metal, where the electrons in emerging from
352 INTRODUCTION TO THEORETICAL PHYSICS
the metal, for example at high temperature in thermionic emis-
sion, may find a surface layer of atoms, exerting on them such
a strong repulsive force that they would be unable to penetrate
on classical mechanics, but can in quantum theory. Suppose
that we have a simple barrier of the sort shown in Fig. 63, where
the potential has one constant value to the left of £ ,'a second
high value between x and xi, and a third lower value to the right
of xi, and where E — V is negative only between x Q and x\. The
corresponding problem in metals is that where the region to the
left of x Q represents the interior of the metal, that to the right of
x\ the space outside, that between x and xi the surface layer or
E3
v
h.
! E,
*i
Fig. 63. — Potential barrier. The barrier is between xa and xi. Motion with
the energy E\ would have a wave function large only to the left of xo, rapidly
decreasing to the right of x<>. With the energy Ei, the wave function would be
large on both sides of the barrier, small but not zero within it, giving the possi-
bility of penetrating the barrier. The wave function of energy E% would be
large everywhere.
barrier. An electron of low energy, as E h will be confined, except
for a small exponential term, to the region to the left of x , or
the interior of the metal, and will never escape. An electron of
the very high energy E s will be able to escape, either on classical
or quantum mechanics. But an electron of intermediate energy
E 2 can penetrate the barrier and escape on quantum mechanics,
but not in Newtonian mechanics. These electrons of high
energy, as Ei or E% y are met only at high temperatures, so that
we see the connection with thermionic emission.
Consider an electron of energy E 2 , and a solution which to the
right of xi is a progressive wave traveling to the right. Then
within the barrier we should have a combination of the two kinds
of exponential functions, one increasing exponentially to the left,
the other decreasing, with amplitudes properly chosen to satisfy
the boundary condition of continuity of the function and its
slope at x\. These in turn will join onto two progressive waves
SCHRODINGER'S EQUATION IN onE DIMENSION 353
to the left of x , one traveling to the right, one to the left. The
final result may be described as follows: An incident progressive
wave falling on x from the left; a reflected wave in the region to
the left; a transmitted wave to the right of x\, the transmission
through the barrier being of the real exponential form. We can
tell something without much trouble about the amount trans-
mitted. For within the barrier the term
Constant -j- J y/2m(V-E) dx
\/V - E 6
increasing exponentially to the left, is the important one. And
we readily see that its amplitudes at xi and x measure, at least
in order of magnitude, the relative amplitudes of transmitted and
incident waves. Thus the fraction transmitted depends on the
— 2tt Cxi ,
square of the quantity e h * Xo m x . We work out examples
of this integral in the problems, showing that there can be barriers
of atomic size small enough so that appreciable penetration takes
place, though in general this is not true, since a small increase in
the height or breadth of a barrier can, on account of the exponen-
tial, make an enormous difference in the ease of penetration.
213. Motion in a Finite Region, and the Quantum Condition. —
Assume next that the kinetic energy is positive only in a finite
region, so that classically the motion would be limited to that
region. Then there will be a boundary condition on the wave
function at each boundary of the region. Just as with the string
held at both ends, this condition cannot in general be satisfied;
it can be satisfied only for certain energies (corresponding to
certain frequencies with the string). Using the approximate
method of Wentzel, Kramers, and Brillouin, it is easy to see the
nature of this condition. For each boundary must have essen-
tially the treatment of Fig. 62, only the exponential decreasing
toward infinity being allowed, whereas with an arbitrary energy
the exponential would increase toward infinity in at least one
direction. We have seen that the exponential part of the curve
corresponds to | wave length of the sinusoidal part. The num-
ber of wave lengths between x\ and x 2 is I ^ dx. Thus the whole
Jx t tl
number of waves between — <*> and °° , taking account of the two
C X2 v 1
exponential ends, is I £ dx + j- Since the function goes to zero
354 INTRODUCTION TO THEORETICAL PHYSICS
at both limits, this must be a whole number of half waves, or
twice it must be a whole number. Hence
i Li dx+ \
2 1 p dx = ( n + ^jh, n = 0, 1, 2,
2|J«fc + 2- 1,2,3,
(10)
This is the so-called quantum condition, developed particularly
by Sommerfeld. We must remember that, since it is based on
the approximation of Wentzel, Kramers, and Brillouin, it is not
necessarily an exact condition. In some cases, as the linear
oscillator, taken up in Prob. 5, it proves to be exactly true. In
other cases, as a particle moving freely between two reflecting
walls, as considered in Prob. 10, a similar condition holds, except
that the quantum number, which here is {n + |), a half integer,
is instead a whole integer. There are still other problems, as
the hydrogen atom, in which a modified form of the condition
is correct. In most cases, however, the quantum condition gives
only an approximation, though a good one.
A number of problems can be solved exactly when the motion
is confined to a finite region, and it is by comparison with these
exact solutions that one can check the method of Wentzel, Kram-
ers, and Brillouin, and the quantum conditions. Thus, in Prob.
5 we show that the wave equation for the linear oscillator can be
solved as an exponential times a power series. This power
series in general diverges for large x, indicating a function which
goes to infinity as x becomes infinite. But if we give the energy
p dx
= (n + \)h, the series breaks off to form a polynomial, and the
function goes to zero at infinity. These are the only solutions
we can use, and they give just the quantum condition we found
before, though by a quite different method. Again, a rotator,
a solid of fixed moment of inertia and constant angular momen-
tum rotating on an axis in the absence of torques, has a wave
function e ~ h , where p, 8 are angular momentum and angular
rotation. Since p is constant, the real forms of this are sin (or
cos) (+ 2Tp6/h). For this to represent a single-valued function
of position, it is necessary, as with the circular membrane, to
have the function periodic with period 2x in 6. Thus we must
SCHRODINGER'S EQUATION IN onE DIMENSION 355
have 2tp/Ji = integer = m, giving whole integral quantum
numbers in this case, and determining the angular momentum
as m h/2w.
214. Motion in Two or More Finite Regions. — In classical
mechanics, we do not have to discuss specially the case where
there are two separated regions where the kinetic energy is
positive, separated and bounded by regions where it is negative;
the motion occurs in one or the other of these regions, and that
is the end of it. But in wave mechanics, the barrier between
regions is not entirely impenetrable. We shall not go into the
mathematical details of the solution, for, while they involve
no new ideas, they are rather tedious. But the result is that
the particles can penetrate the barrier and go from one region
to the other, just as we have seen in a previous section in consider-
ing a barrier between two regions each extending to infinity
in one direction. There are some new situations, however.
Each region by itself would have stationary states of its own,
if the other were not there. But with the two, no one of these
states refers to motion wholly in the one region; the particle
can go back and forth from one to the other. However, if the
energy level is one that is characteristic of the one region and
not of the other, the particle spends almost all of its time in
that region of which its energy is characteristic. once in a while
it leaks over to the other side, but it soon finds its way back.
It may be, however, that a given energy level will be char-
acteristic of both regions; this is surely true if they are identical
regions. Then the particle will travel back and forth from one
to the other, spending equal lengths of time in each. This is
an important physical case. For instance, in the hydrogen
molecule, both atoms are just .alike, and an electron finds a
potential field which has two minima, one at each nucleus. It
then can oscillate back and forth, spending half its time about
one nucleus, half on the other. These problems are closely
analogous to that of coupled oscillators, which we have already
taken up. There we found that one oscillator would not move
without setting the other into vibration, and similarly here the
wave function cannot be large in one region without having a
value in the other also. And here we have a special case if the
two regions are identical, as we did before if the oscillators were
equivalent. We shall find that the whole mathematical treat-
ment is closely analogouf ,
356 INTRODUCTION TO THEORETICAL PHYSICS
We can finally have motion in two regions, one finite, the other
reaching to infinity. Then, if the particle starts in the finite
range, it is able in time to leak across the boundary, and go off to
infinity. The present explanation of radioactivity is based on
this idea. An alpha particle is supposed to be held in an atomic
nucleus by a restoring force pulling it to a position of equilibrium.
But if it were outside, then being positively charged, it would
be repelled from the positive nucleus, the repulsion going to
zero at infinity. Thus we should have a potential curve as in
Fig. 64, where potential is drawn as function of r, the distance
Nucleus
Fig. 64.— Potential curve for radioactive disintegration. A wave function
of energy approximately E, starting out as a wave packet within the valley of
the potential curve, would gradually leak out through the barriers.
from the center of the nucleus to the escaping alpha particle.
If now the alpha particle has energy E, and is originally within
the nucleus, it will eventually leak out, going off to infinity
with a large kinetic energy, as the ejected alpha particles are
actually found to have.
Problems
1. Prove that the function co nstant e ±£ T^ p dx , where p = \/2m{E - V),
\/E - V F V A J,
is an approximate solution of Schrodinger's equation, becoming more and
more accurate as V changes by a smaller and smaller amount in a wave
length.
2. Note that in Bessel's equation for J m , when m > 0, there is a region
near the origin where the Wentzel-Kramers-Brillouin approximate solution
is exponential rather than sinusoidal. Discuss the solution qualitatively
for x < m, where m is fairly large, showing how this solution joins onto the
sinusoidal one found in Prob. 9, Chap. XIV. *
SCHRODINGER'S EQUATION IN onE DIMENSION 357
3. Note that the solution of Schrodinger's equation is sinusoidal or expo-
nential in a region of constant potential. Discuss the one-dimensional
problem of particles going from one region with constant potential Vi to a
second region of constant potential V z , when the energy is great enough so
that the kinetic energy is positive in both regions. Satisfy boundary condi-
tions at the surface, making u and its derivative continuous, joining the two
sinusoidal functions together at the boundary. Show that some of the
incident particles travel across the boundary, but that some are reflected
back, contrary to classical mechanics. Find the fraction reflected.
4. Assuming the potential function of Fig. 63, consider particles striking
the barrier from the left with energy E%. Set up the solution, satisfying
the boundary conditions at x and xi, and get an expression for the reflection
coefficient as a function of the height of the barrier. Show that the reflec-
tion coefficient approaches unity if the barrier is infinitely high, or infinitely
broad.
JI*X2
p dx for an oscillator of natural frequency v,
XI
energy E, equals E/v. Show that therefore the quantum condition leads
to the energy levels E = (n + %)hv for the oscillator.
6. Solve the problem of the linear oscillator of frequency v, where V =
2ir i v 2 mx i . To do this, set
2 * ifnv x ,
u = e h v(x),
and set up the differential equati on for v(x). For convenience, introduce
the change of variables y = 2ir\/mv/h x. Solve in series, and show that
the resulting series breaks off only if E = (n + %)hv, where n is an integer.
7. Using the series of Prob. 6, investigate the behavior for large x if the
series does not break off. Show that for very large x, v approaches the
/iir 2 mv
series for e h , so that the whole function u increases exponentially with
x 2 , and cannot be used as a wave function.
8. Compute and plot wave functions of the linear oscillator corresponding
to n = 0, 1, 2, 3, 4. From the graphs find the region in which the solution
is oscillatory (that is, the region between the points of inflection). Draw
the potential curve and the values of E corresponding to these four stationary
states, and show that the motion is oscillatory in the region where the
kinetic energy is positive.
9. Set up the approximate solution for the linear oscillator problem by
the Wentzel-Kramers-Brillouin method, getting expressions for the functions
in both the sinusoidal and the exponential ranges. Investigate to see how
well these functions join on at the point of inflection.
10. Compute and plot the approximation of Prob. 9 corresponding to
n = 4, and compare with the exact solution.
11. A particle executes one-dimensional motion in a container, having
constant potential inside, and with the potential becoming suddenly infinite
at the walls, so that the particle never gets out. Show that the boundary
condition is that the wave function must be zero at the walls, as the dis-
placement of a stretched string is zero at its ends. Find the wave functions
of the problem, and find the energy of the particle in the nth stationary state.
CHAPTER XXX
THE CORRESPONDENCE PRINCIPLE AND STATISTICAL
MECHANICS
The quantum condition has a close connection with the phase
space and the Hamiltonian method, which we have discussed in
Chap. IX. Hamiltonian methods have, in fact, been the guiding
principle in the development of the quantum theory. At the
same time, the phase space is fundamental to statistical mechan-
ics, the mathematical foundation of thermodynamics. For that
reason, we may profitably treat these subjects together, though,
of course, statistical mechanics can be developed entirely from
the basis of classical theory. Nevertheless, on account of the
essentially statistical nature of the quantum theory, it yields
an almost more natural approach to statistical mechanics than
is possible in Newtonian mechanics, and by developing the two
together we can illustrate the correspondence between classical
and quantum mechanics which must hold, since the classical
theory is a correct limiting form of quantum theory for large-
scale problems.
215. The Quantum Condition in the Phase Space. — In Fig. 11,
Chap. IX, we show the phase space for a linear oscillator, with
a line of constant energy E, an ellipse of semiaxis \^E/2ir 2 mv 2
along the q axis, and s/lmE along the p axis. These quantities
measure the maximum coordinate and momentum, respectively,
which a particle of E attains during its motion. For such an
oscillator, the quantum condition (10), Chap. XXIX, equates
twice the integral of pdq between the minimum and maximum q
values to (n + %)h. Just as Jy dx measures the area under the
curve y (x), so fp dq measures the area under the curve p{q)
in the phase space. The integral I 2 p dq is that part of the area
of the ellipse above the q axis, and to get the whole integral we
double this, obtaining also the integral below the q axis. This
may be written as an integral around the contour, .from q x to q 2
around the upper branch of the curve, then back to q\ along the
358
THE CORRESPONDENCE PRINCIPLE 359
lower part of the curve, in which p and dq are both negative,
so that we contribute an equal positive term to the integral.
In other words, the quantum condition may be written
fp dq = (n + %)h, (1)
wnere £ indicates an integral around the contour. And the
physical interpretation is that the quantum integral is the area
of the ellipse. Since this is irdb, where a and b are the two
semiaxes, it is ir\/2niE\/E/2Tr 2 mv 2 = E/v, giving from Eq. (1)
E = (n + %)hv, in agreement with the result of Prob. 5, Chap.
XXIX.
The results of the last paragraph are general: with any one-
dimensional motion the quantum integral fpdq represents
the area of phase space enclosed by the path of the representative
point, and the quantum condition says that this area is
(n + i)h, approximately. If we take successive stationary states,
connected with successive quantum numbers n, each will have
a curve in phase space, the path of a representative point of the
corresponding energy, and the area between successive curves
will, by the quantum condition, be h. Thus the phase space is
divided up by these paths into a set of cells, each of area h,
one for each stationary state.
216. Angle Variables and the Correspondence Principle. — We
have seen in Ghap. IX, Sec. 59, that a change of variables, called
a contact transformation, can be set up, in which the new coordi-
nate w increases uniformly with time, and the momentum J
stays constant. To visualize this transformation in the case
of the oscillator, we may imagine the phase space plotted with
such scales of coordinates and momenta that the ellipses of
constant energy become concentric circles. Then the new
variables are essentially polar coordinates in phase space, the
coordinate being the angle divided by 2-tt, the momentum being
proportional to the square of the radius, so that obviously the
angle variable increases uniformly with time, the momentum
staying constant. The momentum /, called the action variable,
proves in fact to be precisely the area of the ellipse, or circle, or
the same integral fpdq which we meet in the quantum condition.
In terms of the action variable, often called the phase integral,
we saw that Hamilton's equation
dH dw ._.
«7 " W " ' (2)
360 INTRODUCTION TO THEORETICAL PHYSICS
gave the frequency in terms of a simple calculation. This
formula permits us to make an extremely interesting connection
between the classical frequency of motion of a system and the
frequency of the light emitted in a transition between two states
of energy E 2 and E\ according to Bohr's frequency condition
E 2 - Ei = hv, (3)
described in Sec. 201. on the quantum theory, most energies H
of the system are not allowed; we may have rather only those
satisfying the quantum condition (1). Thus H cannot be
regarded as a continuous function of J. We may, however,
replace the derivative dH/dJ of (2) by the difference ratio
AH/AJ, in which AH is the energy difference between two states,
AJ the difference between their phase integrals. If we choose
two states whose quantum numbers differ by unity, we have
AJ = h, so that the difference ratio is
h-~ =V >
giving precisely the quantum frequency according to Eq. (3).
Hence we have the following relation: the derivative dH/dJ
gives the classical frequency of motion of a system; the difference
ratio AH/AJ, where the difference of J is one unit, gives the
frequency of emitted light according to the quantum theory,
or the frequency of oscillation of the oscillator mentioned in
Sec. 202. We shall consider later the significance of transi-
tions of more than one unit in J.
For the oscillator, as one can immediately see from the fact
that its energy in the nth state is (n + $)hv, the classical and
quantum frequencies are exactly equal, the derivative equaling
the difference. This is plain from the fact that here E = Jv,
so that the curve of E against J is a straight line, and the ratio
of a finite increment in ordinate, divided by a finite increment
in abscissa, equals the slope or derivative. But for any other
case the curve of E against J is really curved, so that the deriva-
tive and difference ratio are different, and classical and quantum
frequencies do not agree. Thus in Fig. 65 we show an energy
curve for an anharmonic oscillator, in which the tightness of
binding decreases with increasing amplitude, the frequency
decreases, and therefore the slope decreases with large quantum
numbers. Here the classical frequency, as given by the slope
of the curve, does not agree with the quantum frequency con-
THE CORRESPONDENCE PRINCIPLE
361
nected with the transition indicated, from % to %, for the
quantum frequency is the slope of the straight line connecting
E 2 and El We may assume, however, that if we go to a very
high quantum number, so that we are far out on the axis of
abscissas, any ordinary energy curve will become asymptotically
fairly smooth and straight, so that the chord and tangent to
the curve will more and more nearly coincide. This certainly
happens in the important physical applications we shall make.
In these cases, we may state Bohr's correspondence principle :
j£ h %h %h
Fig. 65. — Energy curve for anharmonic oscillator. Slope of curve gives
classical frequency, slope of straight line connecting E 2 and Ex gives quantum
frequency.
in the limit of high quantum numbers, the classical and quantum
frequencies become equal. This is essentially simply a special
case of the general result stated in Chap. XXVIII, that in the
limit of small wave lengths (which for most practical purposes
is the same as the limit of high quantum numbers) the classical
and quantum theories become essentially equivalent.
217. The Quantum Condition for Several Degrees of Freedom.
In classical mechanics, we have seen that certain problems, like
the two-dimensional oscillator and the central field motion, are
separable, so that they can be broken up into several one-dimen-
sional motions. Since each of these motions was periodic, the
362 INTRODUCTION TO THEORETICAL PHYSICS
whole motion is multiply periodic in these cases. In these partic-
ular problems with several degrees of freedom, separation of
variables can also be carried out in the quantum theory. In
phase space we can pick out the two-dimensional space represent-
ing one coordinate and its conjugate momentum, and the projec-
tion of the representative point on this plane will trace out a
closed curve. There is a quantum condition associated with this
coordinate, the area enclosed by the curved path in the two-
dimensional space being a half integer times h. Thus we have
a quantum number associated with each degree of freedom in
such a problem. Further, we can introduce angle and action
variables connected with each of the coordinates, just as if each
formed a problem of one degree of freedom. The various fre-
quencies of the multiple periodicity can be found by differen-
tiating the energy with respect to the various J's, and the
correspondence principle can be applied to connect these classical
frequencies with the quantum frequencies associated with various
possible transitions.
It can be shown in general that any coordinate of, say, a doubly
periodic motion, can be analyzed into a sort of generalized Fourier
series in the time, in which terms appear of frequencies
where n, r 2 are arbitrary integers. This is the generalization
of the ordinary Fourier representation for a purely periodic
motion, in which all frequencies nv x will in general appear which
are integral multiples of the fundamental frequency. Now we
can carry out a general correspondence between any one of these
overtone or combination frequencies and a corresponding transi-
tion. Thus let us consider the transition in which Ji changes
by ti units, J 2 by t 2 units, where Ji and J% are the two action
variables. The quantum frequency emitted will be
E(Ji, Jt) - E(Ji ~ rji, J 2 - T 2 h) ^ ^
h
where E is the energy, written as function of the J's. But if we
are allowed to replace differences by derivatives, as we assume we
are in the correspondence principle, this becomes
l/dE , . BE ,\ , „
THE CORRESPONDENCE PRINCIPLE 363
in agreement with Eq. (4), if Vl = dE/dJ h v 2 = dE/dJ 2 . Thus
we have a one to one correspondence between all possible over-
tone vibrations of the classical motion and all possible quantum
transitions. This correspondence is of great importance, for
instance, in discussing intensities of radiation, as we shall see
later. For each component of the Fourier representation is
a sinusoidal vibration, with frequency (4), and a certain ampli-
tude A Tl , rx . This oscillation, if it were the oscillation of an
electric charge, would send out a radiation of frequency (4),
with an intensity proportional to the square of the amplitude,
as we have seen in Chap. XXV, where we found a rate of radia-
.. 16ir 4 AV _,, , . ^
— 3c* * Fourier component A would directly
determine the intensity of classical radiation. It then seems
very reasonable that, at least in the limit of high quantum num-
bers, this intensity would agree approximately with the intensity
of the corresponding quantum transition given by Eq. (5). Thus
one can derive from correspondence principle definite information
about probabilities of quantum transitions, for the rate of radia-
tion of energy in a particular transition is proportional to the
number of transitions occurring per unit time, or the probability
of transition. We shall return to this question in a later chapter.
The results which we have mentioned are all for multiply
periodic, separable problems in several dimensions. With an
n-dimensional problem, and a 2n-dimensional phase space, there
are n J's which stay constant during the motion. Thus we may
set up n sets of surfaces, J x = constant, J 2 = constant, . . .
/„ = constant, in the phase space, and the representative point
moves so that it stays on an intersection of all n surfaces, or in
an n-dimensional region, instead of all through the (2n - 1)
dimensional energy surface, as it would in quasi-ergodic motion.
The particular surfaces J x = (m + $)h, J 2 = (n 2 + i)h, etc.,
divide up the phase space into cells, each of which is seen to have
the volume h n , at least in simple cases, and a little examination
shows that there is just one stationary state per cell. Of course,
the path of a representative point is always on an energy surface,
and if we take only the quantized J values, the corresponding
representative points lie only on the energy surfaces correspond-
ing to quantized energy values. In many cases it proves to be
true that a number of different stationary states have the same
energy. Such a problem is called degenerate, and the number of
364 INTRODUCTION TO THEORETICAL PHYSICS
different states connected with the energy level is called the a
priori probability of the level. In such a case the volume of
phase space between this energy surface and the next adjacent
one proves to be h n times the a priori probability.
For a quasi-ergodic system, as we have said, there are no
quantities like the J's which stay constant, other than the energy.
There are still stationary states in the quantized problem, though
they are not determined by ordinary quantum conditions. They
are derived from solutions of the Schrodinger equation, however,
and the boundary conditions lead to definite stationary states,
as with one-dimensional motion. Thus we can always introduce
energy surfaces in the phase space, corresponding to the quantized
states. Generally quasi-ergodic systems are not degenerate, all
energy levels being distinct, and the volume of phase space
between successive energy levels will always be, at least to an
approximation, equal to h n . These relations prove to be of
importance in investigating the statistical mechanics of collections
of systems in the phase space.
218. Classical Statistical Mechanics in the Phase Space.— In
Chap. IX we have investigated the motion of a representative
point in the phase space. Statistical mechanics, however, like
any statistical science, deals not with single points but with an
enormous number of individuals, investigating their average
behavior. In its applications to thermodynamic problems, there
are two principal methods, both of which are frequently used.
In the first of these, we deal, for instance, with a gas composed
of a great many identical molecules. These molecules them-
selves form the individuals whose average properties we investi-
gate. Thus the phase space we use is one in which there are
enough coordinates and momenta to describe a single molecule.
Such a space is often called a n space. The second method is
more powerful but more abstruse: the individuals with which we
deal are whole systems, as whole samples of gas, and we imagine
a large collection, often called an ensemble, or assembly, of such
samples, all just alike in such gross properties as volume, tem-
perature, and density, but differing in their finer details, as the
positions and velocities of individual atoms or molecules. These
might represent different pictures of the same gas at different
times; or they might represent different repetitions of the same
experiment, all controllable conditions being held fixed. Finding
averages over such ensembles means then finding the time aver-
THE CORRESPONDENCE PRINCIPLE 365
age, or finding the average obtained by repeating the experiment
many times. The phase space required for this second method
has as many coordinates and momenta as there are in the whole
system, a very great number if the system contains many mole-
cules. This space is often called the r space. As to the dis-
tinction between the methods ,of the /j, and the T spaces, the
general situation is that they are equivalent when applied to
perfect gases; but if the molecules interact, they can no longer be
treated as independent systems and described by separate points
in the p, space, but one must instead consider the whole system
together, and use the V space. The latter method is then the
one which we shall use more often. Both methods are alike,
however, in using phase, spaces, and in considering the motion of
a swarm of points in such a space.
We imagine an ensemble of a great many, or even an infinite
number, of points in a phase space. As time goes on, with the
points moving, the effect is as of the whole swarm flowing, like a
liquid or gas composed of atoms. In fact, many of the ideas of
hydrodynamics can be applied in this case, as we shall show in
the next section. We introduce -first the density of points as a
function of the p's and q's:
f(pi . . . p n , qi . . ■ q n )dpi . . . dpndqt . . . dq n
gives the number of points in the 2n-dimensional volume element
dpi . . . dp n dqi . . . dq n . The velocity of points in the phase
space is then given by Hamilton's equations, dqi/dt = dH/dpi,
dpi/dt = —dH/dqi, as we pointed out in Sec. 52, Chap. IX.
Thus we have the necessary' quantities to describe the motion of
the points as a flow, and in the next section we apply the equation
of continuity and investigate its consequences.
219. Liouville's Theorem. — Consider the steady flow of a
fluid of density p, velocity v. The equation of continuity is
dp/dt + div (pv) — 0, or dp/dt + p div v + v • grad p = 0, (6)
if the density varies from point to point. We are interested
particularly in a divergenceless flow, for which div v = 0, for it
turns out that the flow of points in the phase space is of this sort.
It is easy to see that this corresponds to the flow of an incom-
pressible fluid. For let us find dp/dt, the time ra,te of change of
density with time. This is given by
dp _ dp dp dx dp dy _ dp
* ~ Tt + Tx U + ay H + " at + v grad p ' {7)
366 INTRODUCTION TO THEORETICAL PHYSICS
where dp/dt is the rate at which density changes if we follow
along with a particle of fluid. But now if div v = 0, Eq. (6)
becomes dp/dt = 0, showing that the density following the
particle does not change with time, which is to be expected if the
fluid is incompressible. This does not imply, however, that
the density of the fluid is at all points the same. Let us imagine a
fluid composed of large droplets of one sort of fluid suspended in
another. If the fluids are chosen so that they do not mix, and
the surfaces of separation remain sharp, then the density will
change from point to point, as we go from the one fluid to the
other. Further, if the whole fluid is moving, the density at any
point of space will change with time, as first the one sort of fluid,
then the other, will be carried past this point. But if the fluid
is incompressible, the density of a particular part of the fluid, as
we follow it in its motion, will be constant. That is, v • grad p
and dp/dt are separately different from zero, but their sum
vanishes.
The situation we have just described holds for the motion of
points in the phase space. The 2n-dimensional velocity of points,
as we have seen in the last section, has components dqi/dt, dpi/dt,
where i goes from 1 to n. Then the analogue to the divergence is
div v = — ^ + — ^ + • • • + — ^ + • • •
dqi dt dq 2 dt dpi dt
6 dH , d dH , d dH _ n /0 ,
dqi dpi dq 2 dp 2 dpi dqi
Thus on account of Hamilton's equations the flow is divergence-
less. Then we see that the flow is an incompressible flow, the
density of points remaining constant as we follow a particle.
This is Liouville's theorem.
220. Distributions Independent of Time. — The principal use
of distributions in the phase space is for thermodynamic pur-
poses, and here we are interested in thermal equilibrium, and
in distributions independent of time. An ensemble independent
of time is one for which dp/dt = 0. To get that, we see from
Eq. (7) that we must have v • grad p = 0. This means that
the rate of change of density along the direction of flow, or
along the streamline, is zero. In other words, all along a single
line of flow, or through a single tube of flow of infinitesimal cross
section, the density is constant. We may imagine the whole
phase space divided up into tubes of flow, and then any distribu-
THE CORRESPONDENCE PRINCIPLE 367
tion in which each tube has its own constant density through
its whole volume, no matter how this density may change from
one tube to another, will be independent of time.
' In a multiply periodic motion, the lines of flow will be given
by J i = constant, J 2 = constant, • • • . Thus if we make the
density any arbitrary function of the J's, we shall have a distribu-
tion independent of time. Remembering that the density is
the function /, this is
/(Pi * ' * Vn, qx •'• ' q n ) = F(JiJ 2 • • • J»). (9)
on the other hand, in a quasi-ergodic motion, a single line of flow
will in time come arbitrarily near to every point of an energy
surface. Thus the only distribution which will be independent
of time in this case is one in which the density is constant all
over an energy surface:
/(Pi • • • P», qx • • • <?») = F(E). (10)
Of course, the ensemble (10) would be independent of time even
in a multiply periodic motion, but it is more specialized than is
necessary in that case. For instance, in an ensemble of systems
each consisting of a particle in central motion, we could make an
ensemble in which all parts of the phase space corresponding
to the same energy had the same density, and this would be
constant. But we could equally well make the density in differ-
ent parts of the space corresponding to the same energy but
different angular momentum different, and still, as long as
the angular momentum was conserved, this distribution would
be constant. Any perturbation which involved slow changes
of angular momentum, however, would destroy the constancy
of this distribution, whereas if we had started with one which
depended only on energy, it would not be affected by such a
perturbation.
The ordinary systems which we deal with thermodynamically
are assumed to be so complicated that they are quasi-ergodic.
Thus the only type of ensemble independent of time is that of
Eq. (10), in which the density is a function of the energy. This
is the sort which we shall consider in thermodynamic applications.
221. The Microcanonical Ensemble. — A particularly impor-
tant ensemble is that called the microcanonical ensemble, in
which all the systems of the ensemble have practically the same
energy. More precisely, we have
368 INTRODUCTION TO THEORETICAL PHYSICS
f(pi • ■ • Pn, qi • • • ?») = F(E)
= constant for E Q < E < E + AE
= otherwise. (11)
It is evident that an arbitrary ensemble can be made up by
superposing microcanonical ensembles, the ensemble whose
systems lie between E Q and E + AE having a constant density
so chosen as to give the proper density in that particular part
of the energy space. In thermodynamics the microcanonical
ensemble is often used, when we wish to deal with the statistical
properties of systems at a given temperature, for energy content
is correlated with temperature in such a way that systems of the
same temperature have just about the same energy, and therefore
are represented at least approximately by a microcanonical
ensemble.
222. The Canonical Ensemble. — More suitable than the
microcanonical ensemble for discussing temperature equilibrium
proves to be a slightly different one called the canonical €insemble«
In this distribution the density function is given by
/ = p(E) = constant e kT , , (12)
where E is the energy, k a constant, called Boltzmann's constant,
equal to 1.37 X 10 -16 c.g.s. units, T is the absolute temperature.
We shall discuss in a later chapter the particular properties
of this ensemble, and its advantages. This ensemble has not
only the property of remaining unchanged with time, if the
system is left to itself, but also of remaining unchanged if the
system can interchange energy with another of the same tempera-
ture. This is evidently necessary for thermal equilibrium,
and the canonical ensemble is the only one in general which has
this property. From this ensemble we can derive interesting
results, though we mention only a few. We may, for instance,
use the fj, space, each system being a molecule. The energy
of such a molecule is ~—(p x 2 + p v 2 + p* 2 ) + V, so that the
probability of finding a molecule having its coordinates and
momenta within the limits x and x + dx, y and y + dy,
z and 2 + dz, p x0 and p x0 + dp x , etc., is proportional to
IT
dxdydzdpxdpydpz. (13)
THE CORRESPONDENCE PRINCIPLE 369
This law is ordinarily called the Maxwell-Boltzmann distribu-
tion law. From it we can easily find that the velocities are
distributed according to Maxwell's distribution of velocities,
and that the density in ordinary space at different points is
proportional to e~ v / kT . We leave these proofs for problems.
If on the other hand we use the r space, E represents the energy
of the whole sample of gas, and we can prove easily that the
energy of an individual sample in the ensemble is very nearly
the same as that of any other sample. Thus for such a system
the canonical ensemble is very similar to the microcanonical
ensemble. one gets the same thermodynamic results using
either ensemble, but the canonical ensemble is both more correct
theoretically and decidedly simpler for most of its applications.
223. The Quantum Theory and the Phase Space.— In Sec. 210,
Chap. XXIX, we have seen that a stationary state of a one-
electron problem corresponds to a classical particle whose
energy is determined, but whose initial time of starting is undeter-
mined. More accurately, it corresponds to an ensemble of
particles, all of the same energy, but with phases distributed
in such a way that the properties of the ensemble are independent
of time. This, however, is exactly a microcanonical ensemble.
This may be connected with the uncertainty principle for energy,
Eq. (2) of Chap. XXVIII, which states that the uncertainty
of energy multiplied by the uncertainty of time is equal to h.
If then we set up an ensemble of particles all of exactly the same
energy, it must necessarily be true that the uncertainty of time
of one of the particles is infinite. That is, we know nothing at
all as to its phase, or the ensemble consists of particles in all
possible phases. And since it is a stationary state we are dealing
with, nothing depends on time. In other words, with the
quantum theory, the mere process or setting up a stationary state
automatically sets up a microcanonical ensemble. We need not
do that specially, and we need not prove Liouville's theorem
to find out how to get ensembles independent of time. In this
way the quantum mechanics is more convenient for statistical
purposes than classical mechanics.
With problems with several variables, the stationary state
certainly represents an ensemble independent of time. If the
problem is multiply periodic, it will represent an ensemble of
states all of the same J values (that is, the same set of quantum
numbers), but arbitrary phases. on the other hand, if it is
370 INTRODUCTION TO THEORETICAL PHYSICS
quasi-ergodic, it will represent a microcanonical ensemble.
And even in a multiply periodic, degenerate case, where there
are several stationary states of the same energy, we can always
set up a microcanonical ensemble, by combining all the various
states of the same energy. Each one of these states will corre-
spond to a volume h n of the phase space. Then if the micro-
canonical ensemble is to have a constant density of points over
a region between two energy surfaces, it will have a definite
number of points for each element of volume h n , and hence a
constant and equal number of points for each of these substates
of the same energy. We may say that in this ensemble the
number of systems in any group of substates is proportional
to the a priori probability of this group of states; that is, simply
proportional to the number of substates in the group.
The distribution function f(p x • p„, q x • q n ) for the quantum
theory involves us in rather complicated considerations, which
we shall take up in the next chapter. The reason is that the
probability function which we are given directly is the square
of the wave function, #, and that is a function of the coordinates
only, giving the probability of finding the coordinates within
certain limits, independent of the momenta. In Sec. 210,
Chap. XXIX, we have shown that this probability function
approximately agrees with that found in classical mechanics.
We postpone other comparisons between the quantum and
classical distributions. But there is one feature of the quantum
distribution function which should be mentioned at the outset.
We have spoken above as if one could draw the paths of particles,
and set up distribution functions, in the phase space, for the
quantum theory as for the classical theory. But this is really
not possible, as we can see from the uncertainty principle.
This says that the uncertainty in the coordinate of a particle,
multiplied by the uncertainty of its momentum, is of the order
of magnitude of h. This product of uncertainties is simply an
area in phase space. Instead of representing the particle by
a sharp point, we can visualize it as a region in phase space, of
dimensions Aq and Ap along the two axes. By the uncertainty
principle, the area of this region is h. If we had the same
thing in a number of dimensions, as n variables, the 2n-dimen-
sional volume associated with the uncertain position and momen-
tum of the particle or representative point would be h n , just the
volume associated with a stationary state. As a result of this
THE CORRESPONDENCE PRINCIPLE 371
uncertainty, we must always be cautious about using the ideas
of definite paths of representative points in the phase space.
It would perhaps be more accurate to think of the paths, and
energy surfaces, as having definite thicknesses, as if the point
carried along its volume h n , and allowed that to trace out a finite
region of phase space.
The canonical ensemble can be set up in quantum theory as in
classical mechanics. In the classical theory, it is the ensemble
in which the number of points per unit volume is proportional
to e~ E/kT . In quantum theory, the number of points in volume
h n , or the number in a given stationary state, is proportional
to e~ E/kT , or this exponential is proportional to the probability
of finding a system, chosen at random from the ensemble, in the
stationary state in question. If we group together a number of
degenerate substates all of energy E, and if there are g of them,
so that the a priori probability of the group is g, the number of
systems in the group is proportional to ge~ E/kT .
Problems
1. Take the problem of a particle executing one-dimensional motion in a
container with constant potential inside, but impenetrable walls, as in Prob.
11, Chap. XXIX. Plot the path of the representative point in phase space,
find the phase integral, and show that the quantum condition leads to the
same stationary states and energy levels that were determined previously,
except that it leads to half rather than whole quantum numbers.
2. For the system of Prob. 1, compare (a) the frequency of oscillation of
the particle back and forth between the walls, as determined classically by
elementary argument; (b) the same frequency as determined by the formula
dH/dJ; (c) the emitted frequency on the quantum theory.
3. Draw the phase space for a rotator, as described in Sec. 213, and verify
the quantum condition stated there.
4. Apply the correspondence principle to the radiation from a linear
oscillator. Show that the Fourier components of the classical motion are
zero corresponding to all transitions except those in which the quantum
number changes by one unit only. From this one may infer that in the
quantum theory only this particular transition can occur, the probability
of any other sort of transition being zero. Such a result is called a selection
principle.
5. Consider the motion of Prob. 6, Chap. IX, in which a particle executed
simple harmonic motion on a rotating turntable. Assume that one quantum
number, and phase integral, is associated with the rapid frequency of oscilla-
tion, and the other phase integral with the slower frequency of rotation of
the turntable. From the Fourier analysis of the x component of motion,
show that the only allowed transitions are those in which each quantum
number changes by + 1 unit. Show further that both must change together,
there being no transitions of one quantum number alone, but that a transi-
372 INTRODUCTION TO THEORETICAL PHYSICS
tion of +1 unit in one of the quantum numbers is equally likely to be con-
nection either with +1 or —1 of the other.
6. Find Maxwell's distribution of velocities, stating that the number of
molecules of a gas for which the velocity is between v and v + dv is propor-
tional to
mv
~2kT
dv.
To do this, use /x space, assume the Maxwell-Boltzmann distribution law.
Consider a fixed point of space, so that x, y, z are constant, and we need only
consider the three-dimensional momentum space. Note that the velocity
is proportional to the radius Vp* 2 + Vv 2 + Vz 2 in the momentum space.
The number of molecules between v and v + dv is then proportional to the
density of molecules in the momentum space, which from the Maxwell-
Boltzmann law is constant for constant v, times the volume of momentum
space between v and v + dv, which can be computed from the ordinary
geometrical relations of a sphere. Determine the constant factor in the
law so that your formula will give directly the fraction of all molecules in
the range dv.
7. Find the mean kinetic energy of a molecule at temperature T. Note
that the mean of any quantity F(p, q) is given by
J = J^(P, g) f(P, g) dp • • • dq • • " ;
If(P, Q) dp • • • dq • • •
where f(p . . . q . . . ) is the density function in the phase space, and the
integration is over all parts of the phase space. Note also that since in this
case F depends only on the momentum, the integrals in numerator and
denominator can be factored into one integral over the momenta, one over
the coordinates, and that the latter cancel out.
8. By integrating over all momenta, show that the space density of
molecules in a gas is proportional to e _F / fcr . Apply this to the density of
the atmosphere in the earth's gravitational field, assuming constant tem-
perature. Find from this the rate of decrease of barometric pressure with
altitude, at the earth's surface, assuming a reasonable atmospheric
temperature.
9. In the r space, consider a canonical ensemble of N identical molecules,
where N is very large. Assume that no forces act. Find the number of
systems of the ensemble for which the total energy is between definite limits
■ E and E + dE. To do this, note that the energy is proportional to p 2 ^ +
p2 yl _|_ . . . pi zN , or the square of the iV-dimensional radius of the momen-
tum space, so that the part of the space between E and E + dE is the region
between two corresponding hyperspheres. Note that the "volume" of a
two-dimensional "sphere" (a circle) is ht 2 ; of a three-dimensional one, Ittt 3 ;
of an iV-dimensional one, constant times r N . Also note that the volume
j.j- d (volume) ,
between r and r + dr is t of.
ar
10. Show that the fraction of all systems is a canonical ensemble for
which energy is between E and E + dE is approximately given by a Gaus-
sian error curve, Ae - "^" ^. Find c and a. (Hint: The function found in
THE CORRESPONDENCE PRINCIPLE 373
Prob. 9 has a very sharp maximum, to be approximated, by the error curve
above. Expand the logarithm of the function in power series about its
maximum, a, so that the logarithm equals constant — c(E — a) 2 + •■ • • ,
where there is no first power term because the expansion is about the maxi-
mum, and higher power terms than the second are to be neglected. Then
the function is the logarithm of this power series, giving the value above.
Show that the third and higher power terms are negligible unless E — a is so
large that the function itself has sunk to a negligible value.)
11. In the distribution of Prob. 10, show that the mean energy of the
systems of the ensemble is just N times the energy of a single molecule, as
found in Prob. 7. To get an idea of the range of distribution of energy
about this mean, find the energy for which the Gaussian distribution curve
falls to half its maximum value. Show that the energy difference between
this value and the mean increases proportionally to y/N, but that the
percentage deviation of the energy, or the deviation divided by the total
energy, goes down as l/\/~N, so that for large N the percentage deviation
is extremely small.
CHAPTER XXXI
MATRICES
Suppose we have a problem, like the linear oscillator, in which
there are no motions which go to infinity; that is, in which every
motion is quantized, so that only discrete energy values are
allowed. Let the nth energy value be E n , the corresponding
wave function u n . Then a general solution of the wave problem,
involving the time, is
X Cni
2-riEnt
h u n (x, y, z), (1)
where we choose the negative exponential for reasons which
will appear later. This function will shortly be derived as the
solution of a wave equation involving the time, though we have
not yet written down that equation. Now let us recall the mean-
ing of yp. It is the amplitude of a wave whose intensity gives
the probability that the particle be found at a given place at a
given time. Since \p is complex, this intensity is given by
multiplying by its conjugate; hence tyyp gives the desired proba-
bility. Or more precisely, the probability that the particle,
at time t, is in the volume element dxdydz, is \fydxdydz.
one result appears at once from this: the probability that the
particle be somewhere is unity, and this must be the sum of
the probabilities that it be in all separate parts of space, or
the integral of the probability over all space:
JJ7# ^ dx dy dz = 1. (2)
Now having the probability, we can proceed to get statistical
information about the behavior of the particle.
224. Mean Value of a Function of Coordinates.— As we have
seen in the last chapter, the first step in a statistical investiga-
tion is to find a distribution function. There we were interested
in functions of coordinates and momenta of a particle or system,
and we had a function f(qi, . . . q n , Pi • • • Pn), such that
fdqi . . . dp n gave the number of systems having coordinates
374
MATRICES 375
and momenta in the range dqi . . . dp n . To find the average
of any quantity, given such a distribution, we proceed as follows :
if the quantity is F{q x . . . p n ), a function of coordinates and
momenta, we multiply the function by the fraction of systems
having those particular q's and p's, and integrate over all q's
f
and p's. This fraction is -jj-j - — — = — > so that the result is
jf dqi . . . dp n
jf dqi • • • dp n '
where we denote the average of F by F, to avoid confusion with
the single bar indicating complex conjugates. Similarly in the
present case we have a function ^ which is a distribution func-
tion as far as coordinates are concerned: fypdxdydz is the
probability (directly, since f$ \p dx dy dz = 1) that the particle
have coordinates within dx dy dz. Thus if we have a function
F(x, y, z) of the coordinates, and wish its mean value, we have
F = fF$$dxdy dz = WF^dxdy dz, (4)
where we prefer the latter method of writing it because it fits
in with formulas which we shall have later. This does not tell
us how to find averages of functions of the momenta, such as
for example the energy; that is more complicated, and will be
discussed in a later section. But we may wish, for instance
with an atom or molecule, to find the mean value of the center
of gravity, or moment of inertia, or some such function of posi-
tion alone, and the formula suffices to determine it.
It is now very interesting to substitute our expansion of yp
in the expression for a mean value. That gives
t = >,c tt c m e h Ju„,F u m dxdydz
n,m
n.m
where by definition F nm = Jw n F u m dx dy dz. The quantities
F nm form a two-dimensional array of numbers, of the sort known
in mathematics as a matrix, and the individual F nm 's are called
matrix components.
225. Physical Meaning of Matrix Components. — Suppose we
have an electron in an atom, and try to find its electric moment
as a function of time; that is, its charge e times the displace-
376 INTRODUCTION TO THEORETICAL PHYSICS
ment of the electron, x. In other words, we wish the mean
value of ex, which is
ex = '2jC n c m e h {ex)» m . (b;
We observe that in the mean moment the terms depend on time
^i(E —E)t
through the expression e h , having the frequency (E n —
E m )/h. But this is just the frequency which by the quantum
theory the atom should emit in jumping between the energy
levels m and n. Hence we connect this particular matrix compo-
nent with this transition. By the correspondence principle,
in Sec. 217, Chap. XXX, we have already seen that associated
with each transition there is a classical frequency of oscillation,
and a corresponding Fourier component of the motion. It
can now be shown that this Fourier component, in the limit of
large quantum numbers, becomes equal to the matrix component
(ex) nm of the electric moment, which appears in Eq. (6). The
individual terms of Eq. (6) act like oscillators, radiating energy,
and it proves to be true, though it requires a difficult analysis
to show it, that the rate of radiation of the oscillator determines
exactly the quantum probability of transition. For example,
if a matrix component is zero, there will be no radiation of the
corresponding frequency, no transitions are possible between
the stationary states concerned, and we have what is called a
selection principle, a principle selecting out certain transitions
which can occur, the rest being forbidden.
The matrix components which we have noticed have been those
where m and n were different. If we make a scheme of matrix
components like
F "
F\2
Fiz
Fa
Fa
F23
Fi\
. . .
(7)
we see that the components F lh F 22 , etc., along the principal
diagonal all have m = n, so that our components with m 9^ n
are just the nondiagonal components. The diagonal compo-
nents, however, have a different meaning. They refer to time
average properties of the system, rather than to the sinusoidal
properties which are connected with radiation. Thus if we
take the time average of P (where the averaging in F refers to
MATRICES 377
averaging over the probability distribution, not over time),
^{E n -E m )t
the exponential term e h averages to zero, unless n = m,
in which case it is unity. Hence we have
time average of P = ^c n c„F n n, (8)
n
the double sum reducing to a single sum. Here, as we said above,
only the diagonal components of the matrix of F appear.
We can understand this formula better if we notice the mean-
ing of the c's. To get at this, we observe that the c's are the
amplitudes by which the various overtones are multiplied, in
order to get the whole wave function. Thus the quantities c n c n ,
the squares of these amplitudes (taking account of the fact that
they may be complex by multiplying by the conjugate) are
quantities proportional to the intensities of the various overtones ;
and the interpretation of this is that they are proportional
to the probability that the particle be in a given stationary state.
As a matter of fact, we shall soon show that c n c n represents just
the probability itself, the sum of all the probabilities, ]£V„c„,
n
being unity. Thus the formula
time average of F = ^c n cJF nn
n
means that F nn is the time average of F over the nth stationary
state, and c n c n the probability of finding the system in this
stationary state, so that we multiply together and add to get the
average over all stationary states.
226. Initial Conditions, and Determination of c's. — Just as
with the problem of the vibrating string, we may have initial
conditions: we may know that the distribution \p has a certain
value at t = 0. Let us take a specific example: we may know
that at t = the particle is inside a given small volume, though
we do not know where in that volume. Then we may ask as
to its probable later motion. That is, we know that ^(x, y, z, t)
is zero, at t = 0, outside the small volume, and has a constant
value, ' or at least a sinusoidal form with constant amplitude,
inside the volume. Now at t = 0, the exponentials become
unity, so that we have ^{x, y, z, 0) = ^c n u n (x, y,z). But this
n
is just the familiar problem of expanding an arbitrary function
378 INTRODUCTION TO THEORETICAL PHYSICS
of x, y, 2 in a series of functions u n . These are orthogonal func-
tions; they are solutions of Schrodinger's equation, which is
of the type already discussed in Prob. 10, Chap. XV, where we
showed in general that the solutions were orthogonal. We
assume them to be also normalized. Thus the c's are simply the
coefficients of expansion, determined directly by multiplying by
the corresponding normal function and integrating. We must be
careful. of only one thing: our functions are now often complex,
and when we multiply two such functions together, in such cases,
it proves to be necessary always to multiply so that a function
and a conjugate appear together. Thus we have
fffy(x, y, z, 0) u m (x, y, z) dx dy dz = ^ c„ ju n u m dx dy dz.
n
But now the orthogonality is such that ju n u m dx dy dz is
unity if n = m, zero if n j£ m, so that we have
Cm. = JV Um dV. (9)
The physical situation is then this. If we know initially
the distribution of coordinates, we can find a \j/ satisfying the
conditions, and in general all the c's will be different from zero.
That is, all overtones will be excited, or the system will be
partly in each stationary state. We may say, if we choose,
that we have an ensemble, and that a system of this ensemble
has a probability c n c n of being in the nth state. If now we ask
how ip changes with time, we can see that the particle will no
longer have the initial distribution of probability, but that the
probability will change with time. For example, if we originally
know it is in a small volume, this will not continue to be true
as time goes on; it will have a chance of moving out of the volume.
The reason is that the different waves cooperate to give just
the right function at t = 0, but they vibrate with different
frequencies, and soon they get out of step, and can no longer
cooperate properly. Thus a general wave function, made by
superposing many stationary states, does not represent an ensem-
ble independent of time, though a single wave function does.
Though the probabilities as functions of the coordinates change
with time, it is significant that the c's, being constants, do not.
Thus the probability of finding the atom in a given stationary
state does not change. The atoms do not go from one to another,
and the states are really stationary. This is all true only
MATRICES 379
if we neglect radiation, or external forces. If there is radiation,
the whole situation will be altered, the c's will change with
time, and the time rate of change of any cc will be interpreted
as being connected with a corresponding probability that atoms
are having transitions to or from this state. It is much as with
vibrating strings: if the string is started off with a complicated
shape, this shape will be soon changed, but if there is no friction
we can analyze the motion into overtones, and each overtone
preserves its amplitude. If friction is present, however, the
overtones change their amplitudes.
227. Mean Values of Functions of Momenta. — The method of
finding mean values of functions of the coordinates is perfectly
straightforward, but the treatment of the momenta is peculiar,
and is one of the characteristic features of wave mechanics. The
momentum shows itself in the wave function through the wave
length of the wave, and in order to get information about wave
length, it turns out that the proper procedure is to differentiate
the wave function. We can find the correct formulas from a very
simple case; and since we are setting up a theory which is not
derived from any other, we can do nothing but postulate the
general formulas, which prove to be the same ones that we find
in this special case. Thus suppose we have a free particle in
empty space, traveling with a momentum p, energy E. Its
wave function, if it travels along the x axis, will be e~h ,
corresponding to the wave length 1/X = p/h. More generally, if
its components of momentum along the three axes are p x , p y ,
p z , its wave function will be
2iri
a plane wave. If we wanted to find the mean x momentum of this
particle, we should multiply p x by the probability, and integrate;
we should get p x , of course, since the mean value of a constant is
the same constant. But the question is, how is this to be general-
ized so that it can be used in more complicated cases, where the
momentum does not appear explicitly, and is not constant? The
answer proves to be the following : If our function is \f/, we observe
that pr—. — equals p x \l/. Thus if we form the expression #:r— . ~^-
2m dx 2-ki dx
and integrate, the answer will be the same as integrating ^p^,
(h a V
K— . —J \p
F =
380 INTRODUCTION TO THEORETICAL PHYSICS
h r)
would give p x 2 , and so on. In other words, the operator 7— . — >
2iri ox
operating on \p, and averaged, can be taken to stand for the x
component of momentum.
It is now assumed that this process can be applied in general.
Thus with any wave function ^, the mean value of the x com-
Jh d
$k—- -5- 1 dv. Or more generally, if we
have any function of momenta and coordinates, say F(x, y, z, p x ,
Pv> Pz), we have for the mean value
jV F (x, y, z, ± £ £ Ty'h.Tz) + *• (1 1)
This is the general rule, reducing to our former one when F
involves only coordinates. There is one difficulty connected with
this, however. It turns out that if there are any terms in F
involving products of coordinates and momenta^ the answer will
depend on the order in which they occur. The best example is
the case of the product p x x. We have
^ x = r[L-L^] dv
h _
= as + *P"
or
plx — xpl = s - .• (12)
This is the so-called commutation rule ; it states that interchange,
or commutation, of the order of a coordinate and momentum
operator changes the value, since the difference is not zero. In
most actual cases that we meet, we shall not be troubled by this
difficulty of noncommutability of coordinates and momenta, but
it is something against which we must be on our guard.
We notice by analogy with what we have done that, taking the
wave function of the form given above, — ^— . -£■ = E\l/. This
2tti dt
MATRICES 381
again is taken to be a general method of finding the energy of a
wave function:
*-/*(-£■!>*•
— — s t
If \j/ = 2jC m e h "u m (x, y, z), we evidently have
m
2«,
"o""--^ = y^r»E m e h u m {x, y,z).
2iri dt
m
Multiplying by #, we have
X r-(Mn-Mm)t_
c n c m E m e h u n u r
-*p iB .-H~)t.
Integrating over the coordinates, the nondiagonal terms drop
out on account of orthogonality of the w's, and the rest reduces to
E = ^C n C n E n , (13)
n
a weighted mean of the energy of the various states.
228. Schrodinger's Equation Including the Time. — We are
now able to give a more general interpretation of Schrodinger's
equation than was possible in Chap. XXIX. We start with the
classical expression
H{qi • • • Qn, pi • • • Pn) = E,
where H is the Hamiltonian function, E is the total energy, and
the equation represents the conservation of energy. But now
suppose we try to replace each side by the corresponding quan-
tum theory expression, so that we shall be able to allow each
side to act on xp, and if we wish multiply by # and integrate to
get averages. The first step is
„/ JlJL JL JL h d \h - h -0t c[±\
H\qi • • • Q»>2ridqi 2ti dq 2 ' ' ' %ci dqj* 2wi dt' { }
But this is just Schrodinger's equation, in the form involving the
time (which we have not so far met). To show that it reduces
to the form we have previously met, let us take the case of rectan-
gular coordinates x, y, z. There
# = 2^(P* 2 + p. 2 + p* 2 ) + F,
3S2 INTRODUCTION TO THEORETICAL PHYSICS
so that the equation becomes
87r 2 m\d:r 2 "*" dy 2 ~ r dz 2 / "*" _T 2« d<"
In this, let us assume a solution \p = e h u(x, y, z), where E
is a constant, to be identified at the proper time with the energy.
Then the equation becomes
(-£?' + v ) u - Eu - < 15 >
which leads immediately to the form of Schrodinger's equation
with which we are familiar.
229. Some Theorems Regarding Matrices. — Suppose that we
have an operator F, formed from a function of the q'a and p's
by replacing the p's by differentiations, in the manner we have
described. Then we have by definition F nm = JunFu m dv. But
we can look at this in the following way. The tin's form a set of
orthogonal unit vectors in function space. Fu m is a function
different in general from any of the u's, and hence a different
vector. The quantity JunFu m dv is the scalar product of Fu m
with u n ; that is, it is the component of Fu m along the nth. axis.
But this suggests writing a vector equation:
Fu m = ^F nm U n , (16)
n
expressing Fu m as a sum of unit vectors, each multiplied by the
corresponding component. To prove this, we need only multiply
by u n and integrate, when the right side, on account of orthogonal-
ity, leaves only F nm . An example of such an expression is Schrod-
inger's equation not involving the time, which can be written
Hu n = EnU n , (17)
if E n is the energy in the nth state. This obviously expresses
the fact that the matrix of H has only diagonal components
(H nm = E n if n = w, zero if n 9^ m), so that, since H has no
nondiagonal components, it has no terms depending on time, or
is a constant.
It is interesting to write down the matrix of a constant, foi
example a number C. Evidently ' C nm = ju n Cu m dv = C if
n = m, if n 9^ m. A particular case is the matrix of unity,
ju n u m dv = 1 if n = w, if n 9^ m, simply the orthogonality and
normalization conditions. This matrix is often called 8 nm ; by
MATRICES 383
definition 8 nm = 1 if n = m, Oifn^ra. In terms of this, we
have C nm = C8 nm . And we can write the matrix of the energy as
"urn = H'nVnm.' (loj
This matrix equation, stating that the matrix of the energy is a
diagonal matrix with the characteristic values E n , may be taken
as a matrix statement of Schrodinger's equation; we readily see
that it is just what would be obtained by multiplying Schrod-
inger's equation by an arbitrary u m , and integrating. We shall
actually use this matrix equation later in discussing perturbation
theory. It is to be noted that a matrix depends on two things:
first, the operator, and secondly, the set of orthogonal functions
with respect to which it is computed. Thus a given operator, as
energy or angular momentum or x coordinate, can have its
matrix computed in any set of orthogonal functions. The prob-
lem of solving Schrodinger's equation with a given energy opera-
tor may be considered as that of finding the particular set of
orthogonal functions which makes that operator diagonal. In
a similar way we can find a set of orthogonal functions which
would make any other desired operator have a diagonal matrix.
We shall see in the next chapter that this involves us essentially
in a rotation of axes in function space, similar to what we found
in introducing normal coordinates in vibration problems.
From our expansion of Fu m in series in the w„'s, we can easily
get the method of multiplying matrices, which is very useful in
matrix manipulation. Suppose that we have two operators F
and G, and know the matrix components F nm and G nm . We can
then find easily the matrix components of the product operator
FG. For we have
' mnv»m
Gu n = ^G*
m
FGu n = 2jG mn Fu m = ^GmnFkmUk ~ 2-A ZjFkmGmn )Uk.
m m,k k m
But also
(FG)u n = ^{FG) kn u k ,
. k
by the earlier formula. Hence
*(FG)kn = Z^^^Gmn, ■ (19)
m
the formula for multiplying matrices.
384 INTRODUCTION TO THEORETICAL PHYSICS
It is a rather remarkable fact that the method of operating
with matrices was discovered before the wave mechanics. This
multiplication rule, and the commutation rule, were both devel-
oped. They were used for a number of complicated calculations,
without use of wave functions, for example for finding the energy-
levels of the linear oscillator, its intensities of radiation, and even
the energy levels of the hydrogen atom. For a few problems, as
perturbation theory, the matrix method is still more convenient
than the wave method, as we shall see.
Problems
1. Prove that a coordinate commutes with another coordinate; a momen-
tum commutes with another momentum; and a coordinate commutes with
a momentum conjugate to another coordinate.
2. Write down the operators for the three components of angular momen-
tum in rectangular coordinates.
h riJ?
3. If F is any operator, prove that ^— = (HF - FH), where H is
the Hamiltonian operator, the equation above to be regarded as either an
operator or matrix equation. To prove it, take average values of the
operators. Find the average value of F, differentiate it with respect to
time, to get the left side of the expression. on the right, in computing
the average values* use the multiplication rule to compute the matrices
of HF and FH, noting that H has a diagonal matrix. Finally identify terms
on both sides of the equation.
4. Using the result of Prob. 3, prove that the time rate of change of the
energy is zero; prove that H and t satisfy a commutation relation like a
momentum and coordinate.
6. Show that for the linear oscillator the assumptions
E n = (n + h)hv
Xnn = U
= j Mn + l) MM
Sn+l.n - X n , n +1 ~ \ 8ir 2 m»»
x nm = if m y£ n ± 1
satisfy the quantum mechanics. To do this, compute the matrix components
of x nm , and find the matrix of the energy expression (m/2)(x 2 + 4a- Vx 2 ),
computing the matrices of x 2 and x 2 by the multiplication rule. Show that
this matrix is diagonal, its diagonal components being the energy values
given above.
6. By comparing with the wave functions of the linear oscillator in Chap.
XXIV, Prob. 6, verify that the values of matrix components in Prob. 5
are correct. If you cannot give a general proof, take the actual wave func-
tions you have worked out, in Prob. 6, Chap. XXIX, using them for n = 0,
1, 2, normalizing, and calculating the matrix components by direct
integration.
MATRICES 385
7. Show that a linear oscillator radiating from the nth. stationary state
cannot jump except to the (« — l)st state, so that there is a selection
principle on its radiation. Compute the rate of radiation of the oscillator
in the nth state, on the assumption that it is the same as that of a classical
— -(E — E i)t
oscillator whose charge is e, displacement is x n ,n-ie h * " ~ +
— (.En-1-En)t
x n -i, n e h . Compare this displacement with the displacement of
a classical oscillator of energy E„, showing that in the limit of large quan-
tum numbers both amplitude and frequency of the classical oscillator agree
with the quantum values. This is an example of the correspondence
principle.
8. Solve Schrodinger's equation for a rotator, whose kinetic energy is
§/0 2 , in the absence of an external force. Find wave functions, showing
that the angular momentum is an integral multiple of h/2ir. Compute the
matrix of R cos 6, one component of displacement of a point attached to the
rotator at a distance R from the axis. Show that all matrix components are
zero except those in which the angular momentum changes by + 1 unit.
9. Find what p 2 q —. qp* is equal to, using the commutation rule for pq —
qp.
2*2
10. Show that e h u(x), where p is the £ component of momentum, a. is
a constant, is equal to«(a; -J- a). Use Taylor's expansion of the exponential
operator.
11. Write down Schrodinger's equation in spherical polar coordinates,
by using the Laplacian in these coordinates, assuming a potential V(r).
Discuss the method of deriving the equation from the Hamiltonian by
replacing the momenta by differentiations, showing that the former method
is consistent with the latter, but that the latter method does not lead to
unique results.
CHAPTER XXXII
PERTURBATION THEORY
There are many problems in wave mechanics, which, though
they, cannot be exactly solved, are approximated by soluble
problems. Thus a nonlinear oscillator can be approximated
by a linear one; or a system, as an atom, in an external electric
or magnetic field can be approximated by the same system
without the field. The perturbation theory is adapted to the
solution of such problems, starting with the known approximate
solution, and expanding in power series in the perturbation.
At the same time, there are some problems of more general
nature treated by perturbation theory. Thus the radiation
of an atom can be examined by treating the interaction of the
atom and a radiation field as a perturbation. We shall be led
by such questions to a discussion of the transitions between
stationary states. The actual method we shall use is closely
analogous to the perturbation theory used with the nonuniform
vibrating string.
230. The Secular Equation of Perturbation Theory —Suppose
that we wish to solve Schrodinger's equation Hu n = E n u n ,
where H is the given Hamiltonian. Let us start with a set of
orthogonal functions u n °, which often are solutions of a similar
problem approximating the real one, and let us expand the
correct functions u n in series in the u n °'s:
= %Sr,
(1)
Then the problem may be regarded as that of finding the expan-
sion coefficients S mn , which are really coefficients of a linear
transformation in function space transforming from the original
set of orthogonal functions to the final, correct, ones, so that we
may expect the S'b to satisfy orthogonality and normalization
conditions. We substitute this expression for u n in Schrodinger's
equation, and get the condition for the coefficients. If we
substitute, multiply by u k °, and integrate, we shall have only
386
PERTURBATION THEORY 387
one term on the right, on account of orthogonality of the w 0, s;
on the left, we shall have a linear combination, each term involv-
ing a matrix component of H with respect to the w°'s, for example
Hkm = Juk° Hu m ° dv. We recall that, since the u°'s are not
solutions of the problem, this matrix will not be diagonal.
Carrying out the substitution, we have
7 .(Hkm ~ E n 8km)Smn = 0, (2)
m
or an infinite set of equations for the infinite set of $ m7t 's. Writ-
ing them for the nth stationary state, we have
(ffn - E n )S ln + H 12 S 2n + H ls S Sn + • • • = (k = 1)
H u Sm + (tf 22 - E n )S 2n + H^Szn + •■.=() (k = 2) (3)
These equations are all homogeneous, of the same sort found
whenever we have introduced normal coordinates or rotated
axes, as, for example, in discussing coupled systems or the vibrat-
ing string. As usual, the equations in general do not have a
solution; they have one only if the determinant of coefficients
H\\ — E n H\2 H\z
Hi\ Hi% — E n H<lz . . . (4)
is zero. This secular equation determines the energy levels.
231. The Power Series Solution. — If the u°'s were solutions
of the problem, H would have a diagonal matrix, the diagonal
terms being the energy levels. Though this is not true,* let us
assume that the u°'s are not far from solutions. Then by argu-
ments of continuity the nondiagonal terms of H, though not
zero, are small, and the diagonal terms, though not exactly
the energy values E n of the exact solution, are not far from the
correct values. Thus E n is approximately H nn . We assume
the problem is nondegenerate, by which we mean that only one
state has even approximately this same value. Now let us
recall how to expand a determinant. We take products of
terms, choosing just one from each row, one from each column.
There are Nl ways of doing this, if the determinant has N rows
and columns. We give each a sign + or — according to its
requiring an even or an odd number of interchanges of rows
or columns to bring the desired term to the principal diagonal.
Finally we add, In this case, since we are dealing with small
388 INTRODUCTION TO THEORETICAL PHYSICS
quantities, we look first for the largest product. This is plainly
the principal diagonal, for the only large terms are those on the
principal diagonal. For a first approximation we may set this
equal to zero. It is already factored: (Hn — E n )(H 22 —
E n ) ' • • — 0. one of the factors must be, to this approxima-
tion, zero. Plainly it must be H nn — E n , since this is the only
term which is even small, assuming the system is nondegenerate.
This then is the first order approximation to the energy: E n =
H nn , the diagonal component of the matrix of the energy with
respect to the approximate wave function.
Using the first-order approximation to the energy, we can
easily get the corresponding linear transformation and wave
functions. If the u°'s were the correct wave functions, we should
have S nn = 1, all the other S's = 0. To a first approximation,
in the actual case, we may set S nn — 1, but regard the other
S's as small quantities. Then we have, for example, for the
first equation
(#11 — HnnJSln "f" ' " " + H \ n + * ' ' =0,
where the terms we do not write are of a smaller order than
those we write. Hence
Sin = —jj : — rj—' (5)
rL 11 — n nn
The other equations are of the same form, so that the approximate
wave function is
u = u o_ ^ g *^° . y 6)
For the second approximation to the energy, we must consider
further terms in the determinant. We can proceed by analogy
with the case of a determinant of two rows and two columns,
which we should have if there were only two stationary states
to consider. In this case the secular equation would be
H\\ — E n Hl2
"21 H22 — E n
This is only a quadratic for E n , and can be immediately solved
explicitly:
= (H„ - E n ) (# 22 - E n ) - H 12 H 21 = 0. (7)
2
if 11 + H 22
En = Hn + H^ ± ^ pn + H„ y _ ^^^ _ HuHn)
± V( H " 2 H22 ) 2 + H " H "- W
PERTURBATION THEORY 389
This explicit formula is analogous to the formula for the fre-
quency of a system of two coupled oscillators, obtained in Eq. 4,
Chap. XI. Here as there, if the nondiagonal matrix component
#12 of the energy is small, we can expand the radical by the
binomial theorem, obtaining without trouble for the two solu-
tions as power series in Hu,
J? - T7 m H^Ha i . . . .
tl 11 — Zl22
E 2 = H 22 + t^%- + • ' • (9)
Zl22 — Jtl 11
analogous to Eq. (5) of Chap. XI. Here, as there, the effect
of the second-order perturbation terms is to push apart the two
levels. Thus the first-order calculation alone gives E\ = H\[.
The numerator of the fraction giving the second-order calculation,
H12H21, is really a perfect square, for it can be shown that H<n =
Hu (similar theorems hold in general for all the matrices of
real quantities which we meet). Thus the numerator is positive.
If #11 is greater than H22, so that the first-order level 1 lies above
2, the denominator is also positive, so that the level is still
further raised by this perturbation. on the other hand, for the
other level, the denominator is negative, and the level is further
depressed.
The exact solution which we have obtained in Eq. (8) is only
possible when the secular equation is simple enough to handle
algebraically. The approximations (9), however, can be found
directly from the secular equation (7). Tims let us consider
Ei. We assume that the equation is not degenerate, so that
H22 — Ei is not a small quantity, and we may divide Eq. (7)
by it. Thus we have
_ H12H21
tin — Jhi — -fj pT'
■"22 — tii\
Replacing the Ei in the denominator by its value H n , which is
correct to the first order, this becomes
TP - Tf _L ^ 12 ^ 21 ,
&1 — tin T it : tT~ j
till — tl22
agreeing with Eq. (9). By a little consideration of the deter-
minant, exactly a similar discussion can be given in the general
case. And the result proves to be simply
E n = H nn ~ / \jj Tj * t (10)
r ' tikk — tlnn
390 INTRODUCTION TO THEORETICAL PHYSICS
in agreement with the special case solved above. It is very
rarely that further approximations than we have given are used,
for either the energy or the wave function.
232. Perturbation Theory for Degenerate Systems. — We shall
often meet cases in which the unperturbed problem is degenerate;
that is, where the diagonal energies H nn of several states are
almost exactly equal to each other. In this case, the power
series method evidently does not work; the differences of energy
which appear in the denominator of the terms in Eq. (9) or (10)
become zero, or very small, and the series diverge and even have
infinite terms. If there were only two levels, as in the special
case taken up in the last section, we could solve the problem
explicitly, not using the power series at all. Thus if Hn — # 22 =
0, Eq. (8) gives
E = Hn ± H 12 , (11)
an important formula for perturbations of degenerate systems.
With a finite number of degenerate levels, we have a secular
equation of finite degree, and while we cannot solve it as con-
veniently as the quadratic, still we can approximate its solutions,
even for the degenerate case where the differences of diagonal
energies are smaller than the nondiagonal energy terms. Now
it fortunately happens that in many problems in which degen-
eracy enters, as in atomic spectra, the levels fall into groups,
the energies of all the levels in a group being about the same,
but the different groups being well separated in energy. Such
groups of levels are the multiplets in atomic spectra. In these
cases we first solve the problem of the levels within a group,
finding an exact solution for the finite secular equation. This
solution gives us not only energy levels, but also coefficients
of linear combinations transforming the original wave functions
of the group into a new set which has the property that it makes
the matrix components of the energy diagonal, with respect
to the states of this group. We then use these transformed
functions as the starting point of a new perturbation calculation,
in which perturbations between adjacent groups are considered.
In terms of these transformed functions, the energy will have no
nondiagonal components between levels which lie close to each
other, in the groups, but only between levels in different groups,
at a considerable energy distance apart. Thus we may use
the series method of Eq. (10), and the second-order terms will
PERTURBATION THEORY 391
be small, since the only terms of the summation for which the
denominator is small will have numerators equal to zero. It
is to be particularly noted in this discussion that the difficulty
in applying the power series method to degenerate systems arises,
not on account of any unusual size of the nondiagonal energy
components, but on account of the unusually small energy differ-
ences between diagonal terms. The method converges only
if the nondiagonal component between any two levels is small
compared with the difference of diagonal energies of the two
terms. This demands that before applying the power series
method the nondiagonal terms between degenerate levels be
removed, but it imposes no such requirement on the terms
between levels of quite different energy.
We can see more clearly what is happening from a mechanical
analogy. Suppose we have a large number of mechanical
oscillators coupled together, all having different natural fre-
quencies, except the first two, which have unperturbed fre-
quencies exactly, or almost exactly, the same. In considering
the interaction, the effect of the two of equal frequency on each
other will be large, since each one resonates with the other,
but the others will have much less effect. We, therefore, first
solve only the interaction of these two resonating oscillators,
introducing normal coordinates for them. Then we can proceed
with the discussion of the interaction, treating the effect of the
other oscillators, not on these two oscillators individually, but
on the two normal coordinates representing them. Of course,
if there are several groups of degenerate levels, we introduce
changes of variables inside each group first, then apply the
ordinary perturbation theory. We shall have many examples
of degenerate systems in our discussion of atomic structure,
where nearly every energy level of an unperturbed atom is
degenerate, and is split up by an external perturbing field,
as an electric or magnetic field. In more complicated atoms, the
perturbing fields come from within the atom itself, being inter-
actions of one part on another, producing the multiplet structure.
In actual practice, we shall find the study of degenerate systems
very important.
233. The Method of Variation of Constants. — A slightly
different point of view in perturbations is obtained by consider-
ing the time variation. Let us expand \p, the correct wave
Junction depending on time, in series in the unperturbed functions
392 INTRODUCTION TO THEORETICAL PHYSICS
u°: ip = ^\C m (t)u m °(x), where the Cs — functions of time —
m
would be pure exponentials, c n e h " , if the w°'s were the correct
solutions of the problem. Whether correct or not, we can always
make the expansion above, for at any instant $ can be expressed
in series in the orthogonal functions u°, the coefficients being
functions of time. Now let us try to satisfy the equation
Hi, = -£-.% We have
m
Multiplying by u k ° and integrating, the result is
If ~ "" T2i km m - ( }
m
These equations for the time derivatives of the Cs in terms of
their instantaneous values are enough to determine the complete
solution of the problem.
To make connection with the ordinary method, we need only
assume C m = S mn e h " , an exponential solution. Then
immediately we have, canceling the exponential, and the factor
— 2iri/h,
E n Skn = 7 MkmSr,
> mny
or exactly the equation we have previously used. In more
general cases, however, it is not always possible to make this
assumption. An example is that in which the perturbative
force depends on the time.
234. External Radiation Field. — The most interesting example
of the method of variation of constants is the perturbation by an
external radiation field, for this actually produces transitions
between stationary states. First let us look a moment at the
physical side of the problem, so as to understand what we expect
to obtain from the calculations. An ordinary radiation field is
never exactly sinusoidal; its amplitude, at a given point of space,
as function of time, may be analyzed in Fourier series of very
long period, as in Sec. 185, Chap. XXV. If the field is approxi-
mately monochromatic of frequency v , that means that only
PERTURBATION THEORY 393
frequencies in the neighborhood of v will have large amplitudes
in the Fourier representation. on the other hand, if it is con-
tinuous radiation, as the radiation from hot solids, there will be
considerable amplitude in all frequencies, at least over a certain
region. We assume the latter case. The electric field in the x
direction at a given point will then be V#„ cos 2r(vt — a v ),
V
where E v , a v , are amplitude and phase of the component of fre-
quency v, and where we have components of frequencies differing
by small increments dv = 1/T, where T is the fundamental
period. The phases a v of successive components may be treated
as being statistically independent of each other; that is, if we
take any two components, the chance that the phase angle
between them at any instant should have one value is just equal
to the chance that it have another value. The values of E v will
be treated as functions of v, though a somewhat more general
treatment subjects them to probability laws too. Now we are
interested in finding p v dv, the energy per unit volume in the
frequency range dv. Since one component of the series is asso-
ciated with the range dv = 1/T, we can simply find the energy of
this component. For the x component of electric field, this is
o~[2E p 2 cos 2 2ir(vt — a,)], the factor 2 taking account of the mag-
netic field as well as the electric field. The time average of this
term is E p 2 /(8ir). If we are dealing with radiation having equal
intensities in all directions, the mean energy per unit volume
associated with x, y, and z coordinates will be equal. Hence we
have
Pv dv = ^E v \ (13)
235. Einstein's Probability Coefficients. — Now suppose a
radiation field of the type we have described is allowed to act on
an atomic system. Einstein was the first to solve this problem.
He assumed that, if the atom is in its rath state, there will be the
following probabilities of transition to other states, induced by
the radiation field:
1. A probability A mn of radiating spontaneously to each state
n which is of lower energy than the rath, with emission of the
corresponding photon of frequency v mn > given by E m — E n =
hv mn . This spontaneous emission corresponds to the ordinary
394 INTRODUCTION TO THEORETICAL PHYSICS
radiation of an oscillating dipole in classical electromagnetic
theory.
2. A probability B mn pmn of absorbing a photon of frequency
v mn from the radiation field, where now the state n has higher
energy than m, and of jumping up to the state n. This probabil-
ity is proportional to the energy density p mn at the particular
frequency v mn in the external radiation field.
3. A probability B mn pmn, where now the nth. state lies below the
rath, of emitting a photon of frequency v mn , and falling to the
lower state, under action of the radiation. This is called induced
or forced emission.
Einstein assumed that the following relations held between
the A's and B's corresponding to any transition n — m, where
E m > E n : B mn = B nm , and A mn /B mn = Sirhv 3 mn /c 3 . Assuming
these simple laws, he could then give a very elementary deriva-
tion of Planck's law of black-body radiation. Let us assume that
we have a piece of matter containing many kinds of atoms, so
as to have some capable of emitting and absorbing each fre-
quency. Consider a particular set of atoms having a lower
state 1, an upper state 2, and assume that at temperature T the
number of atoms in the upper state is to the number in the lower
state as e~ Ei/kT is to e~ El/kT , or the Maxwell-Boltzmann distribu-
tion law. Now we ask, what intensity, or energy density, in
the external radiation field must we have to be in equilibrium
with these atoms? If we can find this for each frequency of
radiation, we shall necessarily have the distribution of intensity
in radiation in equilibrium with matter at temperature T, which
is what Planck's law gives. Let Nz be the number of atoms in
the second state, JVi in the first, so that N2/N1 = e -(*^-*i)/*r =
e ~hv/kT } w here v is the frequency emitted or absorbed by the atom
in its transition. Now we know that the number of atoms leaving,
the second state per second is equal to the sum of the following:
1. The number leaving on account of spontaneous radiation,
or JV2A21.
2. The number entering on account of absorption from the
lower state, or — N1B12P12.
3. The number leaving on account of induced emission, or
N2B21P12. This sum must be zero, in a steady state where the
N's are constant. Hence
Ni(An + B21P12) — NiBiipn.
PERTURBATION THEORY 395
Using the relation between the A's and J5's, this is
iV 2 5i 2 (pi2 + ~^-J = NiB 12 pi2.
Setting N2/N1 = e~ hv/kT , canceling B 12, and solving for pi 2 , we
have
Sirhv* 1
P12 =
,3 ghv/kT ^
(14)
which is Planck's law of black-body radiation.
236. Method of Deriving the Probability Coefficients.— Ein-
stein's coefficient A is often derived by analogy with classical
theory as follows: In Chap. XXXI we have seen that the matrix
components of electric moment are connected with probabilities
of radiation. Thus, if the amplitude of the component of
moment of the atom corresponding to the transition 2 — 1 is C,
the corresponding classical rate of radiation is — ^ — . We can
write this component in terms of the matrices as follows: corre-
sponding to this frequency, we have the terms (ex) 12 e h ' +
(ex) 21 e h = 2(ex)i2 cos 2irvt, where hv = E 2 — Ex. Thus
C = 2(ex)i2, and the rate of radiation is Q 3 ' But an
oc
atom with a probability A 21 of radiating a photon of energy hv
is radiating on the average at the rate of A 2J1 v per second. Hence
we must set this equal to the rate of radiation above, giving
_ 64tt 4 M 2 12 ^ c 3 W(ex)\2
Ml Zc^T~ ' Bil ~ A21 ^h? ~ ~ZhT~' (15)
The argument given above is hardly a derivation; it is merely
suggestive. To get a real derivation of the probabilities, we use
the method of perturbations. We shall find, for a reason to be
discussed in a later section, that we can only obtain the J3's by
this method. We shall assume that at t = the atom is definitely
in the mth state; that is, c m ° = 1, all other c°'s are zero, where
the c's are the coefficients in the expansion of the wave function
\{/ in terms of the unperturbed stationary states, so that c n c n is the
probability of finding the system in the nth state, and the c°'s
are the values when t = 0. Then we shall investigate the time
variation of the c's by the method of variation of constants,
and it will appear that the c's for n different from m increase
396 INTRODUCTION TO THEORETICAL PHYSICS
i
linearly with time, so long as we consider only small intervals
of time and small perturbations, the term c m c m correspondingly
decreasing. This we interpret as a definite probability that the
system will leave the mth state and go to the nth; in fact we
shall find c„c n equal exactly to B mn p mn t, as far as the variation
is linear with time. By comparing this expression with the
derived values of c n c n , we can evaluate the Z?'s directly from per-
turbation theory.
237. Application of Perturbation Theory. — Let the Hamiltonian
of the system without radiation be H°, and assume that the
unperturbed problem can be solved exactly:
Let the perturbed Hamiltonian be H° — ex^E v cos 2it(vt — a„),
v
the second term representing the potential of the force of the
field represented by the summation, on the charge e. Under
the action of the perturbation, let the perturbed wave function
be ^ = ^.C m (t)u m °(x). Our task now is to find the C's. Using
m
the method of variation of constants, noting that H° has a
diagonal matrix, we have
dC n _ 27ri^S? u p
If ~ —Y2j Hn ^ k
= --^H nn °C n + -J-^(ex) nk C k ^E v cos %r(vt - a v ).
k v
Now let C» = c n {t)e~~t HnnH , where c n (t) would be constant in
the absence of an external field. Writing the field in exponential
form, and letting H° nn - H kk ° = hv nk , this gives
dc„
~dT
= ^'2w,v c*2f ,eiKi r t,H " 1 + e ~ M[{v ' v k)t ^^
If the external field were not present, we plainly would have
dc n /dt = 0; if there is a small field, the time derivative will be
small, or, in other words, the c's will be approximately constant.
To a first approximation we may assume on the right side that
the c's are exactly constant, having the values c° which they had
at t = 0. If this is so, we may integrate directly, obtaining
PERTURBATION THEORY 397
C "~ c -° = • W*
- c -° -X
<^ %h L \ v + y Bft /
„ . /*> — 2iri(»<— v n k)t l\ |
~^( ->.. )} (16)
Now let us take the case we have discussed, where at £ =
we have c TO ° = 1, all the other c's zero. Then for any n 5^ ra, we
have only the single term of the summation above for which
k = m. Next we find c n c n . In this, we have a product of two
sums over v, which is, therefore, a double sum. Each such
term for which we have different frequencies in the two factors
has a term e -«(«v-«/) > WQ ich, on account of the random nature
of the phases, is as likely to be positive as negative, and on the
average cancels. Thus we are left with only the squares of the
individual terms, in which the a's drop out. Further, each of
these squares has terms whose denominators are respectively
+ v nm ) 2 , + v nm )( v - v nm ), and - v nm ) 2 . The frequency
v nm is so defined that it is positive if the nth state lies above the
rath, which we assume to be the case for the moment. When
v becomes nearly equal to v nm , the term (v - v nm ) 2 is very small,
the term with this as denominator very large. Since v is always
positive, it is not possible for the other terms, involving v + v nm
in the denominator, to become so large. To an approximation,
then, we neglect all terms except the last, obtaining
LnCn i-iq /.Hi„ ; — i
w ^ (v- Vnm y
= (ex) 2 nm ^ E 2 [1 - cos 2tt (v - v nm )t]
2h 2 ^-L~" (v-v nm )
" ~1*-2jF- — („ - Vnm y ~ ' (17)
The formula we have just derived is decidedly significant.
It gives essentially the probability that the system will go, in
time t, from state ra to state n, under the action of the radiation.
For a particular frequency v, this probability is seen to be propor-
tional to E 2 ) that is, to the intensity of the incident radiation;
and to (ex)\ m , the square of the matrix of the electric moment
398 INTRODUCTION TO THEORETICAL PHYSICS
connected with this particular transition, which we should expect.
But in addition, there is a dependence on frequency. If we plot
c n c n , at time t, against v, the impressed frequency, we get a narrow
peak with small side bands, centering at v nm , just like the pattern
found in Fraunhofer diffraction. Thus, if the impressed fre-
quency is close to the absorption frequency v nm , there will be a
large probability of transition, while if it is farther away, the
probability will be smaller. If the perturbation acts only for a
small time, the band will be broad, indicating that many fre-
quencies can cause the transition, but if the time is long enough,
practically only the frequency v nm can cause the transition; the
absorption curve of the substance, in other words, will have a
sharp absorption line corresponding to the various transitions
from the state m to other states n, as calculated by the quantum
theory.
In carrying out the summation over v, it is evident that the
essential contributions will come for frequency v very close to
v nm . In this region, we may replace E v by its value at v nm , which
we have already seen to be given by SE\ nm /8ir = p nm dv. Hence
the summation reduces to an integration,
onOn
8T(ex) 2 nm Psin 2 ir(v — v nm )t , . a v
= —3/^ Pnm J {v - Vnm y dv - (18)
The integration should properly be taken from v = to infinity.
But since the integrand is large only in the immediate neighbor-
hood of v = vnmi we shall make a negligible error if we integrate
J sin 2 z
■ — y~ dz, where
-00 Z
z = ,r( v - Vnm )t. This can be easily evaluated, giving ttH.
Thus we have finally
o7T {eX) nm . ('IQ')
C n Cn — qi,2 Pnmt>) V-*-"/
or Bnmpnmt, where B nm is as given before. Thus we have verified
our earlier statement regarding the probability coefficients B.
A simple variation of the argument applies to states n of lower
energy than the state m, resulting in the probability of forced
emission, and if we compute c m c m , we find that the number of
systems in the wth state decreases at a rate to compensate the
increase in the other states. This can be shown easily on
general grounds as well as by direct computation, for it can be
PERTURB A TION THBOR Y 399
shown that the sum of the quantities c n c n for all states remains
constant.
238. Spontaneous Radiation and Coupled Systems. — The
calculation we have just given did not lead to the probability
of spontaneous emission A nm . An attempt might be made to
include it by adding to the external force a radiation resistance
term, depending on the velocity of the electron, but this method
proves not to lead to the right answer. The proper treatment,
as a matter of fact, must be sought in a different direction.
We treat the radiation field, not as a perturbation, but as part
of the system. It is possible to apply the quantum theory
directly to the field by itself. For instance, if the radiation is
confined in a rectangular box with perfectly reflecting walls, the
electromagnetic field inside consists of a set of standing waves,
of all the wave lengths allowed for a vibrating solid of the corre-
sponding size, and with corresponding frequencies. We can
now introduce normal coordinates, each corresponding to one
mode of vibration, and the classical equations of motion of these
normal coordinates are just like those of a linear oscillator.
In a corresponding way, in wave mechanics, we treat these
normal coordinates, set up a wave equation for each, and find
that each one is quantized, with energy (n + %)hv, where v is the
frequency of the wave, n a quantum number associated with
this particular mode of vibration. A change of this quantum
number by unity corresponds to an increase or decrease of the
energy of the radiation field by one unit hv, and this we identify
with the creation or destruction of a photon of this energy, by
interaction with matter.
Next we treat the atomic system just as if the radiation were
not present. In this case, the atom will continuously stay
in the same stationary state, and similarly the radiation field
will always keep the same quantum numbers, meaning that no
photons are being created or destroyed. But finally we introduce
into the complete system of atoms and radiation a perturbation,
corresponding to the potential of the atom in the radiation
field (including the vector as well as scalar potential). This
couples the two systems together, and under the influence of
the perturbation transitions are possible, in which the atoms
gain or lose energy in passing between stationary states, and
the radiation field loses or gains an equal energy, which appears
as destruction or creation of corresponding photons, or decrease
400 INTRODUCTION TO THEORETICAL PHYSICS
or increase of the quantum number of the proper normal vibration
of the radiation. When the probability of these processes is
investigated, by the method of variation of constants, it is found
that we obtain not only the probability of forced absorption
or emission, Bp, but also the probability of spontaneous emission
A. It is not hard by this method to investigate other questions
as well, as for instance the breadths of absorption or emission
lines — the question of just what frequencies of light can interact
with a given atomic system. The general result is that, the
shorter the life of an atom in either the upper or lower state
associated with a transition, the broader the corresponding
absorption or emission line.
It is interesting to look a little more closely at the sort of
perturbation problem we meet in considering spontaneous
radiation, for example. Suppose we start with the atom in an
excited state, and with no energy in the radiation field. Then,
after the transition, the atom will be in its normal state, having
lost energy, and the radiation will be in an excited state, having
gained the corresponding energy. The total energy of the sys-
tem will be the same in either case. Now neither one of these
situations is a steady state, for neither one persists indefinitely.
Both are approximate steady states, corresponding to the same
energy. The perturbation problem, then, is one in perturbations
of a degenerate system, having two equal energy levels. We have
seen that such a perturbation problem leads to mathematics
just like two coupled mechanical systems, as two pendulums, and
it is convenient to use the mechanical language in describing what
happens. Our present problem is like two pendulums of equal
period (corresponding to the equal energy levels), coupled
together. If the first pendulum vibrates alone, that corresponds
to the state in which the atom is excited; if the second vibrates,
it corresponds to the radiation being excited. But neither of
these mechanical motions can occur by itself; if we start one
pendulum vibrating, in time it comes to rest, and the other
takes up all the energy. This corresponds to the fact that
the system gradually changes so that the atom is in its normal
state, the radiation excited. There is a flaw in our analogy,
however: the energy in the mechanical case goes back to the first
pendulum, while the atom does not come back to the excited
state. The answer to this difficulty is easily given. The radia-
tion field actually has not one mode of motion only, but many,
PERTURBATION THEORY 401
all of about the same energy, all capable of interacting with
the atom. Thus the emitted photon can travel in any direction,
and not only that, photons of many different energies, all in
the neighborhood of the energy ordinarily emitted by the atom,
can interact, on account of the finite breadth of the spectral
line. Thus while the situation where the atom is excited, and
the radiation is in its normal state, is just one state, there are
a great number of states corresponding to the other situation.
It is as if our one pendulum corresponding to the excited atom,
interacted with a great, or even infinite, number corresponding
to the excited radiation. In these circumstances, the mechanical
energy originally in the first pendulum would soon become
dissipated, scattered through the others, and it will never happen
all to come back to the first one, though a little might. Physi-
cally, the radiation emitted by the atom travels to a great
distance, and is very unlikely ever to find its way back to the
atom which sent it out. But if the whole thing is enclosed in
a box with reflecting walls, there will be a certain chance, finite
though small, that the radiation will be eventually reflected
back to the atom and absorbed.
one significant feature of the situation is that there are real
stationary states for the system of atom plus radiation. This
follows directly from the fact that we can solve the perturbation
problem. Just as with the coupled pendulums, there are normal
coordinates, consisting of combinations of the various separate
coordinates. Thus, there is some combination of all the various
probabilities of the atom being in various states of excitation and
the radiation field being in corresponding states which could
persist indefinitely, and is thus a stationary state. The things
we ordinarily think of as stationary states are combinations of
these, just as the state where one pendulum is excited, the other
at rest, is a combination of the two normal coordinates, with
definite amplitudes and phases. These are really not stationary
states at all, for they change with time. In any such problem,
there are two equally good methods of treatment: first, we may
use the unperturbed states which physically seem like stationary
states, treating the perturbations between them by variation of
constants, and so introducing apparent transitions into the
problem ; or secondly, we may introduce the real stationary states,
by the ordinary perturbation theory, introducing the correct
initial conditions, and following what happens as time goes on,
402 INTRODUCTION TO THEORETICAL PHYSICS
without having any transitions at all between these real station-
ary states. This point of view is very illuminating, for it shows
us that the only distinction between stationary states and transi-
tions is largely artificial, determined by the original unperturbed
wave functions in which we choose to discuss the system.
239. Applications of Coupled Systems to Radioactivity and
Electronic Collisions. — Many other problems of transitions can
be looked at from the same point of view we have just used in
discussing radiation. one example is the radioactive disinte-
gration, which we have considered in Chap. XXIX, Fig. 64. We
might take as approximate stationary states first the discrete
states of a particle within the finite depression, second the con-
tinuous states of the particle outside. If the barriers were
infinitely high, there would be no transitions between them, but
if the barrier is finite, we may start with a particle within the
nucleus, and consider that it has a certain probability of a transi-
tion to a state of equal energy outside the barrier. This could
be treated by the perturbation theory of degenerate systems,
where we could find the probability of leaking out by variation
of constants, or alternatively could get approximations to the
actual stationary states of the system. In this case, as with
radiation, the probability of the particle coming back and getting
back into the nucleus again, though small, is finite, if the system
is enclosed in a finite box. Here the stationary states which are
combinations of solutions for the discrete and continuous regions
are perfectly reasonable and natural, and the more accurate way
of solving the problem would be to determine these stationary
states by the Wentzel-Kramers-Brillouin method, and build up a
wave packet at t = corresponding to having all the distribution
inside the nucleus, and asking how this packet spreads out as
time goes on, though without change of real stationary states.
Another similar problem is that of collisions, either elastic
or inelastic. Suppose that an electron collides with an atom,
being scattered either without change of energy, or with decrease
or increase of energy corresponding to raising or lowering the
energy of the atom. We can start with a number of unperturbed,
not quite stationary, states: first, the electron approaching the
atom, with the atom in its original state; secondly, an electron
being scattered, say in a definite direction, or better with some
function of angle represented by a spherical harmonic, with its
initial energy, the atom being unchanged; thirdly, an electron
PERTURBATION THEORY 403
scattered with a decrease of energy corresponding to a transition
of the atom, with the atom in the correspondingly excited state;
fourthly, an electron scattered with increase of energy, the atom
being in a lower state, after what is called a collision of the second
kind. All these states have the same energy, so that the pertur-
bation problem between them, resulting from the fact that they
are not solutions of the problem in the region where the electron
is in the atom, is one of transition between systems of the same
energy. Here, as before, it is often convenient to proceed by the
method of variation of constants, and from this we get the prob-
abilities of the various elastic and inelastic impacts. one thing
is worth noting in all these problems : in the method of variation
of constants, the quantity determining the probability of transi-
tion is the nondiagonal matrix component of the perturbing
energy between the different approximately stationary states.
Thus the calculation resolves itself into a computation of these
matrix components, and transitions are likely for which the matrix
components are large. In our radiation problem, the matrix
components in question were those of the electrical energy, involv-
ing directly the matrix components of electric moment of the atom.
While the perturbation method can be used for discussing
collisions, it is not very accurate, on account of the large perturba-
tions which the colliding electron exerts on the atom during the
instant of collision. Fortunately, at least in the case of elastic
collisions, much better approximation methods are available.
As we shall see later, an atom acts on an electron very much
like a central field of force, and the problem of the scattering of
an electron by a central field is merely the special case of the
central field problem, discussed in the next chapter, which we
meet if the electron is in a continuous rather than a quantized
energy level. By analogy with the results which we shall obtain
in Sec. 241, the wave function of an electron in a central field is
a product of spherical harmonics of angle, times a certain func-
tion of r, and for an electron coming from infinity, this function of
r is of the form shown in Fig. 62, satisfying a definite boundary
condition at the center of the atorii, but becoming sinusoidal
for large values of r. By combining an infinite number of such
solutions, all corresponding to the same energy, but with different
functions of angles, it can be shown that we can make the result-
ant wave at large distances approach a plane wave, representing
a stream of electrons traveling in a definite direction. But the
404 INTRODUCTION TO THEORETICAL PHYSICS
functions are such that, if the central field is not vanishingly
small, it is not possible to build up exactly a plane wave.
Instead, there are certain terms left over representing spherical
waves traveling outward from the center of force, with amplitudes
proportional to 1/r, so that they are negligible compared to the
plane waves at sufficiently large distances. These spherical
wuves represent the elastically scattered electrons.
Twc particularly interesting features of the elastic scattering
cai be investigated by the method just described. First, one
m&7 find the total intensity in the scattered wave, which can be
pro , : !ci to be equal to the total intensity removed from the plane
waw by its passage over the atom. This gives the probability
that tn electron will be scattered by the atom, and it proves to
increase as the atomic number of the atom increases, and to
depen i in a complicated way on the speed of the electron. This
dependence is so complicated that in some cases, called the Ram-
sauer effect, very slow electrons have abnormally small probabil-
ities of being scattered, and practically pass through the atom
without hindrance. The probability of scattering is often
described by defining an effective cross section for the atom, a
cross section such that if all electrons striking it were scattered,
and all passing around it were not, the probability of scattering
would agree with the observed value. Plainly the effective
cross section depends on electron velocity and on the nature of
the atom. The second interesting feature of elastic scattering
is the angular distribution of the scattered electrons, determined
by the relative probabilities of scattering with the various spher-
ical harmonic functions of angle. This again can show a com-
plicated dependence on electron velocity and atomic constitution.
Problems
1. Prove that if both unperturbed and perturbed functions, u n ° and u n ,
are orthogonal and normalized, the transformation coefficients S mn satisfy
the orthogonality and normalization conditions.
2. Show that if we expand the correct wave functions in a series of func-
tions which are not exactly orthogonal or normalized, the equations for
the transformation coefficients S mn arc
2,(Hkm — E n dkm)Smn = 0,
where dkm = Juk°Um° dv, which now is not diagonal and is not equal to dkm-
3. Consider a degenerate system in which there are two unperturbed
wave functions, having equal diagonal energies Hn = H 2i , which are nor-
PERTURBATION THEORY 405
malized but not orthogonal to each other, so that /■Ui°W2° dv = d i2 ?* 0.
Hu + H21 Hu — H21
Show that the two energy levels are
4. Show that the two correct wave functions in Prob. 3 are
1 + di2 1 — di2
U!° + U 2 °
V2(l +d)
H-0 y
— 7 j respectively. Prove them to be normalized and orthogonal.
V2(l — d)
6. Solve the problem of a system with two degenerate unperturbed levels
of the same energy, by the method of variation of constants. Show that
the equations for the time derivatives of the c's can be solved by assuming
an exponential or sinusoidal solution. Show that the final solution is a
pulsation from one state to the other, the frequency of pulsation being
Hu/h.
6. Prove by perturbation theory that the energy levels of a linear oscil-
lator are not affected by a constant external field, except in absolute value,
all being shifted up or down together. Why should this be expected
physically?
7. Find whether a rotator's energy is affected, to the first or higher orders
of approximation, by a constant external field in the plane of the rotator.
8. Prove in Einstein's derivation of Planck's radiation law that B 12 = B 2 i,
by considering equilibrium in the limiting case of extremely high tempera-
ture, noting that in this limit the probability of forced transition is large
compared with that of spontaneous transition, on account of the large
density of radiation.
9. Prove directly from Schrodinger's equation that the sum ^^c n c n always
n
remains constant.
10. For the problem of interaction of atoms and radiation, when the atom
starts in the wth state, work out c m c m as a function of time, and show that
this, added to the other c„c„'s, gives a constant.
CHAPTER XXXIII
THE HYDROGEN ATOM AND THE CENTRAL FIELD
In the preceding chapters we have been discussing the general
principles and methods of wave mechanics. We have seen that
from wave mechanics one can derive ordinary Newtonian mechan-
ics as a special case. But by far the most interesting mechanical
problem which demands wave mechanics for its solution is the
structure of atoms, molecules, and matter in general. We shall
accordingly devote the remaining chapters of this book to the
structure of matter. This is a problem which is doubly interest-
ing; first, as a most important subject in itself, secondly, as the
finest illustration of wave mechanics.
240. The Atom and Its Nucleus. — An atom consists of a
nucleus, and a number of electrons. All electrons are alike,
electrified particles of negative charge — e = —4.774 X 10 -10
e.s.u., mass of 9.00 X 10 -28 gm. Nuclei are heavier, and posi-
tively charged. The charges on nuclei are found in every case
to be integral multiples of the charge e. Thus a nucleus may
have a charge Ze, where Z is an integer, and in this case Z is
called th*e atomic number. If the atom has enough electrons
to be electrically neutral, it is obvious that it must have Z elec-
trons, so that the atomic number measures both the charge on
the nucleus and the number of electrons in the neutral atom.
We shall see that this number Z is the determining quantity in
fixing the properties of the atoms; if all atoms are tabulated in
order of their atomic numbers, they show periodic properties,
for reasons which we shall discuss in the next chapter, and this
arrangement is called the periodic table of the elements. Of
course, the number of electrons on the atom does not always
have to be just the atomic number; violent methods, as bombard-
ment, can knock electrons off, or in some cases extra electrons
can be added, producing positive or negative ions, respectively.
We shall see that some elements, the electropositive or alkaline
ones, have a tendency to lose electrons, and form positive ions,
while the basic elements tend to gain electrons and become nega-
406
THE HYDROGEN ATOM AND THE CENTRAL FIELD 407
tive ions. Atoms often enter chemical compounds as ions, rather
than neutral atoms, so that in our study of atomic structure we
shall have to speak constantly of ions as well as neutral atoms.
The element of atomic number one is hydrogen, the simplest
element. Its nucleus is an elementary particle, called the proton,
with mass 1,846 times that of the electron. The heavier nuclei
appear to be built up from a combination of protons and neutrons,
particles of no charge, but of mass approximately equal to that
of the proton. There are approximately equal numbers of
protons and neutrons in any nucleus, making the atomic weight
(the mass of the nucleus, in multiples of the mass of the proton)
approximately twice the atomic number, though this rule is far
from exact, the heavier atoms containing more neutrons in pro-
portion than the light ones. The forces holding the nucleus
together are presumably largely forces of attraction between
protons and neutrons, more than counterbalancing the repulsions
between protons on account of their like electric charges. By the
action of these forces, stable structures are produced, disintegrat-
ing only in the case of the heavy, radioactive elements, or in the
very light elements under heavy bombardment. The theory of
the structure of the nucleus is still in a preliminary state, and we
shall not consider it; ordinary properties of matter prove to be
almost completely independent of the nuclear structure, depend-
ing only on its charge and mass, with most properties depending
only on its charge, so that two nuclei of the same charge and
different masses, called isotopes, exhibit almost identical prop-
erties. Such isotopes are of very common occurrence, many
ordinary elements being a mixture of several, the chemical atomic
weights being weighted means of the weights of the isotopes,
explaining why many observed atomic weights are far from whole
numbers.
241. The Structure of Hydrogen. — The simplest element is
hydrogen, with but one electron moving about a single nucleus.
Fortunately the problem of its structure, according to wave
mechanics, can be exactly solved, and it serves as a model for the
more complicated elements. In fact, we have already carried out
many of the mathematical steps in problems at one time or
another, so that we shall merely have to summarize results here.
For generality, we shall treat not merely hydrogen, but the prob-
lem of a single electron moving about a nucleus of charge Ze.
The first thing we notice is that the nucleus is very heavy, com-
408 INTRODUCTION TO THEORETICAL PHYSICS
pared with an electron. Now if we have a single electron and a
single nucleus, exerting forces on each other, we find, in wave
mechanics as in classical mechanics, that the center of gravity
of the system remains fixed, each particle moving about the
common center of gravity. But the center of gravity is very
close to the nucleus; it divides the vector joining nucleus and
electron in the ratio of 1:1,846. Thus the nucleus executes only
very slight motions, and practically we can treat it as being fixed,
and the electron as moving about a fixed center of attraction.
We shall find that this is a very general method in discussing the
structure of matter: we first assume all nuclei to be fixed, and
discuss the motion of the electrons about them. only later do
we have to take the motions of the nuclei into account. We
discuss this more in detail in a later chapter.
We have, then, an electron of charge e, mass m, moving in a
central field of force. The attractive force of the nucleus has a
potential energy —(Ze 2 /r). Thus Schrodinger's equation, with
the time eliminated, is
Hu = [ — h-s— V —]u = Eu, (1)
y Sir 2 m T J v '
f
We shall find it convenient in all our atomic problems to introduce
at the outset so-called atomic units of distance and energy. The
unit of distance is a = /i 2 /4x 2 we 2 , a unit first introduced in
Bohr's theory of the hydrogen atom, but which comes into the
o
present discussion as well. It is equal to 0.53 Angstrom. The
unit of energy most convenient to use is 2ir 2 me 4 /h 2 ) though some-
times a unit twice as great is used. This is the energy required
to ionize a hydrogen atom from its normal state. It is most
conveniently stated, not in ergs, but in volt-electrons. A volt-
electron by definition is the energy an electron acquires in falling
through a difference of potential of 1 volt, or eV = 4.774 X 10 -10 X
•5"^ ergs. In terms of this, our fundamental unit of energy is
13.54 volt-electrons. Associated with this energy is a frequency,
given by energy = hv, and a wave length, and its reciprocal a
wave number, given by 1/X = v/c. The wave number associated
with our unit of energy is the so-called Rydberg number, R =
109,737 per centimeter, and the corresponding energy is Rhc.
In terms of our atomic units, Schrodinger's equation for hydro-
gen can be rewritten, eliminating all the dimensional constants.
THE HYDROGEN ATOM AND THE CENTRAL FIELD 409
Thus, if our new distances are the old ones divided by a , the
new energy the old divided by Rhc, we easily find that
(-v -??)« = £«,
(2)
where the derivatives are to be taken with respect to the new
x, y, z. The coefficient 2 in the potential energy appears in the
process of changing variables, the potential energy of two elec-
tronic charges being 2/r in these units.
Schrodinger's equation can now be solved, in spherical coordi-
nates, by separation of variables. Using the results of Chap.
XV, Probs. 6 to 8, the equation can be separated, letting u =
RQ&, and the differential equations are
i d( . je\, r
^ede{ sme d9) + [
, 2Z 1(1 + 1)
sin dd\ dd J ' v ' J sin 2
1(1 + 1) - m
R = 0,
6 = 0,
|? + "> ! * = °. (3)
The solutions of the second and third are 9 = Pf 1 (cos 6), $ =
e ±iwl *, or cos m<f> or sin m<f> } where m must be an integer in order
to have the function single-valued as far as <t> is concerned, and
I must be an integer in order not to have the function P become
infinite for cos = 1. The P's are called associated spherical
harmonics, and are given by
Pi m (cos 0) = sin lml 0(A o + Ai cos + A 2 cos 2 + • • • ),
(fc + M - l)(Jc + [ml - 2) -1(1 + 1) f
A k - A k ^— k(k _ 1} (4)
For integral Vs, this series breaks off, the last nonvanishing term
being for A; = I — \m\. For even I — \m\, the expansion is in even
powers, and for odd I — \m\ in odd powers. The functions R are
discussed in Prob. 3. We use a simple transformation of the
dependent variable, y = rR. The equation in this variable is
3f + [- + ?-«^>-a
(5)
The solution is
y = e- r ^^r l+1 (A Q + A ir + A 2 r 2 +••■),
A - -2A Z-Q + k)V=E
Ak " ZAk -\i + k)(l + k + l)-l(l + 1)' w
410 INTRODUCTION TO THEORETICAL PHYSICS
This series breaks off if E = — Z 2 /n 2 , where n is an integer.
A simple discussion shows that if it does not break off, the result-
ing infinite series becomes infinite as r becomes infinite like
e 2r ^~^ f so that y becomes infinite, and is not admissible as a
wave function for a stationary state. We therefore limit our-
selves to integral n's, and n is called the principal or total quan-
tum number, determining the energy. In terms of it, we have
__rZ_
y = e n r i+i( Ao + a i7 . + . . . + Ar^-i-ir"- 1 - 1 ),
. _2Z n-l-k
Ak n Ak ~\l + k)(l + k + 1) -1(1 + 1)' {J)
From this recursion formula, we see that I cannot be greater than
n — 1, in order to have any terms to the series; and from the
earlier recursion formula for the function of 0, \m\ cannot be
greater than I. The principal quantum number n, and the so-
called azimuthal quantum number I must both be positive,* the
smallest allowable value of n being 1 and of I zero. The so-called
magnetic quantum number m, however, can be positive, negative,
or zero, so long as its magnitude falls within the allowed limits.
242. Discussion of the Function of r for Hydrogen. — Though
we have an exact solution for hydrogen, a qualitative discussion
is still desirable, using the method of energy. In Chap. VII we
have already discussed motion in a central field in classical
mechanics. We have seen that the motion along the radius is
like a one-dimensional oscillation, in a potential field V +
p 2 /2mr 2 , where V is the potential energy, p the angular momen-
tum. In our case, the differential equation for y is like a one-
dimensional wave mechanical problem with a potential, in
. ■ .. * 2Z , 1(1 +1) . .. . Ze 2
atomic units, ot (- - j! — ^ — -> or in ordinary units — 1-
2 — ? where p = \/l(l + l)— It is thus clear in the first
place that the quantum number I determines the angular momen-
tum, in units of h/2ic, though the values are not I times this
unit, but s/l(l + 1) times it. We shall further discuss the
angular momentum later on. Now it is interesting to draw
the various potentials, as we do in Fig. 66, where (- ,
2 k 2
is plotted, for 1 = 0, 1, 2. We have also plotted 1 — ^
THE HYDROGEN ATOM AND THE CENTRAL FIELD 411
indicated by the dotted lines. The reason for this is that
in Bohr's theory of hydrogen, it was assumed that the electron
moved according to classical mechanics, and that its energy
could have only those particular values for which the quantum
conditions were fulfilled. He assumed that the angular momen-
tum was kh/2T, where k was an integer, so that if we discuss
Energy
1 2
3 4 5 6 7
8
r
'/9
— 11 N
"I \
I I \
\^ ^--:_
1 ^
1 \ \
""""-— i
r ~ .
■Va
n ^ j
__ — —
^Or/^\«=2
\-T>
i V —
' — s \s
,' /Vl
/ /
-1.0
1 /
1 /
1
1 /
I//
-1 =
Fig. 66. — Potential and energy levels for hydrogen.
TT ,1V 2 _l_ W + 1 }
Full lines: 1 5 —
(potential corrected for centrifugal force, wave mechanics). Dotted lines:
2 k 2
1 — - (corrected potential, Bohr theory). Horizontal lines represent energy
r r L
levels.
the classical motion with these dotted potential curves, we shall
have precisely Bohr's orbits. He also assumed
tfPrdr = fy/2m (E - V — p 2 /2mr 2 ) dr = n r h,
where
p = kh/2ir.
The energy levels, either on Bohr's theory or wave mechanics,
are — 1/n 2 , where on Bohr's theory n = k + n r , and these are
drawn, at —1, — ■£-, — |, etc. Now consider the particular
case k = 1. The lowest possible energy level for this is evidently
— 1; for here E intersects the potential curve at but one point,
giving, therefore, a circular orbit, the perihelion and aphelion
distances being equal. As we see from the diagram, the radius
412 INTRODUCTION TO THEORETICAL PHYSICS
of the circular orbit is one unit, and the energy minus one unit,
explaining, therefore, the origin of the units. But for this same
k, higher energy levels are connected with elliptical orbits, as,
for example, that for which n = 2,. k = 1, with perihelion
smaller, aphelion larger than the circle for n = 1. For n = 2
there is a second Bohr orbit, for k = 2: a circle of radius 4 units.
Similarly for n = 3, there are three orbits, for k = 1, 2, 3,
and so on, the orbit f or k = n being in each case a circle. This
question is discussed in a problem, where it is shown that the
i • 7? J) 97^
orbits are ellipses, of semimajor axis equal to -^ -—z — - = -=-a ,
Z Airline* Z
and minor axis equal to k/n times the major axis.
In the wave mechanics, where the angular momentum has
the nonintegral value y/l(l + 1) units, we must use the full
lines. Now we are interested in the region where the kinetic
energy is positive, not as the only place where motion can occur,
but as the region where the wave function is sinusoidal. Out-
side this region, it falls off exponentially. We can see a few
examples in Fig. 67, in which the first few wave functions are
plotted (we plot y, equal to r times the radial part of the wave
function). on each function the limits of the region of classi-
cal motion are determined by the fact that the points of
inflection come here, the tendency of the curves being sinusoidal
between the points, exponential outside. It is plain that the
wave functions are larger where the electron is likely to be found,
small where it is not, as we could prove by deriving the solution
from the Wentzel-Kramers-Brillouin method, a possible, though
not very convenient, method of discussing the hydrogen prob-
lem. As this method would show at once, the wave length
and amplitude both become large as r becomes large, and E — V
becomes small, so that the outermost maximum of the wave
function is in all cases the largest, and contributes most to the
wave function as a whole. one property of the wave function
is evident from Fig. 67: for small r, the behavior is determined
mostly by I, for large r mostly by n. This is natural from the
fact that for small r the quantity E + • % — - approaches
2 — > and for large r it approaches E + - = ^-\
We note that as I becomes smaller and smaller, the region where
the wave function is large, or the classical orbit, penetrates
THE HYDROGEN ATOM AND THE CENTRAL FIELD 413
414 INTRODUCTION TO THEORETICAL PHYSICS
closer and closer to the nucleus. For large r, and, as a matter
of fact, for the whole outer maximum, which, as we have seen,
is the most important one, a fairly good approximation to the
_Zr
wave function is simply r n e n , the wave function for the orbit
of maximum azimuthal quantum number (I = n — 1), corre-
sponding to the circular orbit in Bohr's theory. It is interesting
to note that this function has its maximum at r = -~ao, just
the radius of the corresponding circular orbit in Bohr's theory.
243. The Angular Momentum. — We have seen that the
quantity y/l(l + 1)^- corresponds to the angular momentum of
the orbit. This can be seen by computing the matrix of total
angular momentum, or rather of its square, which is more con-
venient. We can most easily get the operator for the angular
momentum, in spherical coordinates, by an indirect method.
Classically, H = p r 2 /2m + p 2 /2mr 2 + V, where p is the total
angular momentum. Now in wave mechanics we find the wave
equation such that
1 h d/ 2 h d\
2mr 2 2-iri dr\ 2-wi dr)
Lf-LAV- B~±\ 1 ( h V d 2 1
nr 2 lsin d 2iri d0\ Sm 2-wi ddj + sin 2 0\27rc'/ d<j> 2 J
h d\ 1 / h V „ ,
2mr
By comparison, it is plain that the operator for p 2 is
But now from the differential equations for and $, we easily
have, using this operator,
p 2 u = 1(1 + l)(^) 2 w. (9)
That is, p 2 has a diagonal matrix (since p 2 u is a constant times u,
without any terms in other characteristic functions), and the
diagonal value is Z(Z + \)Qi/2tt) 2 , so that the total angular momen-
tum is constant, as it must be in the absence of torques. We
can also easily find the component of angular momentum along
the z axis. The angular momentum along this axis is the momen-
tum conjugate to the angle <f> of rotation about the axis, so that
h 8
its operator is ^— : — -• Now take the solutions where d> enters
Jiirl dq>
THE HYDROGEN ATOM AND THE CENTRAL FIELD 415
into the wave function as the exponential, e ±
im4>
Then p e u =
-— : — = ±m 7r -u. This again is diagonal, showing that the
2m d</> 2ir
component of angular momentum remains constant. Further,
if we use the wave function e iw4> , the component equals m h/2ir.
The interpretation of these results is best made in terms of
a vector model. Suppose we consider that the angular momen-
tum of the orbit is I h/2ir. This will then be regarded as a vector,,
normal to the plane of the orbit, pointing in some arbitrary
direction in space. The component of angular momentum along
the z axis is simply the projec-
tion of the vector in that direc-
tion. Now we find that this can
have only the quantized values
m h/2ir. Hence there are only
a finite number of possible
orientations for the orbit, as
shown in Fig. 68, for the states
for I — 3. Plainly m can go
from a maximum of I to a mini-
mum of — I, or 21 + 1 values in
all, just as one finds from the
discussion of the spherical
harmonics. Now this vector
diagram is only suggestive, not
strictly true. We see this from
the fact that our vector has
length I h/2ir, while the actual angular momentum is \/l(l + 1)
h/2w. The fundamental reason is that, since the angular momen-
tum and its component are exactly given, the uncertainty prin-
ciple does not allow us to fix definitely the plane of the orbit,
which corresponds to a coordinate. As a matter of fact, the
electron in wave mechanics does not move exactly in a plane, but
strays outside the plane, as the uncertainty principle would
suggest. This is best shown by polar diagrams of the spherical
harmonics, plotting the square of the spherical harmonic, which
gives the density, as function of angle. This is done in Fig. 69,
for I = 1, m = 1 and 0, and I = 2, m = 2, 1, 0. (1 = does not
depend on angle.) If we imagine these figures rotated about the
axes, we see that for m = I, the figure indicates that most of the
Fig.
>. — Possible orientations of angu-
lar momentum, for 1=3.
416
INTRODUCTION TO THEORETICAL PHYSICS
density is in the plane normal to the axis, but considerable is
out of the plane. For I = 2, m = 1, for instance, the density-
lies near a cone, as if the plane of the orbit took up all directions
whose normal made the proper angle with the axis.
ra = + 1 m = m = ±2 m = + 1 m =
1=1 1=1 1=2 1=2 1=2
Fig. 69. — Dependence of wave functions on angle. O 2 plotted in polar diagram.
244. Series and Selectio^i Principles. — All the states for a given
value of I and n, but different m, have the same function of r, and
the same energy. We shall find that this is still true with an
arbitrary central field, so that even in that problem the solution
is degenerate. Physically, so long as the angular momentum is
determined, it cannot make any difference as far as the energy is
concerned which way the orbit is orientated, on account of the
spherical symmetry. Thus we often group together the various
substates with the same I and n but different m, regarding them
as constituting a single degenerate state, with a (2Z -)- 1) fold
degeneracy. For hydrogen, the energy as a matter of fact
depends only on n, so that all states of the same n but different
I values are degenerate, but this is not true in general for a central
field. It is convenient, rather, to group all the states of the same
I value but different n together to form a series, since they are
closely connected physically, having the same functions of angle,
while those of the same n merely happen to have the same energy,
but without important physical resemblances. The series of
different I values are conventionally denoted by letters, derived
from spectroscopy. We have the table as shown on page 417.
By order of degeneracy we mean simply the number of sub-
levels of different m values.
The classification into series becomes important when we con-
sider the transition probabilities from one level to another. We
THE HYDROGEN ATOM AND THE CENTRAL FIELD 417
I value
Letter
States
Order of
degeneracy
s
Is, 2s, 3s, . . .
1
1
V
2p, Sp, 4p, . . .
3
2
d
3d, Ad, . . .
5
3
f
4/, 5/, . . .
7
4
9
5g, . . .
9
recall that these are given by the matrix components of the
electric moment between the states in question. When these
components are computed, it is found that there are certain
selection rules:
1. The component is zero unless the Vb of the two states differ
by ± 1 unit.
2. The component is zero unless the m's differ by or ±1 unit.
The latter rule is easily proved. For, suppose we compute the
matrix components of x + iy, x — iy, z, which are simple com-
binations of x, y, z, the three components of displacement. If
we find the matrix components of all three of these to be zero for
a given transition, the transition will be forbidden. Now these
three quantities, in polar coordinates, are r sin e^, r sin e~ i4> ,
r cos 0, respectively. If u is RQe™*, we have (x + iy)u = rR
sin 9 e i{m+1)4> , showing that this quantity has a matrix component
only to states having the quantum number m + 1, since the
quantity on the right could be expanded in series of functions with
many values of n and I, but only the one value m + 1. Similarly
(x - iy)u = rR sin e*'"^"*, allowing transitions only from
m torn — 1, and zu = rR cos 6 e im *, allowing only transitions
in which m does not change. The proof of the selection principle
for I is slightly more difficult, involving the theorem that sin
Pi m (cos0) or cos P z m (cos 0) can be expanded in spherical
harmonics whose lower index is I + 1 or I — 1 only.
The selection rules have the following results: If we arrange
the series in order, spdf . . . , a level of one series can only have
transitions to the immediately adjacent series. This gives us
the transitions indicated in Fig. 70 (all of the transitions between
upper states are not indicated; merely some of the more important
ones down to lower states). The series of lines arising from
transitions of the p states to Is is called the principal series; from
the s terms to 2p, the sharp series; from the d terms to 2p, the
418 INTRODUCTION TO THEORETICAL PHYSICS
diffuse series; from the / terms to 3d, the fundamental series.
The letters s, p, d, f are the initials of these series. When the
matrix components are worked out, the strongest lines are those
in which I decreases by one unit (principal, diffuse, and funda-
mental series), and those for which I increases (as the sharp
series) are weaker. Of course, on account of the degeneracy in
I in hydrogen, the different series are not separated, but they' are
in other atoms, and it is for those that the classification is impor-
Fig. 70. — Energy levels and allowed transitions and series in hydrogen.
tant. To see this, we must study the energy levels in the general
central field.
245. The General Central Field. — We shall find that in
discussing atomic structure, we shall wish to consider that each
electron moves in a central field, but not an inverse square field.
The field is rather the sort which we should have if there were a
nucleus of charge Z units, surrounded by a spherical ball of
negative charge, having a total charge — (Z — 1) units, corre-
sponding to the remaining electrons of the atom. Such a field
has a potential ■> where Z(r) goes from 1 at large r to Z at
THE HYDROGEN ATOM AND THE CENTRAL FIELD 419
small r. For such a potential, most of our discussion goes through
without alteration. The differential equation can be separated
in the same way, and the functions of angle are just the same,
so that our classification into series, vector model, and selection
principles holds as with hydrogen. The only difference comes in
the function of r, and in the values of the energy levels. We can
no longer solve the equation exactly, and shall use the qualitative
method of discussion. In Fig. 71 we show a diagram, like Fig.
66, m which we plot — — — + — — i — -• The potential is so
chosen that for r greater than unity, Z(r) is just unity, but for
smaller r's Z(r) = 10 — 9r, so that the charge approaches 10
at r = 0, but joins on smoothly at r = 1. It is obvious that the
s electrons are greatly affected by the change in potential. The
Is wave function is located practically all inside r = 1. Thus
■ ii -2(10 - 9r) 2(10) , _, nx
its potential curve is practically = — h 2(9)
for the whole range. In other words, it is like a hydrogen prob-
lem of nuclear charge 10 units, but with the constant correction
2(9) to be added to the energy. The energy of such a state would
be *- (10) 2 = —100, and when we add our constant 18, it is —82
units, showing that this level is very tightly bound. Similarly
the 2s is largely inside, though not so completely, and to a some-
what poorer approximation its energy is — j — (- 18 = —7 units.
The higher s orbits, however, project out into the region beyond
r = 1, where the potential is hydrogen-like with charge 1, and we
shall discuss them in a moment. The p, d, . . . states, on the
other hand, are almost entirely outside the range where the
potential is not hydrogen-like. Their energy levels and wave
functions are almost exactly like those of hydrogen.
It is seen from this discussion that we can divide the levels in
such a case into three classes: (1) those entirely inside the range
of large potential, which will prove to be those inside the atom;
(2) those half in and half out; and (3) those entirely outside.
The levels of larger I values do not penetrate the inside, and
belong to group 3. In this case, we reach this situation with I —
1, but with larger cores of negative charge about the nucleus, and
so larger regions where the potential is much greater numerically
than in hydrogen, the p electrons, or in some cases the d or even
/ electrons, are penetrating. For the lowest I values, in any
420 INTRODUCTION TO THEORETICAL PHYSICS
case, the orbits of large n are partly outside but penetrate inside,
and those of small n are entirely inside the core of negative charge.
These penetrating orbits have quite different energy values from
the nonpenetrating ones, so that the different series do not lie
-60-
-l >-
Fig. 71. — Potential and energy levels for a central field, with Z(r) = 10 — 9r
from r = to 1, Z(r) = 1 for r greater than unity. Left-hand diagram on
different energy scale.
on top of each other, as in hydrogen. For the orbits which
penetrate in their inner parts only, we get a formula for the
energy, from the quantum* condition. This formula is most
conveniently derived using Bohr's form of the azimuthal quan-
tum condition. We have fp r dr = n r h for the radial quantum
condition. Then for hydrogen, I l p r dr -\- kh\= nh = — 7 — - h,
V-E
THE HYDROGEN ATOM AND THE CENTRAL FIELD 421
where k is Bohr's azimuthal quantum number. Thus $p r dr =
, ■ — kh. For a penetrating orbit with our form of potential,
the integral over the outer part of the orbit, where the potential,
and hence p r , are hydrogen-like, will have just the same value as
here, if we use the proper energy. For the inside, however, p r
is much greater, so that there is an additional contribution to the
integral, as we see from Fig. 72. This contribution, moreover,
is roughly the same for all terms of the same k value, since the
Fig. 72. — Phase space and phase integral for r, penetrating and nonpenetrat-
ing orbits. (1) and (2): Nonpenetrating orbits of same k, different n. (3)
combined with (2) : Penetrating orbit, having same energy as (2) , but in a non-
Coulomb field, so that it has a different quantum number and phase integral.
Shaded area represents the quantum defect 5.
inner part of the orbit depends almost entirely on the angular
momentum alone. Thus we have for the general case
s
p r dr =
h
V-E
— kh + Sih,
where 5i is a function of k only, to the first approximation. The
result must be n r h, by the radial quantum condition, so that we
have
E = -
(n r + k - 5i) 2 (n - 5i) 2
(10}
422 INTRODUCTION TO THEORETICAL PHYSICS
where n is the total quantum number, and where 5 is called the
quantum defect. A more careful discussion, using the Wentzel-
Kramers-Brillouin method, shows that the same formula still
holds when we use -\/l(l + 1) in place of k, and remember that
we must use half quantum numbers. This formula, which can
be written, in wave numbers, E = —-, r~v5' is called Ryd-
' (n — 5i) 2
berg's formula, and was first discovered experimentally by Ryd-
berg. We see then that the penetrating orbits fall into series as
the nonpenetrating ones do, but that we must subtract the
quantum defect from the quantum numbers. These quantum
defects range from for the nonpenetrating orbits to sometimes
quite large values, even of the order of 5 or 6, for the s electrons
of heavy atoms. From experimental observations of spectral
series, we can find the quantum defects, and so tell which orbits
are penetrating, and which are not. In the next chapter we
shall discuss in more detail the energy levels for the orbits entirely
inside the atom, which are most directly concerned in atomic
structure.
The wave functions for the central field of the type we are
discussing are not very different in general from those for hydro-
gen. But there are important differences in detail. We note
that a hydrogen-like orbit corresponding to the problem of
nuclear charge Z is 1/Z times as great as that for nuclear charge
1. Hence, in the case of Fig. 71, the Is and 2s orbits are some-
thing like 1/10 as large as for hydrogen. The penetrating orbits,
like 3s, 4s, etc., will have the inner loops small in proportion, as
the Is and 2s are, but the outer parts, being in a field of charge 1,
will be large. Thus there will be a much greater disparity
between the size of the inner and outer loops than even for hydro-
gen, the outer ones being much more important in consequence.
We may see this from the Wentzel-Kramers-Brillouin method.
Here both amplitude and wave length go inversely with p r . In
the penetrating part of the orbit, p r is much greater than for
hydrogen, for the same total energy, so that amplitude and wave
length become extremely small. The physical way to say this
is that the electron moves very fast when it penetrates the core
and is exposed to the whole charge of the nucleus, and hence
spends but a very short time there, so that the wave function is
small. For actually computing the wave functions, we can best
use numerical integration of the differential equation, or the
THE HYDROGEN ATOM AND THE CENTRAL FIELD 423
method of Wentzel, Kramers, and Brillouin. We shall discuss
wave functions more in detail in the next chapter.
Problems
1. Work out the spherical harmonics for I = 3, and draw diagrams for
them similar to Fig. 69.
2. Prove from the differential equation that the associated spherical
harmonics are orthogonal. Verify this for the cases of I = 1 and 2.
3. Carry out the solution of the radial wave function for hydrogen, deriv-
ing Eqs. (5), (6), and (7), following the method outlined in the text, and
verifying that if the series does not break off it represents a function which
becomes infinite as r approaches infinity.
4. Show that y 2 dr, where y = rR, is proportional to the probability of
finding the electron between r and r + dr. Compute radial wave functions
for states Is, 2s, 3s for hydrogen, and draw graphs of y 2 .
5. Prove that for a radial wave function without nodes (I = n — 1), for
nuclear charge Z, the maximum of y comes at n 2 /Z.
6. Using the results of Prob. 3, Chap. IX, set up the radial phase integral
for Bohr's model of hydrogen, showing that E = —1/n 2 . Using the prop-
erties of the ellipse mentioned in Prob. 4, Chap. VII, verify the statements
of Sec. 242 regarding the dimensions of the orbits.
7. Draw an energy level diagram in which the substates of different m's
are shown, drawing them as if slightly separated, including states Is, 2s,
3s, 2p, 3p, 3d. Indicate all transitions allowed by the selection principles
for I and m, as in Fig. 70.
8. Prove that the potential used in Fig. 71 is what would be found with
a nucleus of 10 units charge, surrounded at distance unity by a hollow sphere,
with 9 units of negative charge uniformly distributed over the surface.
9. A rough model of the inner electrons of the sodium atom can be
obtained by assuming the nucleus of charge 11 units; a shell of radius 0.09
units, with two electronic charges spread over the surface; and a shell of
radius 0.58 units, with 8 electrons spread over it, so that the net charge is
1 unit positive. Set up a diagram like Fig. 71 for such a potential field,
drawing the potential functions for s, p, d electrons. Find which orbits
are nonpenetrating.
10. Using the potential of Fig. 71, and Bohr's azimuthal quantum condi-
tion, compute the positions of 3s, 4s, and 5s levels. To do this, evaluate
the radial quantum integral, computing separately the parts inside and
outside r = 1, set the sum equal to n r h, and solve for the energy, using
numerical methods if necessary to solve the transcendental equation. Find
how closely the result fits with the Rydberg formula, computing quantum
defects for each level.
11. In the field of Fig. 71, the p electrons do not have exactly the hydrogen
energies, for their wave function is not zero in the region inside r = 1,
where the potential is not hydrogen-Uke. Compute the first-order perturbed
value of the energies of 2p, 3y>, 4p, by using hydrogen wave functions as the
starting point of a perturbation calculation, and assuming the difference
between the hydrogen potential and the actual one as perturbative potential.
424 INTRODUCTION TO THEORETICAL PHYSICS
Compute quantum defects for each level, seeing how well the Rydberg
formula is obeyed. It is to be noted that in such a case as this, the second-
order perturbation is often more important than the first, so that our calcula-
tion is not very accurate.
12. Apply the Wentzel-Kramers-Brillouin method to the wave functions
of hydrogen, computing approximate radial functions for 3p, 4p, and com-
paring with the exact solutions.
CHAPTER XXA.1V
ATOMIC STRUCTURE
The electrons in an atom move, to an approximation, in central
fields of force, each in the field produced by the nucleus and the
average charge of the other electrons. Thus, as we have seen
in the last chapter, there are different quantum numbers which
they can have. We can have in an atom Is, 2s, 2p, . . . elec-
trons. All electrons of a given total quantum number, inside the
atom, have roughly the same radius for the maximum of their
wave functions, and roughly the same energy, in contrast to the
electrons which are largely outside, in which s and p electrons
are more tightly bound on account of penetration. We can then
group the electrons of the same total quantum number together
into shells, those of n = 1 forming what is called the K shell,
those with n = 2 the L shell, n = 3 the M shell, etc., the letters
K,L,M, . . . coming from x-ray notation. The inner electrons
are the most tightly bound and hardest to remove, and hence
connected with the highest frequencies in the spectrum: the K
series of x-rays, connected with the electrons of the K shell, has
shortest wave length, L series next, and so on. on the other
hand, an outer electron is shielded from the nuclear attraction
by the presence of the other electrons; for the electrical force
acting on a charge in a spherical distribution is what we should
have if we imagined a sphere drawn about the center through the
charge in question, forgot about all charge outside this sphere,
imagined the charge inside the sphere concentrated at the center,
and calculated its attraction by the inverse square law. Thus
lor an inner electron we forget about almost all the other elec-
trons and have practically the unadulterated attraction of the
nucleus, but with an outer electron the number of other electrons
within the sphere is almost equal to the nuclear charge and almost
cancels it, leaving only a small net attraction, and an easily
detached electron. It is convenient in this connection to speak
of an "effective nuclear charge" Z e , and a shielding constant S;
?>~ is the charge which, placed at the center, would produce the
425
426 INTRODUCTION TO THEORETICAL PHYSICS
same attraction as the nucleus and electrons, and thus varies
from Z for the inner electrons down to the order of magnitude of
1 for the outer ones, and S is denned by Z e = Z - S, so that S
measures roughly the number of electrons inside the sphere in
question. In general, we see that each electron in an atom, or
at least each shell, will have a different shielding constant. And
now it is an important fact that the energies involved in ordinary
chemical and physical processes are only large enough to remove
or disturb the outer electrons of an atom, and leave the inner
ones unaffected. only x-rays, very violent bombardment, and
such extreme means can disturb the inner electrons, and as a
result we need not consider them in ordinary chemical and physi-
cal applications.
246. The Periodic Table.— The series K,L, M, . . . of shells
has no obvious end, and yet an atom has but a finite number of
electrons. It is evident, then, that the shells cannot all be filled.
The attraction of the nucleus will pull electrons into the lowest
shells, until they are filled, and then the rest will have to go into
higher ones. The capacity of a shell is strictly limited, according
to a very important principle called the exclusion principle
(excluding more than a certain number of electrons from a shell),
so that a K shell can contain only 2 electrons, an L shell 8, an
M shell 18, an N shell 32, and so on. Using this principle, we
can begin to see how the atoms build up, and in so doing we under-
stand the structure of the periodic table (see Fig. 73), the fact
that when atoms are tabulated according to atomic number their
properties repeat themselves in a regular way. Thus hydrogen
has but one electron, which naturally prefers to go into the K
shell. Of course, it does not have to; it can be in a higher shell,
or level, corresponding to a higher energy, and then it is an
excited electron. But this is not a stable situation: collision with
another atom or molecule, or interaction with radiation, is most
likely to absorb the extra energy and permit the atom to fall to
its lowest and most stable energy level, losing its excitation, so
that this lowest level is the normal state. This situation of the
existence of excited states, but the preference for the normal
state, is characteristic of all the atoms, and for the moment we
are describing the normal states, for they are the ones in which
we ordinarily find the atoms.
To resume, helium has two electrons, and in the normal state
they are both in the K shell. This shell is now completed, no
ATOMIC STRUCTURE
427
more electrons can be bound in it, and such a completed shell is
characteristic of the inert gases, of which helium is one. Lithium,
with three elctrons, would have two K electrons, and one L, and
the latter would be loosely bound, and could be easily detached.
In connection with this, we observe that lithium is an alkali
metal, very much inclined to form a singly charged positive ion,
which it does by losing the one electron, the loss of unit negative
Cs
5556
- Ra
Ce Pr Nd-SmEuGdTbDyHoErTuYbCb
57 58 59 60 61 62 63 64 65 66 67 686970?!
Ac
ThRaU
909192
HfTaW-OsIrPt
7273-74757677 78
AuHfl TIPbBiPb
7990 81 82 836485
Q. Q.CL Q. O.
W W "llO "w "«
gg 6s,4f.5eJ,6p
7s.5f,6d
w w to 8 S W W
Fig. 73. — Periodic table of the elements, with electron configuration of lowest
states.
charge being the same as gaining unit positive charge. Next,
beryllium with four electrons has two K'& and two L's, and can
easily lose the latter to form a divalent positive ion. Thus we
go through boron with two K's and three L's, carbon with four
L's (forming sometimes the ion with four positive charges) and
nitrogen with five L's. By this time, however, the attractions
between the outer electrons and the nucleus have become rather
large, and they are not easy to detach. The reason for this is
that as we get more electrons in a shell, the effective nuclear
428 INTRODUCTION TO THEORETICAL PHYSICS
charge gets larger. For the electrons in a shell cannot shield
each other very effectively; off hand we cannot say whether they
are inside or outside the sphere of the last paragraph, and as a
matter of fact the contribution to the shielding constant made
by an electron in the same shell we are considering is only about
0.35 of an electronic unit. Thus if the effective nuclear charge
for lithium's L electron were 1.30 (which is about the right
amount, equal to Z — S where Z = 3, S = 1.70 for the two K
electrons, which do not shield perfectly), then for one of the two L
electrons in beryllium we should have 4.00 — 1.70 — 0.35 = 1.95,
and for an L electron in boron 5.00 - 1.70 - 0.70 = 2.60, increas-
ing 0.65 for each atom, until for nitrogen we have 3.90 and for
oxygen 4.55. Since the electrostatic attractions are proportional to
the nuclear charge, this means that it is much harder to remove an
electron from nitrogen than from lithium. By the time we come
to oxygen and fluorine, we hardly have positive ions formed at
all. But now another situation comes in : the attractions become
so strong that an atom can pull an extra electron or two into its
outer shell, forming a negative ion. Thus oxygen very easily
forms a singly charged negative ion, and sometimes a doubly
charged one. It can not go farther than this, for with two extra
electrons its L shell has eight electrons and is completed. Simi-
larly, fluorine can form a singly charged negative ion, but no
more. And finally neon, with ten electrons, has two K's, eight
Us, and consists of closed shells. It is the next inert gas after
helium. It forms no ions : it would have to hold an extra electron
in the M shell, and this would not be tightly bound, so that it
would not stay; or to form a positive ion, it would have to lose
one of its L electrons, and these are held too tightly to be removed
by ordinary chemical processes. Thus it is inert.
After neon, we next come to sodium, with eleven electrons.
This has two K's, eight L's, and the next electron must be an M .
That is, it has one loosely bound electron, just like lithium. It
again has a tendency to form a singly charged positive ion, and
is an alkali metal like lithium. Magnesium, next, has two M
electrons, and is like beryllium. We begin here to see the origin
of the periodic table, for we have advanced by eight in our series
of elements and have come to elements of similar properties.
The similarity persists in this way up through argon, with eigh-
teen electrons. A that point, we must take account of a further
fact which we have not mentioned. Each of these shells is
ATOMIC STRUCTURE 429
really subdivided into subshells, of slightly different size and
energy. The subshells are determined by the azimuthal quantum
numbers, the states s, p, d, . . . of the same total quantum
number becoming less tightly bound as we go out in the series,
on account of decreased penetration. The maximum number cf
electrons in a shell of a given designation is invariable : an s group
can have only 2 electrons, a p group 6, a d group 10, an / group
14, and so on (2 X 1, 2 X 3, 2 X 5, 2 X 7, • • • , or in general
2 X the number of subgroups of different m values). Now the
K shell contains only the s group, accounting, therefore, for its
maximum number 2 of electrons. The L shell contains a 2s and
a 2p group, so that its maximum number is 2 + 6 = 8. Simi-
larly the M has subshells 3s, 3p, 3d, with a maximum number
2 + 6 + 10 = 18, and N has 4s, 4p, Ad, 4/, with a possibility of
2 + 6 + 10 + 14 = 32 electrons. When now we examine the
energies of these various groups, we discover that the differences
of energy between different subgroups of a shell may often be
larger than those between different shells, with a result that the
order of groups is changed. As a matter of fact, beginning with
the most tightly bound shells, the groups are arranged as far as
their energy is concerned approximately as shown in the following
table, in which the first line gives the group, the second the num-
ber of electrons in the group, the third the total number of elec-
trons in that group and all inside it, and the last the element
completing the group, whose atomic number therefore stands
just above it:
Is, 2s,
2p,
3s,
3p,
4s,
3d,
4p,
5s,
U,
5p,
Gs,
4/,
M,
6p,
7s
2 . 2
6
2
6
2
10
6
2
10
6
2
14
10
6
2
2 4
10
12
18
20
30
36
33
48
54
58
70
80
C6
88
He Be
Ne
Mg
A
Ca
Zn
Kr
Sr
Cd
Xe
Ba
Yb
Hg
Rn
Ra
Within each shell the subshells are arranged in the order stated,
but there is overlapping between the shells.
We now see that at A (argon), although the M shell is not' com-
pleted, still the 3p subshell is, and this is enough to form a closed
group and an inert gas. Next we come to K, 19, with one 4s
electron, another alkali, and Ca, 20, with two, an alkaline earth
like Be and Mg. But now instead of forming a group of 8 by
adding p electrons, the next additions go 'nto the 3d shell, and
only after that is filled up do they go into 4p, so that by the time
we come to the next inert gas, Kr, we have added 18 electrons
rather than 8 after A. The series of elements in which the 2d
430 INTRODUCTION TO THEORETICAL PHYSICS
electrons are being added is the iron group. These have con-
siderable similarity, because although the 3d electrons are less
tightly bound than the 4s, they are farther inside the atom, and
the outside parts of these atoms are quite similar. When we go
beyond Kr, we repeat the same sort of process, having another
group of 18 elements in which the 5s, 4d, and 5p electrons are
being added, before coming to the next inert gas Xe. The
transition group which we go through here is the Pd group.
Next after that, after adding the two 6s electrons to form Ba,
the whole group of 14 4/ electrons is added, resulting in a long
group of remarkably similar elements, the rare earths. As a
matter of fact, these elements have one 5d electron each, so that
our scheme is a little misleading in respect to them. After
finishing the 4/ group, the normal procedure repeats itself, the
bd and Qp being added to complete the shell of 18 interrupted by
the rare earths and terminated at Rn, and finally the 7-quantum
electrons being added to give the elements of the last, incom-
pleted row of the table.
It is often convenient, in describing an atom in any state, to
give the number of electrons having each quantum number by a
symbol, as ls 2 2s 2 2p 6 3s for the normal state of Na, meaning that
there are two Is, two 2s, six 2p, one 3s electron. Such an arrange-
ment is called a configuration. And a transition between two
stationary states can be conveniently denoted by writing the
two configurations. Thus the transition ls 2 2s 2 2pHp — > ls 2 2s 2 2p 6 3s
for Na is a line of the principal series in the optical spectrum;
the transition Na ls 2 2s 2 2p 6 3s — > Na+ ls2s 2 2p 6 3s represents the proc-
ess of ionizing one of the K electrons of Na; and so on.
247. The Method of Self -consistent Fields. — We have just
seen that the electrons of an atom act approximately as if they
moved in central fields, rather than under the action of the other
electrons, and have shown that this leads to quantum numbers
for the electrons, to shells resulting from this, and to the periodic
properties of the elements as successive shells are filled up. In
making this idea more precise, we meet the method of self-con-
sistent fields, developed by Hartree. In this method we assume
that
1. The field in which the kth electron moves is obtained by
taking the wave function of each of the other electrons, squaring
to get the average density of charge due to these electrons, averag-
ing over angles to get a spherically symmetrical distribution,
ATOMIC STRUCTURE 431
adding all these charge densities together, and finding the poten-
tial, together with that of the nucleus, by electrostatics. This,
of course, will give a nonhydrogenic field, different for each
electron.
2. To get the wave function of the fcth electron, we solve
Schrodinger's equation for the field above, using the appropriate
quantum numbers. Since the field is nonhydrogenic, we must
use numerical methods, or the Wentzel-Kramers-Brillouin
method.
Having found these final wave functions, they must be the
same ones with which we started step 1. It is this fact which
leads to the name " self-consistent." If we started with arbitrary
wave functions, computed a field, solved for the wave functions
in that field, the final functions would not in general agree with
the original ones. If we keep on repeating the process, however,
using in each case the final wave functions of one stage of the
calculation to begin the next, it rapidly converges so that after
a few repetitions the field is approximately self-consistent.
This method has been used for numerical computation of the
wave functions of a number of atoms.
248. Effective Nuclear Charges. — The method of self-
consistent fields, though quite accurate, demands numerical
computation, and is not well suited for elementary calculations.
We may instead approximate the wave function of each electron
by a hydrogen wave function, corresponding to an effective
nuclear charge Z — Si. To get Si, we should add up the total
number of electrons within a sphere whose radius is the effective
radius of the ith. electron's wave function. It is easier to figure,
not by means of the radius, but from the quantum number, since
to a rough approximation the radius of an orbit is n?/{Z — Si),
so that electrons inside a given one are those of smaller total
quantum number. The following table proves to give roughly
the contribution to the shielding constant of a given electron
from each other type of electron, valid for the electrons found
in the light atoms. We see that the shielding of one electron
by a second does not go suddenly from unity to zero as the
shielding electron's quantum number becomes greater than
that of the shielded electron, but instead changes gradually,
in accordance with the fact that each electron really has charge
distributed over all distances, and it is possible for part of the
charge to be inside, part outside, a given radius.
432
INTRODUCTION TO THEORETICAL PHYSICS
Table 1. — Contribution of one Shielding Electron, of Given
Quantum Number, to Shielding Constant of Shielded Electron
Shielding electron
Shielded electron
Is
2s
2p
3s
3p
Is
0.35
2s
0.85
35
0.35
2p
0.85
0.35
0.35
3s
1.00
0.85
0.85
0.35
0.35
3p
i.oo
0.85
0.85
0.35
0.35
To illustrate the use of this table, let us take the case of
Na, Z = 11, in its normal state ls 2 2s 2 2p 6 3s. Evidently we
have three shells, corresponding to the three values of n. Then
we have
n = 1: 8 = 0.35, radius = n 2 /(Z - S) = 1/10.65 = 0.09
n = 2: S = 2(0.85) + 7(0.35) = 4.15, radius = 4/6.85 = 0.58
n = 3: S = 2 + 8(0.85) = 8.80, radius = 9/2.20 = 4.09.
The inner radii are as given in Prob. 9, Chap. XXXIII.
The calculations we have given so far refer to wave functions,
rather than energy levels. To investigate the latter, we must
make a more careful discussion of the theory of the many-body
problem and its treatment by Schrodinger's equation.
249. The Many-body Problem in Wave Mechanics. — Our
treatment of atomic structure so far has been rather intuitive,
not based directly on Schrodinger's equation at all. We have
not yet set up the problem of many bodies in wave mechanics.
To do so, we proceed as follows: Let the problem have N gen-
eralized coordinates, q\ . . , q N . Then we seek a wave function
>K<?i . . . qN, t), such that ^dq\ . . . dq N gives the probability
that the coordinates will be found at time t in the region dq x . . .
dq N . To set up Schrodinger's equation, we take the classical
Hamiltonian function, convert it into an operator H by sub-
stituting -p—. — for p iy and write the equation H\p = —tt—. -r--
2ti dqi 2ti at
We eliminate time as usual, and have a differential equation
for u{qi . . . q N ), which is Hu = Eu, E being the energy of the
whole system.
There is one simple case of the many-body problem: that
where there are many particles, exerting no forces on each other.
ATOMIC STRUCTURE 433
That is, we may have n particles, whose coordinates are x x y\Z\
. . . x n y n z n , and the potential is 7= Vi(xiyiZi) + • • • +
V n (x n y n Zn), without any terms involving coordinates of two parti-
dV dV-
cles simultaneously. For such a potential, — = -^(xiyiZi),
OXi OXi
a force on the ith particle depending only on the coordinates
of that particle. In such a case, we can separate variables,
writing u = Ui(xit/iZi) • • • u n (x n y n Zn). For Schrodinger's equa-
tion can be written
[(- afe*' + v ) + ■ ■ ■ + (- s^" 2 + v -)} =
Eu, (1)
where V; 2 means d 2 /dx t 2 + d 2 /dyt 2 -f- d 2 /dZ{ 2 . A separation of
variables can be carried through in the usual way, and can be
summarized as follows: if we write u as a product, as above,
then Schrodinger's equation is satisfied if
(-&fe V <* + V ) Ui = EiUi
't»i) (2)
Ex + • • • + E n = E
In the case of atomic structure, and in general with the struc-
ture of matter, there are forces between the electrons. But
here it is possible to make an* approximation, as we have done:
we replace the actual force between a given electron, say the ith,
and the others, by the average which it would have from the
mean distributions of the other electrons in space. Roughly
we may say that, while the force with any particular arrangement
of the other electrons will differ from this value, it will average
out to give our mean value, and the deviations from the mean
will not be so large as to destroy the approximation. Thus,
using such a method, each electron becomes acted on, not by
the other electrons, but by an averaged field. It is the motion
in this field that we have considered in the present chapter.
250. Schrodinger's Equation and Effective Nuclear Charges. —
The result of the approximate calculation we have made has
been a set of one-electron wave functions, one for each electron
of the atom. These satisfy equations which, in atomic units,
are
^, 2(Z - Si)
Vt
(Z - Si) 2
Ui = — j-^-Ui. (3)
434 INTRODUCTION TO THEORETICAL PHYSICS
Now the potential energy of the whole atom, in atomic units, is
all pairs
if m is the distance between the ith and ith electrons. Thus the
Hamiltonian is
*-2(-"-*+2£+2£\ («
i \ j inside i j in same /
\ shell as i /
where the two summations are the same thing as the sum over
all pairs. If now we assume that u = u x • • • u n , where the u'a
are as we have found, and try to see how good an approximation
this forms, we have, substituting for the Laplacians, from Eq. (3),
*--[-2^V2(-* +
2
Ti
^ rj
+ > f-k (6)
j inside i j in same
• shell as i
If Schrodinger's equation were satisfied, this would be Eu, where
E is a constant. This is not true; the first term is a constant
times u, but the second is a variable function of the r's times u.
The average value of the last term, however, is approximately
zero. For 2/r*,- is the potential, at the *'th electron, of the jth
electron. If the latter is inside, and we average over its position,
and average to make it spherically symmetrical, the potential
will be the same as if it were concentrated at the center, or will
be 2/n. For an electron in the same shell, it turns out that the
average of l/r»,- is about 2(0.35) /r». The summation, for all
electrons inside or in the same shell as i, is then essentially
2Si/r i} just canceling the first term, and leaving as the result,
using this approximate method «of averaging, of
i
showing that we have an approximate solution, and that the
energy of the atom is - ^ ^— This represents the nega-
ATOMIC STRUCTURE 435
tive of the energy required to remove all the electrons from the
atom. If we wish to find the energy of the atom by first-order
perturbation theory, we recall that we must find the diagonal
term of the energy matrix, or juHu dv. This means averaging
the energy over the wave function, or over the motions of the
electrons; and to the same approximation we have just used,
the summations average to zero, leaving the same energy we just
found.
As an example of the calculation of energy, we can again
take the case of Na. The energy of normal Na is, using the
2( —\ — ) + 8( — ~ )
+
= —321.4 units. With one Is electron removed,
making the appropriate changes in shielding constants, the energy
to -|7!y»Y + 8( 7 4°Y + (mX] = -240.6unit, Thedif-
[(^y + s(-y +(*#•)]
ference is 80.8 units, or 1,094 volt-electrons, representing the
ionization potential. Similarly with the 2s removed, the energy
K^M^y + m}=
is -^2\^—J + 7^^ ) + I ^p I J = -318.6, leaving an
ionization potential of 2.8 units, or about 38 volt-electrons.
Finally the ionization potential of the 3s, as we immediately
(¥*)' - •
see, is simply I ~^— J = 0.54 unit = 7.3 volt-electrons.
251. Ionization Potentials and one-electron , Energies. — In
the method of self-consistent fields, each electronic wave function
is the solution of a central field problem, for a single electron.
This one-electron problem has a certain energy, as found in
the preceding chapter, always negative, very large numerically
if the electron is tightly bound, smaller if it is more loosely bound)
and it is natural to ask for the interpretation of this energy.
The connection with tightness of binding suggests directly
that the one-electron energies measure the work required to
remove the electron in question, or the ionization potential, the
negative energies being the negative of the ionization potentials.
This proves in fact to be the case. one can compute these
ionization potentials, by finding the energies of the atom and
ion and subtracting, and the result proves to be, to the first
order of perturbation, just the one-electron energy. Thus the K
\
436 INTRODUCTION TO THEORETICAL PHYSICS
ionization potential is given by the distance of the Is energy
level below zero in the corresponding one-electron problem, and
so on. The connection is not very accurate, but it is close enough
to be very useful.
Our method of effective nuclear charges, being an approxima-
tion to the method of self-consistent fields, should show the same
property, and we can give a simple though not entirely satis-
factory proof. ' The negative of the ionization potential is
the energy of the atom, minus the energy of the ion. If the ith
electron is to be removed, and if S,- represents a shielding con-
stant in the atom, S/ for the ion, then the energy of the atom,
minus the energy of the ion, is
3 39*1
If we set S/ = Sj — (Sj — S/), and expand, this is
(Z - Stf
tii 2
^ (z - Si)> + 2(z - Sj)(s, - s/) + (Sj - s/y - (z - s^y
^J Uj 2
39* i
Our simple proof holds only in case there is no other electron
in the same shell as the ith, and if we assume that each electron
shields by either 1 or 0. Then we have S, — S/ = if the jih
electron is inside the *th, 1 if the jth. is outside the ith. Thus
for the ionization energy we have
(Z - S^ 2 | ^g 2(Z - Sj + j) (g)
j outside i
In this case we can easily find the energy of the one-electron
problem. The potential energy of the field in which the larger
part of the ith wave function is located is
-*£=M+ y, i ( 9)
j outside i
where Si represents the number of electrons inside the ith, and
the summation is for all outer electrons, assuming constant mean
radii, which are approximated l/r } - = (Z. — $,-)/n,- 2 . To verify
the correctness of this potential^ we note that the corresponding
force, — [2(Z — Si)]/r 2 , is what we should have for the charge
ATOMIC STRUCTURE 437
inside the sphere concentrated at the nucleus, and the constant
terms of the summation are added to make the potential con-
tinuous at the outer shells. Thus the wave equation for Ui is
-V, 2 -
«^+.2r'-*]"-*
j outeids i
(Z - S*)
nc
which gives immediately
1 outside i
or, using the value of l/r,- f
j outside i
agreeing with the value (8) already found, except that the correc-
tion term *^ in (Z - £,- + }4) is missing. This formula, more-
over, is interesting, in that it shows that the shielding has
two effects on the energy: (1) The energy has the term
— (Z — S?)/n£ instead of — Z 2 /n 2 , as we should have with an elec-
tron in the unshielded field of the nucleus. This effect, reducing
the magnitude of the ionization potential, is called the inner shield-
ing, since it comes from the inner electrons. (2) There is also the
summation over the outer electrons, likewise resulting in a
reduction of ionization potential, and called the outer shielding.
As we see from our derivation, the outer shielding results from
the rearrangement of shielding constants of the outer electrons
when an inner electron is removed.
Problems
1. The K series in the x-ray spectra comes when a K electron is knocked
out, and an L, M , . . . electron falls into the vacant place in the K shell.
The lines are K a (if an L electron falls in), Kp (an M), etc. Write down the
configurations before and after the K a and Kp transitions of Mo.
2. Show that the frequencies of the lines of the K series are less than the
frequency of light necessary to cause ionization of the K electron. Compute
the K ionization potential and the K a line for Ca, and show that they fit in
with the general case.
3. Moseley's law is that the square roots of x-ray term values (ionization
potentials) form a linear function of the atomic number. This would obvi-
ously be true if there were just inner shielding, for then the square root
would be simply (Z — S)/n. Investigate how closely this is true when there
438 INTRODUCTION TO THEORETICAL PHYSICS
is outer shielding as well, computing K and L term values for electrons from
Z = 10 to Z = 20, and seeing how closely the square roots fall on straight
lines.
4. Iso-electronic sequences are sets of ions, all of the same number of
electrons, but with different nuclear charges, and hence different degrees of
ionization. Compute the ionization potentials, or term values, ls 2 2s 2 2p — >
ls 2 2s 2 , ls 2 2s 2 3s — > ls 2 2s 2 , for the atoms Z = 5 to 10, indicating what ions
they are (as Z. = 6, ls 2 2s 2 2p is C + ). Investigate to see whether these term
values follow Moseley's law that the square root of the term value is a linear
function of atomic number.
5. Using the approximation that the radius of a shell is n 2 /(Z — S), draw
curves giving the radius of each shell as function of Z for all atoms up to
Z = 20 (compute only enough values to draw the curves).
6. In a closed shell of p electrons, there are two electrons of m = 1, two of
m = 0, two of m = —1. Using the spherical harmonics for these cases,
compute the squares of the wave functions, treating these as electron densi-
ties. Add the densities of all electrons, showing that the sum is independent
of angle, or that the p shell is spherically symmetrical. The same thing is
also true of any completed shell.
7. Given a spherical distribution of charge, where the potential is 2Z P /r,
and the force 2Z//r 2 , where Z p , Z/ are both functions of r, prove that Zf =
Z p - r(dZ p /dr). *
8. Assuming that the electrons are located on the surfaces of spheres of
radius n 2 /(Z — S), find and plot Z/ and Z v for Na + as functions of r.
CHAPTER XXXV
INTERATOMIC FORCES AND MOLECULAR STRUCTURE
Atoms by themselves have only a few interesting properties:
their spectra, their dielectric and magnetic properties, hardly any
others. It is when they come into combination with each other
that problems of real physical and chemical interest arise. Atoms
act <on each other with forces, in some cases attractive and in
others repulsive, and in this chapter we shall consider the nature
of these forces, how they arise, and what their results are in
their effect on the physical and chemical structure of the sub-
stance. Interatomic forces in the first place hold atoms together
to form molecules ; this forms the province of chemistry. But in
turri they hold molecules together into their various states of
aggregation, as solids, liquids, and gases, and this is ordinarily
considered to be part of physics. The distinction, however, is
purely arbitrary, and not at all general. We shall begin by dis-
cussing the most important types of force, with a little considera-
tion of the types of substances in which they are found. All the
interatomic forces of interest in the structure of matter are
electrical or in some cases magnetic; the only other forces, gravi-
tational, are far too small to be of significance. We arrange
the different types according to the way they depend on the
distance of separation of the atoms.
252. Ionic Forces. — If two atoms are ionized, they attract or
repel according to the inverse square. If the net charge on one is
Z\ units, on the other z 2 , the potential energy between them is
2i2 2 e 2 /r, if r is the distance between.
253. Polarization Force. — Atoms are polarizable, as we have
seen in discussing refractive index, in Sec. 172, Chap. XXIV.
That is, an atom in an electric field E acquires an electric moment
olE. Now suppose that we have an atom or an ion in the pres^-
ence of another ion. The ion produces a field ze/r 2 . This in
turn polarizes the first atom or ion, producing a moment aze/r 2 .
The resulting dipole reacts back on the ion, attracting it with a
force equal to the field of the dipole (equal to the moment of the
" • 439
440 INTRODUCTION TO THEORETICAL PHYSICS
dipole times 2/r 3 ) times the charge on the ion, or -(2az 2 e 2 /r 5 ).
The potential of this force is — (az 2 e 2 /2r i ), giving always an
attraction.
254. Van der Waals' Force— Ionic and polarization forces
are met only with ions. The forces observable at largest dis-
tances between neutral atoms or molecules, and hence of impor-
tance in the behavior of liquids and imperfect gases, are called
Van der Waals' jgprces, on account of their appearance in Van
der Waals' equation of state for imperfect gases. They arise as
follows. An atom is generally spherically symmetrical and thus
on the average has no externa] electric field. But this is only
on the average; instantaneously it is not spherical, but the elec-
trons are at arbitrary positions, and the result gives a dipole
moment, averaging to zero, but instantaneously different from
zero. This dipole polarizes a second atom or molecule. Thus
the field of a dipole of moment n is jic/r 3 X function of angle.
In the two special cases where the dipole points straight toward,
or away from, the atom, the function of angle has the values
+ 2, respectively. In that case, the induced dipole in the second
molecule is + (2a/z/r 8 ). This produces a field back on the first,
equal to ± (4a^/r 6 ). The force by which it acts on the original
dipole is equal to the rate of change of the field with r, times the
dipole moment, times a function of angle which is ± 1 in the two
cases considered, or ( +-^\±ii) - -— ^-> with potential
4cm 2
energy g — If we had considered all angles, we should have
got a different constant, but in any case an attraction, —con-
stant X a/j, 2 /r 6 .
To calculate the polarization and Van der Waals' forces, we
should have to find a and /*. The calculations for these are
difficult and will not be attempted here, though, a derivation
will be given in a later chapter. For the present, however, we
can get some semiempirical formulas which will serve for rough
calculations. First, the polarizability a has the dimensions of
a volume. An argument from a simple model in Sec. 172 showed
that, at least in order of magnitude, the polarizability of a spheri-
cal atom is equal to the cube of its radius. Now the radius of
an electron's orbit can be approximated by n 2 a /(Z — S), so
that we might imagine that the polarizability of an atom could
be approximated by the sum of such terms, cubed, for all elec-
INTERATOMIC FORCES AND MOLECULAR STRUCTURE 441
trons. Empirically, one finds that this gives about the right
dependence on Z, but not very accurately for n: the contribution
of an electron to the polarizability proves to be approximately
/ n 2 a V
\Z-Sj
(4.5 if n = 1
X <1.1 if n = 2
(0.65 if n = 3, etc.
The total polarizability is the sum of such contributions, for all
electrons. As we readily see, only the electrons in the outer shell
make an appreciable contribution, since they have the largest
values of n and the smallest Z's. Hence we may simply multiply
the number v of electrons in this shell by the term above. Thus
v = 2 for an ion with the same structure as the He atom, 8 for
one built like Ne, 8 for one like A, etc.
For the Van der Waals' force, we expect an energy —con-
stant X an 2 /r 6 . We shall consider the problem more in detail
in Chap. XLII, Sec. 301, where it is shown that the energy is
3 1 2 •
~2 ^ '
and where in addition we have the relation
, a&E
11 ~2~'
In this formula, AE is the difference of energy of that transition
from the normal state which contributes most to the refractive
index and dispersion. Ordinarily this can be taken to be the same
as the ionization potential of the atom. Thus, since we know how
to find ionization potentials from our effective nuclear charges,
we may use empirical or approximately calculated polariza-
bilities to get coefficients for the Van der Waals' attraction.
The three types of force we have enumerated all fall off as
inverse powers of the distance. If we inquire further, we find
that there is a whole series of terms, in higher and higher inverse
powers of r. Thus between ions we have a series commencing
with terms in 1/r and 1/r 4 , between atoms commencing in 1/r 6 ,
but having higher terms arising from interaction of the induced
dipoles of both atoms with each other, interaction of dipoles and
charges with quadrupole moments, etc. The complete series
would be difficult to evaluate. In addition to these forces, there
are other quite different ones, coming when the atoms are so
close that their charge distributions actually begin to overlap.
442 INTRODUCTION TO THEORETICAL PHYSICS
Since these distributions fall off exponentially with distance,
as in hydrogen functions, these types of force all fall off exponenti-
ally and for that reason cannot be expanded in inverse powers
of r at all (the exponential function possesses a singularity at
infinity and so cannot be expanded in power series in 1/r). The
forces are sometimes grouped together, but we prefer to break
them up into three classes.
255. Penetration or Coulomb Force. — As one atom penetrates
another, there will be forces on account of pure electrostatics,
even if the two atoms do not distort each other. Let the outer
shell of each atom penetrate within that of the other (Fig. 74a).
Then the part of each which penetrates the other finds itself in a
field attracting it toward the nucleus of the other, since it is no
(o) (b)
Fig. 74. — Penetration of one atom by another. Circles represent shells of
electrons, (a) Attraction. Negative charge of each atom penetrates within
the outer shell of the other, being attracted to the positive nucleus, (b) Repul-
sion. Nucleus of each atom penetrates the outer shell of the other, the repulsion
of the nuclei for each other outbalancing the attractions.
longer shielded by all shells of the other. The result is an attrac-
tion of the charge of each for the other, pulling the whole atoms
together. on the other hand, as the atoms get still closer, the
whole system of inner shells of one would get inside the outer shell
of the other (see Fig. 746). These inner cores are both positively
charged on the whole and will repel, a repulsion more than enough
to counteract the attraction, in general. Hence at sufficiently
close distance, the penetration force will be repulsive. In
between, there will be some distance at which the force will be
zero and there will be equilibrium.
256. Valence Attraction. — The penetration force acts even
though the atoms are not distorted. The force of attraction
principally concerned in valence, however, is an additional force
resulting from the distortion of one atom by the other. The
distortion produced by ordinary electrostatics is at least approxi-
mately taken care of by computing the polarization, as stated in
INTERATOMIC FORCES AND MOLECULAR STRUCTURE 443
Sec. 253, but there is an additional effect, resulting from the
operation of the exclusion principle, and the existence of electron
spins, and which leads to a tendency for electrons to form stable
pairs, agreeing with the ideas of G. N. Lewis regarding homopolar
valence, or valence attraction between uncharged atoms. To
understand this, even approximately, we must look more closely
into the exclusion principle. In addition to their charge, elec-
trons also act like little magnets, having a north and south pole.
This is as if the charge were to rotate, forming a little electric
current around a circle, and corresponding magnetic lines of force.
The result is called electron spin. Now when we have a pair
of electrons, it turns out that their spins can be oriented in just
two possible ways: either parallel to each other, or opposite or
antiparallel. If they are parallel, then the exclusion principle
comes in and says they cannot be in the same shell. But if they
are opposite the principle does not operate. It is a result of this
that the allowed numbers of electrons in the various groups in an
atom are all even numbers. Thus, in the s shell, after we have
one electron, we can add a second if its spin is opposite to the
first, for then the exclusion principle does not act. But if we
now try to add a third, its spin must be parallel to one of the two
already there, and the exclusion forbids it. Similarly a p group
really contains three different subgroups, each of which can
contain but two electrons, with opposite spins. Analogous
results hold for the other groups. We see, then, that the sub-
group of two electrons with opposite spins is a configuration
which electrons like to form, and that only two electrons can
enter such a configuration, so that there is a tendency toward
pairing. But now it appears that such a pair can be formed by
two electrons in different atoms, just as well as by two in the same
atom. Thus if each of two atoms has just one electron, rather
than two, in one of its subgroups, and if these two electrons have
opposite spin, they can form a pair held in common by the two
atoms, actually localized in the space between the atoms, and
tending simply by electrostatic forces to hold the atoms together,
the attractions of this negative concentration of charge for the
nuclei, which must have a net amount of positive charge, more
than counterbalancing the repulsions between like charges at
large distances, though at smaller distances the force becomes
repulsive, on account of the ordinary penetration effect. This
is the origin of homopolar valence. We see that every electron
444 INTRODUCTION TO THEORETICAL PHYSICS
lacking from a closed shell can be interpreted as giving the possi-
bility of forming a valence bond, so that for example the halogens
have a single valence, oxygen and sulphur have two, hydrogen
has one (one electron missing from Is), and so on.
257. Atomic Repulsions. — If one brings two atoms close
enough together, they will always repel and resist further
approach. This is what we know physically as the impenetrability
of matter. It is a result of the exclusion principle, again. If
we force two atoms so close together that the shells of the two
atoms overlap, and if these shells are all filled with electrons,
then we are really trying to force more electrons into the same
region of space than the exclusion principle allows. What, hap-
pens is that the electrons then move outside of this region, the
atoms become distorted, and the resulting increase of energy is
interpreted as a force of repulsion between the atoms. These
actions commence as soon as closed shells begin to overlap appre-
ciably, and as a result the atoms have rather sharp boundaries,
and for some purposes may be considered as having definite
sizes. We should notice that, if the outer shells of the atoms are
not closed, this repulsion can be altered. Thus, if two lithium
atoms approach, each having a closed K shell but only one elec-
tron in its 2s shell, either of two things can happen. If the two
L "electrons happen to have parallel spin, then the exclusion
principle operates between them, and they will repel each other,
as if they had only closed shells. But if the spins are opposite,
then the outer shells can coalesce, forming a shared electron pair,
and resulting in attraction. Even in such a case, however, we
finally meet repulsion as we bring the atoms together. In the
first place, at close enough separation, the K shells would begin
to overlap, and since they are closed shells they would repel in
the usual way. But also the pure electrostatic interaction gives
repulsion at small enough distances. For with more and more
penetration, we get to the point where the nuclei are close
together, in the midst of a combined set of shells of electrons from
both atoms. Increasing closeness will then increase the repulsion'
between the nuclei, without much changing anything else, and
this repulsion will finally become great enough to cancel all other
effects.
258. Analytical Formulas for Valence and Repulsive Forces. —
The three types of force which we have just been discussing,
Coulomb penetration force, valence attraction, and repulsion,
INTERATOMIC FORCES AND MOLECULAR STRUCTURE 445
all depend on the actual overlapping of the charge distributions
of two atoms. Here again we can find a simple approximate
formula, which is yet accurate enough to be decidedly useful.
Since the charge distribution falls off in general exponentially
with the distance, we may assume that the potential energy also
falls off exponentially: energy = Ce~ ar , where r is the distance
between nuclei. The constant C is negative for attractions, posi-
tive for repulsions. The value of a, of course, will be different
with each type of force, and each type of atom. Nevertheless,
we can give extremely rough rules which yet suffice to give the
order of magnitude of a. First we set up, for each of our two
atoms, the "radius" of the outer shell, n 2 /(Z — S). We add
these radii for the two, multiply by 1 if the electrons in the outer
shells are p electrons, as in closed shells, but by 1.4 if they are s
electrons in both atoms, as in a molecule made of two alkali
atoms. Let the result be r . Then as far as order of magnitude
is concerned, the energy is a constant times e~ i{r/ro) for the pure
repulsion between closed shells. In the valence attraction case,
where the curve has a minimum, we can combine the valence and
Coulomb forces, since both behave about the same. Then the
result is approximately
• (J e -6(r/r ) _ C"g-3(r/ro) /
the first term representing the repulsion close in, the second the
attraction farther out. The constants as we have written them
are for the normal state of the atoms and molecules, and in this
case it is found that the equilibrium distance for the valence
attraction comes approximately at r . This results, as we readily
verify, by writing the formula in the form
D [ e -<^) _ ar'fe-O}
or more generally
D e -2a(r-r ) _ 2De -nB(r - r ° ) , (1)
where a is a constant, which we have set approximately equal to
3/r . This form of potential curve has been used by Morse, and
he has tabulated values of D, a, and r for a number of molecules,
in excited as well as normal states.
The constant coefficient D, or the corresponding coefficient
in the pure repulsive energy, is not easily given in a general way.
We can easily see its significance, however. In Fig. 75 we plot a
Morse potential jgurve, observing that it has a minimum at r ,
^DEPARTMENT OF CHEMISTRY
HVHMHH: COLLEGE OF TECHNOLOGY
446
INTRODUCTION TO THEORETICAL PHYSICS
the energy at this point being — D, while at infinite separation
the energy is zero. Thus D represents the energy required to
pull the atoms apart to infinity if they are initially at rest at the
equilibrium distance, or, in other words, the energy of dissocia-
tion of the pair of atoms. These energies, for actual molecules,
vary between a fraction of a volt-electron and several volt-
electrons, depending on the tightness of binding of the molecule.
A few simple rules help in
estimating D, as for instance
that the larger r , the smaller D
tends to be (for example, F 2 is
more tightly bound than J 2 , the
F atom being smaller than /);
molecules with a double or triple
valence bond have larger D's
than with single bonds; etc.
The repulsive energy between
closed shells, which we have
approximated by Ce~ i(r/ro) , is
generally associated with an
ionic or Van der Waals' attrac-
tion, resulting again in a mini-
mum. This minimum,
however, is ordinarily at much
larger distances than r , more
nearly 2r p or even larger. This
is in consequence of two things :
the attractive forces are rather weaker than the valence attrac-
tions, and second the repulsion between closed shells is naturally
larger, and effective at larger distances, than the repulsion
found in valence compounds. In an actual case, where we
know the Van der Waals' or ionic force, we can then make
an estimate of the distance of separation at the minimum,
and find C from the condition that the correct total potential
has zero slope at this point. To get a number comparable with
those met in valence attraction, we should write the repulsion
in the form De ~ 4 V™ ~ * ) . Then in actual cases D comes out of the
order of a few volt-electrons.
Often one finds the repulsive forces of which we have just
spoken approximated by an inverse power of r, as b/r n , where n
Fig. 75. — Morse potential curve,
X) e -2a(r-r ) - 2De' a ( r_r o) .
INTERATOMIC FORCES AND MOLECULAR STRUCTURE 447
proves to be about 8 or 9. We immediately see that both func-
tions, exponential and inverse power, behave similarly, being
large for small r, small for large r, so that either form can be
used, though, since the repulsion depends on penetration, which
actually goes off exponentially, we can be sure that the inverse
power term is not so accurate. We can readily find out why n
has about the value 9. The repulsive term is of importance,
and can be found experimentally, and n determined, near the
minimum of the energy curve. For Van der Waals' or ionic
forces, as we have mentioned, this proves to come at about
2r . Then suppose that we choose b and n so that b/r n has the
same value and slope as Ce~ 4(r/ro) when r = 2r . We have
= Ce
-*(??)
(2r )»
and
nb 4 _ 4 (2iA
(2r ) n+1 r
from which, dividing one by the other, 2r /n = r /4, n = 8,
approximately as is found experimentally. Many discussions,
particularly of the structure of ionic crystals, are based on this
inverse power formula, which has been used by Born and others.
259. Types of Substances: Valence Compounds. — Now that
we have investigated the types of interatomic forces, we should
consider them with reference to the different types of substances
in which they occur. Broadly speaking, there are two main
types of substances, corresponding to the two principal kinds
of interatomic attractions, the ionic and the valence forces.
Let us arrange our valence compounds roughly in order of melting
or boiling points, starting with the most volatile, and ending
with the most stable. The first substances on the list are not
compounds at all, and indicate valence only in a sort of negative
way: they are the inert gases, He, Ne, A, Kr, Xe. Since the
outer shells of these are already completed, they form no ions,
and they have no electrons to be shared and have no possibility
of valence forces, and form no compounds. Next we come to a
group of diatomic molecules, for example H 2 , 2 , N 2 , F 2 , Cl 2 , Br 2 ,
CO, HC1, HBr, etc. These are held together by valence forces
(HC1 and HBr are somewhat ambiguous, and might be considered
to be ionic compounds; this ambiguity is met in almost all H
compounds). For example, each atom in H 2 has one electron;
448 INTRODUCTION TO THEORETICAL PHYSICS
they share these, making a pair. In O2, each atom has six
L electrons; but they share two pairs (a double bond). As we go
on, we come next to fairly simple polyatomic molecules. We
have water, ammonia, methane: H 2 0, NH 3 , CH 4 , all rather
plainly valence compounds (though the ambiguity of which we
spoke previously makes an ionic interpretation possible as well),
with each hydrogen held by a single valence bond to the other
atom. We might well include with these the ammonium ion,
NH 4 + , presumably built like methane. Other simple ones are
C0 2 , CS 2 , with double bonds. Then we certainly should include
some of the simple organic compounds, as acetylene C 2 H 2
(triple bond between the carbons), ethylene C 2 H 4 (double bond
between the carbons), ethane C 2 H 6 (single bond).
All these molecules of which we have spoken are held together
by valence forces. on the other hand, there are also Van der
Waals' forces between molecules, though of a smaller order of
magnitude than the valence forces, and these hold the substances
together in liquids and solids, all of low boiling points, but of
increasing stability as the molecules become heavier and more
complicated. The very considerable difference in order of
magnitude between the valence and the Van der Waals' forces
is significant, for this brings it about that the separate molecules
preserve their identity, even when crowded close together.
More complicated organic compounds naturally come next
in the list. They still preserve to some extent the property
of existing as separate molecules, in gas, liquid, and solid, so
that they still have both valence forces between atoms, and Van
der Waals' forces between molecules. But as the molecules
get more and more complicated, the Van der Waals' forces get
larger and larger proportionally, so that with the fairly compli-
cated ones they are of the same order of magnitude as the valence
forces. Many comphjcated organic compounds dissociate when
heated, rather than going through a change of state, since the
heat necessary to melt and boil the substances becomes more
and more nearly equal to that required to break up the molecules.
It becomes, in other words, harder and harder to distinguish
separate molecules, the solid acting more and more like a single
big molecule.
The silicates form a group of compounds slightly suggesting
the organic compounds in their complexity. They contain the
group Si0 4 -4 , which can be best described as a pure valence
INTERATOMIC FORCES AND MOLECULAR STRUCTURE 449
compound, Si(0 -1 )4, held together just like methane, Si being
analogous to C. In many compounds the silicate groups are
joined together, by sharing oxygens, as in the double group
Si 2 7 -6 , or (0 -1 ) 3 Si-0-Si(0 -1 )3, a neutral O atom being joined
by its two bonds to the two Si atoms. This process of sharing
oxygens may continue, until finally there is a network formed
through the whole crystal, the metallic ions, as Ca++, etc., merely
fitting into empty space in the network, and all traces of molecu-
lar structure being lost. Thus these crystals are held together
by forces so strong that they are not easily broken up. They are
insoluble and refractory, and in fact form a great proportion
of all the minerals.
260. Metals. — The metals form a type of substance more or
less by themselves, but in general resembling valence compounds.
There is a definite indication, at least in some of them, that there
is a network of valence forces between the atoms, running through
the metal, and holding it together to form a solid. At the same
time, the simple Coulomb penetration force seems to account
for a considerable part of the cohesion of metals. The network
of valences seems to be connected with the electrical conductivity:
an electron shared between two atoms can go to either one, and
if the sharing exists through the solid, the electrons can migrate
and carry a current. For many purposes, it is more correct
in a metal to give up the idea that an electron is attached to a
given atom at all, and treat them as free to move from one place
to another, like the molecules of a perfect gas. The typical
metallic states are solid and liquid. When a metal is vaporized,
the tendency toward molecular formation does not seem to
be strong. The vapors of such metals as have been examined
show both monatomic and diatomic molecules; one wonders
if polyatomic ones would not also be found if the experiment were
made, acting simply like little pieces of the large metallic crystal.
261. Ionic Compounds. — The ionic compounds are not so easy
to classify in a definite order as valence compounds, principally
because they are more alike. The primary fact about ionic
compounds is that they are held together by electrostatic forces,
the atoms appearing in the ionized state. The forces between the
atoms depend only on the distance, and are independent of
the presence of other atoms (except in the matter of polarization).
The laws governing the formation of ionic crystals are simple
electrostatic ones, such as that positive and negative ions tend
450 INTRODUCTION TO THEORETICAL PHYSICS
to approach as closely as possible, ions of the same sign go as far
apart as possible, charges in small volumes tend to equalize them-
selves, and so on. As a particular result of these, there is no
tendency to form molecules. It is almost impossible to build up
out of ions any structure which would not have large electrostatic
fields around it; and further ions would be attracted by these
fields, so that the substance can build up indefinitely. Further,
the electrostatic fields are rather large, compared with the valence
forces. The physical nature of these substances follows from
the principles very easily. Their most characteristic form is the
solid, where they form crystals in which the ions are arranged on
a regular lattice. There is no trace of molecular structure in the
lattice. They are hard and stable, often harder than metals,
and of high melting point, although, of course, there is large
variation from one compound to another. The vapor phase is an
unimportant one for practically all ionic substances. Much
more interesting in general than either liquid or vapor is the ionic
state in water solution. Water, on account of its great dielectric
constant, decreases all electrostatic forces. It thus almost
removes the forces holding such a crystal together, and the solid
breaks up into ions dissolved in the water.
When we ask about individual ionic compounds, we can well
classify according to the ions from which they are made. The
fundamental building stones are in every case ions of atoms; and
the ions are of two sorts, positive and negative. The metals
practically always form positive ions. They easily lose their
valence electrons, as we have seen, so that all the electrons outside
closed shells are removed, giving the alkali ions a charge 1, alka-
line earths 2, the aluminum group 3, and so on. As we go through
the series of elements, we see that even the nonmetals sometimes
form positive ions, as CI with seven positive charges. Sometimes,
however, their ions are negative, though about the only important
atoms forming negative ions are O and S, forming singly and
doubly charged ions, and the halides F~, Cl~, Br" I~. These
atoms add electrons to make a closed shell, instead of losing them.
It is obvious why there are so few: adding electrons makes an
atom negatively charged, so that it tends to repel other electrons.
It is a process which cannot go on far. The negative halide ions
generally exist by themselves. The oxide ion also exists by itself
in oxides; but it also forms complex negative ions, with positive
nonmetallic ones, which are the most important negative ions
INTERATOMIC FORCES AND MOLECULAR STRUCTURE 451
known. There are two alternative explanations of these radicals,
either as pure ionic compounds, or as a combination of this with
valence forces. For example, the sulphate ion can be regarded
as being formed from a completely stripped sulphur ion and
doubly charged oxygens: S0 4 -2 = S +6 (0 _2 ) 4 . But if we assume
that the oxygens have only single negative charges, we have the
other possible structure S +2 (0 _1 ) 4 . With this structure, the
sulphur has four electrons, as carbon does, and so has four homo-
polar valence bonds; and the oxygens have the same electron
structure as halogens, with a single valence bond. Thus the
sulphur can be bound to the four oxygens by valence bonds,
assisting the electrostatic attraction, and the structure would
have similarity to methane or carbon tetrachloride. This latter
explanation seems to be nearer the truth, since it can be calculated
that the work required to form the completely stripped positive
ion in the ionic model would be much greater than the work
necessary to form the other structure.
Problems
1. Find the potential energy between two helium atoms, using our approxi-
mate methods for calculating Van der Waals' and repulsive forces, and com-
pare with the more accurate value
\ 7.7e- 2 " 3 V«o _._l^*f 10-10 ergs,
where a = 0.53 X 10 -8 cm. The polarizability of helium is 1.43a 3 , and
its ionization potential 1.80 Rh. Compare these with simple calculated
values.
2. Using the potential of Prob. 1, compute the equilibrium distance of
separation between two helium atoms, and find the energy of dissociation,
in ergs, and volt-electrons. Compare the equilibrium distance with the
mean distance in the liquid, which has a density of 0.14, assuming atoms to
be spaced on a regular lattice, so that the mean distance will be l/^n, if n
is the number of atoms per cubic centimeter.
3. Find a radius of the helium atom for use in kinetic theory, assuming
that two helium atoms at temperature 300° abs., with kinetic energy of
%kT, collide head on. Find how close they come before they stop, and
compare this molecular diameter with the distance r .
4. Two energy levels of H2 coincide with the lowest energy of the atoms
at infinite separation, one an attractive level (corresponding to valence
binding, with the spins of the two electrons opposed), and one repulsive
(the spins being parallel, so that the exclusion principle operates). Plot
the energies of both terms as functions of distance, deriving the exponents
according to our approximate laws, and determining the scale from the
fact that the energy of dissociation of the molecule is about 4.3 electron volts,
452
INTRODUCTION TO THEORETICAL PHYSICS
and that the energy of the repulsive term at the distance of molecular
equilibrium is about 8 electron volts.
5. Compute by our approximate laws the distance of separation r of
the atoms in the normal states of the valence compounds given below, and
compare with the experimental values tabulated :
Compound
r (Angstroms)
Compound
r (Angstroms)
c 2
CN
CO
H 2
1.31
1.17
1.15
0.76
I 2
NO
o 2
SiN
2.66
1.15
1.21
1.57
6. Compute by our approximation polarizabilities for the following ions,
and compare with the experimental values tabulated :
Ion
a X 10 24
Ion
. a X 10 24
o—
1.60
s—
5.91
F"
0.868
ci-
3.33
Ne
0.398
A
1.67
Na+
0.292
K+
1.12
Mg ++
0.173
Ca++
0.785
7. Compare the distance of separation of atoms in the metallic crystals
n 2
tabulated below with the sum of the quantities
z -s
for the two atoms.
Metal
Distance,
Angstroms
Na
K
Ca
3.72
4.50
4.97
8. Compute the interatomic potential energy for NaCl at large distance,
assuming it is composed of Na + and CI - , so that there will be the ionic
force, and at the same time a polarization force, the sodium polarizing the
chlorine. Show that the polarization of sodium by chlorine can be neglected.
Using the polarizabilities of Prob. 6, show that the potential energy is
-, t-. — r-. electron volts.
|_ r/oo (r/a ) 4 J
9. The observed interatomic distance in the NaCl molecule is 2.73 Ang-
stroms. Compute the constants C and a in the repulsive potential Ce~ ar ,
INTERATOMIC FORCES AND MOLECULAR STRUCTURE 453
Find a by the rules we have used, and determine C so that the sum of the
repulsive potential, and the attractive potential of Prob. 8, will have a
minimum at the required distance.
10. Using the value of a found in the preceding problem, find the equiva-
lent value of n in the repulsive potential b/r n for the NaCl problem, seeing
how nearly it equals 9.
CHAPTER XXXVI
EQUATION OF STATE OF GASES
In the preceding chapter, we have considered interatomic
forces, and their effect in determining the nature of substances.
When we begin to think more precisely of what we mean by the
nature of substances, we conclude that the equation of state, and
the closely related specific heat, are among the most important
properties. We shall, therefore, take them up, giving necessarily
enough thermodynamics and statistical mechanics to make
calculations possible. Our investigations will be concerned with
the thermal motion of the nuclei, moving under the interatomic
potential which we have investigated. We shall naturally not
be able to treat all sorts of substances; liquids, for instance, are
so complicated that comparatively little progress has yet been
made in understanding their properties. But gases and crystal-
line solids both present features of simplification which we can
make use of.
262. Gases, Liquids, and Solids. — Before passing to our
analysis, let us consider what types of behavior we wish to explain.
We can conveniently divide our discussion into gases, liquids,
and solids. A monatomic gas, as an inert gas, is the simplest
case: we have only to find its pressure) and total energy, as a
function of volume and temperature, a task which can be carried
out when we know the law of force between molecules. Gases
of valence compounds, however, are more complicated. Their
equation of state is not much harder to approximate than with
monatomic gases, at least at low density, for on account of the
rotation of the molecules they act on the average as if they were
spherically symmetrical, and we need use only the intermolecular
force averaged over angles in deriving the equation of state. In
the specific heat, however, there are two forms of energy to con-
sider: the translational kinetic energy of the molecules as a whole,
which acts just as in monatomic gases, but also the rotational
and vibrational energy of the individual molecules. This involves
454
EQUATION OF STATE OF GASES 455
a different sort of calculation. A still further complication
appears in gases of some ionic substances, and of some valence
compounds like I 2 and NO. Here there are several types of
molecule which can be simultaneously present in the gas, as
21 +± I 2 , 2Na ^ Na 2 , and a proper treatment of the equation of
state and specific heat would demand investigation of the equi-
librium concentrations of the constituents, and their change with
pressure and temperature.
A liquid is more complicated than a gas, in that the molecules
are so closely in contact that they can no longer be treated as
points. The liquids of the inert gases are, of course, exceptions,
and there are a few other exceptions, diatomic and polyatomic
substances whose molecules rotate even in the liquid, and so act
like spherical systems. But with most liquids the molecules are
bulky enough so that they do not rotate, and are definitely non-
spherical in their average behavior. In considering the equations
of state, in particular the compressibility, one can no longer, as
with a gas, neglect the change of volume of the molecules with
change of pressure. As the molecules become larger and larger,
as with complicated organic compounds, the distinction between
forces within and forces between molecules becomes lost, and
the whole liquid must be treated as a single complex, the volume
being determined more and more definitely by the space required
to pack the atoms together.
The state of close-packing of atoms which we have just men-
tioned is definitely reached with solids. In fact, with noncrystal-
line solids, there is no sharp distinction between the states, as
glass for instance shows, solidifying perfectly continuously from
the liquid. The solids with definite melting points are the
crystals, which *have a definite lattice arrangement of the atoms
which is not met in the liquid. This regularity of arrangement
is the simplifying feature which makes it possible to treat crystals
theoretically. We can here commence our discussion with the
state at absolute zero of temperature, where the atoms are at
rest, and the whole crystal is in a position of equilibrium of the
interatomic forces. The compressibility of such crystals can
be fairly easily found from the forces, and this has been carried
through particularly successfully for some of the ionic crystals.
Then we can treat the crystal in thermal agitation by investigat-
ing the small oscillations of the atoms about their positions of
equilibrium, using the method of normal coordinates. This
456 INTRODUCTION TO THEORETICAL PHYSICS
makes it possible to consider both equation of state and specific
heat with fair ease and generality.
Out of all the group of topics which we have suggested, there
are a few which can be treated theoretically fairly successfully.
First, there is the equation of state of rare gases, or of polyatomic
gases whose molecules rotate so as to be spherically symmetrical
on the average. This is what we take up in the present chapter.
Secondly, there is the specific heat of rotation and vibration of
molecules. Thirdly, one can consider the equilibrium between
different types of molecules in a gas, the question of chemical
equilibrium. Fourthly, the equation of state and specific heat
of crystalline solids can be investigated. As a preliminary to
these, we must extend our treatment of statistical mechanics,
which we have already considered slightly in Chap. XXX. We
first follow out the ideas of classical statistics a little further,
treat the equation of state of a gas by those methods, and then
go to quantum statistics, asking what changes are introduced.
263. The Canonical Ensemble. — Following Chap. XXX, we
consider a phase space; that is, a space in which each coordinate
and each momentum of the system is plotted as a variable.
Let the coordinates be q x . . . q n , the momenta p x . . . p n ,
the Hamiltonian function H(q x . . . p„). Then the phase
space has 2n dimensions, and a point in this space represents a
whole system (for instance a sample of gas). Next we set up
an ensemble of points in this space, the number in the volume
elemental . . . dp n being proportional to f(q x . . . p n )dq x . . .
dp n . We assume all points of the ensemble to be equally likely;
that is, we assume that the probability that the coordinates
and momenta of the system actually lie in the region dq x . . . dp n
is proportional to the number of points of the ensemble in this*
region, or is proportional to f dq x . . . dp n . Then to find the
average of any function of the coordinates and momenta, as
F(q x . . . p n ), we multiply by /, integrate, and divide by the
integral of /: F = r// gl ' ' 'j Pn > as we saw in Chap. XXXI.
ifdqi... dp n
Now in particular we set up the canonical ensemble,
ff(gl . ■ . Pn)
f{qi • ' ■ Pn) = constant e kT ,
where T is the absolute temperature. This ensemble gives the
probability that a system in thermal equilibrium at temperature
EQUATION OF STATE OF GASES 457
T will have its coordinates and momenta within given limits.
The essential physical reason for this is the following: Suppose
we have two systems, the first of coordinates and momenta.
?i . . . ff», Pi . . . Vn, the second q n+ i . . . q m , p^ . . . p m ,
with the separate Hamiltonian functions Hi(q! . . . p n ), H 2 (q n +i
. . . p m ). Then physically we know that, if 1 and 2 are at the
same temperature, and are then allowed to interact slightly,
as by interchanging energy, it will be found that they are already
in equilibrium with each other, and they already form a combined
system in equilibrium at this temperature. This, in fact, is
the definition of equality of temperature. But this is satisfied
for the canonical ensemble. Thus if the separate systems are
in equilibrium, their distribution functions are
Hi(qi . . . p„)
fi(qi " ' ' p n ) = constant e kT
Hi(q„ + i . . . p m )
/ 2 (<7„ + i - • • p m ) = constant e kT
By the laws of probability, then, the probability that simultane-
ously the coordinates q x . . . p n will be in the range dqi . . . dp n ,
and that q n+1 . . ; . p m will be in dq n+x . . . dp m , is proportional to
(ffi+ffiO
the product of these probabilities, or constant e kT dq x . . .
dp n dq n+ i . . . dp m . But now suppose that the two systems are
allowed to interact. The combined system will have an energy
Hi + H 2 + H', where H' is a small interaction potential,
depending perhaps on all coordinates and momenta, negligibly
small compared with the separate energies (as for instance an
interatomic force between the negligibly small number of
molecules on the boundary between the two systems, which
permits the flow of heat between them). Then according to the
canonical ensemble, the distribution of the whole combined
{Hi + Hz + H')
system in thermal equilibrium should be constant e kT
But we observe that, except for the negligible energy H', this
is just the distribution before the interaction, so that the two
systems were already in equilibrium before the interaction, and
by definition are at the same temperature. This result Is true
only with the canonical ensemble, since it depends on the expo-
nential form, adding exponents being equivalent to multiplying
the functions.
Suppose we choose the constant in the definition of the canoni-
cal ensemble so that // dq x • • • dp„ = 1, and avoid having to
453 INTRODUCTION TO THEORETICAL PHYSICS
bother with the denominator in taking averages; this corresponds
to normalizing a wave function. Further, let us write the con-
JL
e kT
stant in the form -r— > so that we have
h n
f(qi ••■?») =
F-H
e
kT
h n
and
r 1 f lnH
f(Qi • • • Vn) dqi ■ ■ ■ ■ dp n = 1 = ^J e kT dq x • • • dp n . (1)
Here F is a quantity of the dimensions of energy, a function of T,
chosen to make the constant have the correct value. Since
e F/kr j s dimensionless, and since the function / must have the
dimensions of I /{dqx . . . dp n ), in order to make its integral dimen-
sionless, we must multiply by a constant of these dimensions.
We have chosen l/h n , which has the correct dimensions, since
h is of the dimensions of pq. It is a purely arbitrary matter
that we have chosen this particular constant/since in all ordinary
physical applications the constant drops out anyway, and it
does not imply the introduction of quantum theory into classical
questions. We shall later see, however, that it simplifies the
comparison with quantum theory to have it there.
264. The Free Energy. — Let us take the factor e F/kT out of the
integral above (it does not depend on the q's and p's), and
divide througn by it. Then we have
e hT = F n I e kTdqi ' ' ' dpn ' (2)
The integral on the right is often called the integral of state (we
shall later see cases where it degenerates to a sum, called the sum
of state). It is fundamental in thermodynamic applications.
The quantity F is the free energy, and we proceed to investigate
its properties. We have seen that it depends on the tempera-
ture; but we must also observe that it depends on the volume.
To see how this comes about, let us think about the Hamiltonian
function H, in particular for a gas. We are considering only
the nuclear motion, so that H includes the kinetic energy of the
nuclei, and the potential enersy of the interatomic forces, as
EQUATION OF STATE OF GASES 459
discussed in the last chapter. But it also includes another term,
if the gas is in an enclosure: the repulsion of the wall. The
molecules of the gas, as they strike the wall, are repelled, so
violently that they never penetrate the wall. We may say
approximately that the potential energy becomes rapidly infinite
as any molecule approaches the wall, and is infinite if any
molecule is outside it, so that e~ (H/kT) is zero in that case, and
there is no probability of finding one of the molecules outside.
Now this term in the potential depends on the volume of the
vessel, the rapid rise of potential coming at the edge of the
volume, which is adjustable. Thus we have H(q x . . . p n , v),
where v is the volume, so that the free energy, which depends
on an integral of this quantity, also depends on v as well as
T.
Let us investigate the rates of change of the free energy with
respect to volume and temperature. We have
or
kT 1 J
h n J
1 dH
kT dv e
dF _~d~H
dv dv
where we remember the formula for finding the average of any
quantity. Now consider a cylinder filled with gas, closed with
a piston of unit area. If we decrease the volume, the increment
of volume being — dv, which therefore equals numerically the
displacement of the piston, and if the pressure, and therefore
the force on the piston is p, we shall do the work — pdv on the
system. This will represent the increase in energy of the system,
or dH. Hence we have dH/dv = — p. We may consider this
relation as stating that p is the generalized force connected with
a generalized coordinate v, and therefore equal to the negative
derivative of the energy with respect to this coordinate. Per-
forming the average, We then have
\dv J,
g)--p. O)
Next we can differentiate the free energy with respect to
temperature. We have
460 INTRODUCTION TO THEORETICAL PHYSICS
dT
(«'")- (W* ~Wrw) e~* T " Uw> e '"^ ■ dp -
or
dF —
If we define H, the mean energy of all systems of the ensemble,
as the internal energy E, we have
'-KU).-*
(4)
the familiar Gibbs-Helmholtz equation.
From Eqs. (3) and (4), we have
dF = - V dv- ^-jr^dT. (5)
Now let us define the entropy S by the equation
F = E-TS,S = ^=-?- (6)
Differentiating, this leads to dE = dF + TdS + SdT. Simi-
larly, Eq. (5) becomes dF = -pdv - Sdt. Combining, we are
led to
dE = TdS - pdv. (7)
Equation (7) is the fundamental equation of thermodynamics,
which we have derived from statistical methods. For the first
law of thermodynamics is dE = dQ - pdv, where dQ is the heat
absorbed in a process, pdv is the work done by the system. And
the second law of thermodynamics is that for a reversible change
(as our change is, since we assume that the distribution is always
given by a canonical ensemble, which means that it is always in
equilibrium), the quantity dQ/T is a perfect differential, dS
Combining these statements, we have Eq. (7).
The specific heat can be found immediately by differentiating
the energy with respect to temperature at constant volume.
Using Eqs. (4) and (7), it is
*-($.-<$.-'<&■ (8)
EQUATION OF STATE OF GASES 461
Thus we can find the specific heat, as well as the equation of
state, by differentiating the free energy. This makes it a very
useful function, and its calculation, by means of the integral
of state, is the usual method of deriving information about
physical properties of substances. Of course, we could derive
the same information from the energy itself as a function of
volume and temperature, but it is not quite so convenient to
calculate.
265. Properties of Perfect Gases on Classical Theory.— Let us
apply the method of the free energy to the calculation of the
equation of state and specific heat of a perfect gas, on classical
mechanics. Let there be N molecules, each of mass m, so that
N
H = 2i 2^ + v >
i = 1
where V, the potential energy, is zero so long as all molecules
are within the volume v, but becomes infinite if even one molecule
strays outside. Then we have
P 2 xl - PzN-
fe kT dq x ■ ■ ■ dp N = J*^ e 2mkT dp xl ■ • • f\ e 2mkT dp iN
f je~ h ~ T dx l • ■ ■ dz N . (9)
Now by direct integration each of the integrals over the p's
is simply \/2irmkT. The integral over coordinates is the integral
of unity over all regions where the coordinates are inside v, over
the outside, so that it is I I I dx x dy\dz\ • • J I J dx N dy N dz N —
V V
v N . Thus we have finally
e -f T _ (vwryv, (]0)
a function of temperature and volume as it should be, and the free
energy itself is
F = -SNkT In ^' 2 ™ kT - NkT In v.
h
I rem this we have at once p = NkT/v, giving the ordinary law
of perfect gases, and C v = %Nk, likewise a well-known result.
462 INTRODUCTION TO THEORETICAL PHYSICS
266. Properties of Imperfect Gases on Classical Theory. —
Next let us consider an imperfect monatomic gas, such as an
inert gas. This differs only in that there is an additional term
in the Hamiltonian, a sum of interaction energies of each pair
of atoms : H = kinetic energy + ^S Va + repulsion of walls,
pairs i,j
where Va may be the sum of a Van der Waals attraction between
the ith. and jth. atoms at large relative distances, and an expo-
nential repulsion at small distance. We then have
fffdxxdyidzi • • • JJjdx N dy N dz N e ^-f kT . (11)
The integration over the coordinates can be carried out in steps-
First we integrate over the coordinates of the Nth molecule. The
quantity e kT , can be factored : it is equal to
S' _ s ViN
e kT e i^N kT
where 2' represents all those pairs which do not include the Nth.
molecule. The first factor then does not depend on the coordi-
nates of the iVth molecule, and may be taken outside the integra-
tion over its coordinates, leaving
///«
X ViN
kT
dx N dyNdz N .
We rewrite this as
= v-W, (12)
f f fdx N dy N dz N — Cfifl — e- S-^f ) dx N dy N dz N = v
v v ^ '
the first term being simply the volume, the second term being an
integral to be evaluated. To investigate W, let us imagine all
the molecules except the iVth being in definite positions of space.
If the gas is rare, the chances are that they will be well separated
from each other. Now if the point x N y N z N is far from any of these
molecules, the interatomic potentials V iN will all be small, and
EQUATION OF STATE OF GASES 463
the integrand will be practically 1 — e° = 0. Thus we have
contributions to this integral only from the immediate neighbor-
hood of each molecule. If all are alike, each of these contribu-
tions will be equal to
w
= Jjf(} ~~ e~TF) dx N dy N dz N ,
a quantity which, though it formally involves the index i, actually
is independent of i. In fact, if we imagine the ith. molecule to be
located at the origin, and remember that Vi N is a function of r,
the distance from the origin, we see at once that
w = J^°°4xr 2 (l - e~W) dr, (13)
where we integrate to infinity instead of to the boundary of the
vessel because the integrand is so small for r's larger than mole-
cular dimensions that it makes no difference. In terms of this,
we then have
W = (N - l)w. (14)
Now when we integrate over the coordinates of the (N — l)st
particle, we have just the same situation over again, except that
there are only (N — 2) remaining molecules, and so on. Thus
finally we have for the integral over coordinates
[v - (N - l)w][v - (N - 2)w] ■ • • v.
We can easily evaluate this product, by taking its logarithm,
which is what we want anyway. This is
N-l N-l N-l
V In (y — sw) = Vlny+ ^ln(l — sw/v).
s = s = s=0
The first term is N In v, which we should have for the perfect
gas. For the second, we note that on account of the rarity of the
gas, sw/v is always small compared with 1. Hence we have
In (1 — sw/v) = — sw/v approximately, and the sum is approxi-
mately equal to the integral with respect to s, or
J.JV-l
_sw
o v
(N — l) 2 w
— ^— To this order, then, neglecting unity in compari-
son with N, we have
F = -3NkT In y^l - NkT In , + *™* (15)
h 2v
464 INTRODUCTION TO THEORETICAL PHYSICS
w iU , NkT , N*wkT . tu- • * +
We then have p = -^-5 h • • • . This is often
written in the form
PL = 1 + ^ + • , . (16)
RT + 2v + '. { }
where R = Nk. This expression pv/RT is called the virial, and
the coefficients of its expansion in inverse powers of the volume
are called the virial coefficients, so that Nw/2 is the second virial
coefficient. The results of experiments on imperfect gases are
ordinarily given as tables of the virial coefficients as functions of
temperature, and by the equation above we can compute the
second coefficient, finding w if necessary by numerical integration
from V{r). In addition to the pressure, we can, of course, find
the specific heat, and it immediately comes out the same as for
the perfect gas. We must remember, however, the rotational
and vibrational specific heats of the polyatomic gases, which
must be added to the translational terms to get the total specific
heat.
267. Van der Waals' Equation. — There is a limiting case in
which we can compute w approximately. This is the case where
the attractive part of V(r) varies slowly with r, while the repul-
sive part varies so rapidly that it can be considered zero if r is
greater than r , infinite if r is less. This is what we should have
if the molecules were rigid spheres of diameter r , attracted by the
Van der Waals' attraction. If we let V (r) represent the attrac-
tion, we have
V(r)
Voir)
kT
=
e kT
if
r
>
ri>,
=
if r
<
r .
The integral then is
w = P°47rr 2 (1 - 0) dr + f °°4xr 2 [l - e-"j?r]dr.
The first term is simply f7ir 3 , the volume of a sphere of radius
r , or eight times the volume of the sphere of diameter r which
represents a molecule. In the second integral, we may expand the
exponential as a power series, since V is relatively small : it is 1 -
[1 _ (y /kT) • •] = Vo/kT. Thus this term is ^ J ^rr 2 V dr
+ • • • . If, for instance, we have the type of Van der Waalf'
EQUATION OF STATE OF GASES 465
force considered in the last chapter, we have V = —fi/r 6 , where
/? = an 2 . Then the term is — (47r/3/3r 3 A;7 T ). In this case, the
second virial coefficient becomes
Nw #/4 3 \ _ 2Nrfi
2 2\3 7rro / 3r 3 kT
the further terms being in higher inverse powers of T. We may
write this
h A
where b is four times the volume of all molecules, A = 2JW/3/3r 3 .
Actual gases have second virial coefficients which agree well with
this formula. The pressure, in other words, is given by the
result
' VL _ i . & d_ 4. . . . (\i\
RT 1+ « RTv~* * ^ }
being greater than for a perfect gas for large T (the b/v term
preponderating), and less for small T. Physically, at high tem-
perature, the finite size of the molecules, given by 6, decreases
the apparent volume, which produces an increase of pressure;
while at lower temperatures the attractions between molecules,
given by A, pull the gas together.
There is a very well-known equation, Van der Waals' equation,
for the pressure of an imperfect gas. / This is
(* + £)<•
6) = RT (18)
This differs from the equation of state of a perfect gas in two
respects : in having the volume (v — b) in place of v, as if the mole-
cules took up space, and in having the pressure increased by the
amount A/v 2 . The arguments used to deduce the equation are
not reliable, and it cannot be regarded as more than a very useful
empirical formula. But as far as the second virial coefficient is
concerned, it is correct. If we compute pv/RT from it, and
expand in inverse powers of volume and temperature, we can at
once show that the expansion is what we have already found, as
far as the term in 1/v, the values of b and A agreeing with those
we have already given. The higher terms in the expansion,
466 INTRODUCTION TO THEORETIC AL PHYSICS
however, do not agree with what we should get by correct
calculation.
268. Quantum Statistics. — Distribution functions, and hence
canonical ensembles, have a rather different meaning in quantum
theory from what they have in classical mechanics. For on
account of the uncertainty principle we can no longer specify
both coordinates and momenta, and hence cannot give functions
of the q's and p's. Instead, as we have seen, we deal with a wave
function \p, such that \p\p gives the probability of finding the
system at a given point of space. We could set up the corre-
sponding quantity in classical statistics: if f(qi . . . p n ) is the
ordinary distribution function, normalized so that its integral is
unity, then/ . . . jf(qi ... p„) dpi . . . dp n would give a func-
tion of the q's, giving the probability of finding the system with
given g's. Thus we should have the correspondence
/ • • • Jf(3l ' • ' Pn) dpi • • ' dp n ~ ftp,
the two quantities agreeing at any rate in the limit of large quan-
tum numbers, where classical and quantum theory approach each
other.
It is not difficult to show that this correspondence holds, at
least with one degree of freedom. First, we consider micro-
canonical ensembles, ensembles in which all systems have the
same energy, but are distributed in phase as if they had started
off at all arbitrary instants of time. In such a case, with one
degree of freedom, the probability of finding a system in a given
range of coordinates is proportional to the length of time a system
would stay in that range, or is inversely as its velocity. But
now the corresponding quantum ensemble is one in which all
systems are in the same stationary state. And using the Wentzel-
Kramers-Brillouin method, we have already seen, in Chap.
XXIX, that ftp is approximately proportional to 1/y/E — V, or
inversely as the velocity, so that in this case we actually have the
correspondence we desire between classical and quantum theories.
The same thing can be shown with more than one degree of
freedom.
Now any kind of classical ensemble which is independent of
time can be made up of microcanonical ensembles; we may regard
it as consisting of a certain distribution on each energy surface.
The corresponding situation is a quantum state in which all
stationary states are excited at once, represented by a wave
EQUATION OF STATE OF GASES 467
function Vc^e h . The corresponding density, averaged
k
over the rapid time fluctuations, is 2jCkCkUkUk, corresponding to
k
a fraction CkCk of all the systems being in the fcth stationary state,
or belonging to the particular microcanonical ensemble having
energy E k . Let us see what is the classical ensemble correspond-
ing to this combination. We may approximate it in the following
way. Let us imagine the energy surfaces corresponding to the
stationary states drawn in the classical phase space. Then let
a fraction c^k of the systems of the classical ensemble be uni-
formly distributed through the region between the kth. and
(k + l)st energy surfaces, rather than just on the energy surfaces.
We do this to get a continuous function. Then evidently the
density of points between the kth and (k + l)st surfaces will be
c^k divided by the volume of phase space between these sur-
faces. This volume, as we have seen, is h n . Then we have the
approximation
/(?i ' • ' Vn) ~-j^ between E k and E k +i.
This gives a step-like function for /, which would approach con-
tinuity as the stationary states got closer together. Now it is
plain how we are to set up a canonical ensemble: we are to set
c k c k proportional to e~ Ek/hT , and this will then give the right
variation for /. Of course, our correspondence is not exact, but
we assume that the quantum canonical ensemble is the exactly
correct thing, the classical one the approximation to it. This is
justified by the fact that we can give just the same argument for
the canonical ensemble's representing thermal equilibrium in
the quantum theory that we could in classical theory, and we
know quantum theory to be the correct form in cases where it
differs from classical theory.
Having the canonical ensemble in quantum theory, we can now
proceed to the calculation of the free energy and equation of
state as we did in classical theory. To get exact correspondence,
we should set
F-H F-Ek
/(ffi •'•?">= if = ")F - IF
468 INTRODUCTION TO THEORETICAL PHYSICS
Now the integral / . . . Jfdqi . . . dp n goes over into a sum over
all stationary states, multiplied by the volume of phase space
associated with each stationary state, or h n . Thus we have •
F-H
-E k
kT
k k
and finally
e kT = ^ e tr ( 19 )
In the case of degeneracy, where there are several stationary
states of the same energy, the sum in Eq. (19) includes a term for
each state, so that for an energy level with g states, we have g
times the contribution from a single level of the same energy.
269. Quantum Theory of the Perfect Gas. — We have already
shown the correspondence between the classical and quantum
expressions for free energy, to the approximation to which the
Wentzel-Kramers-Brillouin method is accurate. This shows us
that, for both the perfect and imperfect gases, we may expect to
find about the same equation of state and specific heat on both
theories. The errors in the method are large only when the wave
length is changing very rapidly, and this actually comes, in this
problem, only when two molecules are in collision with each other,
or are colliding with the walls. Accurate discussion shows that
there are appreciable corrections to the classical equation of
state introduced in this way for the lightest gases (which there-
fore have longest wave length for a given velocity), but even
these are small, and difficult to discuss. It is easy, however, to
carry through the exact solution of the quantum theory of the
perfect gas, and this will suffice to show the general situation.
Let the gas be confined in a rectangular volume of sides A, B,
C. Then the wave functions for single molecules satisfying the
boundary conditions of being zero on the boundaries are sin ^~
sin ^-jjr- sin ~> where p, q, r are integers. A wave function
Jo C
for the whole gas can be built up from this by multiplying
together functions for all the molecules, obtaining
EQUATION OF STATE OF GASES 469
. Viirxi . t n tx n
u = sm £ -^ — • • • sin — -~ —
Substituting in Schrodinger's equation,
where V is the same potential of repulsion of the walls which
we have considered before, we at once have
^8^r-^ + -- + -^> (2o)
To get all states, we must take all combinations of the integers
pi . . . r N , each going from one to infinity. Thus we have
Pi=l r jv =1
h*pi 2 _~ h*-r.
= ^ e ~SAhnkT . . . ^ e~8C*mkT. (21)
pi=l r w = l
Now at reasonably high temperatures, 7 1 is so large that we have
to go to large values of the integers p, etc., before the exponential
begins to fall off appreciably. Thus the terms of our summation
differ only slightly from each other, and we can replace them by
an integral, one factor being
Thus we have
JL
(ABCr (v^ty _ ,v(v«rf ,
where v = ABC is the volume of the gas, agreeing exactly with
the classical value, so that equation of state and specific heat
are not altered by using the quantum theory. At lower tempera-
tures, where we cannot replace the summation by an integration,
there will be discrepancies; the gas here is said to be "degenerate."
At the same time other features enter the situation, different
470 INTRODUCTION TO THEORETICAL PHYSICS
sorts of statistics known as the Bose and Fermi statistics, which
we shall discuss later in other connections. We shall not work
out the case of degeneracy here, since practically one cannot
reach such low temperatures without liquefying the gas, and
since we shall meet in the next chapter some corresponding
situations in solids, which are actually attained, and are of much
more physical interest.
Problems
1. For neon, experimentally, 6 = 20.6 c.c. for a mol. Find the equivalent
diameter r of the atoms, regarded as rigid spheres. Compare this with the
sum of the quantities n 2 /(Z — S) for the two atoms.
2. Using our approximate methods of dealing with Van der Waals'
attraction, and using the value of r from Prob. 1, compute the constant
A for neon. Compare with the experimental value of 0.21 X 10 12 absolute
units (you cannot expect very good agreement) .
3. Using the experimental values of b and A for neon, from Probs.l and
2, draw a graph for the second virial coefficient as function of temperature.
At what temperature does the graph cross the T axis, and what does this
mean physically?
4. Carry out the expansion of Van der Waals' equation in virial form,
showing that the second virial coefficient is as we have found, and compxiting
the third virial coefficient as well.
5. Using Van der Waals' equation, plot a number of isothermals (lines of
constant T, p being plotted against v). Choose both low and high tem-
peratures. Use the constants given in Probs. 1 and 2 for neon. Note
that at low temperatures the isothermals have a maximum and minimum,
while at high temperatures they do not. As is well known, this maximum
and minimum are not really present, but the region in which they occur
is that in which gas and liquid are in equilibrium and exist as a mixture.
6. The critical point is that point where the maximum and minimum of
the isothermals of Van der Waals' equation coincide, or where the first
derivative of pressure with respect to volume at constant T has a double
root. Compute the critical pressure, temperature, and volume, for neon,
using the constants given in Probs. 1 and 2.
7. Hydrogen gas is confined in a container 10 cm. on a side. Find the
order of magnitude of the temperature at which it would become degenerate;
that is, the temperature at which most of the molecules would be in the
lowest quantum state.
8. Compute the internal energy and entropy of a perfect gas by the
classical theory.
CHAPTER XXXVII
NUCLEAR VIBRATIONS IN MOLECULES AND SOLIDS
In the last chapter, we have seen that in addition to the
equation of state of gases, there was another range of phenomena
which we could treat satisfactorily: the phenomena resulting
from nuclear vibrations in molecules, leading to the vibrational
specific heat, and in solids, leading to the equation of state and
specific heat. The mathematical methods used in dealing
with them are similar, so that they can be profitably treated
together. At the same time, the question of the stationary
states of vibrating molecules is of interest in itself, and can be
easily taken up.
We shall begin with the problem of a crystalline solid; the
extension to a molecule, which after all is not very different from
a fragment of such a solid, is not hard to make. Our problem
is to find pressure as function of temperature and volume, and
specific heat. Ordinarily the measurements of the equation
of state of a solid take the form of measuring the compressibility
and thermal expansion: we express volume as a function of
pressure and temperature, and have
compressibility = k = — I— 1 >
thermal expansion = -( -r^ J •
We shall thus compute these quantities. Now a solid, unlike a
gas, behaves in a perfectly normal way at the absolute zero of
temperature. Its volume is finite and definitely determined,
being given from the equilibrium positions of its own atoms and
molecules, which all pack closely enough to be in their equilibrium
positions, since they have no kinetic energy. If external pres-
sure is applied, the volume will decrease, and we can compute
the compressibility. Temperature will not greatly change these
quantities: temperature agitation slightly increases the volume,
and makes the crystal more compressible, but these effects are
471
472 INTRODUCTION TO THEORETICAL PHYSICS
small enough to be treated as perturbations of the state at
absolute zero. Hence we begin by considering the crystal
without temperature agitation.
270. The Crystal at Absolute Zero. — The energy of a crystal
at the absolute zero, when its atoms are in perfectly definite
positions, is simply the sum of the interaction energies for all
pairs of atoms. In the position of equilibrium, this energy must
be a minimum with respect to any possible small deformation
of the crystal. Thus each separate atom is in equilibrium with
respect to a slight displacement, keeping all other atoms fixed,
so that it is at a minimum of potential, and could execute vibra-
tions about this position of equilibrium, which to a first approxi-
mation would be simple harmonic. But there are other sorts of
distortion to consider. For instance, we may decrease the whole
volume slightly, moving the atoms closer together but preserving
their relative arrangement, and the energy must be a minimum
with respect to such a distortion. It is this which particularly
interests us in computing compressibilities. Now in a very
simple crystal lattice, if the volume is decreased, the atoms will
still have just the same arrangement. Thus NaCl has a cubic
lattice, Na and CI ions being found alternately at the corners
of cubes, and squeezing the whole lattice would merely decrease
the size of the cubes. The same thing is true of the simpler
metals. It is easy to see that it is not always the case; a crystal
composed of molecules rather loosely tied together would, under
compression, have the molecules forced closer together without
much change in the dimensions of each molecule. We do not
consider such complicated cases, however, but rather assume that
all interatomic distances r are proportional to the dimension
5 of the crystal as a whole.
Let us assume, then, a cube of crystal, of side 5, a quantity
which depends on the pressure. Let this cube contain N atoms.
Now let us assume that the potential energy of the force between
two atoms at distance r is the sum of two terms: an attractive
term, negative in sign, proportional to 1/r for ionic crystals,
or exponential in r for valence crystals; and a repulsive term,
positive, and varying exponentially with r. The total energy
of the crystal is the sum of all interatomic potential terms.
This sum, for the exponential terms, is easy to compute. For
these terms fall off so rapidly that practically only the nearest
neighbors contribute appreciable terms to the energy. Thus
NUCLEAR VIBRATIONS IN MOLECULES AND SOLIDS 473
we simply take each of the N atoms, and sum up the exponential
terms to its s nearest neighbors. This, as we readily see, gives
each pair counted twice over, so that the sum is \NsAe~ ar ,
where r is the distance to the nearest neighbor, A and a are
constants, or ^NsAe~ aS — Ce~ aS i where a = ar/8, since r is
proportional to 8. For the inverse power attraction between
ions, we cannot confine ourselves to nearest neighbors, since the
forces fall off too slowly. Since each term in the energy is
proportional to an inverse interatomic distance, and therefore
to 1/6, however, the energy will likewise be proportional to 1/5,
and the coefficient can be calculated by a proper method of
summation over all ions.
Having the total energy, it is easy to compute the compressi-
bility. We consider the ionic crystal, where the energy has the
form
-j + Ce- s . (1)
We note that dE = —pdv, where E is the energy, p the pressure,
v = 5 3 is the volume, so that
dE dE d8 ( K . „ A 1 ,_ N
To compute the compressibility, we note that
dp = dpd8 = ^K _ C(a 2 + 2a/8)e- aS
dv d5 dv ~ 95 7 95 4
from which we get the compressibility by the definition k =*
1 /fat
7— Now we are interested in the properties of the solid at
v dp
zero pressure. Setting the expression above for p equal to zero,
a particular value of 8 is determined, giving the volume at zero
pressure. In turn, we substitute this into' the compressibility,
obtaining
K ° " (2 - a8 )K {6)
When we remember that a8 = ar , and see from Chap. XXXV
that this, for equilibrium, is about 8, we have approximately
474 INTRODUCTION TO THEORETICAL PHYSICS
K o = ~ei?- This shows among other things that we may expect
oil
that in a series of similar crystals, as the alkali halides, where K
may be expected to be the same for all, the crystals with larger
grating spaces will be more compressible, since they will have
larger 5 's for the same number of atoms.
271. Temperature Vibrations of a Crystal. — The atoms of a
crystal will, of course, vibrate at temperatures above the absolute
zero. The ^problem is very similar to that which we have already
considered in Chaps. XI and XII, where we had particles coupled
together, and considered their vibrations by means of normal
coordinates. Here for sufficiently small amplitudes the potential
energy will be quadratic in the displacements of all the particles,
so that we can again introduce normal coordinates, though this
will break down for too high temperature. We confine our dis-
cussion to the case where it can be done. There will be as many
normal vibrations as there are coordinates; thus, with N_ atoms,
each having 3 coordinates, there will be SN normal coordinates.
These coordinates can now execute simple harmonic vibrations,
and the superposition of all these vibrations, each with its appro-
priate amplitude and phase, is the temperature agitation of the
crystal. on classical mechanics, each of the normal coordinates
can vibrate with any arbitrary amplitude, the actual magnitude
depending in thermal equilibrium on the^ temperature. As a
result of this, we have what is called equipartition of energy
between the coordinates : each one, on the average, at temperature
T, has kinetic energy %kT, potential energy equal to this, so that
the total energy is kT, and the total thermal energy of the crystal
SNkT, leading to a specific heat 3Nk, which is given by Dulong
and Petit's law, an experimental law for the specific heat of solids.
We can easily prove this result. For example, to follow the
method of the last chapter, we may compute the free energy:
e~^ = ptij J e kT d ^' d P SN >
where the g's are the normal coordinates, and the p's are their
conjugate momenta. Now from Chap. XIII we see that the
kinetic energy is equal to T = ^>tk 4*?, where n* is a mass coeffi-
fc=i
NUCLEAR VIBRATIONS IN MOLECULES AND SOLIDS 475
cient connected with the kth. normal coordinate, so that p k =
q k /nic. Also, if cofc is the angular frequency of this normal coor-
dinate, V = ^^ "*V, so that H = T + ^ is a sum of squares
of momenta and coordinates. Thus we have
The limits of integration for the normal coordinates are given as
- oo and oo . This of course is not correct; the whole method of
normal coordinates becomes impossible if the q's are even moder-
ately large. But for low temperature the integrand will be small
before the g's have attained such large values, so that we can
just as well integrate to infinity. We then have, performing the
integration,
^ = -j^(V2^kf) ■ ■ ■ (V2^55')( 1 J^) • • •
\ \ HZN&3N V
3N
F = -kT^ In ^ = -SNkT In T + kT^ In (hv t /k), (4)
t=i
d 2 F
where v ( = ^ From this, C. = -^qjH = 3 ^ M we have
stated.
We may, however, treat the problem by the wave mechanics.
Here we consider the normal coordinates to be the coordinates in
Schrodinger's equation. on account of the fact that the Hamil-
tonian is a sum of terms, one connected with each normal coor-
dinate, Schrodinger's equation is separable, and the equation
for each coordinate is just like that for a linear oscillator. It
will then have stationary states given by Em = (k + %)hv if
giving the kth. energy level of the ith normal coordinate. The
total energy is the sum of all the energies of the oscillators, or
3N
V (&. + i)h Vi . Using the quantum expression for free energy,
476 INTRODUCTION TO THEORETICAL PHYSICS
we then have
P . " - -X (ki + i) %?
e~w = X • • ' X e ffl
ki=0 k 3 N =
= y\ e w • • • V e +i) ^r.
ki=0 & 3 jv=0
These summations can be easily carried out : we have
[_hv / hv \ 2
1 +e Ar + ^ e *r) + . . .
e (IC + i) kT = e 2kT\
k =
_lhv_
o 2kT
1 - e kT
From this we have easily
SN 3N / hn\
F = X* hvi + kT X ln V 1 - e kT )>
i=\ i = 1
3JV 3AT
i-i i=ie kT - 1,
and
a
\kf) 6
i = l [ e kT_ l)
Curves for E and C v , its derivative, are shown in Fig. 76. The
energy, for a single normal vibration, approaches the zero point
energy \hvi at low temperature, where the exponential in the
denominator of the second term is very large, so that the whole
term is small. on the other hand, at high temperature, the
exponential is small and can be expanded in power series, giving
2 ' i + £ + ife|Y +
'* 4. 1 (ki\
n "*" 2 \kTj
= kT + terms in 1/T, 1/2' 2 , • • • (6)
NUCLEAR VIBRATIONS IN MOLECULES AND SOLIDS 477
for one normal vibration, or an energy of SNkT at high tem-
peratures for the whole crystal, leading to a specific heat which
approaches zero at the absolute zero, but approaches the classical
value SNk at high temperatures.
To get more information about the specific heat, we must
consider the values of the natural frequencies v { . It is not
impossible, with simple crystals, to solve the problem of the
secular equation connected with the normal coordinates exactly,
and find the v'a. Born has done this, and has found the corre-
sponding specific heat, in general agreement with experiment.
But there are two simpler approximation methods which have
been used, giving fairly good results. First, Einstein assumed
Fig. 76. — E and C„for one degree of freedom, quantum theory of linear oscillator.
that all natural frequencies were the same, so that C v was merely
3N times an expression like that above. This gives a specific
heat which is zero at the absolute zero, but rises in the region of
temperature given by kT = hv to the value SNk, given by the
classical theory, Qualitatively, this is in accord with observa-
tions, but quantitatively at low temperatures it is not accurate.
Obviously at low temperatures Einstein's formula gives C v =
T7r)e~ W > whereas experimentally C v is proportional to
T z at low temperatures. Debye has given an explanation of
this, which we shall sketch briefly. In the first place, as we have
seen in Chap. XII, the frequencies of the lower overtones of a
vibrating string agree with those of the equivalent weighted
string. The same thing is true with a vibrating solid, the corre-
sponding three-dimensional case. While the higher overtones
478 INTRODUCTION. TO THEORETICAL PHYSICS
do not agree, still Debye finds that he does not introduce very
serious errors by assuming that the frequencies of the actual
vibrations are those of the continuous vibrating solid, or, by
analogy with the membrane, by a formula of the nature of co =
JT/k 2 . P . m 2 \ , _ . ■ .
T \ \ X* Y* Z*l where T 1S an e l astl c constant, fi a
density, k, I, m integers, X, Y, Z the dimensions of the solid.
This, being a continuous medium, has an infinite number of
vibrations; but Debye assumed that the first 37V of these fre-
quencies approximated the SN normal coordinate frequencies.
Now it is easy, as in Prob. 4, Chap. XV, to show that the number
of overtones, or normal coordinates, with frequency between a>
and w + dw is proportional to ia 2 dw, up to aw, the frequency of
the highest overtone, and after that it is zero. Hence the sum-
mation for C v can be replaced by an integration,
constant | v 2 ^— 2 dv,
\e kT - l)
X
leading to Debye's formula for the specific heat. At high tem-
perature, it gives results about like Einstein's, with y max substi-
tuted for v. At low temperature, however, we have
^ _hv_
C v = constant I '— e kT dv.
C v 4
I max J/*
Jo T~ 2
For low temperatures, the exponential is very small, even for
frequencies much below v max , and we can without error integrate
to infinity. Thus we have
C v = constant T s I , x 4 e~ x dx, where x = r^-
= constant T 3 ,
giving correctly the behavior of the specific heat at low tem-
peratures.
272. Equation of State of Solids. — From our free energy, we
can find the equation of state, as well as the specific heat. We
must note two things, however. First, the Hamiltonian we have
used, whether on classical or quantum theory, is not the whole
energy, but merely the part connected with thermal vibrations.
NUCLEAR VIBRATIONS IN MOLECULES AND SOLIDS 479
To this we must add H (v), the energy when all atoms are station-
ary, which we have already computed in finding the compressi-
bility at the absolute zero. Secondly, we must note that the
frequencies vt of the overtone vibrations will depend on the
volume. For as the crystal is compressed, and the atoms are
forced closer together, they will be in regions of stronger force,
and will vibrate with higher frequency. Remembering both
these facts, we have on classical theory
F =
■3NkT In T + kT^ ln X + #
hvi _
dH _ S^kTdvi
p = J*) = _!
\dv/T dv ^J Vi dv
i
= »°- 3 Mlf} <n
where p is the pressure at absolute zero, which we have already
computed. The other term is the additional, thermal pressure,
the average being taken over all overtones. Now experimentally
it is found that the frequencies increase with decreasing volume,
something like v cc l/v", where a is an exponent of the order of
magnitude of 2, so that -£■ = — — > ( — ^ J = — — . We
dv v \Pi dv / v
thus have approximately
V = Po + — > (8)
showing that the pressure is the sum of the pressure at zero
temperature, arising from the atomic forces, and an additional
term which is like the pressure of a perfect gas of the same num-
ber of atoms, only several times as great.
Using wave mechanics, we have
V = Po
-s2[5»'« + * rto ( 1 -«" S )]
h—
dv
(9)
The pressure at absolute zero of temperature has a small correc-
tion, on account of the terms $hvi. And the thermal term in the
480 INTRODUCTION TO THEORETICAL PHYSICS
pressure has quite a different form from what it has in classical
theory. At high temperature, this approaches the classical
value, but at low temperatures it goes down to zero, the thermal
pressure being less than classically. This is a real case of degen-
eracy, met in practice, unlike the degeneracy of gases, which
we met in the preceding chapter, and which cannot be actually
realized.
273. Vibrations of Molecules.— The small vibrations of a
molecule behave just as do those of a crystal, the only difference
being that the number of degrees of freedom, and hence of
normal coordinates, is small. Molecules have vibrational
specific heat, falling off to zero at low temperatures as with
crystals. In fact, the vibrational frequencies are generally
such that at room temperature the oscillations are not excited,
and the vibrational specific heat is zero, becoming apparent
only at rather high temperatures. The particular interest
in the vibrations arises from band spectra. A molecule can
jump from one vibrational state to another, emitting radiation,
generally in the infra-red, whose frequency is given by the differ-
ence of the vibrational energies. Or a molecule in an excited
electronic state can jump to another electronic state, with
simultaneous change of vibrational quantum number, emitting
light in the visible or ultra-violet. Both types of spectra show
the appearance of bands, from which they are named. A general
discussion of band spectra is beyond the scope of this book, but
we can at least consider a little the type of vibrational level*
to be expected, and draw some conclusions about the sort of
spectra we expect.
We have already seen that there will be a number of normal
coordinates, each acting like a single linear oscillator, and
therefore having energy levels (k { + %)hv it so that the whole
energy will be a sum over the normal coordinates of such expres-
sions. With N atoms, we might suppose that there would be
3iV such vibrations. As a matter of fact, however, this is not
the case: three out of the SN coordinates are taken up in describ-
ing the position of the center of gravity, and either two or three
in describing the orientation of the molecule, two for a linear
molecule, as a diatomic one, and three, as for instance Euler's
angles, for a nonlinear, polyatomic molecule, so that only 3A^ — 5
or 3N — 6 are left for vibration. Thus a diatomic molecule
(JV = 2, 3JV — 5 = 1) has only one mode of vibration, that
NUCLEAR VIBRATIONS IN MOLECULES AND SOLIDS 481
in which both its atoms simultaneously move toward, or away
from, each other; a nonlinear triatomic molecule has three; and
so jm. We could neglect these corrections with crystals with-
out serious error, on account of the enormous size of N. We
shall commence our discussion with the diatomic molecule, and
then shall indicate briefly the nature of the extension to the poly-
atomic case.
274. Diatomic Molecules. — We have already seen, in Fig. 75,
the form of the potential curve of a diatomic molecule, as func-
tion of the distance of separation of the atoms, the one coordinate
with respect to which we have vibration. The quantity (r — r )
plays the part of a normal coordinate directly, so that to the
extent to which we can use normal coordinates at all, we must
replace the actual potential curve by a parabola, approximating
it as well as possible at the minimum. The vibrational levels
would then be (n + \)hv, where v is the vibrational frequency
in the parabolic potential. We can go farther in the solution
in this particular case, solving the vibrational problem exactly
if the potential is given by a curve of Morse's type, though we
shall not do that here. It is necessary, however, to consider
rotation as well as vibration, and we shall carry this discussion
through.
Let us start with the general case of two atoms of masses
mi, m 2 , with a potential V(r) between them, where r is the dis-
tance between. The Hamiltonian is (pi 2 /2mi) + (?>2 2 /2m 2 ) +
V(r). Let us, however, separate off the coordinates of the center
of gravity. Let x^yxZi be the coordinates of the first particle,
xiy&i of the second. Then we introduce new coordinates,
v muXi + m 2 x 2 ,
X = ; > etc.,
mi + m 2
£ = x 2 — xi, etc. (10)
Here X, Y, Z are the coordinates of the center of gravity, £, 77, f
the relative coordinates, so that £ 2 + v 2 + f 2 = r 2 . In terms
of these coordinates we readily find that
T = him, + m 2 )(X 2 + f 2 + Z 2 ) + ^(k 2 + v* + f 2 ),
where ^ = ; . Thus the Hamiltonian is
mi + m 2
PX 2 + PY 2 + PZ 2 , Pi 2 + Pr, 2 + Pj 2 , v(t s ni .
2(mt + m.) + % +T(^r). (11)
482 INTRODUCTION TO THEORETICAL PHYSICS
This is already separated, on either classical or quantum theory,
the first term giving the translational energy of the molecule of
mass (mi + m 2 ), the second being like the motion of a particle
of mass fj, in a central field V(r). This latter problem can be
solved in wave mechanics by introducing polar coordinates,
just as we did with the general central field problem. As in
Chap. XXXIII, the solution is
u = e ±im ' t 'Pr(cos d)R(r),
where y = rR satisfies the equation
h 2 d?y
J..0 \
8xV dr 2
E _ y (r) _ !« + »».*
y = o, (12)
I being an integer. Qualitatively, however, the situation is quite
different as far as the function of r is concerned. With the
electrons in atomic structure, we had a quantum number for
this equation representing the number of radial nodes, which
we might call n r , and then had n r + I + 1 = n, the total quan-
tum number. We found that the energy, in the hydrogen case,
was proportional to 1/n 2 . That is, terms of the same n r had
very* different energy, depending on the azimuthal quantum
number. Here, however, on account of the fact that n is much
larger than the mass of the electron, the term 2 — ^— is very
much smaller than in the problem of atomic structure. As a
result, the energy depends only slightly on I, determining the
rotation. The number n T plays the part of the vibrational
quantum number, and we have approximately E = (n r + %)hv,
but with a small correction term. We can get a first approxima-
tion to the correction very easily, from perturbation methods.
We first solve without the term depending on I, then introduce
it as a perturbation in the energy. As we have seen, the per-
turbed energy is, to the first approximation, the unperturbed
energy plus the mean value of the perturbed energy over the
unperturbed wave function. Thus the rotational contribution
1(1 -t- l)h 2
to the energy is v „ , • If we note that the amplitude of
vibration is not very large, so that r does not vary a great
deal, we see that -5 is approximately equal to — «> where r is
NUCLEAR VIBRATIONS IN MOLECULES AND SOLIDS 483
the equilibrium distance, so that, setting I = /ir 2 , the moment
of inertia, we have
E = [n r + -^hv + g^/ • (13)
This gives a set of vibrational levels (given by n r ), each broken
up into a group of rotational levels (given by I), which are
fairly closely spaced.
Electronic band spectra are emitted when we have transitions
between two such sets of levels. To the energy above we must
add the electronic energy; that is to say, we have counted our
energy so far from the minimum of the potential curve, and this
is different for different states. Two such electronic energy
levels, with the corresponding vibrational and rotational levels
indicated, are plotted in Fig. 77. In considering transitions,
there is a selection principle which must be considered, as with
atomic systems, I being able to change only by ±1 or units.
As a result, bands have three branches, called P, Q, R, corre-
sponding to the three possible changes of I (P corresponding
to decrease of I by one unit, Q no change, R increase). These
branches are indicated in Fig. 77. It should be stated that
actual band spectra are ordinarily much more complicated
than this, on account of a number of factors which we have not
yet considered, as multiplet structure and the existence of elec-
tron spin.
275. Specific Heat of Diatomic Molecules. — Using an approxi-
mate energy expression for a diatomic molecule, we can find the
specific heat of a gas of such molecules. The translational part
of the energy separates from the rotational and vibrational
part, as we have seen, giving a contribution fiVfc to the specific
heat. The whole free energy is given by
20-+P
The summation over i is over all molecules, so that each vibra-
tional quantum number has a factor ^.e kT , and each
w=0
484 INTRODUCTION TO THEORETICAL PHYSICS
n=2
nM
P Q R
Fig. 77. — Energy levels and transitions for diatomic molecule.
NUCLEAR VIBRATIONS IN MOLECULES AND SOLIDS 485
°° l(l + l)h*
rotational one a term ^ e 8w2IkT . The first we have already
z = o
considered, and have seen that its contribution to the specific
heat is appreciable only at high temperatures. The last term,
arising from the rotation, can be easily calculated at high tem-
peratures, where the rotational energy is large, and we must
consider states of large Vs. We must first notice that for each
value of I there are (21 + 1) substates, corresponding to different
ra's, all of the same energy, so that each term must actually be
counted (21 + 1) times in the summation. Then we replace
the summation by an integration, and neglect unity compared
with I, finding
Sir 2 IkT
21 e *** IkT dl =
so that each molecule contributes to the free energy a term
i (%*''
IkT\
— kT In I — j~2 — J on account of its rotation, and to the specific
heat the amount k, just the amount corresponding to two
degrees of freedom according to the equi partition of kinetic
energy. At low temperatures, we may not replace the sum by
an integral, and the rotational specific heat proves when worked
out to decrease to zero at the absolute zero,, just as the vibra-
tional specific heat does. This is not ordinarily observed
for any gas except hydrogen, for it comes at so low a temperature
that the gas condenses, but in hydrogen the phenomenon occurs
at room temperature, on account of the small moment of inertia,
correspondingly wide spacing of the rotational levels, and high
temperature necessary to acquire an energy comparable with
this spacing.
276. Polyatomic Molecules. — The theory of polyatomic mole-
cules differs from that for diatomic molecules in several ways,
though we shall not go into the question in detail. First, there
are several different fundamental frequencies of vibration,
corresponding to the various normal coordinates, so that the
system of energy levels and the band spectrum are more com-
plicated. Nevertheless, in some fairly simple cases it is possible
to analyze the band systems, identifying the empirical frequencies
with the corresponding modes of vibration. Next, the rotational
486 INTRODUCTION TO THEORETICAL PHYSICS
levels are more complicated, on account of the lack of symmetry
of the problem. We can no longer solve the rotational problem
immediately in polar coordinates. We could do it before
because the rotating molecule would act like a symmetrical
top, symmetrical about the internuclear axis, and the energy
levels depended only on the moment of inertia about axes at
right angles to this axis. But with a polyatomic molecule,
unless it happens to be linear, this simplification is not present.
The problem resembles a nonsymmetrical top, and as we saw
in discussing the motion of a rigid body in classical mechanics,
this introduces great complications. It can be approximately
solved, but we shall not do it. one fact, however, can be stated
at once about the rotational levels: their spacing depends in
general on -the reciprocal of the moment of inertia, and as we
go to more and more complicated molecules, with larger and
larger moment of inertia, the rotational levels are more and more
closely spaced, so that the classical theory becomes more and
more accurate. This is unfortunate for band spectroscopy,
since the rotational lines in the spectrum are so closely spaced,
in even moderately heavy molecules, that they cannot be resolved
spectroscopically, the band appearing continuous rather than
full of discrete lines. As a final point in connection with non-
linear polyatomic molecules, the specific heat connected with
rotation proves to be |fc for each molecule, corresponding to k/2
for each of the three coordinates concerned in' determining the
orientation of the molecule in space.
Problems
1. Given a crystal whose energy at the absolute zero is — K/8 + 6/5",
where 5 is a linear dimension, b and n are constants, find the compressibility
at zero pressure, as function of K, 5 , and n.
2. In rock salt, ions of Na + and CI - are arranged alternately on a cubic
lattice, the equilibrium distance between successive Na + ions being 5.7 X
10~ 8 cm. If 5 is chosen as the distance between successive Na + ions, so
that 5 = 5.7 X 10 -8 , the energy in the cube of volume 5 3 may be approxi-
mated — K/8 + b/8 n , where K = 13.94 e 2 , if e is the charge on an electron,
and n is about 9 units. Find the compressibility at the absolute zero, in
reciprocal dynes per square centimeter, comparing with the observed value
(between 3 and 4 X 10" 12 ).
3. Using the figures of Prob. 2, compute the energy required to break up
1 gm. mol. of NaCl into Na + and CI" ions (that is, to make 5 infinite). Com-
pare with the experimental value, in the neighborhood of 180 kg. cal. per
gram-molecule.
NUCLEAR VIBRATIONS IN MOLECULES AND SOLIDS 487
4. Using Dulong and Petit's law, compute the specific heat at ordinary
temperatures of copper and lead, and compare with the experimental values.
6. Compute and plot Einstein's specific heat curve, assuming hv/k = 100°
abs.
6. Prove that Einstein's specific heat curve approaches the classical value
at high temperature.
7. Using the expression p = p + {ZaNkT/v) for the pressure of a solid at
any temperature in terms of its pressure at the absolute zero, find the thermal
expansion, showing that it is approximately equal to a C v k /v, where #c is the
compressibility. (Suggestion: Set the pressure equal to zero, find the
volume as function of temperature, expressing p as a function of volume by
means of the compressibility.)
8. By methods like those of Prob. 7, show that the fractional change of
.... 1 Ok
compressibility with temperature, — -=, is to our approximation equal to
the thermal expansion. From the figures given above for NaCl, compute
the order of magnitude of these quantities for this crystal.
9. Show that the thermal expansion is given by a. C v k /v even at low
temperatures where C v must be given by the quantum theory rather than
the classical formula.
10. Solve the problem of the vibrational levels of a molecule whose poten-
tial energy is De _2au - 2De~ a », where u = r - r , without rotation. If R
is r times the radial wave function, and y = e~ au , set up the differential
equation for R, using y as independent variable, showing that it is
1 d ( dR\ , 8xW# , 22) _A „ n
yd- y {y^) + ^{^+T- D ) R=0 -
Treat this equation like Schrodinger's equation for the hydrogen atom,
letting R = <r*»(2dy)*/*F(y), where d = ^^f 3 , E = _«!. obtain
an oJiir'n
the differential equation for F, showing that the series solution breaks off
to give a finite polynomial if the energy is given by
& = _ D + („ + £K_(-) ( » + iy,
where v is the frequency of classical vibration about r , equal to ^- \—-
CHAPTER XXXVIII
COLLISIONS AND CHEMICAL REACTIONS
We have been considering the interactions of atoms in mole-
cules, gases, and solids, under the action of their interatomic
forces. There are a few special interesting and unusual cases
of interaction, which we shall consider in the present chapter.
Our discussion will be mostly qualitative, since an exact treat-
ment is very difficult and complicated. We can classify the
problems we are taking up by remembering that there is a
potential energy function acting between atoms, depending
on the electronic state. It is this energy which is responsible
for the formation of molecules, and for their mutual attractions
and repulsions, as we have seen. But there is one feature
we have not considered: each different electronic state has a
different interatomic energy connected with it. Thus an excited
atom is much larger than a normal one, so that two such atoms
begin to exert valence and repulsive forces on each other at
much larger distances than atoms in their normal states. For
any given problem, we have an infinite set of electronic energy
levels as functions of the nuclear coordinates, as we have seen
for the diatomic molecule in Fig. 77. Now all the problems we
have spoken of (except for the emission of electronic bands,
discussed in Fig. 77), have been problems in which there was
no change of electronic quantum numbers. Ordinarily the
electrons have been in their lowest stationary states, and the
corresponding potential energy is the one used in discussing
molecular formation and interatomic forces: We now classify
the problems we shall take up in the present chapter into two
groups: (1) those in which the electronic quantum numbers
do not change; (2) those in which the electronic quantum num-
bers do change, with of course a compensating change in some-
thing else. Many chemical reactions are examples of (1),
and we consider them first.
277. Chemical Reactions. — Probably the simplest chemical
reactions to discuss are bimolecular gas reactions, in which
488
COLLISIONS AND CHEMICAL REACTIONS 489
two molecules collide, are transformed into one or more other
molecules, and the resulting molecules separate. The simplest
case of two single atoms colliding does not ordinarily lead to
the formation of a molecule at all, particularly if the atoms are
not excited. For then there is a potential energy curve between
them like Fig. 75. Considering the atoms as moving according
to classical mechanics in this potential field, we see that if they
start toward each other from infinity with finite kinetic energy,
they will approach to a "perihelion" distance, separate, and
finally go to infinity again, with the same kinetic energy as
before. The only way for them to become bound would be
for them to lose vibrational kinetic energy when close together,
at a distance approximately r , so that they would begin vibrat-
ing about the distance r . This could conceivably happen if
they radiated while close together, but calculation shows this
to be most improbable. Actually the only important mechanism
by which such binding occurs is a collision by a third particle,
part of the kinetic energy being imparted to this particle, resulting
in the binding of the atoms to form a molecule. Thus this reac-
tion is not of the simple type we wish to consider.
one can really get bimolecular reactions, however, if at least
one of the colliding molecules is diatomic. For example, it
would be possible for HBr colliding with H to change into H 2
and Br, or vice versa. And such reactions can occur without
change of electronic quantum number, the electrons always
staying in their lowest state. To understand them, we need
merely consider the potential energy function as it depends
on the coordinates of all three atoms. In discussing the motion
of systems of particles, we have seen that it is helpful to think
of the potential energy as plotted in a many-dimensional space,
and to imagine a ball rolling over this many-dimensional energy
surface. In this case, we have the nine coordinates of our
three particles, and the energy is to be plotted as a function of
these nine variables. While it is impossible to visualize the
whole diagram, we can easily describe some features of it. First,
there will be regions of space corresponding to one H atom
near the Br atom, but with the other H atom far off. Here
the potential will depend only on the relative distance of the
adjacent H and Br, the potential as function of the distance
looking like the familiar energy of a diatomic molecule, and
leading to a low total energy of the system if they are at the
490 INTRODUCTION TO THEORETICAL PHYSICS
proper distance r . If the atoms move either closer together
or farther apart the energy will increase. But there is also an
entirely different region of the nine-dimensional space in which
the two H's are near together, and the Br is at a great distance.
This again will give a low potential energy, since the H's can
be bound together at a suitable distance to form a molecule
H 2 . The larger part of the coordinate space, however, will
correspond to all three atoms being separated from each other,
and will have a constant potential energy, a plateau, with the
two valleys we have spoken of. The two valleys connect with
each other, in the region of space where all three atoms are
near together, but calculation shows that to get from the bottom
of one valley to that of the other, one must go over a considerable
elevation, though not so high as the plateau. Now suppose we
start with HBr and H, the HBr molecule having no vibrational
energy, and the mutual kinetic energy of HBr and H being
small. The corresponding rolling ball will be in the bottom
of the HBr valley, rolling slowly along the valley (this corre-
sponds to relative velocity of the two molecules, without vibra-
tion of the HBr) . The direction of motion is toward the junction
with the other valley, corresponding to H 2 + Br. If the ball
is going fast enough, it will be able to rise over the pass separating
the two valleys, and roll down the other side, resulting in the
formation of H 2 , or in the reaction we are interested in. We
see, in other words, that a necessary condition for the possibility
of such a reaction is a potential energy with two separate valleys,
so that the point representing the system can start in one and
end up in the other. Of course, complicated reactions could have
several valleys, but the principle is the same. And in every
case it turns out to be true that there is an elevated pass between
valleys. This means that molecules must have a certain kinetic
energy before they can react. This is called activation energy,
and it is ordinarily large enough so that only a few of the fastest
molecules can react. Now Maxwell's distribution of velocity
is such that the number of molecules with particularly high
speed increases very rapidly with temperature. Thus, if the
necessary kinetic energy is E, the chance of finding a pair of
molecules with the required energy will depend on a quantity
like e~^ w \ a very small quantity, but increasing rapidly
with temperature. It is evident, then, that the reaction velocity
COLLISIONS AND CHEMICAL REACTIONS
491
will increase rapidly with temperature. We can estimate this
by assuming the velocity V to be proportional to e \ k T/. Then
we at once see by differentiating that (d lnV)/dt = E/kT 2 ,
the equation of the so-called reaction isochore. In ordinary
reactions at room temperature, E/kT is of the order of 20,
showing that only very fast molecules can react. Then, setting
T = 300° Abs., the result is about (d InV) /dt = 0.07, showing that
InF increases by about 0.7 for 10° rise of temperature. Since
0.7 is approximately In 2, this means that reaction rates approxi-
mately double for 10° rise of temperature.
;HBr equi/ibr/um ct/sfcwoe
equilibrium
o/isfance -,
Fig. 78. — Potential energy of H-H-Br as function of H-Br distance (n) and
H-H distance (r 2 ) for all three atoms in line. Dotted line indicates reaction
HBr + H — » H2 + Br, the height of the pass above the first valley being the
activation energy. (Curve from H. Eyring and M. Polanyi, by permission.)
While the potential energy cannot be plotted in the general
case, we can do so in a special case. Let all three -atoms be
in a straight line, in the order H-H-Br, and let the H-H distance
be r 2 , the H-Br distance Vi. Then we may plot energy as func-
tion of r 2 and r 1} indicating it by contour lines. The result is a
diagram as in Fig. 78, where the dotted line shows the path
of the rolling ball in the reaction we have described.
278. Collisions with Electronic Excitation. — The chemical
reaction we have just considered was of the type in which the
electrons did not change from one stationary state to another.
We can easily change the problem, however, to involve electronic
transitions. Thus let us take even our simple case of two atoms
colliding, but now with rather large kinetic energy. If this
492
INTRODUCTION TO THEORETICAL PHYSICS
energy is large enough, one or both of the atoms can become
excited, the atoms separating with loss of enough kinetic energy
to compensate for the increased electronic energy. This plainly
Fig. 79. — Potential energy curves for collision. The continuously shaded
regions represent continuous distributions of energy levels, superposed on the
discrete levels.
demands an initial kinetic energy greater than the lowest excita-
tion potential of one or the other atom, a situation which is not
ordinarily met in chemical problems, so that this does not
represent a conventional chemical reaction;, but which is easily
realized in discharge tubes. To understand how the process can
occur, we refer to Fig. 79. There we plot two electronic potential
COLLISIONS AND CHEMICAL REACTIONS 493
energy curves, V\ and V 2 , corresponding to different electronic
energy, and evidently going to different energy levels of the
atoms at infinite separation. The total energy is now taken
to be E, greater than the energy F 2 of the excited electronic
state. Now, of course, in the potential curve Vi we have vibra-
tional quantized levels of energy less than the value of Vi at
infinite r, but we also have continuous levels for higher energy,
corresponding to the classical motions extending to infinity,
and it is such a level which we consider. Similarly the potential
curve V2 has discrete and continuous levels, and in particular
a continuous level corresponding to this same energy E. Now
in Chap. XXXII we have seen that when there are two states
of this sort, both corresponding to the same energy, there is a
certain probability of passing from one to the other. Thus,
proceeding by the perturbation method of variation of constants,
we can start with the system in electronic state V\, but as time
goes on the wave function will commence to take on some of the
properties of electronic state F 2 . In other words, there is a
certain chance that during the time of collision this transition
will have taken place, and that on separation of the atoms they
will be in the excited level corresponding to F 2 , with correspond-
ingly small kinetic energy. As in Chap. XXXII, the rate of
passing from one state to the other will depend on the non-
diagonal matrix component of the energy between the two
states; we shall shortly see how to compute this component.
Several other types of process similar to the one we have just
considered are of importance. First there is the inverse process :
excited atoms, corresponding to the potential curve V2, approach
each other, lose their excitation energy, and separate with greater
kinetic energy than they had before. This is called a collision
of the second kind. Other more complicated collisions of the
second kind are those in which most of the excitation energy of
one atom passes in the collision to excitation energy of the other,
only a relatively small amount being left for the change in kinetic
energy. Such collisions prove to be much more likely than the
ones with large change in kinetic energy. The essential reason
is that if the kinetic energy changes greatly, the wave functions
corresponding to the nuclear motion in the two states will have
very different wave length.- Then when we multiply the wave
functions and integrate in finding the nondiagonal energy matrix
component which produces the transition, we shall find that the
494 INTRODUCTION TO THEORETICAL PHYSICS
two waves of different wave length will have a product which
averages practically to zero, producing a small integral. When
the change of wave length (or of kinetic energy) is small, how-
ever, the two waves can interfere constructively, producing a
large integral, and large probability of transition.
Still another sort of process is that involved in predissociation.
Here one has a molecule in state F 2 , but with a discrete energy
E', so that it is a stable molecule, rather than two colliding atoms.
This energy E', however, corresponds at the same time to a
continuous level of the potential function V\. There will then
be a certain probability that the molecule will change over to this
continuous level. If it does, it will at once dissociate, resulting
in two atoms, with the potential curve Vi, and large kinetic
energy. Thus the molecular state E' is inherently unstable, as
is a radioactive nucleus. At the same time, on account of the
finite life time of the state, on account of the probability of dis-
sociation, the energy levels will be broadened, as we saw in Chap.
XXXII. This really means that a finite range of the continuous
distribution of energy levels of the problem in the neighborhood
of E' have properties suggesting the molecule and the state F 2 ,
rather than the colliding atoms and state V\. This broadening
of energy levels is observed in the spectra of predissociating
molecules.
An atomic phenomenon similar to predissociation is found
when an atom has two excited electrons. In this case it is easily
possible for it to have greater energy than the normal state of the
ionized atom. Thus the discrete level we are considering lies
at the same height as the continuous levels above the series limit
of the ordinary spectrum, corresponding to an ionized atom in
its lowest state, with an electron flying off with a positive kinetic
energy. "Phe system in the discrete level then has a certain
probability of changing over to the continuous state, or, in other
words, one of the electrons has a certain probability of escaping
from the atom, the necessary energy being provided by the other
excited electron falling down to a lower state. This spontaneous
ionization of a doubly excited atom is called the Auger effect,
and is the explanation of the fact that not many energy levels of
doubly excited atoms above the series limit are observed. The
levels are unstable, and have short lifetimes.
279. Electronic and Nuclear Energy in Metals. — We have not
so far considered metals in detail. They present, however, a
COLLISIONS AND CHEMICAL REACTIONS 495
feature so similar to what we have just discussed that it will pay-
to bring it up here. The characteristic feature of metals is that
they contain free electrons, which can be accelerated under
the action of an impressed electric field. Thus the wave func-
tion of a free electron of energy E, momentum components
p x , Vv, Pz, can be written
y = e h .
If, however, there is a constant force F acting on the electron
along the x direction, we should expect by classical mechanics
that p x would change with time, such that its time rate of change
was F, or p x = p x0 + Ft, while p y and p z would remain constant,
( qy 2 _L sr\ 2 _|_ *r\ 2 J
E changing so that at all times it would be — — -~ — yz •
2m
If then we set up the wave function
2 ™fr _l™ _i_ _i_ f r ( Pso+ffl' +Py'+P' 2 "! ■»)
yf, = e ir {&«+*>*+*»+*— ) I ^ J*^ (1 )
we readily find on carrying out the differentiations that it satisfies
the differential equation
(-^-4=-A.f, (2)
or Schrodinger's equation corresponding to the potential energy
—Fx, or the force F. This solution obviously does not corre-
spond to a constant energy, but rather to a continuously increas-
ing energy, as if we were passing continuously from one real
stationary state of a free electron in the absence of field to another.
In other words, the external field, regarded as a perturbation on
the electron in free space, produces continual transitions, amount-
ing to a uniform acceleration.
Now if there were nothing to stop it, this process of accelera-
tion of the free electrons would go on for ever, their velocities
and energies increasing without limit. The counteracting influ-
ence is the collisions which the electrons make with the atoms of
the lattice, setting them into vibration, and losing energy them-
selves. These two processes, of acceleration of the electrons and
collision, result in an equilibrium, giving a uniform drift velocity,
and a uniform current as a result, and at the same time resulting
496 INTRODUCTION TO THEORETICAL PHYSICS
in a continuous process of increasing the lattice vibrations, which
is simply the heating of the metal on account of its resistance.
Investigation shows that the probability of collision with loss of
the kinetic energy is proportional to the velocity of the electron.
This results in an apparent resisting force proportional to the
velocity, and the electrons act like particles subject to an external
force field, and a resisting force. As we have seen in Ghap. II,
the resulting motion is a uniform drift with velocity (and hence
current) proportional to the external field. This is the origin
of Ohm's law.
Now the process of collision of the electrons with the ions of
the lattice is essentially similar to the other collision processes
we have been considering in the present chapter. Let us forget
for a moment the accelerating action of the field, and simply
assume that it has already raised the electrons to states of large
momentum and kinetic energy, but that it is momentarily not
acting. The crystal as a whole is now in a state of high elec-
tronic energy, but of low nuclear vibrational energy. There
are, however, other states of the crystal of the same energy,
corresponding to smaller electronic energy, but higher nuclear
energy. In particular, these other states mostly correspond to
ho net momentum of all the electrons, or no electric current.
There will be a probability of interaction between the original
state and these other states, and in the course of time the system
will pass over completely to these other states, the current being
lost, its momentum dissipated, its energy converted into nuclear
vibrational energy. Of course, some of the kinetic energy will
remain with the electrons, since they have finite though small
specific heat, and the system tends to thermal equilibrium. But
the momentum, and the current, will be lost. In this case, unlike
the case of collisions, the inverse process, in which the nuclei
lose energy and change it into electronic energy, might happen
enough to keep equilibrium, but it would never restore the current
of its own accord. The reason is that this is such an excessively
unlikely process, since of all the electronic states corresponding
to the same energy, only one will correspond to having the elec-
trons all moving in the same direction and cooperating to form
a net current, while in most of them the electrons will be moving
in all directions, and their currents will cancel. This process of
dissipation of the current, an essentially irreversible process, is
similar to the process of radiation of. energy, which we have con-
COLLISIONS AND CHEMICAL REACTIONS 497
sidered in Chap. XXXII, and in both cases the irreversibility
arose in having a simple state change into a very complicated
one. Such a change is always accompanied by an increase of
entropy, and is a typical irreversible process of thermodynamics.
280. Perturbation Method for Interaction of Nuclei. — So far,
we have not really applied quantum mechanics at all to our
problem of, nuclear intereaction. We have simply assumed that
potential energy functions of the positions of the nuclei can be
set up, and that these are to be used for discussing the nuclear
motions. In phenomena of our first type, including all our equa-
tion of state problems of the preceding chapters, and simple
chemical reactions as well, the nuclei moved as if only one poten-
tial function had to be considered. on the other hand, we have
just been discussing phenomena of a second type, in which there
was appreciable probability of transition from one electronic
level to another. We shall now analyze this problem more in
detail, showing how the electronic and nuclear motions can be
separated, to a first approximation exactly (leading to the phe-
nomena of the first type), but to a higher approximation not
quite perfectly (leading to the transitions of the second type).
The reason why this very fortunate separation is almost exactly
possible lies essentially in the great difference of mass between
nuclei and electrons, resulting in differences of velocity and wave
length. The electrons move so fast that they go through their
orbits many times while the nuclei move a small distance, so
that we may approximately solve for the motion of the electrons,
assuming the nuclei to be fixed. In this solution, the coordinates
of the nuclei will appear as parameters in the wave function.
Similarly the energy levels of the electronic system will depend on
the position of the nuclei. And it turns out to be actually true,
as we have been assuming, that the electronic energy, as function
of the position of the nuclei, plays the part of a potential for the
nuclear motion.
It is not hard to show how the ideas we have just described
lead to a separation of variables. Let the electronic coordinates
be symbolized by Xi, the nuclear ones by Xj, the mass of an elec-
tron being to, and of the jth nucleus M,-. Then Schrodinger's
equation may be written
[J£i &r*mdx«* + ^l toPMidX? + V{Xi ' Xj) Y ~ ~
h <ty
2wi dt
498
INTRODUCTION TO THEORETICAL PHYSICS
-—m
Now assume that ^ = e h u(x{, Xj)v(Xj), where u.(xi, Xj) is
the wave function for the electrons, assuming the nuclei fixed,
and containing therefore the X/s as parameters, and v(Xj) is
the wave function for the nuclei, under the action of the potential
arising from the electrons. That is, u is a solution of the equation
^ _ h
8x 2 m dxi 2
+ V(x it Xj)
u(xi, Xj) = e(Xj)u(x i} Xj). (3)
The Hamiltonian on the left is what we get by assuming the
masses of the nuclei are infinite, so that they stay at rest. Since
the potential function V depends on the X/s as parameters,
the energy e must also show this dependence. Now we take
e{Xj) as the potential for the nuclei; that is, v satisfies the equation
2-
h*
8tt 2 Mj dXf
+ <X t )
v(Xj) = Ev{Xj),
(4)
where E is the whoie energy of the system. For example, with
a diatomic molecule, e(Z,), the electronic energy as function
of the nuclear position, is the potential curve we have often used,
and E is the vibrational energy level, measuring actually the
total energy, electronic and vibrational, and staying constant
during the motion, the electronic energy decreasing when
vibrational energy increases, and vice versa.
Let us now see what differential equation uv satisfies ; it proves
to be this product which approximates a wave function for the
whole system. We have easily
W
-A 2
87r 2 m dXi 2 ^^8t 2 Mj dXj 2
i
+ V(x it Xj)
u(x i} Xj)v(Xj)
"^1 h 2 ( du dv d 2 u \
= Eu{ Xi , Xj)v(Xj) +2i~ toWkhx, dXj + ax?7 (5)
i
If it were not for the last summation, this would be exactly the
equation we wish ^ to satisfy. But it is not difficult to show that
these terms are small. Thus for example with the last one, u
depends on the X/s in very much the same way in which it
depends on x i} since it depends largely on the differences (#» —
Xj), representing the coordinates of electrons with respect to
COLLISIONS AND CHEMICAL REACTIONS 499
the various nuclei, which are the essential things in the electronic
motion. Hence d 2 u/dXj 2 is of the same order of magnitude as
d 2 u/dxi 2 . But this quantity, multiplied by h 2 /Sir 2 m, is of the
order of magnitude of the energy of one of the electrons, an
appreciable fraction of the energy of the system. The term
appears here, however, multiplied by h 2 /8ir 2 Mj, smaller in the
ratio of m/Mj, and since M, is thousands of times w, this means
that these terms are much smaller than the others, and can be
neglected. Thus approximately uv forms a solution of the
problem, and we are justified in using the electronic energy as
a potential function for the nuclei. But to a higher approxima-
tion, we cannot neglect these small terms. We can find their
matrix components between the state we are interested in and
all other states, differing in electronic and nuclear motions, and
these components, though small, will be different from zero.
du dv
dXj dX j
It is these components, of the term ^, — D .,, ( 2-^r- -^r- +
u \
7 V >
d 2 u
„y 2 v ), which determine the rate of transitions between different
electronic levels, which we have considered in this chapter, and
which we have seen are important in problems of collisions,
nuclear vibrations in metals, etc.
Problems
1. If a reaction rate doubles for 10° rise of temperature, at T = 300°,
find the activation energy, in volt-electrons per atom, and in kg.-cal. per
gram-molecule.
2. If three atoms interact by valence forces, it can be proved that the
following formula gives approximately the energy of the lowest state:
VM(« — P) 2 + (£ — y) 2 + (t — «) 2 ], where a, /3, y, are the energies of
binding of the pairs 1 and 2, 2 and 3, 1 and 3, respectively, if in each case
the third atom is removed to infinity, so that a, etc., are given as functions
of the three interatomic distances r 12 , r 23 , r 3i by curves of the nature of Fig.
77. This formula is used in constructing Fig. 78. Show that the formula
approaches the correct limit as any one of the three atoms recedes to infinity.
Show that a single atom approaching a molecule is repelled, by assuming
atoms 1 and 2 to be at the equilibrium distance, forming a molecule, so that
a is large and negative, /? and y much smaller and also negative, increasingly
so as the third atom approaches, and expanding the square root in binomial
series in the small quantities /3 and y.
3. Find the energy for three hydrogen atoms on a line at arbitrary dis-
tances apart, using the formula of Prob. 2, and the hydrogen interaction
energy from Prob. 4, Chap. XXXV.
500 INTRODUCTION TO THEORETICAL PHYSICS
4. Taking the energy expression of Prob. 3, let the distances r 12 and r 23 be
equal, so that the corresponding point is on the 45-deg. diagonal of Fig. 78,
on which on account of symmetry the pass is located in this case. Compute
energy as a function of r i2 or r 23 , and find graphically the energy of the pass
(the minimum of this curve). Compare with the energy at the bottom of
either valley, and so find the activation energy of the reaction in which a
hydrogen atom approaches a hydrogen molecule, knocks off one of the
atoms from the molecule, and itself becomes bound.
5. Suppose that two normal atoms collide, and we are interested in
knowing whether either one can become excited. Let their kinetic energy
be the average value corresponding to T. What would the temperature
T have to be if the energy of excitation was 0.1 electron volt? 1 volt?
10 volts?
6. Show that two normal atoms colliding with sufficient kinetic energy
may be bound to form an excited molecule under certain conditions, by the
inverse process to predissociation, and describe by a diagram the necessary
relations of electronic energy levels and excitation energy.
7. The collision of electrons with ions of a metallic lattice is as if the
electrons were stopped after traveling a certain time T. Treating the
electrons by classical mechanics, assume there are N per unit volume, all
accelerated by the external field, and each stopped after traveling the time T,
being reduced to rest, but immediately starting to speed up again. Find
the mean velocity of each electron, and the electric current carried by all
the electrons. Show that this is proportional to the external field, so that
Ohm's law holds, and compute the specific resistance in terms of N and T,
T e 2
showing that the specific conductivity is AT — — » where e and m are the charge
and mass of the electron.
8. Assuming that copper has one electron per atom acting in metallic
conduction, and taking the observed conductivity, find the mean time T
between collisions, using the formula of Prob. 7.
CHAPTER XXXIX
ELECTRONIC INTERACTIONS
For some time we have been treating the motions of atoms
and molecules, under the influence of a potential energy function
which was really simply the energy of the electrons if the nuclei
were considered as held fixed. We have derived many interest-
ing results, but we have merely assumed the form of the potential
energy function, or have discussed it by semi-empirical methods,
rather than deriving it directly from wave mechanics. It is
time now to return to this fundamental problem of electronic
motion, with fixed nuclei, and to try to derive the many results
we have used regarding the energy as a function of nuclear
position. Even with atoms we have given so far only the rough-
est sort of approximation, and there are many features of both
atomic and molecular structure, as for instance the structure of
multiplets, and the proper formulation of the exclusion principle,
which we have not touched. Our discussion will be carried out
in general by means of the perturbation theory, and we shall
apply it in a general way, so that it can be specialized for either
atoms or molecules. A considerable part of our work must be
devoted to the electron spin and its relation to the exclusion
principle, for we have seen that this is fundamental in the theory
of valence. The starting point will be the discussion of the
many-body problem which we have already given in Chap.
XXXIV. There we saw that we could get an approximate
solution of the problem if we assumed that each electron, instead
of being acted upon by all the others, was acted on instead by a
force which depended on its position alone, this force being equal
to the average force which the other electrons would exert,
averaged over their motion. In this case, each electron is
independent of the others, as far as its equations of motion are
concerned, Schrodinger's equation can be separated, and the
final wave function can be written as a product of functions
of the various electrons: U = UifayiZi) • • • u n (x n y n z n ), where
ui(xiyiZi) is the one-electron function for the first electron, etc.
501
502 INTRODUCTION TO THEORETICAL PHYSICS
281. The Exclusion Principle. — The wave function we have
just written down is a solution of the problem connected with
the Schrodinger equation
2-
g^Vi 2 + VAx<U*i)
U = EU, (1)
which results from the separate equations
h 2 1
-o-o— V» 2 + Viixii/iZi) \Ui{xiyiZi) = EiUi(xtyjZi),
07T m J
where E x + • • • + E n = E.
But now there is one special feature when all the particles
1 . . . n are electrons, and hence just alike. This feature is
that each must move in just the same force field. In other words,
interchanging the position of two electrons must leave the
potential energy unchanged. The reason is plain: interchanging
two electrons can make no physical difference, on account of
their complete identity. In other words, we should write for
the potential energy, not Vi(xiyiZi), but merely V(xiyiZi). And
the various one-electron functions Ui(xiy&i) are simply solutions
of the same problem, but in general connected with different
quantum numbers. Thus in a problem of atomic structure,
each electron moves in the same central field, but different
electrons have different quantum numbers.
An immediate result of the identity of electrons is that, as
soon as we have found one solution of the Schrodinger equation
above, we have likewise found many other solutions. For
suppose that we take one solution, and then permute the quan-
tum numbers in any arbitrary way among the electrons, we shall
still have a solution. By this we mean that, if u a (x 1 )u b (x2)
is a solution of a two-electron problem, a and b referring to two
quantum numbers, 1 and 2 to the coordinates of two electrons,
then equally well Uh{xi)u a {xi) will be a solution; and so on for
more complicated cases. Further, each of these wave functions
will correspond to the same energy, so that we shall have a
problem of degeneracy. And in such a case, we know that the
correct solution is generally a combination of the degenerate
solutions. From Chap. XXXII we know how to take care of
such a situation. We shall have n\ different wave functions;
for the n sets of quantum numbers can be permuted among
ELECTRONIC INTERACTIONS v 503
the electrons in n! ways (unless some sets of quantum numbers
are counted more than once, in which case there will be fewer
wave functions). Thus we must solve a perturbation problem
between these n\ functions, which we may number 1 . . . N,
where N = n\ and denote by Ux . . . U N . Then we find the
matrix components of energy H between these functions, and
in terms of these components, we set up the set of N simultaneous
equations
^(H km - E p 8 km )S mP = 0, k = 1 • • • N, (2)
m
where 8 km = 1 if k = to, if k 9^ m. The operator H is the
whole energy, involving interactions between electrons, rather
than the approximate one used in defining the w's. These
equations determine the coefficients S such that the linear
combinations ,
2^& mp U 11
(3)
represent the correct wave functions after applying the perturba-
tion, the one we have written being the pth perturbed wave
function. In order to solve these equations, we have found that
the determinant of N rows and columns formed from the quan-
tities (H km — Ep8 km ), taking all N values of k and to, has to be
zero. This gives an equation, called a secular equation, for
E p , having N roots giving the N energy levels, which we number
by the index p from 1 to N. We should carry out this process
in this case. Unfortunately it is too difficult to do, but fortu-
nately a simplification is introduced by the exclusion principle
which renders it easy to handle.
If we were able to solve the problem of degeneracy, we should
find N linear combinations of the original products, ^S mp U m ,
' m
where p = 1 • • • N, each of which was a solution; that is, H
operating on any one of these combinations would give just
a constant times this combination itself. This is not true of
the original products: HU m = H lm Ui + • • • + H Nm U N , involv-
ing all the functions, not just the single one U m , since H does
not have a diagonal matrix with respect to the approximate
functions U. But now if by some other method we can set up a
combination of u's which has the property that H operating on
504 INTRODUCTION TO THEORETICAL PHYSICS
it gives a multiple of itself, we shall immediately know that this
combination is a solution of Schrodinger's equation. Fortu-
nately we can do this in one case, and it turns out that this is
the only case we are interested in. Suppose we set up the
determinant
Wi(xi) Ux{Xi) . . . ui(x n )
u 2 (xi) . . U2(x n )
u n (xi) u n (x n ) . (4)
This determinant by definition is a linear combination of all
possible products of the form ui(xi) . . . u n (x n ), obtained by
permuting the quantum numbers in all the possible JV ways,
each having a coefficient + 1 or — 1 according as an even or odd
number of interchanges of rows or columns was necessary to
bring the desired term to the principal diagonal of the deter-
minant. In other words, it is a linear combination 2^S mp U m ,
m
in which air coefficients S are ±1. And we can show that this
particular combination actually has the property that H operat-
ing on it gives a multiple of itself (that is, that it does not have
any nondiagonal matrix component to any other linear combina-
tion of the U's). To do this, we must first note that the deter-
minant is antisymmetric in the coordinates xi . . . x n . By that
we mean that if we interchange two coordinates with each other,
as Xi and x h the whole function changes sign, but is otherwise
unchanged, retaining the same magnitude. In other words,
interchanging the position of two electrons makes only a change
of sign in the wave function. The way in which we see that
this particular function is antisymmetric is from a property
of the determinant: to interchange two coordinates means to
interchange the corresponding columns in the determinant,
and there is a theorem of determinants stating that this inter-
change merely multiplies the determinant by — 1. The reason
is simple : each product entering into the expansion of the deter-
minant is unchanged numerically by the interchange of columns,
but in each case one more or one less permutation is required
to reach a given product than was required before, multiplying
all factors by -1, according to the rule of sign stated above.
As a simple example, we start with the determinant
Wi(Zi) Ui(x 2 )
u 2 (xi) u 2 {x 2 )
ELECTRONIC INTERACTIONS 505
= Ui(Xi)u 2 (x 2 ) — U 2 (x 1 )Ui(x 2 ),
and interchange columns, obtaining
U\{x 2 ) Wi(£i) / \ / \ / \ f \
t \ ) \ = Ul(X2)U2(Xi) - Ui(Xi)U2(x 2 ),
u 2 (x 2 ) U 2 (Xi)
the same as the other but with opposite sign Thus our linear
combination is antisymmetric. It is easy to see that it is the
only possible antisymmetric linear combination.
We now know, if we call our antisymmetric determinant D,
that HD must be a linear combination of all functions U. But
HD must be antisymmetric; for D is antisymmetric, and H,
the energy, is symmetric, being entirely unchanged by inter-
change of two electrons, so that HD in turn will be changed only
in sign. In other words, HD is an antisymmetric linear combina-
tion of the products U, and the only such combination, as we
have just seen, is the determinant D itself, or at most a constant
times the determinant Hence we have shown that
HD = constant X D,
or that D has the property we desired, of having no nondiagonal
matrix components to other linear combinations of the U's.
We must not suppose that we have found an exact solution of
Schrodinger's equation, though our description might indicate
this; for H will have nondiagonal components between D and
other antisymmetric functions formed from one-electron wave
functions u of different quantum number from those used here.
It is only within our group of N unperturbed wave functions
that we have eliminated nondiagonal terms.
Out of all the N linear combinations of the N unperturbed wave
functions, we have found just one which satisfies Schrodinger's
equation. This seems like a rather small beginning toward
the task of finding all N combinations, which we should obtain
by solving the secular equation. But now the exclusion principle
enters; and its statement is at first sight quite different from
what we have become accustomed to. It is:
The only wave functions allowed in nature are those anti-
symmetric in all electrons.
This principle, as we have before pointed out, cannot be at
present deduced from any results of wave mechanics; but must
serve as a separate postulate. We can see at once, however,
506 INTRODUCTION TO THEORETICAL PHYSICS
that it is a consistent postulate, in the sense that if the universe
were once set up with antisymmetric wave functions, it would
always stay so. For Schrodinger's equation involving the time
is H\p = — s— . -^j' giving the time rate of change of \p. If
now at a given instant ^ is antisymmetric, then Hyp must also
be antisymmetric, and hence the increment of 4/ in time dt is
also antisymmetric. Since the sum of two antisymmetric func-
tions is itself antisymmetric, \p at time t + dt, which is the original
\p plus its increment, will also be antisymmetric, or this property
of antisymmetry does not change with time.
282. Results of Antisymmetry of Wave Functions. — The
antisymmetry of the wave function, which we have just stated,
results immediately in the ordinary form of the exclusion princi-
ple, the fact that no two electrons may have the same set of
quantum numbers. For suppose that two of the one-electron
wave functions, say Ui and u,, were equal, as they would be if
they corresponded to the same quantum numbers. Then we
should have a determinant with two equal rows, and such a
determinant is always zero, &k we can see from the fact that
interchanging these equal rows must surely leave the determi-
nant unchanged, and yet interchanging two rows of any determi-
nant must change its sign, inconsistent requirements unless the
determinant is zero. Hence no antisymmetric wave function
can be set up unless all electrons have different quantum numbers.
As a result of the exclusion principle, any particular set
of electronic quantum numbers, and hence of wave functions
u\ . . . u n , is connected with but one wave function for the whole
system, instead of the n\ = N functions we at first had to con-
sider. This greatly simplifies problems of electronic structure.
There is one point connected with it, however, which is at first
paradoxical. We can no longer speak of the quantum numbers
of a particular electron. Each electron behaves just the same
as any other electron. The quantum numbers refer merely to
the one-electron wave functions from which we construct our
antisymmetric wave function. We can visualize this situation
if we think of an atom, with tightly bound K electrons, and a
loosely bound valence electron. The same electron which
acts at one time as valence electron may sometimes go near the
nucleus and act like a K electron, but at the same time another
electron will have changed place with it, and will now be acting
ELECTRONIC INTERACTIONS 507
as valence electron. A similar process of interchange takes
place in molecules. For example in H 2 , one cannot say that
one definite electron is attached to one atom, the other to the
other, for the electron which at one time is on one atom will at
another time be on the other, with a corresponding change
of the second electron. This process of electronic interchange
is intimately connected with the formation of valence bonds,
and is a very widespread phenomenon.
283. The Electron Spin. — In Chap. XXXV, it was stated that
electrons have spins, as if they were permanent magnets, and
that these magnetic moments are allowed to be oriented in only
certain directions. For the present purposes, we can state the
rule regarding their orientation in the following way: We may
pick out some arbitrary direction in space, and then may postu-
late that each spin can be oriented either parallel or opposite to
this direction, but not at an angle to it. The spin has angular
momentum ^ ^-i as if it had I = „> and correspondingly there are
the two possible orientations m = ± \i parallel or opposite to the
axis. The spin may now be considered a little like a coordinate :
four, not three, quantities are needed to describe the situation
of an electron, its x coordinate, its y, its z, and its spin. The
coordinates x, y, z are capable of taking on any value; but the
quantity determining the spin, which we may take as the com-
ponent of spin along the chosen direction, can have only two
values, + or — the magnitude of spin itself. Our one-electron
wave function should now depend not merely on x, y, z, but also
on the spin. Since there are only two possible values which the
spin can take, the wave function needs to be determined only for
these two values, which we can symbolize by + and — . We
have, then, u(x, y, z, spin), defined only when spin is one of the
two values symbolized by + or — . In other words, we have
u(x, y, z, +) and u(x, y, z, — ).
The spin, as we have seen, behaves like a coordinate. But
at the same time, it also acts like a quantum number, and this is
apt to be rather confusing. Let us consider an electron in a
central field. The three quantum numbers with which we are
familiar are the total quantum number n, the azimuthal quantum
number Z, and the quantum number m. Of these, I measures the
total angular momentum on account of the rotation of the elec-
tron in its orbit, in units of h/2r t and m measures the projection
508 INTRODUCTION TO THEORETICAL PHYSICS
of this angular momentum along a fixed axis, the z axis. But
now the electron has an angular momentum on account of its
spin, which proves to be ^ h~ in magnitude. This spin, as we
have just seen, can be oriented in two ways with respect to a fixed
axis, either along it or opposite to it. It thus appears that this
spin angular momentum should likewise have quantum numbers
similar to the orbital angular momentum, one representing its
total magnitude (which, being always \^, need not be specially
considered, since the spin angular momentum, unlike the orbital
angular momentum, never changes its magnitude), and the other
its projection along the z axis (which can be either +3"2 or — ^
units). Suppose this latter quantum number, determining the
projection of spin along the axis, and capable of taking on just
the two values +3^ and — }4, be called m s . Then to specify
the stationary state of an electron, we must give the four quanti-
ties n, I, m, m s . As a matter of notation, it is often convenient
to use the name mi instead of m for the component of I along the
axis, so that our four numbers are n, I, mi, m s . And the wave
function should properly carry these four numbers as subscripts :
u n ,i, miima (x, y, z, spin).
We are now prepared to consider the physical meaning of the
functions u n> i, m[tms (x,y, z, +) 2Lndu n ,i, m ,, m Xx,y, z, — ). Thesquare
of the first gives the probability that, if the quantum numbers are
n, I, m h m s , the coordinates will be x, y, z, +. Suppose that
m s = 3^. Then we know that the spin must be along the + axis.
In this case, there is no probability that the spin is along the —
axis, for we have information to the contrary. Thus u 2 n< i ttnuma
{x, y, z, — ) must be zero, since it measures the probability that
the spin is along the — axis. on the other hand, there is
certainty that the spin is along the + axis, so that u 2 „,i, miims
(x, y, z, +) merely gives information about the distribution in
x, y, z, or reduces to the ordinary function of x, y, z. A similar
situation holds if m s = — }■£ and the final result is
u n ,i, mi , ms (x, y, z, +) = u n ,i, mi (x, y, z) if m a = }4
= if m s = — }-2
u n ,i,m v mXx, y, z, —) = u n ,i, mi (x, y, z) if m a = — %
= if m 3 = >£,
where u n ,i >mu {x, y, z) is the ordinary solution of Schrodinger's
equation without spin. It is easy to see that these can be com-
bined in the statement
ELECTRONIC INTERACTIONS
509
where 5(m„ a) = 1 if m„ = a, if m s ^ a. The wave function is
then separated, a function of x, y, z times a function of spin, and
the latter has this peculiar form 5. The separation is natural,
since the energy does not depend on the spin. If we were includ-
ing magnetic terms in our energy, which we have so far neglected,
we should find that the magnetic moment of the spin actually
does exert a small force, resulting in a small term in the energy,
and when this is considered the separation is no longer possible.
284. Electron Spins and Multiplicity of Levels.— Suppose we
have two electrons. The spin of each is Y 2 unit of angular
momentum, and can be oriented in either of two ways, parallel
or opposite to the z axis. Thus we can have the following
possibilities:
1st
2d
Sum
I
+ Y2
+Yz
+ 1
II
+y 2
-K
III
-H
+ V2
IV
-M
-M
-1
In other words, the total angular momentum of spin along the
z axis, the sum of the two, has the possibility of being 1, 0, 0, - 1.
But there is another way of interpreting this. We may consider
that the total angular momentum is the vector sum of the separate
angular momenta of the two spins. If these are parallel, the
sum is 1 ; if they are opposite, the sum is zero. With any inter-
mediate angle, the result is between zero and unity. But such
vector additions of angular momenta prove to occur often in
quantum theory, and when they do, the vector sum is always
quantized; that is to say, it has only the possibility of a discrete
set of values, differing by unity from each other. Thus in this
case the only possibilities for the total angular momentum are
and 1. The quantum number S is applied to this total angular
momentum (s for spin, capital letter because it is a sum over
several electrons, rather than for a single electron). Next, any
angular momentum is allowed only certain quantized orientations
in space, as the orbital angular momentum of a hydrogen electron
is. In Fig. 68, we saw how an angular momentum I was allowed
to have only the components m along the z axis, where m = I,
510 INTRODUCTION TO THEORETICAL PHYSICS
I — 1, I — 2, • • • — I. This law is also general, so that we
see that our angular momentum can have only the component
along the axis, whereas the other one 1 can have components
1,0, — 1. For the state with S = 0, then, we have but one level,
while for the state with S = 1 we have three levels.
Now the orientation of the vector S in space is a process involv-
ing but very small energy. The only forces on S prove to be
magnetic forces, since the spin carries with it a magnetic moment,
the motions of the electrons produce a magnetic field, and the
relative orientation of the two affects the magnetic energy. This
energy is small, however, so that the three levels of S = 1 lie
close together, and form what is called a triplet. Similarly for
S = we have but one level, a singlet. on the other hand, we
shall soon see that levels with different values of S generally lie
far apart, with large energy separation. The effect of spins,
then, is to produce multiplets, groups of levels close together,
with considerable separation between multiplets. To verify
these facts, let us compute the energy levels in the case of two
electrons.
285. Multiplicity and the Exclusion Principle. — We have
already considered one form of degeneracy inherent in our method
of setting up an approximate solution of the wave equation : the
exchange degeneracy, arising because it made no difference in
the physical situation if two electrons exchanged positions. Now
we must consider a second type : spin degeneracy, arising because
(to the approximation to which we can neglect magnetic energy)
it makes no difference in the energy which way a spin is oriented.
In the last section, we have set up four combinations for the spins
of two electrons. To each of these corresponds an antisymmetric
wave function; for instance, to the second one,
1
V2
(6)
where x x symbolizes the four quantities xxyiZx spin x , and where the
factor l/\/2 simplifies the normalization. This determinant
stands for the situation in which one electron is in the state
with orbital quantum numbers nilm h and spin +%, the other in
the state with quantum numbers riiUm^ and spin — 3^. There
is one interesting fact which we may at once deduce from these
determinants, and that is concerning the exclusion principle.
ELECTRONIC INTERACTIONS 511
Suppose that our two electrons have the same orbital quantum
numbers, so that n% = n\, etc. Then the determinant in which
both electrons have + spins, or in which both have — spins, is
necessarily zero, for both electrons are then entirely alike, and
the two rows of the determinant are alike and the determinant
vanishes. But the determinant corresponding to spins + and —
does not vanish; in this, the two electrons differ in spin, and so
the exclusion principle does not forbid them to have the same
orbital quantum numbers. To see this, we need only expand the
determinant, which, writing the spin wave functions explicitly, is
u n ,i,mi(x 1 yiZi)u n ,i, mi Xx2y2Z2)[5(}i, spini)5( — >£, spin 2 ) —
K-V2, spini)5Q4 spin 2 )] (7)
The second factor is not zero; if the two spins are opposite, it is
either +1 or —1, so that the wave function, as far as it depends
on the coordinates, is simply the product of the functions of the
two electronic coordinates. We readily see that the other deter-
minant, corresponding to the first electron having a — spin, the
second +, is just the same, except for a difference of sign, a trivial
matter. Thus in this case of two electrons with the same quan-
tum numbers, only one out of the four levels remains. This is
clearly the singlet level. We then have the following very signif-
icant result:
Two electrons in general lead to a singlet and a triplet; but if
they have the same orbital quantum numbers, they have only a
singlet level.
The exclusion principle, in other words, can act to exclude
certain multiplets, while permitting others. This proves to be a
very important result in spectroscopy, since often a great many
of the multiplets which would be allowed by the formal rules are
excluded, simplifying the spectrum greatly. But the most impor-
tant result is in the periodic table, and in other places where
certain configurations are excluded entirely. Thus suppose we
tried to have three electrons all with the same orbital quantum
numbers. Then we simply could not choose their three spins so
that all three would be different. The best we could do would be
to have one +, two — , or vice versa. We should then inevitably
have two electrons with just the same wave function, two equal
rows in the determinant, and a wave function of zero. In other
words, no more than two electrons can have the same orbital wave
512 ' INTRODUCTION TO THEORETICAL PHYSICS
function. And if two have the same wave function, they must have
opposite spins, and hence form a singlet.
286. Spin Degeneracy for Two Electrons. — Let us avoid
difficulty with the exclusion principle by assuming that our two
electrons have different orbital wave functions, which to save
writing we may symbolize by a and b. Similarly we shall sym-
bolize the coordinates by 1 and 2, so that we can write a one-elec-
tron wave function of electron 1 in orbit a, with + spin, as a + (l).
Now we have our four combinations of spins, and each of these
yields a different wave function. We have, then, a problem of
degeneracy between these four functions, and we must set up
the secular equation for this fourfold degeneracy, and solve it.
The first step is to find the matrix components of the energy
between the wave functions. And here a simplifying result
appears, which we shall first prove : the matrix component of the
energy (if we neglect magnetic terms) is zero unless both initial
and final states have the same total spin. We can prove this
most easily from the general method of finding a matrix com-
ponent. Since the spin acts as a coordinate, we must sum over
its two possible values, just as we integrate over each coordinate,
in obtaining matrix components. Thus we have, for the matrix
component between the first and second functions of our
tabulation,
Jd0i/dw22(spini)Z(spin 2 ){[a+(l)6+(2) - 6+(l)a+(2)]
ff[a+(l)6-(2) - 6-(l)a+(2)]}. (8)
Now a + (l) = a(l)5(+, spini), etc.; further, since H does not
include the spin, it leaves the 8 functions unchanged when it
operates. Hence we may write our matrix component as
jdvijdv 2 2 (spini) S (spin 2 )
{[$(+, spini)5(+, spm x )5(+, spin 2 )5( — , spin 2 )]
a(l)6(2)#a(l)&(2)
— [«(+, spini)5( — , spini)5(+, spin 2 )5(+, spin 2 )]
a(l)b(2)#b(l)a(2)
— [5(+, spin x )5(+, spini)5( + , spin 2 )S( — , spin 2 )]
6(l)a(2)#a(l)6(2)
'+[«(+, spini) «( — , spini) 8(+, spin 2 )5(+, spin 2 )]
6(l)a(2)ff&(l)a(2)}.
But now 2(spin 2 )5( + , spin 2 )8( — , spin 2 ) is zero; for it equals
8(+, -f-)<5( — , +) + 8(+, — )5( — , — ), each term containing a
ELECTRONIC INTERACTIONS
513
factor zero. Similarly each of the four terms is zero, and the
matrix component vanishes. The same thing is readily seen to
occur always if the total angular momentum is different in the
two configurations.
The only components of H which are different from zero are
then the diagonal ones, and the component between the second
and third states. Let us compute these. We number our four
levels from I to IV, as in the table in Sec. 284, so that for instance
the function (6) is labeled II. We denote the matrix component
of the energy between states I and II as (I/H/II), with corre-
sponding symbols for other components, and the matrix com-
ponent of unity between the same states is (I/l/II). Then we
have
(l/H/l) = fdih/cfoaSCspinOS (spin 2 )
U s (+> spini)5(+, spini)5(+, spin 2 )5(-r-, spin 2 )]
H
6(1)
a(2)|
K2)l
H
a(l)
6(1)
H
o(2)|\
K2)lf
|a(l) a(2)
|6(1) 6(2)
1/f , r . Had) a(2)
= [(ab/H/ab) - (ab/H/ba)],
where by definition
jdv 1 jdv 2 a(l)b(2)Ha(l)b(2) = (ab/H/ab), etc.
Similarly we have
(H/ff/II) = (lU/H/IIl) = (ab/H/ab)
(U/H/1II) = (m/H/II) = -(ab/H/ba)
(IV/tf/IV) = (I/H/I)
(9)
(10)
(11)
We may now write down our secular equation; but we note first
that our functions I ... IV may not be normalized and orthog-
onal. Thus we have
(I/l/D = HJ*!/*.!!^ 1 } 5(2)
a(l) o(2)
6(1) 6(2)
where
= [(ab/l/ab) '- (ab/l/ba)],
Jd»i/di>2a(l)6(2)a(l)6.(2) = (ab/l/ab), etc.
(12)
(13)
514 INTRODUCTION TO THEORETICAL PHYSICS
Similarly the other components are like those of H, but witt
1 substituted in place of H. We see that, if a and b are separately
normalized and are orthogonal, (ab/l/ab) = 1, (ab/l/ba) = 0,
so that
(I/l/I) = (II/l/II) = (III/l/HI)
= (IV/l/IV) = 1
(II/l/III) = (III/l/II) = 0,
so that our functions are normalized and orthogonal; but we
shall sometimes meet cases where this is not true.
Now we can write the secular equation for the energy. This is
I-J-E
I-E -J
-J I-E
I-J-E
= 0, (14)
where I = (ab/H/ab), J = (ab/H/ba), E is the energy. This
determinant can be at once factored:
(I - J - E)[(I - EY - J 2 ](7 - J - E) =0, (15)
giving a double root E = I — J, corresponding to the two states
with components + 1 and — 1 of spin along the axis, and giving
the two roots of the quadratic, I — E = ±J,orE = I±J,for
the two states corresponding to no spin along the axis. We have,
then, three roots equal to each other, E = I — J; and one differ-
ent root, I + J. Evidently the first, having three wave func-
tions, with components of spin +1, 0, —1, forms the triplet,
and the other is the singlet. To the order of accuracy to which
we are working, neglecting magnetic terms in the energy, the
three levels of the triplet fall exactly together, but they are
separated by a considerable amount from the singlet, the separa-
tion being 2J, where J is an integral depending on the electro-
static forces in H, and therefore of considerable size. This
verifies our earlier statement that the energy depended in an
important way on the total spin S, but only very slightly on
its orientation.
287. Effect of Exclusion Principle and Spin. — The present
chapter has been devoted to the mathematical formulation
of the exclusion principle, and the effect on it of the spin, and
to the method of finding energy levels subject to the complication?
ELECTRONIC INTERACTIONS 515
introduced by these features of electronic interactions. In the
next chapter we shall make several physical applications of these
ideas. For the present, we shall merely summarize what we
have done, and briefly point out its importance. We first
showed that the identity of electrons produced a degeneracy
if we made approximate wave functions out of products of one-
electron functions, on account of the possibility of exchanging
electrons without making physical change in the system. We
discovered, however, that out of the many perturbed wave
functions allowed mathematically as linear combinations of
these unperturbed ones, but one occurred in nature, the function
which was antisymmetric in the coordinates of all electrons.
This function had the property that it allowed no two electrons
to have the same quantum numbers, the ordinary exclusion
principle, but its importance extends much farther. Next we
considered the spin, which had two possible orientations for
each electron. This led to a new degeneracy, since each electron
could have two possible spins, so that n electrons had 2 n possi-
bilities (four possibilities for two electrons). We found that
these 2 n levels broke up into groups, or multiplets, characterized
by the total spin angular momentum, and such that all levels
of a multiplet had the same energy, if we neglected magnetic
effects, while different multiplets were separated widely from
each other. This separation of multiplets is a result of the
antisymmetry of the wave function, as we see if we look back
over the argument, a result quite apart from the actual exclusion
of certain levels, but equally or perhaps even more important.
As we shall see when we analyze it mathematically, we' have
spoken of it in an earlier chapter as the effect of the exclusion
principle on valence. The term ±J, depending on this effect,
will prove to be the term in the molecular energy which gives
the valence binding or the repulsion, depending on whether
the spins of the shared electrons are antiparallel (singlet state,
energy I + J, binding, since J proves to be negative) , or parallel
(triplet, I — J, repulsion). It is somewhat paradoxical that
this large and important effect of the spin on the energy can
occur, when the spin exerts but negligible magnetic forces.
These effects are not magnetic at all, but purely electrical, and
they result simply because, on account of exclusion, the spin
can exert a large influence on the wave function, the grouping
of the electric changes, and the electrical energy. We can see
516 INTRODUCTION TO THEORETICAL PHYSICS
this most clearly from one property of the antisymmetric wave
function which so far we have not pointed out : if we set the
coordinates of two electrons equal in the determinant, the wave
function vanishes, since interchange of the coordinates must
change the sign, and yet can make no change when they are
equal. This includes not merely space coordinates, but also
spin. The result means the following physically : the probability
is zero that two electrons of the same spin will be found at the
same point of space (and small that they will be found near
each other). on the other hand, the antisymmetry makes no
objection to two electrons of opposite spin being close to each
other. In other words, on account of this part of the exclusion
principle, an electron of a given spin drives other electrons
of the same spin away. And while this is not directly attributa-
ble to any force at all, still it can have a powerful effect on the
electronic motions, and on the energy.
Problems
1. If the one-electron wave functions Ui . . . u n are orthogonal and
normalized, prove that
^ Ui(Xi) . . . Ui(Xn)
y/n ! ', \ J -,
U n {Xi) . . . U n {x n )
is normalized.
2. Show that a system containing an odd number of electrons always
has even multiplets, as doublets, quartets, etc., while one with an even
number of electrons has odd multiplets, singlets, triplets, etc.
3. Show that three electrons lead to a quartet and two doublets, on
account of spin degeneracy. In case two of the electrons have the same
orbital wave functions (that is, are equivalent), show that the quartet and
one of the doublets are excluded.
4. In the problem of spin degeneracy of two electrons with orthogonal
one-electron functions, as we have worked out, find the four final wave
functions resulting from perturbation theory. Express these in terms of the
5 functions for spin, and show that each one factors into a product of a
function of coordinates and a function of spin. Show that the function of
coordinates is proportional to a(l)&(2) + a(2)b(l) for the singlet, a(l)6(2) —
a(2)6(l) for each of the levels of the triplet.
5. Discuss the spin degeneracy of two electrons, in the case where the
one-electron functions a and 6 are not orthogonal, showing that the energies
are
(ab/H/ab) + (ab/H/ba)
1 ± (ab/l/ba) '
the + signs being for the singlet, the — for the triplet.
ELECTRONIC INTERACTIONS 517
6. Set up the perturbation problem of spin degeneracy for three non-
equivalent electrons, and find the energy of the quartet terms, in the general
case where the one-electron functions are not orthogonal.
7. If all wave functions are normalized and orthogonal, prove that the
sum of all diagonal terms in the energy matrix equals the sum of all
the perturbed energy values. To do this, expand the secular equation in the
form E N — E N ~ l (coefficient) + • • • =0, and also obtain a similar expan-
sion for the factored solution of this equation, (E — Ei) (E — E 2 ) • • •
(E — E N ) = 0, where E x • • • En are the roots. Identify coefficients of
the term in E N ~ l in the two expressions.
8. Verify the result of Prob. 7 for the two-electron solution obtained in
the text.
9. Using the method of Prob. 7, applied to the three-electron case with
orthogonal one-electron functions, and the solution for the quartet energy
found in Prob. 6, find the sum of the energies of the two doublet terms (or
the mean energy of the two doublets).
CHAPTER XL
ELECTRONIC ENERGY OF ATOMS AND MOLECULES
In the last chapter we have seen how to set up the wave
function of a system, subject to the exclusion principle and to
spin, and how to find its energy levels by perturbation theory,
taking account of the various degeneracies introduced. Space
does not permit us to make complete applications of these
methods to problems of atomic and molecular structure, but
we shall indicate descriptively in the present chapter how the
calculations are made, and the results they lead to.
288. Atomic Energy Levels. — In Chap. XXXIV, we have seen
that the energy of an atom is primarily fixed by its configuration;
that is, by the values of n and I for each electron. We have
found approximate formulas for getting this energy, in any
configuration, finding which electrons were tightly and which
loosely bound, etc. And we have seen that in the low states
of an atom, the electrons tend to be in the lowest levels possible,
with two in the Is, two in 2s, six in 2p, etc., so that only the levels
near the outside of the atom are unoccupied or only partly
filled under ordinary conditions. Now we are prepared to go
farther in understanding atomic structure.
An electron actually has not merely the two quantum numbers
n and I, but also mi and m s , giving the orientation of orbital
angular momentum and of spin, respectively. These only
slightly affect the energy directly, only through small magnetic
terms, which we have neglected. It was for this reason that
in our earlier calculation of energy we could neglect them entirely.
But these now introduce a degeneracy into the system, which
we must consider in making a more accurate calculation of
energy. We have already considered in the preceding chapter
the spin degeneracy arising from ra s , connected with the different
orientations of spin, but there is likewise an orbital degeneracy
associated with mi, arising from different orientations of orbital
angular momentum. These two types of degeneracy become
closely associated in atomic structure, and we must consider them
518
ELECTRONIC ENERGY OF ATOMS AND MOLECULES 519
together. It is best to think of them first in terms of the vector
model which we have used for discussing mi for a single electron,
and for m s . Let us consider an atom with two electrons. Sup-
pose, for illustration, that the electrons are nonequivalent p
electrons (both having 1=1, but with different values of n).
Then each has a total angular momentum of h/2ir on account of
its orbital motion, as given by I. These two vectors now form
a vector sum, which we call L. If they are parallel, L = 2;
if they are opposite, L = 0. As before, only values of L differing
by integral values are allowed, or L = 2, 1, 0. Next, the two
spin vectors also form a vector sum, S, which as we have seen
can be in this case 1 or 0. Finally, L and S can be oriented
in different ways with respect to each other, giving a resultant J,
taking on all values with integral differences between L + S and
\L — S\. We have already seen that the energy depends in an
important way on the value of S; it also proves to depend in a
similar way on L. The value of /, however, is unimportant,
only the magnetic energy of coupling between spin and orbital
motion depending on J. Hence the group of levels of the same
L and S lie close together, and form a multiplet, but different
multiplets lie comparatively far apart. It should be stated
that this is a fairly special case, though an important one; cases
may exist in which the magnetic energy is large, even sometimes
larger than the energies dependent on L and S. It is only because
here the coupling energies involved in forming L and $ are large
that it is correct first to set up these vectors, then to combine
them to form J; this case is then called L — S coupling. But
other types of coupling exist, and must often be considered.
In the matter of selection rules, and various other particulars,
a multiplet of a many-electron atom with a given L behaves like
a hydrogen level with the same value of I. For this reason, L
is regarded as azimuthal quantum number, and there is a notation
for levels like that in hydrogen, levels with L equal respectively
to 0, 1, 2, ... , being called S, P, D, F, . . . (large letters
for complete atoms). A multiplet is indicated by a symbol as
3 P, meaning a triplet P, with L = 1 (P), and S = 1 (triplet).
And the separate levels of the multiplet, with J = 2, 1,0, are
denoted by 3 P 2 , 3 Pi, 3 Po- To specify a level completely, one
can give the configuration as well as the description of the term ;
as ls2p 3 P 2 , a level of a two-electron atom with one Is and one
2p electron. Now we can see from our vector rules what multi-
520
INTRODUCTION TO THEORETICAL PHYSICS
plets arise from a given configuration. For instance, an s and a
p electron give IP and S P; two p's give 1 S, 1 P, 1 D, S S, 3 P, S D,
since L can be 2, 1, or 0, and S is 1 or 0. But here as in the case
of the spin degeneracy alone, if the two electrons are equivalent,
certain multiplets are excluded. With two equivalent p's,
for instance, the only multiplets which remain are the 1 S, 1 D, 3 P.
By classification of the levels, as we shall do in the next paragraph,
this can be easily proved.
289. Spin and Orbital Degeneracy in Atomic Multiplets.—
As an illustration, we take the case of two p electrons, which to
begin with we shall assume are not equivalent. The necessary
information is given in the following table. In this we have
indicated the orbital degeneracy completely, but not the spin
degeneracy, since this follows the same arrangement as in the
preceding chapter. In the first two columns we give the mi's
of the two electrons, and in the third M L , the sum of these two,
giving the component of total orbital angular momentum along
the z axis.
Orbital Degeneracy for Two p Electrons
m h
mil
Ml
Non-equivalent
Equivalent
1
1
2
1 + 3
1
1
1
1
1
CO CO
+ +
T-H T-H
1 +3
1
-1
-1
1.
1 +3
1 +3 I
1+3 )
1
1 +3
-1
-1
-1
-1
1+3 )
1 +3 \
1 +3
-1
-1
-2
1 +3
' 1
These are evidently the proper M L 's to account for a D level
{M L = 2, 1, 0, -1, -2), &P(M L = 1, 0, -1), and an S(M L =
0), as demanded by the vector model. But now as a result of
spin degeneracy, each one of these problems results in a singlet
and a triplet, as indicated in the next column, so that we have
just the set of multiplets already described as resulting from
the configuration. In the case of equivalence of the electrons,
ELECTRONIC ENERGY OF ATOMS AND MOLECULES 521
however, we have already seen that the triplet is not allowed
if the two electrons have the same orbital wave functions. Thus,
as indicated in the last column, for the two mi's equal respectively
to 1, 1, or 0, 0, or — 1, — 1, only the singlet is allowed for equiva-
lent electrons. Further, since the electrons are equivalent, the
two arrangements 1, and 0, 1 mean exactly the same thing,
and yield only one singlet and triplet instead of two. Thus
as we see from the last column we have a singlet level with
M L = 2, a singlet and triplet for 1, two singlets and triplet for 0,
etc. And the only arrangement of multiplets which would yield
this arrangement is 1 S, X D, 3 P, as we have already stated.
We shall not carry out the computation of the energy of the
multiplets, since this is rather a long and complicated task. It is
not hard, however, to outline the process that must be used.
In the first place, if the magnetic energy is neglected, it can be
proved that the energy, has no matrix components between states
of different M L . Let us assume that we have already solved,
as in the last chapter, the separate problems of spin perturbation.
Then in the case we are using for illustration, the degenerate
perturbation problem of orbital degeneracy can be broken up
into ten smaller problems: one for each of the five values of M L ,
and each of the two allowable multiplicities. No one of these
has more than a secular determinant of three rows and columns,
and all are easily solved. The calculation of the integrals
involved in the matrix components can be carried out. Since
the one-electron wave functions are solutions of central field
problems, they are functions of r times spherical harmonics of
angle. The term in the energy which makes complication in the
calculation of the matrix components is that in I/V12, where r i2
is the distance between the electrons. To handle this, we expand
l/ri2 in spherical harmonics and inverse powers of r, and the
integration over angles then resolves itself into an integration
over products of spherical harmonics, which can be performed,
leaving only an integral over functions of r. These integrals
cannot be evaluated without knowing the functions of r contained
in the wave function, and it is often convenient to leave them as
undetermined parameters in the solution. Then we can get all
matrix components in terms of a few of these parameters, and can
solve the perturbation problem. When we do this, we find in
the first place that all the levels of a multiplet come out auto-
matically to have the same energy, as they should so long as we
522 INTRODUCTION TO THEORETICAL PHYSICS
neglect magnetic energies. Further, we find that the various
multiplets are displaced from the center of gravity of all multi-
plets associated with the same configuration by amounts which
are simple rational multiples of the various integrals, or param-
eters, which enter the problem. Thus, for instance, for two
equivalent p electrons, there is but one parameter, and the *S
has energy 10 times the parameter, l D 1 times it, and S P -5 times
it, all referred to the energy which we should obtain by the ele-
mentary theory neglecting degeneracy and multiplet structure.
That is, 3 P lies lowest, l D next, X S highest, and the energy separa-
tions are in the ratio of 2 to 3, a prediction which can be tested
experimentally, even without knowing the numerical value of the
parameter. In other cases, there are often several unknown
parameters, so that we cannot predict immediately the relative
values of the various separations, but still can get considerable
information. ♦
When now the magnetic energy of interaction between the
magnetic moments of the spin and the orbital motion is included,
this produces a further perturbation between the degenerate
levels of each multiplet. For small magnetic energies this is not
hard to work out. In this case, the energies of the various states
of a multiplet follow simple rules, so that multiplets are groups
of levels spaced in a regular fashion. When the magnetic energy-
is so large that multiplets spread enough to overlap other multi-
plets, however, the L - S coupling no longer holds, and the
situation becomes very complicated. It is no longer possible to
classify into multiplets at all; an individual level may take on
some of the properties of several different multiplets, its wave
function being a linear combination of functions of these various
multiplets. And even a greater complication often is present
in actual cases: the multiplets connected with a given configura-
tion may spread out so much that they overlap other configura-
tions. Then even the distinction between configurations can
become partly lost. We must solve a perturbation problem in
which we take into account many configurations, not just one,
and the final wave functions will be mixtures of these different
configurations. The actual atomic structure, then, and the
calculation necessary to describe it in detail, are very complicated.
290. Energy Levels of Diatomic Molecules.— Orbital
degeneracy is of much less importance in molecules than in atoms,
on account of the lack of spherical symmetry. The origin of
ELECTRONIC ENERGY OF ATOMS AND MOLECULES 523
atomic orbital degeneracy is found in the fact that the energy of
an electron is independent of the orientation of its orbit in space,
so that a number of levels, having different orientations, corre-
spond to the same energy. But with a molecule, different orien-
tations, with different space arrangements of charge, will interact
differently with the other atoms of a molecule, and hence will
have different energies, and are to be counted as different con-
figurations. The only special case is in a diatomic molecule,
where we may take the axis of the molecule to be our preferred
axis in space. Then each state is characterized by the com-
ponent of angular momentum along this axis; and the other
state with component just the negative of this will have exactly
the same energy, corresponding in Bohr's theory to an orbit in
which the electron was merely rotating in the opposite direction.
Thus we shall have levels with a two-fold degeneracy, but no
more. And in polyatomic molecules, even this degeneracy will
generally be lost. It is worth noting also that molecules in which
the atoms are in s states, and have no orbital angular momentum
anyway, will necessarily have no orbital degeneracy. This
includes many important actual cases.
Since it is unimportant, we shall neglect orbital degeneracy
in molecular structure. The real complication comes in a differ-
ent direction, as far as the theory is concerned. This is the
question of the type of unperturbed one-electron functions to use.
Two different methods have been used, each having some advan-
tages, and we shall describe these methods, and point out their
relations. The first is the method of Heitler and London, which
starts out by assuming that the one-electron wave functions are
just as in separated atoms, and applying perturbation theory to
these functions. We shall begin by sketching the treatment of
the lowest state of the H 2 molecule by this method, its best known
application.
291. Heitler and London Method for H 2 . — Let us assume that
we have two normal hydrogen atoms at distance R apart. We
shall assume, with Heitler and London, that the one-electron
wave functions are those of an electron moving about either
hydrogen atom in the absence of the other, or simply the Is
functions of a hydrogen atom. Let the function representing
an electron about the first nucleus be a, and around the second b.
Now if we have two electrons, one in each of these two wave
functions, we have a perturbation problem leading to spin
524 INTRODUCTION TO THEORETICAL PHYSICS
degeneracy, as we saw in the last chapter. There is no difficulty
about equivalence of the electrons; for, while both wave functions
are Is, they are about different nuclei, so that they correspond
to different functions of coordinates. Hence we have a singlet
and triplet level arising from the interaction. We can get their
energies immediately, from the methods of the preceding chapter.
It is easy to see that the one-electron functions are not orthogonal ;
a and b are both everywhere positive, so that it is impossible
that fa(x)b(x)dv should be zero. Hence we must use the method
for nonorthogonal functions developed in Prob. 5, Chap. XXXIX,
and we find for the energies
(ab/H/ab) ± (ab/H/ba) m
1 ± (ab/l/ba) ' U;
where the + signs are for the singlet, the — for the triplet.
These integrals are functions of the distance of separation R,
and evaluation of them leads to the interatomic potential
energy curves.
It is interesting actually to work out the values of the energy
integrals. To do this, we note that H for the diatomic molecule
(including all terms except nuclear kinetic energy, which in
this method we leave out), is
_Z,2 p2 p 2 p 2 pi p2 g2
H = — — (Vi 2 + V 2 2 ) +%-—-— - — - — + — » (2)
n Sir 2 m K ^ 2 ' ^ R rax r a2 r bl r b2 ^ r 12 w
where first we have the kinetic energy of the two electrons, then
the repulsion of the nuclei, the attraction of the electrons for the
nuclei (r ai representing for instance the distance between electron
1 and nucleus a), and finally the repulsion of the electrons for
each other (r i2 being the interelectronic distance). We also
remember that, since a and & are solutions of atomic problems,
we have
(-i^r- v S-— ) a = E ^
\ 8x 2 w r a i/
with a similar equation for 6, where E Q = — 13.54 volt-electrons
is the binding energy of the hydrogen atom. Using these
3quations, we may compute Hab, obtaining
Hab = 2E ab + (% - — - - + f)ab,
\R r a2 rti ri 2 /
(3)
J
ELECTRONIC ENERGY OF ATOMS AND MOLECULES 525
where by Hab we mean Ha(l)b(2). Now to find (ab/H/ab),
we must multiply this expression through by a(l)6(2), and
integrate over the coordinates. We assume a and b are normal-
ized, though not orthogonal. Thus ja(l)b(2)a(l)b(2)dvidv2 = 1.
We then have, from the first term of Hab, simply 2E , the energy
of the unperturbed atoms. The term in e 2 /R represents the
repulsive energy between the two nuclei. The next term is
- I d 2 (l)6 2 (2)— dv 12 = - I 6 2 (2)— dv 2 , integrating over the
coordinates of the first electron. But this is just the potential
at the nucleus a of a charge — e distributed according to the
density function 6 2 , around the nucleus b, multiplied by the charge
e of nucleus a. In other words, this term (and the next one,
which is analogous) represents the attractions of the nuclei of
each atom for the electron of the other. Finally the last term is
e 2
a 2 (l)6 2 (2) — dviz, or the repulsive interaction between a charge e
7*12
distributed over the first nucleus according to the density a 2 ,
and another charge on the second nucleus with density b 2 .
The four terms taken together, then, represent the electrostatic
interaction between the two atoms, each regarded as a nucleus
and a charge distribution of negative charge surrounding it.
They are, in other words, the penetration or Coulomb interaction
which we have discussed in a previous chapter. There by
qualitative arguments we showed that this interaction would
give a net attraction at moderate distances, though at sufficiently
small distances, on account of the nuclear repulsion e 2 /R, the
interaction always becomes repulsive. Our first result, then, is
a formula for this penetration interaction.
The other term entering into the energy is the integral
(ab/H/ba). This is the same as (ba/H/ab), and is obtained by
multiplying Ha(l)b(2) by 6(l)a(2), and integrating. In doing
this, we encounter at once the integral Ja(l)b(l)dvija(2)b(2)dv2,
or the square of Ja(l)b(l)dvi. Now the function a is large
near the first nucleus, and falls down exponentially as we go
away from it, becoming small at the second nucleus. Similarly
b is large near the second, small at the first. The product then
is small, since everywhere one or the other factor is small.
The largest values may be assumed to come in between the two
nuclei, where the two atoms overlap to an appreciable amount if
■j
526 INTRODUCTION TO THEORETICAL PHYSICS
they are near enough together, so that both factors, representing
the densities of each atom at such a point, are fairly large. The
integral, then, will be a quantity small compared with unity,
the principal contributions coming from the region of overlapping
between the atoms. We can now examine the individual terms.
The one in 2E will be simply 2E Ja(l)b(l)a(2)b(2)dv 12 =
2E (ab/l/ba). This, then, is just such that in the whole expres-
sion for energy we have the term
2E ± 2E (ab/l/ba)
1 ± (db/l/ba) °'
the unperturbed energy. Next, the term in e 2 /R is equal to
(ae) 2 /R, if we write a = Ja(l)b(l)dvi, so that a 2 = (db/l/ba),
and as we have seen a is a number small compared with unity.
This is the repulsion Of two charges ae, one on each nucleus, for
each other. The next term is — I a(l)b(l)dvi I a(2)6(2) — dv 2 =
e 2
a (2) b (2) — dv 2 , which is the potential energy of interaction
between a charge ae on nucleus a, and a charge of density
— a (2) b (2) distributed in the region of overlapping. This latter
charge has the total amount — ae. Similarly the next term is
the attraction between a charge ae on the nucleus b, and — ae
spread out in the region of overlapping, and the last term is the
repulsion between two charges — ae, each distributed over the
region between the atoms. When now we compute these terms,
we find that the attractions between the charges on the nuclei
and the distributed charges between are more than enough
to balance the repulsions between the charges on the nuclei,
and between the distributed charges, and the net effect is a
negative integral, at least at fairly large distances, giving attrac-
tion. If then in our energy' expression we have the + sign,
the term (db/H/ba), being negative, will result in binding between
the atoms. This is the valence attraction of which we have
spoken before. We see that its physical interpretation is as
follows: part of the electronic distribution from each atom
moves into the region between the two atoms, forming an electron
pair there. The amount of charge moving from each atom is
— ae, leaving a corresponding positive charge on each nucleus.
The cause of the valence binding is the electrostatic energy of
attraction between this concentration of negative charge in
ELECTRONIC ENERGY X>F ATOMS AND MOLECULES 527
the region between the atonis, and the residual positive charge
left on each nucleus. If w^ use the — sign instead, we obtain
the repulsive energy level, discussed 7 in Prob. 4, Chap. XXXV,
the energy being repulsive because the term —(ab/H/ba),
which is a positive energy, is^more than enough to counterbalance
the Coulomb attraction. Ih is easy to see that this repulsive
level is the triplet, corresponding^ to having the spins of the two
electrons parallel, while the attractive level is the singlet, with
the spins opposite. As we have pointed out before, this may
be qualitatively connected with the exclusion principle, in
that if the spins are opposite, the exclusion principle does not
operate and the charges can overlap, while if they are parallel
the charges cannot overlap, and a repulsion results.
292. The Method of Molecular Orbitals — The other method
which has been used for the discussion of molecular energy levels
is one in which we take account of the fact that the actual elec-
trons in a molecule are in a different field from that of a single
atom, and try to find their one-electron wave functions subject
to this field. Thus for H 2 , either electron may be- near either
nucleus. When it is near one, it is attracted by that nucleus,
and when near the other by the other, so that it moves in a field
with two attracting centers. The method we are now describing
tries to find the wave functions of an electron in such a field,
and uses these (sometimes called molecular orbitals) to build
up a solution for the whole problem. It is not very easy to
get exact solutions for the problem of two centers, but we can
find an approximation fairly simply, by perturbations from the
problem of one center. When the electron is near nucleus a,
the actual field acting on it, in the molecule, is not very different
from the field if the other atom were absent. Thus the wave
function for a single electron, near nucleus a, must resemble our
function o(l) which we have used previously. on the other
hand, when the same electron is near nucleus 6, its wave function
resembles 6(1). Thus the whole molecular orbital which we
are seeking must have both properties at once of resembling
a near the first nucleus, 6 near the second. We can try to build
up a wave function as linear combination of these two, a and 6,
and in doing this we are led to a perturbation problem, each
of these acting as an unperturbed wave function. Further,
on account of the identity of the two nuclei, these two unper-
turbed functions correspond to the same energy, and the problem
528
INTRODUCTION TO THEORETICAL PHYSICS
is degenerate. When we solve the problem of degeneracy, we
find easily that the two final wave functions are, except for a
factor, a + b and a — b. The first of these is symmetric in
the two nuclei, while the second is antisymmetric. We show
graphs of these two functions, taken by plotting values along a
line joining the nuclei, in Fig. 80, where we see that the anti-
symmetric function has a node between the nuclei. Calculation
of the one-electron energies of these two functions shows that
the function without the node is more tightly bound. It is
easy to see that the symmetrical function corresponds to an
extra amount of charge between the nuclei, since the wave
function there is twice as big as for either atom separately,
Fig. 80. — Symmetric and antisymmetric molecular orbitals. Figures repre-
sent values of the wave function at points on the line joining the centers of thi
atoms. Curve a + b is symmetric, a — b antisymmetric, where a and b are
one-electron wave functions about the two nuclei.
and therefore its square, the density, is four times as great, or
twice as much as for the two atoms separately. Thus there is
a sort of interference effect between the waves, as in optics,
where the amplitudes add, but the intensity does not. on the
other hand, with the antisymmetric function, with its node in
the center, there is less charge between the nuclei than for the two
atoms separately.
Now that we have found the molecular orbitals, we remember
that we have two electrons, each of which must go into one of
the orbitals. In the lowest state, both electrons will be in the
lowest orbital, with opposite spins (so as not to contradict the
exclusion principle). Then there will be the extra concentration
of charge between the nuclei which we have already pointed
out in the symmetrical orbital, and which produces a binding
between the atoms, as in the method of Heitler and London.
ELECTRONIC ENERGY OF ATOMS AND MOLECULES 52<)
on the other hand, if both electrons have the same spin, so that
we have a triplet level, they cannot both be in the same orbital,
but one must be in the higher, antisymmetric level, and this
counteracts the effect of the attraction, and results in repulsion.
It is a rather complicated thing actually to compute the energy
by the method of molecular orbitals, and its use is more in
qualitative discussion of the types of molecular structure, rather
than for numerical computation. It can be shown, however,
that it must lead to essentially the same results as the method
of Heitler and London.
As an illustration of the sort of case where molecular orbitals
are particularly useful, we may discuss the structure of the
two molecules CO and N 2 . Each of the atoms, in either of these
molecules, contains two K electrons. In addition, either of the
molecules has ten more electrons. Of these ten, two presumably
act as 2s electrons about each of the nuclei, but the six others
behave like molecular orbitals about the two nuclei. In these
particular molecules, the nuclei are fairly close together, and 2p
electrons are sufficiently extended in space so that they practically
surround both nuclei. In this case, the 2p molecular orbitals
in the field of two nuclei are not very different from 2p atomic
orbits, there seem still to be six of them, and in the completed
molecule there is one electron in each, resulting in a molecule
which has a complete shell of 2p's surrounding the whole thing,
and therefore rather like a neon atom, small, inert chemically.
This as a matter of fact is characteristic of the two molecules in
question. In a case of this sort, evidently it would not be accu-
rate to find the molecular orbitals by perturbation methods
from atomic ones, as we did for H 2 , but they should be found
specially for the problem under discussion.
There is one problem for which the method of molecular
orbitals is decidedly more convenient than that of Heitler and
London, and that is the structure of metals. There a molecular
orbital represents an electron which moves in the field of all
other electrons and nuclei. Now this field, while it has great
variations from point to point as we go from a position near
a nucleus to one farther from nuclei, still is in general constant
throughout the metal, showing only local fluctuations, unless
an electric current is flowing, in which case, by Ohm's law, the
field has a potential which varies slowly as we go through the
metal. A molecular orbital in such a field, as we shall see in
530 INTRODUCTION TO THEORETICAL PHYSICS
the next chapter, while it shows the behavior of an atomic wave
function near an individual atom, still varies through large
distances as the wave function for a free electron would, moving
in the averaged-out field. Now the interesting point is that
we can easily set up such orbitals corresponding to electrons
carrying a current, and it is by this method that electrical
conductivity is described. Tlje corresponding process is very
difficult to treat by the method of Heitler and London.
Problems
1. Prove by the vector diagram that an S level, no matter what the multi-
plicity, has only one sub level (one / value), and that a P level never has
more than three sublevels.
2. Discuss by the vector diagram the levels arising from a p and a d elec-
tron; two nonequivalent d's.
3. Prove that a closed shell of p electrons has no orbital or spin angular
momentum, so that its state is ^. Show that the same thing is true of any
closed shell.
4. Prove that any configuration of electrons outside closed shells leads to
the same set of multiplets that it would if the closed shells were absent.
Thus prove that all alkali spectra are similar except for magnitude of the
terms.
5. Work out the problem of orbital degeneracy of two equivalent d elec-
trons, showing that the only allowed levels are 1 S 3 P 1 D 3 F 1 G.
6. Prove that the vector diagram and the method of orbital degeneracy
lead to the same set of levels for three nonequivalent p electrons. (Hint: in
the vector method, first couple two of the Vs together to form a vector,
and then couple the remaining I to this to form L. Proceed similarly with
the spins.)
7. Prove that three equivalent p electrons lead to 4 *S 2 D 2 P.
8. Using the Heitler and London method, find an expression for the den-
sity of charge in the normal state of H 2 , as a function of position. Show
that the density is greater in the region between the atoms than if we simply
added the densities of the two atoms.
9. Using the molecular orbital (a + 6)/2 (neglecting the fact that this is
not exactly normalized), for H 2 , and an internuclear distance of 0.8 A,
find the charge density at points in a plane containing the nuclei. Draw
a diagram with lines of constant charge density, which would be circles
surrounding the nucleus for a single atom, but show that in this case some
of the lines surround both nuclei.
10. Draw a diagram similar to that of Prob. 9 for the charge density of
the repulsive orbital (a — 6)/2 for H 2 .
CHAPTER XLI
FERMI STATISTICS AND METALLIC STRUCTURE
For several chapters we have been dealing with the electronic
structure of atoms and molecules, treating them by the perturba-
tion theory applied to Schrodinger's equation, with the addition
of the exclusion principle (antisymmetry of wave functions),
and the electron spin. There is an alternative method, based
directly on statistics, which in its present form is not capable
of giving exact results, but which is very useful for qualitative
discussions, and is not greatly in error when used numerically.
This is the method of Fermi statistics. It is a method in which
the exclusion principle is properly taken care of, but which
treats the electronic motions, and Schrodinger's equation,
only approximately. We may begin its discussion by treating
the free electron theory of metals, one of the simplest applications.
293. The Exclusion Principle for Free Electrons. — Let us
consider free electrons in a box, subject to the boundary condi-
tion that the wave function goes to zero on the boundaries.
The problem is that of a perfect gas, if we neglect forces on the
electrons, and we have already treated it in Chap. XXXV.
We found there that the wave function for a single electron was
sin — -. — sin D sin -7^— > where n 1} n 2 , m were integers, A, B, C
A £> C
the sides of the box. The corresponding energy is
F = fr 2 f ni 2 ir 2 n 2 V 2 , n 3 2 ir 2 \ m
8ir 2 m\ A 2 ~*~ B 2 "*" C 2 )' u;
Let us consider the same electron in the phase space, and investi-
gate its quantum conditions. The variables can be separated,
and suppose we take the x coordinate and its momentum,
drawing a phase space for these two variables (as in Fig. 81).
In such a phase space, the path of a particle is represented by
a line as abed, where along ab the momentum is positive and
constant, corresponding to motion along the box from left
to right; be, in which the momentum changes suddenly without
531
532
INTRODUCTION TO THEORETICAL PHYSICS
change of coordinate, represents the collision with the wall; cd
represents the motion back in the opposite direction, and there-
fore with negative momentum, to the other wall, and da repre-
sents the collision with this wall. The ordinary quantum
condition would state that fp dq = nh, or area abed = integer
times h. If p x is the positive momentum along the path ah
a
b
i-
2A
\
'
°,
i
A
'
'
d
Fig. 81. — Phase space for free electron in box. The limits of the box are and
A, along the x axis. , The rectangle abeda represents the path of a particle, for
which jfp dx = 2h.
this evidently gives 2Ap x = integer times h, p x = integer times
^-j- For the particular path indicated, the integer is evidently 2,
the particle being in the second quantum state. It is easy to
see that this relation between momentum and quantum number
holds in the same way, with integral quantum numbers, for the
wave mechanical solution.
There the function sin — ?— can
be written as —(e iniwx/A — e - iniwx/A ), and when we multiply
by the time factor, this gives the sum of two progressive waves
traveling in opposite directions, representing the stream of
particles going across the box and returning. For each of these
streams, we can find the momentum; thus for the first term,
FERMI STATISTICS AND METALLIC STRUCTURE 533
h r) h
^— . ~-(e inivx/A ) = -pr-rUi e invrx/A y showing that the x momentum
Zvt ox ZA
is wix-r> as before. Similarly the energy is at once seen, in
terms of these values for p x , p y , p z , to be — — -^ — > the
classical value.
Now let us see how the exclusion principle operates, in the
phase space. We may represent each electron by a separate
point in the same six-dimensional phase space, and each one
will move in such a path as abed, at least when projected into
the two-dimensional space associated with one coordinate.
Now the exclusion principle states that only two electrons can
have the same set of orbital quantum numbers, and these two
must have opposite spin. In other words, only two electrons
can have the same values of wi, n 2 , n 3 . This can be formulated
in a statement that there is a certain maximum density of
electrons in the phase space, which may not be exceeded. For
associated with the quantum number ni, there is a definite area
of the x — p x plane: the area between the path abed and that
corresponding to the next smaller path, which from the quantum
condition is h. Similarly associated with each of the two other
quantum numbers is a two-dimensional area h, and when we
put these together, we find a six-dimensional volume h 3 associated
with a given set of n's. In other words, the exclusion principle
states that the maximum density of points corresponding to
free electrons in the six-dimensional phase space is two electrons
per volume h z , or 2/h 3 electrons per unit volume, or a charge
—2e/h z per unit volume. Since the six-dimensional density in
phase space is the product of the ordinary density in space,
by the density in momentum space, we see that this product
cannot exceed a definite value. The denser the electrons are
in ordinary space, the smaller must be the density in momentum
space, therefore the greater must be the volume of momentum
space occupied, and consequently the larger the maximum
momentum, and kinetic energy, of the particles. Crowding
electrons together in three-dimensional space, therefore, neces-
sarily increases their kinetic energy, therefore requires work, and
this effect, depending on the exclusion principle, is the one we
have spoken of in connection with the repulsion of atoms and
molecules.
534 INTRODUCTION TO THEORETICAL PHYSICS
294. Maximum Kinetic Energy and Density of Electrons.—
Two conclusions may be easily derived from the formulation
of the exclusion principle in the phase space. First, let the
density of electrons in ordinary space be determined. Then,
even though the electrons are all in their lowest energy levels,
the density of electrons in the momentum space cannot exceed
the maximum allowed by the exclusion principle. As a result, the
electrons must have kinetic energy, even at absolute zero of
temperature, and we can easily compute the maximum kinetic
energy which any of them must have. To do this, let us consider
a three-dimensional momentum space. In this space, where
p x , p v , p z are the three variables, a surface of constant kinetic
energy is a sphere: E k = — y — > a sphere of radius
p = ■\/2mEk. Now we assume that electrons are in the lowest
possible energy levels, so that they have the smallest possible
kinetic energy. In other words, the part of the momentum
space within a sphere corresponding to the maximum kinetic
energy, E max , will be filled up to its maximum allowable density
with points, and no points will be found outside. We may then
make the following equation: the actual density of particles in
the ordinary space equals the integral of the density in phase
space, integrated over the momenta. In other words,
P = '///(density in phase space) dp x dp y dp s .
But the maximum density of charge in the phase space, as
we have seen, is _— 2e/h 3 . Further, in place of the integration
fffdp x dp y dp z , we may simply multiply by the volume of momen-
tum space which is occupied, since the integrand is constant.
This is the volume of the sphere of radius \/2mE mas , or
%w(2mE mas ) % . In other words, we have
<> - -(ffifay***-**- (2)
Solving for 5^, we have
Bm * = 2^(" 8^) * (3)
This gives the maximum kinetic energy as a function of density
of electricity, and shows as we expected that this kinetic energy
FERMI STATISTICS AND METALLIC STRUCTURE 535
increases with density. In other words, as the electrons are
forced closer together, the kinetic energy increases.
The second way of stating our result is the inverse: if the
maximum allowable kinetic energy of electrons is somehow
determined, then the density cannot be greater than the maxi-
mum value given by our equation. We shall shortly investigate
a simple model of atomic structure based on this use of the
theorem. There we assume electrons to be bound in the field
of the nucleus, and apply this result to them. Surely no electron
can be bound if its kinetic energy is great enough to allow it to
escape from the center of attraction. This condition places
a maximum on the possible kinetic energy of an electron in the
atom. In turn, this determines a maximum density of electricity
at every point of the atom, a density which is approximately
reached in actual atoms.
295. The Fermi-Thomas Atomic Model. — We have just
described a relation between maximum kinetic energy and maxi-
mum density of electrons. This has been proved only for free
electrons; but since the law that the maximum density of elec-
trons in phase space is —2e/h z holds even in a force field, we may
expect our relation to be general (though not exact, since it
does not take account of the fact that the quantum conditions
are not the exact formulation of wave mechanics). Fermi and
Thomas have applied this result to the problem of atomic
structure. Suppose that we have a nucleus, surrounded by a
cloud of electrons. Let the electrostatic potential at any point
within this cloud be V(r), so that the potential energy of an
electron at the corresponding point is —eV(r). The potential
will go off to zero at large distances, since the atom is uncharged
as a whole. Thus, if the total energy of an electron is negative,
it cannot escape from the field, but if it is positive it can escape.
We may be sure, then, that all electrons bound in the atom have
negative energies, so that the maximum kinetic energy allowable
for an electron at any point is eV(r). This means that the
density of charge at distance r from the center cannot exceed
— (-^| \(2meV)%. Let us assume that the density has just this
value. That gives us, then, a relation between charge density
p and potential V. But of course there is another relation:
the potential V is assumed to be produced by the charge itself,
according to electrostatics, and in that case we have Poisson's
536 INTRODUCTION TO THEORETICAL PHYSICS
equation y 2 V = — 4irp. Since V is spherically symmetrical,
the Laplacian can be written — 2 ^-[r 2 -^- )• Equating the two
expressions for density, we finally have
1 d( dV\ 327r 2 e, ___
^^v r v; = -3^ (2weF)/2 - (4)
This equation must, of course, be solved subject to the condition
that V for very small r behaves like the potential of the nucleus
alone (thus bringing in the atomic number) and for large r
approaches zero (thus determining that the electrons should be
sufficient in number to balance the nuclear charge). It is a
nonlinear differential equation for V, and cannot be solved
except by numerical methods. It is easy to show, however,
that by making a change of scale we can reduce the problem
of arbitrary nuclear charge to a single equation, so that the
problem can be solved once for all for all atoms. This has
been done, and it is found that the resulting potential and charge
density, while by no means accurately equal to the ones found
by more elaborate methods, still are approximately correct, and
good enough for a number of kinds of calculation. Unfortu-
nately, the method is least accurate for the outer parts of the
atom, where the density is small and actually the electrons do
not have the maximum density allowed by the exclusion princi-
ple, and since these outer parts are most important in many
applications, this method of Fermi and Thomas has not had as
wide use as it otherwise might.
296. Electrons in Metals. — We have pointed out in another
chapter that the electrons in a metal, though they are in a field
which has intense local irregularities at the various atoms,
still on the average are in an approximately field free space.
The attractions of the nuclei are, of course, balanced by the
repulsions of other electrons, and the electrons are largely free.
We may thus approximately apply the ideas of maximum density
of electrons, etc., developed in the present chapter. Let us see
what picture of the electrons in a metal results from this.
There are an infinite number of possible stationary states for
a free electron in a metal, corresponding to different values of
momentum and kinetic energy. The energy levels are spaced
in such a way that the energy is a quadratic function of the
FERMI STATISTICS AND METALLIC STRUCTURE 537
quantum numbers. At absolute zero, the electrons naturally
settle into the lowest energy levels they can, but just as in a
single atom, the exclusion principle operates to prevent more
than two electrons (with opposite spins) going into one stationary
state. Thus some of the electrons will have to be in very high
quantum states, and will have much kinetic energy. We have
found the value of the maximum kinetic energy, and when we
put in numerical values, assuming the number of electrons to
be of the order of magnitude of the number of atoms in the metal,
we find that this kinetic energy is of the order of magnitude of
10 volt-electrons. The electrons in a metal, then, are con-
tinuously moving about with these large energies, having speeds
far in excess of those which would result from thermal agitation
even at enormous temperatures, on ordinary statistics. They
do not ordinarily carry any current, however, since as many go
in one direction as in the opposite direction, so that the current
cancels. In the presence of an external electric field, however,
the electrons are accelerated, as shown in Chap. XXXVIII.
As we saw there, this results in an electric current. The electrons
gain only a small amount of energy and momentum in this
acceleration, since they collide with the nucl« and lose their
excess energy very soon, and the small drift velocity they acquire
in the process corresponds to the net current we observe, which
by Ohm's law is proportional to the field producing it.
It is interesting next to consider what happens to the metal
as the temperature is raised from the absolute zero. The first
effect is on the nuclear vibrations, and this affects directly the
electrical conductivity. We have really not been entirely
accurate so far in our description of the process of collision
between the electrons and the nuclei. Since the electrons can
be replaced by waves, the process is like the scattering of waves
by a set of particles. Such a problem is met in optics, and there
it can be shown, as in Sec. 184, that a medium with uniformly
spaced particles does 'not scatter at all; the particles, like the
atoms of a homogeneous transparent solid, affect the refractive
index, but do not scatter the light in random directions. It is
only when the particles show fluctuations of density, like the
molecules of a gas, that they scatter, and scattering is propor-
tional to the mean square deviations of the particles from
positions of uniform spacing. In a similar way, electron waves
are not scattered by uniformly spaced atoms, but only are
538 INTRODUCTION TO THEORETICAL PHYSICS
deflected if the atoms show nonuniformity in their arrangement.
Now at the absolute zero, the atoms of a metallic crystal are
uniformly arranged, so that the electrons are not scattered, and
the resistance goes to zero, as it experimentally does. The zero
point lattice vibration produces a nonuniform arrangement, it
is true, but it can be proved that this does not add to the resistr
ance. As the temperature increases, however, temperature
agitation results in deviations from uniform arrangement
of the atoms. The mean square deviations, and hence the
scattering and the resistance, are proportional to the tempera-
ture, explaining in a very simple way the well-known experi-
mental law giving the temperature coefficient of resistance.
In addition to this effect of temperature on nuclear vibrations,
however, we may ask if there is any effect on electronic motions.
There is such an effect, a small one. The electrons try to take
up thermal energy of agitation. Those lying in low levels of
the Fermi distribution, however, with kinetic energy much below
the maximum kinetic energy of the electrons, cannot take up
temperature energy. The reason is that in order to take any,
their energies would have to increase enough so that they would
have more kinet# energy than the maximum we have computed,
since all energy levels below that are filled already, and this
would require more energy than is available. The few highest
electrons, however, have unfilled levels directly above them,
and they can be raised to these levels by comparatively small
additions of energy, which they can secure from temperature
energy at ordinary temperatures. The situation is much as
it is in atoms. The inner electrons of an atom cannot take part
in temperature agitation, since the levels slightly above them are
already occupied, and the least energy the electrons could take
up would be enough to remove them entirely from the atom, an
energy enormous relative to the amounts available at ordinary
temperatures. on the other hand, the outer electrons can in
some cases have temperature energy, for sometimes there are
unoccupied levels lying only slightly above the highest occupied
ones, which can be reached by addition of small amounts of
energy.
It is interesting to draw diagrams of the change of electron
distribution with temperature. At the absolute zero, we have
seen that the momentum space is filled with electrons at a
uniform density up to a maximum energy E^. The number
FERMI STATISTICS AND METALLIC STRUCTURE 539
of electrons with energy between E and E + dE is then propor-
tional to the volume of momentum space between these energies,
or proportional to p 2 dp f the shell between p and p + dp. Sub-
stituting E proportional to p 2 , this gives a shell of volume
\/EdE. In other words, the number of electrons per unit
energy range is proportional to -\/E, up to E^, and above
that it is zero. This is shown in Fig. 82, curve oabc, rising accord-
ing to a square root law to E ma *, then dropping suddenly. At
a higher temperature, there are fewer electrons with energy
almost up to Em**., but these now have energy slightly above
Fig. 82. — Fermi distribution in energy. Curve oabc is the energy distribution
at absolute zero, the other curves odec and ofg being for higher temperatures.
E max , as in odec. A still higher temperature would be repre-
sented by ofg, and at sufficiently high temperature so that
almost all the electrons were excited, the distribution would
finally approach a Maxwell distribution law, the exclusion
principle no longer being of importance when energies are so
great that there is small chance of finding more than one electron
in a state anyway. This situation would never be reached,
however, in an actual metal, on account of the enormous tempera-
ture required.
The change in distribution of the electrons is not great enough
to make a very important effect in conductivity. It is of great
importance in thermal conductivity, on the other hand, but
the place where it is most directly observed is in thermionic
emission. It has been observed experimentally that hot metals
540 INTRODUCTION TO THEORETICAL PHYSICS
emit electrons, the number increasing enormously with tempera-
ture, as if there were a factor e -^ neTey / kT) in the law giving emis-
sion as function of temperature, where the energy must be taken
to be of the order of magnitude of several volts, and is called
the work function. This is explained as follows. A simple
model of a metal is shown in Fig. 83, where we plot the potential
in which the electrons may be assumed to move. This is con-
stant, equal to A, throughout the metal, and the horizontal
lines symbolize the Fermi levels, filled with electrons at the
absolute zero. At the boundary, however, the potential must
rise to a value C, higher than the maximum kinetic energy of
the electrons at absolute zero, since it is observed that the elec-
trons do not escape. This is symbolized by the rise AC, and
c
Fig. 83. — Potential at surface of metal. The shaded lines between A and B
represent the energy levels filled with electrons at absolute zero, according to the
Fermi distribution. The energy BC is the work function.
outside the metal the potential retains the constant value C.
If now the temperature is raised sufficiently high so that there
is an appreciable number of electrons of energy C or greater,
these electrons can escape, forming those observed in thermionic
emission. The number will evidently increase very rapidly
with temperature, since those which can escape come from the
parts e and g of the curves of Fig. 82, which increase greatly
with temperature.
297. The Fermi Distribution.— It is not hard to derive the
equation of the curves of Fig. 82, giving the distribution in
energy in Fermi statistics. There are many ways to do it, but
perhaps the simplest is by a reversal of the argument of Chap.
XXXII, in which we derived the Planck black-body radiation
law from Einstein's probabilities of transition. Let us first
review that argument, in the reverse direction, and then see
how it is applied in the present case. In Chap. XXXII we
FERMI STATISTICS AND METALLIC STRUCTURE 641
assumed the Maxwell-Boltzmann distribution, and Einstein's
probabilities, and derived from them the Planck law. We
could equally well have assumed the Planck law, however, and
have derived the distribution of molecular velocities from this,
and the corresponding method will work for the Fermi as well
as the Boltzmann distribution. Let us assume then that the
density in black-body radiation of frequency v is
_ Svhv 3 1 . ■.
Further, we assume that the probability of transition from a
state of energy E\ to one of energy E 2 , by absorption of radiation
of frequency v, where E 2 — E x = hv, is Bp v , and the probability
of emission of radiation and of transition from the state 2 to
1 is (A + Bp v ), where A measures spontaneous emission, the
other term forced emission, and where A/B = &irhv z /c z . Then,
if Ni are in the state 1, N 2 in the state 2, the number of transitions
from 1 to 2 per second, under the action of the radiation, will
be NiBp, and from 2 to 1 will be N 2 (A + Bp). These must
be equal in thermal equilibrium, so that the net number of
systems in the various states will not change with time. Hence
we have
Ni A+Bp e > W
using the relations between A, B, and p. Thus we have
N 2 :Ni = e-*2/*r:e-*./* r , (7)
which is the Maxwell-Boltzmann distribution.
Let us now ask in what way our situation with the Fermi
statistics is different. In the first place, we have a continuous
distribution of energy, rather than discrete states. This is
taken account of by taking an energy range dE instead of the
state of energy E h and another energy range dE' instead of
the second state. Let the number of electrons with energy in
dE be f(E)dE, and in dE' be f{E')dE'. Further, let the maxi-
mum possible number of electrons in dE be F(E)dE, and in
dE' be F{E')dE'. Here F represents evidently the total number
of stationary states in the interval (counting states of opposite
spin as different), since each state can have but one electron.
Now in computing the number of electrons going per second
542 INTRODUCTION TO THEORETICAL PHYSICS
from dE to dE', we note that we are grouping together many-
possible transitions, since we have many possible pairs of sta-
tionary states. The probability of a transition in unit time
from one level in dE to one in dE' is Bp (if the level in dE' is
unoccupied) or (if the level is occupied). It is in this differ-
entiation between occupied and unoccupied levels that the
Fermi statistics enter. But now we have f(E)dE levels in dE,
so that the probability of a transition in unit time from any
one of these to one particular unoccupied level in dE' is the sum
of the separate probabilities, or is Bpf(E)dE. Finally, there are
[F{E')dE' - f{E')dE'} unoccupied levels in dE', and the chance
of having a transition to any one of these is the sum of the chances
to the individual ones, or Bpf{E)dE[F(E') - f{E')]dE'. Simi-
larly the probability of a transition from dE' to dE per unit
time is (A + Bp)f(E')dE'[F(E) - f(E)]dE. Equating these,
and using the same relations before for A and B, we have
f(E)[F(E') - f(E')]e~ E '^ = f(E')[(F(E) - f(E)) e - E ^. (8)
Divide through by f(E)f(E'). Then we have
F(E')
f(E')
-E'/kT —
F(E)
f(E)
e -E/kT — constant = A.
Solving for /, then, we have
This is the distribution function for the Fermi distribution,
where A is a constant independent of E (but which may depend
on temperature).
The general properties of the distribution can be seen at
once from the distribution function. First, at low temperatures,
let us write A in the form e~ Eo/kT . Then we have
F{E)
fW = e (E-Eo)/kT + I (10)
For very small T, the exponential is e raised to a very large
quantity. If E — E is positive, this quantity is e to a large
positive power, or an enormous value, so that f(E) is practically
zero. on the other hand, if E — E is negative, it is e to a large
negative power, or practically zero, and can be neglected com-
pared with 1, leaving f{E) = F(E). In other words, E is
our value E max , and as in Fig. 82 the distribution is F(E) for
FERMI STATISTICS AND METALLIC STRUCTURE 543
energies less than .EL**, zero for greater energies. on the other
hand, for extremely high temperatures, the particles become
distributed though a wide range of high energies, so that they are
so spread out that the chance of a given level being filled is small.
Hence f(E) < < F(E), or Ae^ kT > > 1. Then the distribution
approaches f(E) = constant F{E)e~ E/kT , or the Maxwell-Boltz-
mann law. At intermediate temperatures, however, such as
those concerned in thermionic emission, we are interested in
the distribution for energies large compared with kT. Here
we have {E - E ) large compared with kT, so that e^~ E ^l^ T is
large compared with unity, and we can write
f(E) = F(E)e-^ E - E ^/ kT ,
again the Maxwell-Boltzmann law. This is the form of distribu-
tion which, as we stated above, is indicated by the observations
on thermionic emission, and in particular the energy E — E Q ,
where E is the minimum energy necessary to escape from the
metal, or BC in Fig. 83, is the work function. We see, there-
fore, that this is the work required to remove the most loosely
bound electron of the Fermi distribution from the metal. This
is found experimentally to be of the order of magnitude of two
or three volt-electrons, so that in Fig. 83 the energies AB and
BC are of the same order of magnitude, but AB (being about
10 volts or more) is the larger.
Problems
1. Taking the dimensions of a copper crystal lattice, and assuming one
conduction electron per atom, apply the Fermi method to find the maximum
kinetic energy, in volts, of the electrons at absolute zero.
2. To compress a metal at the absolute zero, we must squeeze the elec-
trons into smaller volume, therefore increase the kinetic energy, and do
work. This accounts for the larger part of the resistance to larger compres-
sions. Find the formula for pressure as a function of volume, assuming
only this repulsive effect.
3. Show that the formula for pressure as function of volume, found in
Prob. 2, is the same that one would get by ordinary gas theory for the adia-
batic compression of a gas with the same kinetic energy as the electron gas.
4. A simple model of a metal may be made by assuming the repulsion of
Prob. 2, and an ionic attraction, giving a potential inversely proportional
to the grating space. Using such a potential, determine its arbitrary coeffi-
cient by making the grating space agree with the observed value, and com-
pute the compressibility in terms of the constants of the system.
6. Apply the method of Prob. 4 to the case of copper. Compare the
resulting compressibility with experiment.
544 INTRODUCTION TO THEORETICAL PHYSICS
6. At a distance 0.04 atomic units ( = 0.53 X 10~ 8 cm.) from the nucleus,
; n a rubidium atom, the potential energy of an electron is about -1540
atomic units ( = 13.54 volt-electrons), and the density of charge is given by
saying that the number of electrons per unit increase of r (measuring again
in atomic units) is 42. Find how nearly this agrees with the maximum
density allowed by the Fermi-Thomas method, if the maximum total energy
of the electrons is zero.
7. Find the distribution function for the number of electrons whose x com-
ponent of momentum is between p x and p x + dp x , at any temperature, using
Fermi statistics.
8. Of all the electrons striking the surface of a metal, only those whose
momentum normal to the surface is connected with a term in the kinetic
energy greater than the work function can escape. Derive the expression
for the number per second escaping. In doing this, note that the number
striking 1 sq. cm. of the wall per second is the number contained in a cylinder
of base 1 sq. cm., slant height the vector velocity of the electron.
8. Show that if atoms and molecules obeyed Fermi statistics, the maxi-
mum kinetic energy at absolute zero, and consequent departure from the
Maxwell-Boltzmann law at higher temperatures, would be so small that
they would not ordinarily be observed. Actually molecules do not obey
Fermi statistics, but another sort, called Bose statistics, which involves
deviations from the Maxwell-Boltzmann law of about the same amount,
though in the opposite direction, resulting in a reduction rather than an
increase of mean kinetic energy and gas pressure.
CHAPTER XLII
DISPERSION, DIELECTRICS, AND MAGNETISM
The most important properties of atoms are connected with
their interaction with each other, to form molecules and solids
and all sorts of material systems. But another important set
of properties is connected with the behavior of substances in
external fields, electric and magnetic. We shall consider these
in the present chapter. First is the question of electric fields,
both varying and constant. A periodically fluctuating electric
field is the same thing as a light wave, and the first problem is
dispersion, the question of the forced dipole set up in an atom
by the light wave. We shall find that this has just the form
of the forced motion of a linear oscillator under an external
sinusoidal field, which we have used earlier as a model for the
electrons in dielectrics and transparent media. This agreement
of form between the displacement of a forced dipole and of an
atom on quantum theory is the basis of the whole classical
theory of dielectrics and dispersion. Since the individual atoms
behave in the manner we have already assumed, we shall not
have to repeat the earlier analysis of the reaction of the dipoles
back on the field, and their effect in determining the index of
refraction. From dispersion we can at once pass to the behavior
of dielectrics, remembering that the polarizability and dielectric
constant are derived as limiting cases of dispersion for zero
frequency. In connection with the polarizability, we can verify
the results mentioned in Chap. XXXV about polarization
and Van der Waals' forces between atoms, questions answered
by similar mathematical methods. We then pass on to a
general discussion of dielectrics. These show dielectric prop-
erties for two reasons: because the individual atoms show
polarization, the effect mentioned before, and because molecules
can possess permanent dipole moments, which become oriented
under the action of an external electric field, producing a polariza-
tion and a contribution to the dielectric constant. This second
type of dielectric action decreases as the temperature increases,
545
546 INTRODUCTION TO THEORETICAL PHYSICS
since temperature agitation tends to prevent the necessary
orientation of the dipoles. Finally we shall pass briefly to some
magnetic properties of substances. Paramagnetic substances
have atoms and molecules with permanent magnetic dipoles,
just like the permanent electric dipoles mentioned above, and
under the action of a magnetic field these orient, so that the
theory we develop for dielectrics can be used without change for
paramagnetism.
298. Dispersion and Dispersion Electrons. — In Chap. XXIV,
we have seen that an electron of charge e, pulled to a position
'.of equilibrium by a linear restoring force such that its natural
frequency is v , is set into vibration by an electric field E cos 2tvI
in the x direction. If x is the resulting displacement of the
electron, ex. will be its electric moment, and this is given by
_ e 2 E cos 2irvt (
4x 2 W Vq 2 — v 2 ^ '
e 2 1
The quantity a = -r-s 5 by which the field must be multiplied
47T 2 W J>
to get the electric moment, in the special case of zero frequency,
is the polarizability of the vibrator. Now we have seen that
the contribution of the induced dipoles to the external field
results in a changed velocity of propagation of the wave, and
hence a refractive index. We need not go through the argument
again.
If now an atom has several types of vibrating electrons, rather
than one type only, we must add the electric moments due to
each. Thus if there are /1 electrons in the atom with natural
frequency v h / 2 of frequency j»sj etc., the total moment is
4T 2 m
2^ J ^ Ecos2,r, ' ( - (2)
Experimentally, such a formula gives a good value of the index
of refraction as function of frequency, except for the fact that
we have neglected damping, so that this formula goes to infinity
at each natural frequency, instead of merely going to large
values. To get agreement with experiment, we must assume
that the various natural frequencies Vi are the frequencies of
light which the atom can absorb in going from the state it is in
(usually the normal state) to some other; that is, they are the
DISPERSION, DIELECTRICS, AND MAGNETISM 547
frequencies determined by the quantum theory, including not
merely the optical absorption frequencies in which a loosely
bound electron becomes excited, but also the x-ray absorption
frequencies. For instance, a sodium atom has absorption at
the various discrete frequencies connected with absorption
of the lines of the principal series (its 3s valence electron being
excited to a p state); it has continuous absorption beyond the
limit of this series, in the ultra-violet, the 3s electron being
entirely removed. But also it has absorption if one of the 2
quantum electrons is ejected (the L absorption edge), or if a
1-quantum electron is ejected (K absorption). These latter
are in the x-ray region. To make the classical formula fit the
observations, we must assume that all these frequencies, some
discrete and some distributed continuously, are connected with
oscillators of suitable frequencies v\.
With this interpretation of the frequencies v { , it is obvious
that the numbers f t of dispersion electrons of the various fre-
quencies cannot all be integers. For there are infinitely many
lines, and yet but a finite number of electrons in the atom. As
a matter of fact, in the principal series of sodium, the experi-
mental / values associated with the various lines have been
determined. For the first line (the well-known D line), the /
is almost, but not quite, unity; but the other lines of the series
are much weaker, and decrease rapidly in dispersive power as"
we go down the series, so that the sum of all other /'s, for both
the discrete members of the series and the continuous absorption
at the end, is only a few per cent of unity. The x-ray disperson
terms correspond to fairly large / values, though on account
of the factor v 2 in the denominator, the effect on the refractive
index is very small in the x-ray region. As a matter of fact,
the sum of / values for the L absorptions is of the order of magni-
tude of the number of electrons in the L shell, and for the K
absorptions of the order of magnitude of the number in the K
shell. The total sum of all /'s for the atom, then, is of the order
of magnitude of the number of electrons in the atom. As a
matter of fact, the sum of /' s proves to be exactly equal to the
number of electrons. This was first found experimentally
in the following way: the forced dipole moment determines
scattering as well as dispersion, as we have seen in Chap. XXV,
and at frequency large compared with any natural frequency,
548 INTRODUCTION TO THEORETICAL PHYSICS
€? 1 ^ ^ *<
the displacement ex is equal to j-^ — -^—^ ^/« E cos 2irvt, leading
i
to the Thomson scattering formula, for x-rays. Now experi-
mentally the Thomson formula is found to hold, if we put
in the actual number of electrons in place of ^V*; as a matter
i
of fact, this experiment was one of the earliest ways of deter-
mining the number of electrons in an atom. That number,
then, must equal the sum of all / values. The result can now
be proved theoretically as well; we shall shortly derive values
for the /'s in terms of quantum theory, and the theorem can be
proved by quantum theory from these values.
299. Quantum Theory of Dispersion. — We shall now derive
from wave mechanics the forced dipole moment set up by a
vibrating field, and show that it has the same form as the classical
value. The first problem is to consider the perturbation of the
atomic wave function produced by the external field. This has
already been investigated in Chap. XXXII, in connection with
the absorption of radiation, which, of course, is intimately con-
nected with the dispersion. In that chapter the following prob-
lem was solved: assuming that u m ° was the wave function, E m °
the energy level, of a problem with the Hamiltonian H°, we added
an external field, of which we here take the term of one frequency
only, so that the whole Hamiltonian is H = H° — exE v cos 2irvt
(where here the phase factor in the external field is neglected).
We showed that the perturbed wave function was \j/ — ^C m (t)
m
-—E,nH
u m °(x), where C m = c m (t)e h , and where expressions were
derived for the c's as functions of time [see Eq. (16), Chap.
XXXII]. Rather than writing these formulas in the general
case, we make several specializations. First, it will be assumed
that the atoms are practically all in the state 0, so that squares
of C's connected with other states can be neglected. Further,
we neglect certain constant terms in the c's, which do not vibrate
in the same frequency as the external field, and hence cannot
contribute to the dispersion. Lastly we give formulas for the
C's rather than the c's. When this is done, we have
(p-A / p 2rivt p -2wipt \ _?li Eot
n _ W m0 / e e \ h iio t
Lm ~ 2 \hv + E m ° - E ° hv - EJ + Eo°J ' w
ex
DISPERSION, DIELECTRICS, AND MAGNETISM 549
where (ex) m0 is the matrix component of the electric moment in
the x direction associated with the transition from state to m.
Using these values of the C OT 's, and the value unity for C , we
can compute \p, and from it find the mean value of the induced
electric moment in the x direction, which is ^(ex)-^ dv. In the
double sum coming from multiplying yp by its conjugate, there
are three types of terms : first, the term unity coming from the
terms C Q of both \p and #, which gives nothing because (ex) has
no diagonal matrix component corresponding to the state 0;
second, terms proportional to the C's, coming from one factor of
unity and one other C; these are the essential terms which we
consider; and third, terms in C 2 , which we neglect. In each
term, we must integrate a quantity like u (.ex)u m , giving an addi-
tional factor (ex) m0 . When we carry out these integrations, and
combine conjugate exponentials to get cosines, we have easily
= ^M~°\ hv + E \ _ Eq0 " hv _ E \ + Eo o )E v cos 2rvt..
m
"S^(ex) 2 m oVmof 1 \et o * ,a\
= 2 2i — — tar=?r- ' ()
m
where hv m o = E m ° — E °. This is the same as the classical for-
mula, if we set fi = q7 -(ex)^^
e 2 h
It is interesting to note that this expression for the number of
dispersion electrons can be written in terms of the probability
coefficient B determining the probability of absorption. This, as
871- 3
found in Chap. XXXII, is equal to B m0 = oT 2 (ex) 2 m0 . Thus we
have f m o = — -MmohvmQ. It is reasonable that we should be able
ir e l
to do this, on account of the intimate relation between absorption
and dispersion. As a matter of fact, measurements on dispersion
are often used to determine the JB's, particularly in the visible
range^ and the converse is sometimes done as well, as in the x-ray
region, where the absorption, and B, are easy to measure, but the
dispersion is so small as to be difficult to get accurately.
300. Polarizability. — Setting the external frequency equal to
zero, we have
n . _ o*S? (es) 2 ».o _ 9 N^ (ex) 2 m0
<* - 2 2a-k^~ - 2 2*ej=w (5)
550 INTRODUCTION TO THEORETICAL PHYSICS
as the polarizability of the atom. This formula can easily be
found directly by second-order perturbation theory, without
going through all the calculation we have made. First we notice
that the energy of a dipole of polarizability a, in a field E, is
— {a/2)E 2 . To prove this, we note that it requires no work to
put the unpolarized dipole in the field. When it is polarized,
the internal energy of the dipole, connected with the restoring
force, becomes (a/2)E 2 , or half the product of force (eE) and
displacement (aE/e). But at the same time the potential energy
in the external field becomes — aE 2 , equal to the force eE times
displacement aE/e, with negative sign because the charge is being
pulled in the direction of the field, and therefore to a lower poten-
OL
tial. Adding the two, the result is — ~E 2 , as we have stated.
We can now directly compute the same energy for the atom by
perturbation theory, and by comparison derive the polarizability.
The perturbative energy from a field E is —eEx. We now
have, from Chap. XXXII, the expression
7? — tt _ ^? H*™1
■&n — ■" nn / j jj
H
for the energy of the system, correct to perturbations of the
second order, where H nk , etc., are matrix components of the whole
energy. Assuming that the unperturbed energy has a diagonal
matrix, and that its value connected with the normal state (whose
polarizability we compute) is E °, we have
Ea = E ° - 2tf fc ° -Eo°
In this expression, we have noted that the perturbative energy
has no diagonal terms, so long as the atom has no dipole moment.
This is always true with the normal states of atoms. In excited
states, it is in some cases possible to set up wave functions which
have a dipole moment, the electrons spending more time on one
side of the nucleus than on the other, and in these cases there can
be a first-order effect of the electric field on the energy levels.
This is observed in the spectrum, as a displacement of the lines
under the action of an electric field, and is called a first-order
Stark effect. Most atoms, however, show only a second-order
DISPERSION, DIELECTRICS, AND MAGNETISM 551
Stark effect, the term we have found proportional to the square
of the small matrix components H k o, and therefore to the square
of the field. The only exceptions prove to be hydrogen, and
hydrogen-like states of other atoms.
Inserting now the value of the matrix of the perturbative
energy, our expression becomes
E = E ° - 2 ^-°^ = ^°° ~ P 2 > (6)
leading to the same value of polarizability as before. It is inter-
esting to write the polarizability in terms of the number of dis-
persion electrons. Then it becomes
a = m\%r) 2l(E k ° - E °)*' (7)
k
In this formula, the polarizability is expressed in terms of quanti-
ties all of which can be determined from experiment. It can be
used, therefore, as a direct test of the theory; or it can be used to
find unknown polarizabilities from known members of dispersion
electrons and energy levels, or vice versa.
301. Van der Waals' Force. — The Van der Waals' force which
we have discussed in Chap. XXXV can be derived by second-
order perturbation theory, in a way similar to that which we have
used in discussing polarizability. In this case, instead of con-
sidering the perturbation of an atom by an external field, we
consider its perturbation by another atom. Let us imagine two
atoms at distance R, and investigate their mutual perturbations.
If the first forms instantaneously a dipole of moment /z, this will
produce at the second a field proportional to fi/R 3 , times a func-
tion of the angle between n and the line joining the atoms. This
will produce a perturbative energy proportional to the instan-
taneous dipole // of the second, times the field, or wx'/R 3 times
functions of the angles of both dipoles. This then is the per-
turbative energy which we must use in our perturbation problem.
As with the polarization by a constant field, the average value
vanishes, and we must take second-order perturbations. For
this, we need the nondiagonal matrix components of the quantity
ixf/, with the functions of angle. If we consider the transition
from the normal state to the state in which the first atom is in
552 INTRODUCTION TO THEORETICAL PHYSICS
the kth state, the second the V state, the matrix component is
the product of the Oft component of n, and the 01' component of
/, together with the result of integrating the functions of angle.
on account of the spherical symmetry of each atom, the matrix
component of \x can be determined from that for (ex), and simi-
larly that of n' can be found from that for (ex'), the numerical
factors coming from the angles. The final result, then, of taking
the matrix component and averaging over angles is to replace the
matrix component of w'/R* by constant times (ex) k o(ex')vo/R % ,
where the constant can be determined by carrying through the
computation. Next we need the diagonal elements of the total
energy; these are simply the sums of the unperturbed energies of
the two atoms. We finally have for the perturbed energy, then,
-^constant (ex) 2 k (ex') 2 i'
~ 2i W~ E\ + E° v - E\ - E\r
k,V
When the whole computation is carried out, the constant proves
to be 6.
In most cases, one transition, or one group of transitions, proves
to make the major contribution to the summations for either the
polarizability or Van der Waals' force. Thus for sodium we have
pointed out that the D line, the first line of the principal series,
has an / value far greater than the other lines; we should make
small error if we left the others entirely out of account. Simi-
larly for an inert gas, or an ion in the form of a closed shell, the
first line of the absorption series has such a large frequency that
it is nearly as great as the ionization potential, and all the essen-
tial terms are concentrated close to the ionization potential. In
either case, then, we may approximately replace the summation
by its one principal term. If we let AE represent the energy
difference between the lowest state and this particular excited
state, we have for the polarizability
(ex) 2
a = 2 ~aF"
In this case, the Van der Waals' potential is
__6 aa'AEAE' = 3 1 AEAE' , .
R« ~4(AE + AE') 2 R« AE + AE' aa ' W
This is a useful formula expressing the Van der Waals' potential
between any pair of atoms in terms of the principal energy differ-
DISPERSION, DIELECTRICS, AND MAGNETISM 553
ences of the two (which can be estimated from the spectrum)
and the polarizabilities, which can be observed or calculated.
In case the two atoms are alike, this reduces to
-lw° a * AE > (9)
a useful expression which proves to be fairly accurate experimen-
tally. We may rewrite this last formula in terms of the expression
for a, obtaining
3 1
a formula equivalent, to the one — constant — - of Sec. 254,
Chap. XXXV.
302. Types of Dielectrics. — All atoms are polarizable, and as
a result every substance acquires a dipole moment under the
action of an external electric field, and shows dielectric properties.
These properties can be found from the polarizabilities we have
computed, the dielectric constant being given by e = 1 + 47r2a,
where the summation is over all atoms in a cubic centimeter, or
more accurately by _ = -n-^,^ This dielectric constant is
obviously independent of temperature. But in addition, as we
have mentioned above, there is a dielectric effect on account of
orientations of permanent dipoles in the field. While atoms never
have permanent dipole moments, molecules often do, as for
instance NaCl, in which naturally the Na in general is positively
charged, the CI negatively. The moments tend to set themselves
parallel to the field. In a solid this is hardly possible (with a
few exceptions in which water molecules are free to rotate within
the solid), but in liquids and gases the molecules are free to orient
• themselves, and a polarization results from the process. There
are several features which distinguish this form of dielectric
action from the other. First, it is dependent on temperature, as
we shall prove, decreasing on account of temperature agitation
at high temperature. Second, it shows quite a different depend-
ence on frequency from the other sort. The molecules are fairly
heavy and hard to rotate. Thus while low frequency alternating
fields as well as static fields are able to orient the dipoles, higher
frequency fields are too fast for them, and this type of dielectric
554 INTRODUCTION TO THEORETICAL PHYSICS
action drops out, usually becoming negligible in the wave length
range of a few centimeters, so that for visible light there is no
trace of it. An example is water, with an enormously large
static dielectric constant, on account of the large moments of its
molecules, but with a perfectly normal refractive index, the dipole
effect being entirely ineffective at optical frequencies.
303. Theory of Dipole Orientation. — The mean electric
moment resulting from dipole orientation is easily found by
application of the Maxwell-Boltzmann distribution. If it
were not for the field, the number of molecules with their dipole
moments pointing along the directions contained within a
certain solid angle would simply be proportional to the solid
angle, there being no preferred direction. That is, if is the
angle between the axis of the moment and the z axis, the number
between and + dd would be proportional to the solid angle
between these two angles, or to 2tt sin dd. If, however, there
is an electric field along the z axis, there will be a potential energy
depending on 0: a given molecule will have a potential energy
equal to the negative of the field times the projection of the
moment along the field, or -/x cos E, where E is the field.
Then by the Maxwell-Boltzmann law, the number having direc-
tion between and + dd will no longer be proportional merely
to the solid angle, but to this times e - (eneTey/kT) . In other words,
the number between and + dd is proportional to
e w c ° 8 2ir sin 6 dd. (11)
We may now use this distribution function to^ find the mean
value of ju cos d, the component of moment along the z axis.
This is evidently
f e kT c ° 8 sin d cos d dd
" ^— ~ '
f e w co a sin ddd
where the integrations are from to x. To carry out the integra-
tion, let fiE/kT = c, cos d = y, so that sin ddd = -dy. Then,
noting that y goes from 1 to -1, we have for the mean moment
along z,
JCT ydV = J^±S1 - ^ - ,(coth c - I) (12)
DISPERSION, DIELECTRICS, AND MAGNETISM 555
Ordinarily we are interested only in temperatures high enough,
or electric fields small enough, so that c is a small number. Then
approximately the mean moment is the value found by expand-
ing our expression, which proves easily to be £ n c = (n 2 /SkT) E.
This then gives a contribution to the total polarizability, and
to the dielectric constant, inversely proportional to the tempera-
ture, as we have said. This temperature variation can be used
experimentally to separate the two sorts of dielectric action;
a plot of (e - l)/(e + 2) against l/T should give a straight line,
the intercept at l/T = giving the contribution of the polariza-
bility of the molecules, and the slope determining the permanent
dipole moment. Such measurements are used to find dipole
moments experimentally, and in turn this gives information
about the structure of the molecules.
304. Magnetic Substances.— We shall give only a short
summary of the magnetic properties of substances. In the first
place, there are two fundamentally different types of magnetic
behavior, as there are for dielectrics. The first is diamagnetism ;
any system containing electrons has induced currents set up
in it when a magnetic field is impressed, and these currents in
turn act like magnetic dipoles, always opposing the original field,
and therefore producing a small negative contribution to the
magnetic susceptibility, the magnetic analogue to polarizability,
defined as magnetic moment per unit volume per unit field.
The other is paramagnetism, the result of the orientation of
permanent magnetic dipoles. We shall not work out the theory
of diamagnetism, but shall merely state that the diamagnetic
susceptibility of an atom is proportional to the mean square
radius of its electrons, and measurement of this quantity is
useful in checking calculations of atomic structure, since the
mean square radius can be easily computed from an atomic wave
function. As we have stated, all substances show diamagnetism,
and only those containing permanent magnetic dipoles show
paramagnetism. Where the latter occurs, however, it is almost
always large enough to mask the diamagnetism, and leave a
net positive susceptibility, and a magnetic permeability greater
than unity.
Permanent magnetic dipoles result from two things, orbital
motion, and electron spin. In single atoms, both are of impor-
tance. We have seen that the orbital angular momentum
556 INTRODUCTION TO THEORETICAL PHYSICS
~— and spin angular momentum ~— of an atom unite to form
h
the vector sum J 7r - Each of these vectors carries with it a
magnetic moment, but the magnetic moment associated with
the orbital angular momentum proves to be e/2mc times the
angular momentum, while that associated with spin is e/mc
times its angular momentum. Hence the total magnetic moment
is not simply proportional to J. It turns out that the atom may-
be considered to rotate about /, and only the component of
magnetic moment in this direction is of significance. This
h e
component can then be written Jg^r- ~ — '-where g is 1 if the
angular momentum arises entirely from orbital motion, 2 if
it is entirely from spin, and in between for intermediate cases.
Now J is quantized in space, so that its component along the
preferred direction, in this case the direction of the magnetic
field, is M . Hence the component of magnetic moment of the
eh
atom in the direction of the field is M g-r It is this component
which contributes to the magnetic susceptibility. It is also
responsible for an observable effect in the spectrum : in the field
eh
H, the energy shift is Mg-r H, and this shift in energy levels
shifts and splits up the lines, removing the degeneracy in the
matter of orientation. This effect of magnetic fields on spectra
is the Zeeman effect, and it has been of great use in developing
the theory of spectra, principally on account of the complicated
but definite way in which g depends on L and S.
Since the magnetic energy of atoms depends on the orientation
of J in space, there will be a preference for the atoms to line up
with the field, just as there was with dipoles in the dielectric case.
Again we can use the Maxwell-Boltzmann law to find the mean
moment parallel to the field. Since the orientation is quantized,
we should properly use the quantum statistics, but careful
analysis shows that the final result comes out in important cases
the same as the classical case we have already investigated.
Hence we can simply use the same formula we did for dielectrics.
In the magnetic properties of solids, there are various possi-
bilities. The general situation is that the orbital motions are
so interfered with by neighboring atoms that there is no longer an
DISPERSION, DIELECTRICS, AND MAGNETISM 557
orbital angular momentum which stays constant and is quantized.
The only remaining magnetic moment then arises from the spins,
and this produces the paramagnetism. Thus in the iron group,
the ions of the metals show large multiplicity in their spectra.
As a result, they must have large spins, and these substances
show paramagnetism, both when the ions are in water solution,
and when they are in crystalline salts. The magnitude of the
paramagnetic moment can be found from magnetic measure-
ments, and it is found to agree with the hypothesis that it arises
entirely from spins. on the other hand, in the rare earths,
we recall that the group of 4/ electrons which is being built
up lies far beneath the surface of the atom. These electrons,
then, are relatively unaffected by other atoms, and it appears
that they have quantized orbital as well as spin angular momen-
tum, and that this takes part in the paramagnetism.
The most spectacular magnetic property is ferromagnetism,
as exhibited in iron. In this case the spins alone are oriented.
But here a new feature enters, quite different from what is found
in paramagnetism. It appears from experimental evidence
that iron, even when not magnetized as a whole, still consists
of a great many small grains, small compared with ordinary
dimensions but large compared with atomic sizes, perhaps a
few hundred atoms on a side, each of which acts like a permanent
magnet. The process of magnetizing the iron as a whole con-
sists of bringing the moments of these grains into parallelism
with the field; for it appears that while a grain always has a
magnetic moment, this moment can be rotated with respect
to the grain. Magnetic saturation appears when all the grains
have their moments parallel. Hysteresis is a result of the fact
that the grains do not like to change their orientation, interposing
a sort of friction to rotation, so that they can retain a magnetic
moment even in the absence of a field. These conspicuous
properties, then, are a result of the grain-like structure of the
metal. one of the most convincing pieces of evidence for this
structure is found in the Barkhausen effect, an effect observed
when the magnetization is measured very accurately as a func-
tion of the magnetizing field, while the field is being applied.
It is found that the magnetic moment increases in jumps, rather
than continuously, each jump corresponding to the orientation
of the moment of a whole grain.
55S INTRODUCTION TO THEORETICAL PHYSICS
The more difficult problem connected with ferromagnetism
proves to be, not the explanation of the magnetization curve,
but the question why the individual grains are permanently
magnetized. Qualitatively it seems to be because the spins in
adjacent atoms are coupled together, so that they wish to set
themselves parallel to each other. But the magnetic forces
between spins, which of course would tend to produce just such
a lining up, prove to be many times too small to account for the
effect. The order of magnitude of the necessary energy can
be computed, and it proves to be comparable with the electro-
static binding energies observed in molecular structure. It
has therefore been suggested by Heisenberg that the problem is
essentially like one in molecular binding. We recall that in
discussing the structure of H 2 by the method of Heitler and
London we found two levels, a singlet and a triplet, the singlet
lying lower because the exchange integral (db/H/ba) was nega-
tive. Heisenberg assumes that in the case of the magnetically
active electrons of iron the corresponding integral is positive,
so that the triplet lies below the singlet. In other words l it is
more stable for the two spins to be parallel than for them to
be opposite. If the same sort of thing were true all through a
grain of the crystal, all the spins would tend to line up, and the
energies involved would be large, comparable with binding
energies, as they are experimentally. There is difficulty in
supposing that the explanation is as simple as this, but it seems
certain in any case that the forces making the spins of a single
grain parallel to each other are in general of the nature of elec-
trostatic, molecular interactions.
Problems
1. Compute the index of refraction of sodium gas, assuming that it has
one dispersion electron connected with the D line (5,890 A.), six connected
with the L absorption edge, and two with the K absorption edge. Carry
the calculation down to x-ray wave lengths, and show that the index in the
x-ray region differs only slightly from unity.
2. Show that a gas consisting entirely of excited atoms shows "negative
dispersion" about the possible emission lines, a contribution to the index
of refraction of the opposite sign to the normal contribution. Show that
this may be of importance in a real gas in that the ordinary dispersion con-
nected with a transition up from the normal state may be diminished by
excited atoms.
3. Assuming that the D line of sodium has one dispersion electron, com-
pute the transition probabilities A and B, and find the mean life of an atom
in the excited state before it radiates to the normal state.
DISPERSION, DIELECTRICS, AND MAGNETISM 559
4. The polarizability of the hydrogen atom is 4.5 do 3 . Using the formula
for polarizability in terms of number of dispersion electrons and energy
levels, and remembering that hydrogen has one dispersion electron, find the
range of Ek — E making the important contribution to the sum, and find
where Ek lies in the term system.
6. Helium has a polarizability 0.20 X 10~ 24 , and two dispersion electrons.
Show that the most important terms in its dispersion come from the con-
tinuous spectrum beyond the series limit.
6. Using the polarizability and principal energy difference from Prob. 5,
find the Van der Waals' potential between two helium atoms.
7. The molecule of water has a permanent dipole moment of 1.8 X 10 -18
e.s.u. Compute its dielectric constant at room temperature.
8. Using the selection rule that M changes by only ± 1 or 0, in a transi-
tion, show that a spectral line for which g = 1 is broken up by a magnetic
field into three lines, one undisplaced, the others displaced by the frequency
eH/4^tmc in either direction from the original line.
9. Compute and draw the Zeeman patterns for the sodium D lines,
2 P -> *S, using the information that g = % for 2 Pj^, % for 2 P%, 2 for 2 £^.
10. Set up the problem of finding the mean component of magnetic
moment of an atom along the direction of the field, using quantum statistics.
11. Using the classical formula of the derivation in the text, plot magnetic
moment of a dipole in the direction of the field, for large fields, showing that
it approaches a constant value, or saturation, at sufficiently large fields.
Prove that this corresponds to having all dipoles oriented along the field.
SUGGESTED REFERENCES
In a single volume like the present one, it is impossible to do justice to
many branches of theoretical physics, and some are hardly touched on.
The student following the subject further will wish to refer to textbooks in
the older parts of the field, original papers in more modern parts. The refer-
ences which we give in the present section are far from a complete list, and
many good books are not included, but it seems worth while to suggest a few
texts to which the student who is familiar with the present book can refer,
without too great difficulty.
First, there are a number of other texts which, like the present one, give
a general survey of theoretical physics. Since they show a wide variety in
their approach and emphasis, they are often worth consulting. Among
these may be mentioned "Introduction to Theoretical Physics," by L. Page
{Van Nostrand), which gives a good account of classical physics; "Introduc-
tion to Mathematical Physics," by Houstoun (Longmans), containing discus-
sion of potential theory, hydrodynamics, electromagnetic theory, wave
motion, and thermodynamics; "Introduction to Theoretical Physics," by
W. Wilson (Methuen), of which Vol. I, covering mechanics and thermo-
dynamics, has appeared at the date of writing this; and "Introduction to
Theoretical Physics," by A. Haas (Van Nostrand), treating modern as well as
classical theoretical physics. Two longer treatises on theoretical physics,
in several volumes, may also be mentioned: "Introduction to Theoretical
Physics," by M. Planck (MacmiUan), an English translation of a well-known
German text, and " Einfuhrung in die theoretische Physik," by C. Schaefer
(W. de Gruyter). These last two works go a good deal more into detail than
is possible in the present book.
In addition to these works on general theoretical physics, the student
will doubtless have occasion to consult books on mathematical analysis.
A good book on advanced calculus, such for instance as "Advanced Calculus,"
by E. B. Wilson (Oinn), will be helpful. At the same time a "more advanced
book on analysis, such as "Mathematical Analysis," by Goursat and Hedrick
(Ginn), or "Partielle Differentialgleichungen der Physik," by Riemann and
Weber (Vieweg), will furnish much useful information. In these one will
find treatment of a number of branches of mathematics which we have
merely touched on, such as the theory of functions of a complex variable,
theory of special functions, calculus of variations, etc. In addition to these
works on analysis, a book on algebra, such as "Modern Algebra," by Bdcher
(MacmiUan), will be found helpful, particularly in studying the properties
of determinants and linear transformations. Finally, "A Short Table of
Integrals," by B. O. Peirce (Ginn), and "Funktionentafeln," by Jahnke and
Emde (Teubner), will be found invaluable for detailed assistance in calcu-
lation. For definite integrals which are not given in these books> " Tables
561
562 INTRODUCTION TO THEORETICAL PHYSICS
des Integrates Definies," by Bier ens de Hahn {Amsterdam), will be found a
source of much information.
Next we come to a number of specific references on the various
chapters. "Dynamics," by A. G. Webster (Teubner), is one of the most
useful references on the material of the first part of the book. This treats
the dynamics of particles, generalized coordinates, dynamics of rigid bodies,
potential theory, elasticity, and hydrodynamics. "Electric Oscillations and
Electric Waves," by G. W. Pierce (McGraw-Hill), takes up the material on
oscillating electric circuits which we give in the first few chapters, and also
material on Maxwell's equations and electromagnetic waves, which we treat
in Chaps. XIX to XXVI. "The Dynamical Theory of Sound," by Lamb
(Arnold), treats oscillations of particles, and vibrations of strings and
membranes. "Theory of Sound," by Rayleigh (Macmillan), is the standard
treatise on sound, and interprets the field in such a broad way that it is
practically an introduction to theoretical physics in itself. The vibrations
of particles, coupled systems, strings, membranes, and vibrating solids,
elasticity, wave motion, all are carefully treated. In the mechanics of
particles and in rigid dynamics " Elementare Mechanik," by Hamel (Teubner),
will be found a useful reference. "Gyrostatics and Rotational Motion," by
A. Gray (Macmillan), gives useful and detailed discussion of the dynamics
of rigid bodies. "Hydromechanics," by Ramsey (Bell), may be recommended
for this subject, and also "Physics of Solids and Fluids," by Ewald, Poschl,
and Prandtl (Blackie). For the more mathematical side of mechanics,
potential theory, vector analysis, Fourier series, etc., the following are sug-
gested: "Vector Analysis," by H. B. Phillips (Wiley); "Fourier Series and
Spherical Harmonics," by W. E. Byerly (Ginn); "Newtonian Potential
Function," by B. 0. Peirce (Ginn).
For the chapters on electrodynamics and optics, there are a number of
good references in addition to the chapters from the various general texts.
"Classical Electricity and Magnetism," by Abraham and Becker (Blackie),
and "Electricity and Magnetism," by J. H. Jeans (Cambridge), contain
detailed treatment of the electromagnetic side of the subject. "Lehrbuch
der Optik," by F or sterling (Hirzel), and "Optik," by M. Born (Springer), are
excellent treatments of optics from the standpoint of the electromagnetic
theory. "Theory of Electrons," by H. A. Lorentz (Teubner), contains
important material on electrodynamics and the electronic structure of
matter, though since it was written in the early days of electron theory,
before the time of quantum theory, there are many parts of it which cannot
be accepted at present. Finally, "Physical Optics," by R. W. Wood (Mac-
millan), gives an excellent treatment of the more experimental side of optics.
For the last chapters, on wave mechanics and the structure of matter,
there are in the first place a number of general texts. " Atombau und
Spektrallinien," by Sommerfeld (Vieweg), is a standard work on the older
forms of quantum theory, and "Vorlesungen iiber Atommechanik," by M.
Born, Vol. I (Springer), contains a rather complete mathematical develop-
ment of the older quantum mechanics. "Atoms, Molecules, and Quanta,"
by Ruark and Urey (McGraw-Hill), deals with general quantum theory as
well as wave mechanics. "Quantum Mechanics," by Condon and Morse
(McGraw-Hill), treats the more elementary methods of wave mechanics, as
SUGGESTED REFERENCES 563
does "Atombau und Spektrallinien, Ergdnzungsband," by Somm&rfeld
(Vieweg). "Wave Mechanics," by N. F. Mott (Cambridge), is a short but
readable account of the elementary principles of wave mechanics, and
"Wave Mechanics, Elementary Theory," by J. Frenkel {Oxford), the first of
three projected volumes, furnishes a more detailed treatment of general
principles, with particular emphasis on the statistical side of the theory.
The older " Einfuhrung in die Wellenmechanik," also by Frenkel, gives a gen-
eral survey of the field, and includes details of some of the soluble problems.
For spectroscopic purposes, in addition to the references mentioned, "Struc-
ture of Line Spectra," by Pauling andGoudsmit (McGraw-Hill), and "Linien-
spektren," by F. Hund (Springer), contain good treatments. Chemical
applications are not adequately dealt with in any texts at present, but
"Quantum Mechanics of Chemical Reactions," by H. Eyring (Chemical
Reviews, February, 1932), contains a survey of material on reactions. Sta-
tistical mechanics and thermodynamics are treated in many places, but
perhaps the most useful one from the standpoint of the structure of matter
is " Kinetische Theorie der War me," by K. F. Herzfeld (Vieweg). For quantum
statistics, with particular application to the structure of metals, "Quan-
tenstatistik," by L. Brillouin (Springer), is to be recommended. one can
hardly pass over this subject, however, without mentioning some of the
standard texts: "Elementary Principles in Statistical Mechanics," by Gibbs
(Longmans); "Vorlesungen uber Gastheorie," by L. Boltzmann (Barth); and
several more modern works, such as "Dynamical Theory of Gases," by J. H.
Jeans (Cambridge), and "Statistical Mechanics," by R. H. Fowler (Cam-
bridge). For the theory of dielectrics, "Polar Molecules," by P. Debye
(Chemical Catalog Co.), is to be recommended, and for both dielectric and
magnetic properties, "Theory of Electric and Magnetic Susceptibilities," by
J. H. Van Vleck (Oxford), gives a detailed and excellent discussion.
Finally, in addition to specific books, the student will find it advantage-
ous to make liberal use of the two extensive reference works, " Handbuch der
Physik" (Springer) and "Handbuch der Experimentalphysik" (Akademische
Verlagsgesellschaft). Both of these sets of books provide a convenient and
useful source of reference in both experimental and theoretical physics and
cover practically every subject in these fields.
INDEX
A priori probability of a group of
states, 370-371
Absolute zero, crystals at, 472jf.
Absorption coefficient, optical, 256
Absorption probabilities, relation to
dispersion electrons, 549
Action, principle of least, 342
Action variable, and contact trans-
formation, 88
relation to correspondence princi-
ple, 359
Activation energy, 490
Angle variable (see Action variable)
Angles, Euler's, 100-101
Angular momentum, in quantum
mechanics, 410ff.
of rotating rigid body, 92ff.
Angular rotation, lack of vector
character, 100
Anharmonic oscillator, 38
Anomalous dispersion (see Dis-
persion)
Antisymmetry of electronic wave
function, 504^.
Aphelion, 62
Approximate solution for non-uni-
form string, 147-148
(See also Wentzel-Kramers-Bril-
louin method)
Archimedes' principle, 196
Artificial electric line, analogy to
weighted string, 132
Associated Legendre polynomials, or
associated spherical harmonics,
171, 409
Atom, general discussion, 406-437
medel of Fermi and Thomas, 535-
536
Atom, perturbation theory applied to
multiplets, 390, 518-522
repulsion and attraction between,
439-452
Atomic refractivity, 284
Atwood's machine, 76
Auger effect, 494
Average values, of functions of
coordinates and momenta, on
wave mechanics, 375, 380
in phase space, 372
Axis of rotation, instantaneous, 99
Azimuthal quantum number, 410,
519
B
Band spectra, 480, 483
Barkhausen effect, 557
Beats, with coupled oscillators, 110
between transient and steady
motion, 34
Bent beam, 184
Bernoulli's equation, 191-192
Bessel's equation and function, 18,
166, 169, 170
Bimolecular reactions, 489-491
Binomial theorem, 4
Biot-Savart law, 233
Black-body radiation, Planck's law,
395
Body forces, in elasticity, 173
in hydrodynamics, 191
Bohr, correspondence principle, 361
frequency condition, 360
* quantum rules, 331
theory of hydrogen, 411
Boltzmann distribution law, 369
Bose statistics, 544
Boundary conditions, circular mem-
brane, 166
565
566
INTRODUCTION TO THEORETICAL PHYSICS
Boundary conditions, electromag-
netic field, 258-259
heat flow, 205
rectangular membrane, 161
string, 122
wave mechanics, one-dimensional
motion, 350-351
Canonical ensemble, 368-369, 456-
458
in quantum theory, 467
Capacity of parallel plate condenser,
215
Center of gravity, separation of
coordinates for diatomic mole-
cule, 481.
Central field, 61-63
electron motion in, 418jf.
phase space for, 82
Chandler period, 99
Characteristic functions and num-
bers, 139, 151
(See also Wave functions)
Charge density, 211-212
Chemical compounds, 447-451
Chemical reactions, 488jf.
Circuit, electric (see Electric circuit)
Circular membrane, 164-168
Circular polarization, 266
Classical statistical mechanics in
phase space, 3Q4ff.
Coefficients, of Fourier series, 124
of viscosity, 193
Coherence of light, 295-299
Collisions, quantum-theory treat-
ment, 402-404, 488/.
Combination tones, 38
Commutation rule, 380
Complex exponentials and complex
numbers, 22-26
Compounds, chemical, 447-451
Compressibility, of crystal lattice,
471#.
of elastic solid, 183-184
Condenser, energy in, 246-247
theory of parallel plate, 214
Conduction of electricity in metals,
electron theory, 281
quantum theory, 449, 495-496,
536#.
Conductivity, specific electrical, 241
thermal, 197
Conductor of electricity, as an
equipotential, 215
Configuration, atomic, 430
Configuration space, 83-84, 117
Conjugate foci, 344
Conservative system, condition for,
55
Constants of integration, 11
Constraints, 75-76
Contact transformation, 87-90
and correspondence principle,^59-
361
Continuity, equation of, 186-187
for electric flux, 237
for flow of electricity, 231
relation to divergence of elec-
tric field, 211
Continuity conditions, electromag-
netic wave, 258-259
vibrating string, 156-157
Continuous medium, 120
Convection current, 238
Convergence, 5-8
of Fourier series, 125
Coordinates, curvilinear, curl in, 229
gradient, divergence, Laplacian
in, 199-201
generalized, equations of motion
in, 59#, 69jf.
Cornu's spiral, 320^.
Corpuscular theory of light, 329
Correspondence principle, 359-363
and quantum statistics, 466
Coulomb's law, 210
Coupled systems, 107jf.
application to radioactivity and
collisions, 402-404
Cross section of atoms, collision, 404
Crystals, 472jf.
valence binding and, 449
Curl, in curvilinear coordinates, 229
of electric field, 211
of a vector, 55
INDEX
567
Current, density of, 231
displacement, 236-239
Curvilinear coordinates (see Coor-
dinates, curvilinear)
D
D'Alembert's equation, 243
solution of, 304
Damping, critical, 28
logarithmic, 28
of vibrating string, 142-143
Davisson and Germer, experiment
of, 336
De Broglie, 336
Debye's theory of specific heats, 478
Decrement, logarithmic, 28
Deformation of elastic solid, 180
Degeneracy, gas, 469-470
in multiply periodic motion, 363-
364
orbital, in atomic multiplets, 521-
522
perturbation theory for, 390-391
spin, 510, 521-522
in square membrane, 168-170
Density, charge, 211-212
current, 231
of energy in electromagnetic field,
249-250
Derivative, directional and partial,
54
Determinant, expansion of, 387
Determinant form of electronic wave
functions, 504
Determinantal equation for string,
155
Diamagnetism, 555
Diatomic molecules, electronic
energy, 522-523
nuclear motion, 481-485
Dielectric, force in an inhomogene-
ous, 257
Dielectric constant, 239
relation to polarization, 275
temperature dependence, 555
Dielectrics, types of, 553-554
Difference equations, 129
Differential equations, general prop-
erties, \0ff.
linear, properties, 36
solution by Green's method, 222-
223
Diffraction, 311-328
Dipole, and dipole moment, 221
oscillating, 288^".
Dipole orientation, temperature de-
pendence, 554-555
Direction cosines, 49
Directional derivative, 54
Discontinuities in functions, Fourier
representation, 126
Discontinuity, in electric field, 214,
222
in electromagnetic field, boundary
conditions, 258-259
in electrostatic potential, 222
Dispersion, of electromagnetic waves
in metals, 256
electron theory, 270$".
quantum theory, 546-549
Displacement, electric, 239, 274
Displacement current, 236-239
Dissipation function, 142-143
Dissipative forces, 40
Divergence, in curvilinear coordi-
nates, 200-201
of electric field, 210-211
of a vector, 56
physical meaning of, 187
Double Fourier series, 162
Double layer, 221
Double pendulum, 119
Doubly periodic motion, 63-64, 84-
86
in quantum theory, 362-364
Dulong and Petit's law, 474
Dynamic stability, 103
E
Effective nuclear charge, 425, 431-
432
Einstein, formula for specific heat
of solids, 477
photoelectric law, 330
568
INTRODUCTION TO THEORETICAL PHYSICS
Einstein, probability coefficients for
radiation, 393-399
relation between energy and mass,
252
Elastic constants, 181
Elastic electronic collisions, 402-403
Elastic solid, 172-183
Elastic waves, 179-180
Electric circuit, with inductance and
resistance, 16
oscillations, 20, 28/.
Electric conductivity (see Conduc-
tion of electricity in metals)
Electric displacement, 239, 274
Electric field, 210-214
in spherical cavity, 279
Electric moment of an atom, matrix
component, 376
Electromagnetic field, energy in,
246/
of oscillating dipole, 290-291
quantization of, 399
wave equation for, 243-244, 253
Electromagnetic induction, 235-236
Electromagnetic units, 227
Electromagnetic waves, in metals,
256
polarization of, 262
reflection and refraction of, 258/.
spherical, 286/.
Electromotive force, 212, 250-251
Electron, radius of, 252
Electron collisions, 402-404
Electron emission from metals, 352-
353, 539-540
Electron energy, atoms, 435-437,
518-522
metals, 494-497, 536jf.
molecules, 483, 522-530
Electron equivalence and exclusion
principle, 519-520
Electron excitation, collisions of
atoms with, 491/
Electron interactions, 501/.
Electron pair valence bond, 443-444
Electron shells in atoms, 425
Electron spin, 443, 507/
Electron theory and dispersion,
270/.
Electron wave functions, determi-
nant form and antisymmetry,
504-505
Electrostatic field, energy in, 247
Electrostatic potential, 211-212, 222
Electrostatic problems, and poten-
tial theory, 210-217
Ellipsoid of inertia, 95-96
Elliptical polarization, 266
Emission, thermionic, 352-353, 539-
540
Energy, of activation, 490
of atoms, 435-437, 518-522
electrical, 246-256
internal, 460
of ionic crystals, 473
mechanical, 39-46, 52
of metals, 494-497, 536jf.
of molecules, 483, 522-530
Energy density of radiation, mean
value, 393
Energy levels, Bohr's hypothesis,
331
Energy surface in phase space, 80
Ensembles, canonical, 368-369, 456-
458
in statistical mechanics, 365-369
Entropy, 460
Equation, of continuity (see Con-
tinuity, equation of)
of motion, of elastic solid, 175-176
of fluid, ideal, 190-191
viscous, 194
mechanical, generalized coordi-
nates, 59/., 69/.
of membrane, 160
of rigid body, 96/
of string, 120-121
in normal coordinates, 140
variable, 146-147
of state of gases, 454-470
of solids, 478-480
Equations, difference, 129
Equilibrium, stable, 45
Equinoxes, precession of, 105
Equipotential surfaces, 54, 215
Equivalence of electrons, and exclu-
sion principle, 519-520
Ergodic motion, 81
INDEX
569
Euler, angles, 100-101
equations of hydrodynamics, 190-
191
equations for rigid body, 98
period, 99
Even functions, 126
Exclusion principle, free electrons,
531-533
general discussion, 502-516
periodic table, 426
valence attraction, 443-444
Expansion, Fourier, 123-128
in normal functions for variable
string, 153
Taylor's, 4-5
in wave functions, 382-383
Exponential, complex, 22
Exponential integral function, 18
Exponential solution, vibrating par-
ticle, 21
vibrating rectangular membrane,
161
vibrating string, 121
External forces, on coupled oscilla-
tors, 113
generalized coordinates, 69-70
motion under, 35-36
External radiation field, perturba-
tion of atoms by, 392-393
F
Falling body, 11/.
Faraday's induction law, 239
Fermat's principle, 339-342
Fermi statistics, 5&lff.
Fermi-Thomas atomic model, 535-
536
Ferromagnetism, 557-558
Field, central (see Central field)
electric (see Electric field)
electromagnetic (see Electromag-
netic field)
electrostatic (see Electrostatic
field)
vector, 51
First law of thermodynamics, 460
Flow, of fluids, 185#.
of heat, 197#.
Flow, lines of, 186
Fluids, flow of, 185jf.
Flux, 185
magnetic, 235
Flux density, 185-186
of heat flow, 198
Force, on charge and current, 240
external (see External forces)
interatomic, 439$".
Forced vibrations, of particle, 29
of string, 142-143
Fourier series, 120-128
double, 162
in function space, 139
generalization for multiply periodic
motion, 362
Fourier's method for transient heat
flow, 203-205
Fraunhofer diffraction, 315.^.
Free electrons in quantum theory,
531-535
Free energy, 458jf.
of crystals, 476
Fresnel diffraction, 315jf.
Fresnel equations for reflection,
262-264
Fresnel integrals, 320, 328
Fresnel zones, 308^.
Function space, 137^.
Functions, odd and even, 126
representation by power series, 2ff.
scalar product in function space,
153
/-values for dispersion electrons,
547-548
G
r space, 365
Gases, dispersion in, 275-278
equation of state and general
properties, 454J\
Gauss error curve, 372
Gauss's theorem, 187-188
General solution of differential equa-
tion, 15
Generalized coordinates, curl in, 229
equations of motion in, 59Jf., 69Jf.
gradient, divergence, Laplacian
in, 199-201
570
INTRODUCTION TO THEORETICAL PHYSICS
Generalized force, 69-70
in vibrating string problem, 141
Generalized momentum, 61, 69$".
Geophysical problems with elastic
waves, 179-180
Gibbs-Helmholtz equation, 460
Gradient, in curvilinear coordinates,
200
of a scalar, 54, 56
Green's distribution, 221-222
Green's method for differential equa-
tions, 222-223
Green's theorem, 217
H
Half-breadth of resonance band, 37
Hamiltonian function, 71-72
Hamilton's equations of motion,
71-76
Hamilton's principle, 342
Hartree's method of self-consistent
fields, 430-431
Heat flow, 197-209
Heitler-London method for molec-
ular energy, 523-527
Hertz, electric waves, 238
vector, 291J".
Homogeneous quadratic functions,
73-74
Homopolar valence, 443, 523-527
Hooke's law, 177-182
modified for viscous fluids, 193
Huygens' principle, 302-311
Hydrogen atom, 406$\
Hydrogen molecule, 523-527
Hydrostatic pressure, 174
Ideal fluid, 1G0
Images, method of, 215-216
Impedance, 33
Imperfect gases, 462J\
Index of refraction, 253, 270-283
{see also Dispersion)
Induced emission, probability of, 394
Induction, electromagnetic, 235-236
electrostatic, 215
Induction vector, magnetic, 239
Inelastic electronic collisions, 403
Inertia, moment and products of,
95-96
Infinite series {see Series)
Initial conditions, for circular mem-
brane, 168
for quantum-mechanical motion,
377jf.
for rectangular membrane, 162
for string, 122
for transient vibrations of particle,
29
Inner shielding, 437
Instantaneous axis, 99
Integral, line, 52-53
phase or quantum, 358-359
of state, in kinetic theory, 458-459
Integral method for transient heat
flow, 205-208
Intensity, of electric field, 210
of magnetic field, 225
of radiation, and correspondence
principle, 363
and selection principles, 417
Interaction, of electrons, 501^.
of nuclei, perturbation method,
497-499
Interatomic forces and molecular
structure, 439^.
Internal energy of a system, 460
Ionic compounds, 449-451
Ionic crystals, 473
Ionic forces, 439
Ionization potentials, 435-437
Iron group of elements, 430
paramagnetism of, 557
Irreversibility of heat flow, 208
Irrotational flow, 188-192
Iso-electronic sequences, 438
Isotopes, 407
K
Kinetic energy, 41
of rigid bodies, 95
Lagrange's equations, 58-67, 75
for weighted string, 131
INDEX
571
Lagrangian function, 59
with magnetic field, 77
in relativity, 78
Laplace's equation, for electrostatics,
212
for heat flow, 198
solution as surface integral, 220
for velocity potential, 190
for vibrating membrane, 164
Laplacian, 56
in curvilinear coordinates, 201
in polar coordinates, 164-165
Larmor precession, 106
Legendre polynomials, 158
associated, 171, 409
Legendre's equation, 171
Lenz's law, 235
Level, energy, 331
Lewis, electron pair bond, 443
Line integrals, 52-53
Linear differential equation, proper-
ties, 36
Linear oscillator, phase space, 81
Linear polarization, 266
Linear restoring force, 19
Linear transformation, 115
Lines of flow, 186-188
Liouville's theorem, 365-366
Liquids, dispersion in, 278-280
flow of, 185-195
gases, and solids, comparison, 454
Lissajous figures, 85
Longitudinal waves in elastic solid,
178-179
Lorentz force, 240
Lorenz-Lorentz law, 280
L-S coupling, 519
M
Magnetic properties, quantum
, theory, 555-558
Magnetic quantum number, 410
Magnetism, 225-234
Many-body problem in wave me-
chanics, 432-433
Matrices in quantum mechanics,
374-381
Maxwell-Boltzmann distribution
law, 369
Maxwell's distribution of velocities,
372
Maxwell's equations, 2Z5ff.
Mean values (see Average values)
Mechanical energy, 39-46, 52
Mechanics, statistical, 364-371, 454-
470
wave nature of, 335-336
Membrane, vibrations of, IGQff.
Metals, classical theory, electron
theory,' 280-283
plane waves of light in, 255-256
reflection of light from, 267-268
quantum theory, electrons in,
531-543
nature of conduction process,
494-497
relation to molecular orbitals,
529-530
relation to valence compounds,
449
Method of images, 215-216
Microcanonical ensemble, 367-368
and quantum statistics, 466-467
Molecular refractivity, 284
Molecules, electronic energy, 518,
522-530
general structure and interatomic
forces, 439-451
nuclear motions, 480-486
Moment of inertia, 95
Momentum, angular (see Angular
momentum)
generalized, 61, %Qff.
Momentum operator in wave me-
chanics, 380
Momentum space, 83-84
Morse potential curve for molecules,
445-446
Moseley's law, 437
Motion of rigid bodies, 92#
in several dimensions, 46
of several particles, 117
M-space, 364
Multiplets, 390, 509-527
Multiply periodic motion, 63-64,
84-86
572
INTRODUCTION TO THEORETICAL PHYSICS
Multiply periodic motion, in quan-
tum theory, 362-364
Multivalued potential, 228
Mutual induction, coefficient of, 245
N
Negative dispersion, 558
Neutrons, 407
Newton's law of motion, 11, 70-71
Nodal line, Euler's angles, 101
Nodes in vibrating membrane, 162,
167
Non-central two-dimensional mo-
tion, 83
Nonconservative systems, 41
Nonuniform string, 146-158
Normal coordinates, 107-114
general theory, 134^142
thermal vibrations of crystals,
474-475
Normal dispersion, 277
Normal functions, for vibrating
string, 139, 151-153
(See also Wave functions)
Normal incidence, reflection coeffi-
cient, 260-262, 267
Normal stresses, 174
Normalization, coupled systems, 112,
116
nonuniform string, 152
quantum theory, 374-375, 382-
383
weighted string, 136
Nuclear atom, 406-407
Nuclear charge, effective, 425, 431-
432
Nuclear motions in molecules and
solids, 471-499
Nutation, 104
O
Odd functions, 126
Ohm's law in differential form, 241
one-electron energies and wave
functions, 433-437
Open circuits, 237
Operators in wave mechanics, 380-
382
Optics, 258-333
Orbital degeneracy, 416, 520-521
Orbits, central motion, 62-63
hydrogen, 412
Orthogonality, Bessel's functions,
169-170
coupled systems, 116
nonuniform string, 151
quantum theory, 378
sine and cosine, 125
weighted string, 136
Oscillating dipole, radiation from,
288jf.
Oscillations, of electric circuit, 20
simple harmonic, 19Jv
Oscillator, anharmonic, 38
coupled, 107jf.
linear, classical theory, 19^.
quantum theory, 354-357, 384
two-dimensional, 84^86
Oscillator strength, 280
Outer shielding, 437
Overtone, 120
Parallel plate condenser, 214-215
Paramagnetism, 555-557
Palladium group of elements, 430
Partial derivative, 54
Partial differential equation, 121
Particular solution of differential
equation, 15
Penetrating orbits, 419-422
Penetration of potential barriers,
351-353
Penetration force between atoms,
442
Penetration interaction for H 2 , 525
Perfect gas, classical theory, 461
quantum theory, 468^.
Perihelion, 62
Periodic force in vibrating string
problem, 142
Periodic motion, 44
multiply (see Multiply periodic
motion)
INDEX
573
Periodic system of the elements,
426/
Permeability, magnetic, 239
Perturbation theory, nonuniform
string, 154/.
quantum mechanics, 386-404
Phase change, on reflection, waves
on string, 158
in total reflection, 266
Phase integral, 88-91, 359-360
Phase space, 79/., 358/
for free electrons, 532
Photoelectric emission, 330, 343, 540
Photons, 330, 332
Planck's constant, 330
Planck's law of black-body radia-
tion, 395
Plane waves, elastic, 176-178
optical, 253-256, 258-268
Planetary motion, 60
Point transformation, 87
Poiseuille's law, 194-195
Poisson's equation, 212, 217/
Poisson's ratio, 182
Polarizability of atoms and ions,
275, 439-441, 549-555
Polarization, of dielectric, 270-275
of light, 264-268, 295
Pole of function, 5
Polyatomic molecules, 485-486
Polynomials, Legendre, 158
Potential, electrostatic, 211-212
magnetostatic, 225
retarded, 303-305
vector, 231
velocity, 188/.
Potential barriers, penetration, 351-
353
Potential energy, 41/., 52-55
Power series, 1-8
Power-series solution, for differen-
tial equations, 10-20
for secular equation of perturba-
tion theory, 387-390
Poynting's vector, 249-251
Precession, 92-93, 104-106
Predissociation, 494
Pressure, elasticity, 175/.
hydrodynamics, 191/.
Pressurey kinetic theory, 459/.
of solid, 472
Principal axes, coupled systems, 117
of inertia, 96
of stress, 175
Principal quantum number, 410
Principle of least action, 342
Probability, a priori, 370-371
Probability relations in wave
mechanics, 333-337
Products, of inertia, 95
of vectors, 49-51
Progressive waves, 149
Protons, 407
Q
Quantum condition, 353-361
Quantum defect, 422
Quantum derivation of Einstein
probability coefficients, 395-399
Quantum hypothesis of Planck, 330
Quantum number, 354, 410, 519
Quantum statistics, 466/.
Fermi, 531/.
Quantum theory and phase space,,
369-371
Quasi-ergodic motion, 81
quantum theory, 364
statistical application, 367
Quasi-stationary processes, 241
R
Radial motion in central field, 61-62
Radial wave function, 409/.
Radiation, electromagnetic, 186-300
perturbation of atoms by, 392-393
quantization of, 399
Radiation intensities and corre-
spondence principle, 363
Radioactivity, quantum theory, 356,
402
Ramsauer effect, 404
Rare earths, paramagnetism of, 557
structure of, 430
Rayleigh scattering, 294
Reactance of electric circuit, 33
Reactions, chemical, 488-494
574
INTRODUCTION TO THEORETICAL PHYSICS
Rectangular membrane, 160-164
Reflection, elastic waves, 179
electromagnetic waves, 258-268
waves on strings, 156-158
Refraction, electromagnetic waves,
258-268
index of (see Dispersion)
Relaxation time, 257
Resistance, specific, 241
(See also Metals)
Resolving power, grating, 328
lens, 325-326
Resonance, 29-33
Resonance scattering, 295
Retarded potentials, 303-305
Rigid bodies, 92-105
Rolling-ball analogy, 45-46
Rotating system of axes, vectors in,
97-98
Rotation, of coordinates, 114-116
of diatomic molecule, 482
Rotator, quantum condition for, 354
Rydberg formula, 422
Rydberg number, 408
S
Scalar potential, 241-244
for oscillating dipole, 288-289
Scalar product, of two functions, 153
of two vectors, 49
Scalar quantities, 48
Scattering, of electrons, 404
of light, 293-299
Schrodinger's equation, 345-346
including the time, 381-382
many-body problems, 432-433
Second law of thermodynamics, 460
Secular equation, coupled oscillators,
108
perturbation theory, string, 155
wave mechanics, 386-390
Selection principles for spectra, 417
Self-consistent fields, 430-431
Separation of variables, method of,
163-165
Series, Fourier, 120#.
power, 1/.
Series spectra, 416-418
Several particles, general problem
of motion, 117
Shearing stress and strain, 174, 177
Shells, electronic, in atoms, 425
Shielding of electrons in atoms, 425-
437
Simple harmonic vibrations, 19jf.
Singularity of function, 5
Sleeping top, 105
Solenoid, energy in, 249
magnetic field in, 231
Solids, dispersion in, 278-280
elastic, 172-183
physical properties, 471-480
Sources and sinks, 186
Specific conductivity, 241
Specific heat, of gases, 460-461
of molecules, 483
of solids, 476-478
Specific resistance, 241
Spectra, of atoms, 407-418, 435-437,
509-520
of molecules, 480, 483
Spectral analysis of a light wave,
298-299
Spectral series, nomenclature, 416-
418
Spectral terms, 331
Spherical electromagnetic waves,
286-299
Spherical harmonics, 202-203
associated, or associated Legendre
polynomials, 171, 409
Spin of electron, 443, 507jf.
Spin degeneracy, 510-514
Spontaneous radiation, 393, 399-402
Square membrane, degeneracy, 170
Stability, dynamic, 103
Stable equilibrium, 45
Standing waves, 149
Stark effect, 550
Stationary states, 331
and perturbation theory, 400-402
Statistical interpretation of wave
theory, 332-333
Statistical mechanics, 364-371, 454-
470
Steady flow, of fluids, 187
of heat, 198-202
INDEX
575
Stokes's theorem, 229-230
Strains in elastic solid, 172-183
Streamlines, 186
Stresses in elastic solid, 172-183
String, vibrations, 120-158
Structure of atoms, 425^.
Subshells of electrons in atoms,, 429
Sum of states, 468
Superposition of transient and forced
motion, 33-35
Surface charge density, 212
Surface forces, 173
Symmetrical top, 102-104
Symmetry of stress tensor, 175
Temperature dependence, of chemi-
cal reactions, 491
of electron energy in metals, 538-
540.
of polarizability and dieleetric
constant, 555
Temperature gradient, 197
Temperature vibrations of a crystal,
474-480
Thermal conductivity, 197
effect of electron distribution on,
539
Thermal equilibrium and canonical
ensemble, 457
Thermal expansion, definition of
coefficient, 471
Thermal pressure in solids, 479
Thermionic emission, 352-353, 539-
540
Thermodynamics, laws of, 460
Thomson, G. P., electron diffraction,
336
Thomson, J. J., scattering of light,
295, 548
Top, precession, 92, 102-104
Torque, 92-93
due to shearing stresses, 174-175
Torque-free motion of symmetrical
rigid body, 98-99
Total reflection, 265-267
Transformation, contact and point,
87
Transient flow of heat, 203-208
Transients, initial conditions for, 29
Transition probabilities, 393-397
and selection principles, for atoms
416
for molecules, 483
Transverse waves in elastic solid,
176-178
Traveling waves, 132, 149
Tubes of flow, 186
Turbulent flow, 189
Two-center problem, wave func-
tions, 527-528
Two-dimensional oscillator, 84-86
relation to coupled systems, 114-
117
Types of substances, classification,
U7ff.
U
Uncertainty principle, 333-339
relation to phase space, 370-371
Unit vectors, 48
in function space, 138
Units, electrical, 227
Valence, homopolar, 442-449
in hydrogen molecule, 526
Van der Waals' equation, 464-466
Van der Waals' force, 440-441, 551-
553
Variable mass, relativistic, 78
Variable tension and density of
string, 146/.
Variables, separation of, 163
in quantum theory, 362
Variation, of constants, method of,
391-392
of an integral, 339
Vector, 48-56
Vector model for angular momen-
tum, 415
Vector operations in generalized
coordinates, 199-201, 229
Vector potential, 231-232, 241-243
for oscillating dipole, 289-290
576
INTRODUCTION TO THEORETICAL PHYSICS
Velocity of light, 243-244, 270/.
Velocity field of flowing fluid, 185-
186
Velocity potential, 188jf.
Vibrations, of coupled systems, 107-
118
of crystals, 474-478
of elastic solids, 172-180
of membranes, 160-170
of molecules, 480-481
of particles, 19-36
of strings, 120-158
Virial coefficients, 464
Viscosity, 192-194
Volt-electron, definition, 408
W
Wave equation, for electromagnetic
field, 243-244, 253
for membrane, 164
in polar coordinates, 171
in wave mechanics (see Schrod-
inger's equation)
Wave functions, central-field prob-
lem, 418-423
determinant form, 504-505
hydrogen, 418-423
linear oscillator, 357
two-center problem, 527-528
Wave mechanics, general principles,
335-343
many-body problem in, 423-433
one-dimensional motion in, 346-
356
Wave normal, 253-254
Wave packet, 337-338
Waves, elastic, 176-180
electromagnetic (see Electromag-
netic waves)
progressive and standing, 149
on strings, 132, 149-151, 156
Weighted string, 131, 136
Wentzel-Kramers-Brillouin method,
347-356
Work, 41
as a scalar product, 52
Work function for metals, 543
X
X-ray series, nomenclature, 425, 437
Y
Young's modulus, 182
Z
Zeeman effect, 556
C0U£S 6* TK2ri2SO W3Y UBRARf
r
IIMII
>LATZT
.ilA