Download - Part-II Electrodynamics Michaelmas Term 2014adc1000/Electrodynamics.pdf · Part-II Electrodynamics 3 replaced by new relativity principles, while Maxwell’s equations remain form-invariant

Transcript

Part-II Electrodynamics Michaelmas Term 2014

Lecture notes

Anthony [email protected]

1 Introduction

We begin with a reminder of Maxwell’s equations, which were discussed at length inthe IB course Electromagnetism:

∇ · E =ρ

ε0(1.1)

∇× E = −∂B

∂t(Faraday) (1.2)

∇ ·B = 0 (1.3)

∇×B = µ0J + µ0ε0∂E

∂t. (Ampere) (1.4)

Here E(x, t) and B(x, t) are the electric and magnetic fields, respectively, ρ(x, t) is theelectric charge density, and J(x, t) is the electric current density (defined so that J · dSis the charge per time passing through a stationary area element dS). The constantsε0 = 8.854187 · · · × 10−12 kg−1 m−3 s4 A2 (defined) is the electric permittivity of freespace, and µ0 = 4π × 10−7 kg m s−2 A−2 (definition) is the permeability of free space.Via the force laws (see below), they serve to define the magnitudes of SI electrical unitsin terms of the units of mass, length and time. They are not independent since µ0ε0 hasthe units of 1/speed2 – indeed, as we review shortly, µ0ε0 = 1/c2. By convention, wechoose µ0 to take the value given here, which then defines the unit of current (ampere)and hence charge (coulomb or ampere second). Since we define the metre in terms ofthe second (the latter defined with reference to the frequency of a particular hyperfinetransition in caesium) such that c = 2.99792458 m s−1, the value of ε0 is then fixed as1/(µ0c

2).

With the inclusion of the displacement current µ0ε0∂E/∂t in Eq. (1.4), the continuityequation is satisfied, which expresses the conservation of electric charge:

∂ρ

∂t+ ∇ · J = 0 . (1.5)

Maxwell’s equations tell us how charges and currents generate electric and magneticfields. We also need to know how charged particles move in electromagnetic fields, and

Part-II Electrodynamics 2

this is governed by the Lorentz force law :

F = q(E + v ×B) , (1.6)

where q is the charge on the particle and v is its velocity. The continuum form of theforce equation follows from summing over a set of charges; the force density (force pervolume) is then

f = ρE + J×B , (1.7)

in terms of the charge and current densities.

Maxwell’s equations are a remarkable triumph of 19th century theoretical physics. Inparticular, they unify the phenomena of not only electricity and magnetism but alsolight. To see this, consider Maxwell’s equations in free space (ρ = 0 and J = 0). Takingthe curl of Eq. (1.2), and using Eqs (1.1) and (1.4), gives1

∇2E = ε0µ0∂2E

∂t2. (1.8)

This is a wave equation for the electric field with wave speed c = 1/√µ0ε0. The

equivalent equation for B follows from Eq. (1.4). We see that electromagnetic wavespropagate in free space with speed that is the speed of light in vacuum.

Despite these successes, there is an apparent issue with Maxwell’s equations in that theyare not invariant under Galilean transformations. Recall that these are transformationsof the form

t→ t , x→ x− ut , (1.9)

and so v → v − u, where u is the relative velocity of two reference frames. Newton’slaw dp/dt = F (where p = mv with m the mass of the particle) is invariant undersuch transformations since forces are frame-invariant in Newtonian dynamics. TheLorentz force law can be made Galilean invariant if it is assumed that the electricand magnetic fields transform under Galilean transformations as E→ E + u×B andB → B. However, there is a problem in that Maxwell’s equations predict waves thatpropagate at the fixed speed c, but under Galilean transformations the velocity of awave must change in the same way as particle velocities under a change of frame.One possible way out of this is to assume that Maxwell’s equations are not Galileaninvariant in that they only hold in one particular reference frame. The problem withthis is that, unlike the equations governing the propagation of sound waves in a gassay, it is not clear what defines the preferred frame. This interpretation began to lookincreasingly untenable with experiments (by Fizeau and Michelson and Morley in thesecond half of the 19th century) on the velocity of light in moving fluids, and this was amajor motivation for Einstein introducing the framework of special relativity in 1905.Within special relativity, it is Galilean invariance and Newtonian mechanics that are

1Recall the identity ∇× (∇×E) = ∇(∇ ·E)−∇2E.

Part-II Electrodynamics 3

replaced by new relativity principles, while Maxwell’s equations remain form-invariantunder these new (Lorentz) transformations.

In this course we shall establish the form-invariance of Maxwell’s equations underLorentz transformations. In doing so, we shall learn how to rewrite electromagnetismin a more covariant formalism that makes manifest its invariance under Lorentz trans-formations. With this streamlined formalism, we shall discuss the generation of elec-tromagnetic radiation by accelerating charges and develop action principles for theelectromagnetic field, which is an important step towards developing a quantum ver-sion of the theory (see Part-III courses in Quantum Field Theory).

2 Review of electrostatics

We begin with a brief review of electrostatics. Here, we deal with static charge distri-butions, i.e., ∂/∂t = 0 and J = 0. Maxwell’s equations for the electric and magneticfields decouple in static situations, and those for the electric field become

∇ · E =ρ

ε0, ∇× E = 0 . (2.1)

Since E has vanishing curl, we can write it in terms of the gradient of the electrostaticpotential φ(x) as E = −∇φ. Combining with ∇ ·E = ρ/ε0 gives the Poisson equation:

∇2φ = −ρ/ε0 . (2.2)

A typical problem in electrostatics is to determine the field within some region S dueto specified charges within S, and subject to appropriate boundary conditions on the(closed) boundary ∂S. In such situations, Poisson’s equation has a unique solutionsubject to Dirichlet boundary conditions (φ specified on ∂S), or a unique solution upto an irrelevant constant for Neumann boundary conditions (the normal derivative of φspecified on ∂S). Of course, these boundary conditions must result from other chargesoutside the region of interest.

Here, we are interested in the field of localised charged distributions with no boundarysurfaces. In this case, we can assume that the potential decays to zero at spatial infinity.Since the Poisson equation is linear, we can solve it with the help of an appropriateGreen’s function, G(x; x′):

φ(x) = − 1

ε0

∫G(x; x′)ρ(x′) d3x′ . (2.3)

The Green’s function has to satisfy ∇2G(x; x′) = δ(3)(x − x′), where the Dirac delta

Part-II Electrodynamics 4

function appears on the right, and tend to zero as |x| → ∞. Homogeneity and isotropy2

demand that the Green’s function only depends on |x− x′|. Taking x′ = 0, G is thena function of only r ≡ |x|. Away from the origin, we have

1

r2

∂r

(r2∂G

∂r

)= 0 , (2.4)

for which the solution that decays at infinity is G = −A/r. We can fix the integrationconstant A by integrating ∇2G = δ(3)(x) over some region S that encloses the origin.Using the divergence theorem, we have∫

∂S

∇G · dS = 1 . (2.5)

Taking the region S to be a sphere of radius R, the integral on the left evaluates to(A/R2)× 4πR2 and so A = 1/(4π). The appropriate Green’s function is thus

G(x; x′) =−1

4π|x− x′|, (2.6)

and the solution of Poisson’s equation is

φ(x) =1

4πε0

∫ρ(x′)

|x− x′|d3x′ . (2.7)

For a point charge at the origin, ρ(x) = q1δ(3)(x), and the potential φ = q1/(4πε0r).

The associated electric field isE =

q1

4πε0r2x , (2.8)

where x = x/r is a unit vector in the direction of x. The Lorentz force law (1.6) thengives the force on a charge q2 in the field of q1 as

F21 =q1q2

4πε0|x1 − x2|3(x2 − x1) , (2.9)

where we have generalised to have q1 at x1 and q2 at x2. This is simply Coulomb’s(inverse-square) law. Note that the force is a power-law and therefore has no preferredscale. Ultimately, this reflects the fact that, in a quantum treatment, the gauge bosonthat carries the electromagnetic force (the photon) is massless. If the photon did havemass, this would introduce a scale into the problem and the potential of a point chargewould typically have the Yukawa form, φ ∝ e−r/l/r for some scale-length l inverselyproportional to the mass. As an example of such force carriers, the gauge bosons thatcarry the weak and strong forces are massive with sub-atomic associated length-scales.

2Under rigid transformations of the charge density, i.e., translations and rotations, the potentialshould transform similarly. For example, if the charges are all translated by a, the potential shouldtranslate to φ(x)→ φ(x− a).

Part-II Electrodynamics 5

2.1 Multipole expansions

We now ask the question how does the potential and field due to a localised chargedistribution vary at distances large compared to the size of the distribution? We mightexpect that if there is a non-zero total charge Q, asymptotically the field will look likethat due to a point charge Q. This is indeed correct, but what happens if the totalcharge vanishes? To answer this, we consider the multipole expansion of Eq. (2.7).

Consider a charge distribution localised within a region of extent a. We aim to calculatethe fields at distances from the charges that are large compared to a. As we shallsee, these are generally given in terms of simple low-order multipoles of the chargedistribution. Choose the origin close to the charges, so that all charges are within a ofthe origin. Observe at x where r ≡ |x| a. We now deal with the 1/|x− x′| factor inEq. (2.7) by expanding in the small quantity |x′|/r. Using

|x− x′|2 = x · x− 2x · x′ + x′ · x′ , (2.10)

we have

|x− x′|−1 =1

r

(1− 2

x · x′

r2+|x′|2

r2

)−1/2

≈ 1

r

[1 +

x · x′

r2− 1

2

|x′|2

r2+

3

8

(−2x · x′

r2

)2

+ · · ·

]

≈ 1

r+

x · x′

r3+

3(x · x′)2 − r2|x′|2

2r5+ · · · . (2.11)

Note that the second term on the right is O(a/r) times the first, and the secondO(a/r)2 times the first. Writing 3(x ·x′)2− r2|x′|2 = xixj(3x

′ix′j − δij|x′|2) (summation

convention), we can write the potential in the form

φ(x) =1

4πε0

(Q

r+

x ·Pr3

+1

2

Qijxixjr5

+ · · ·), (2.12)

where the multipole moments of the charge distribution are

Q ≡∫ρ(x′) d3x′ (Total charge or monopole) (2.13)

P ≡∫ρ(x′)x′ d3x′ (Dipole moment) (2.14)

Qij ≡∫ρ(x′)(3x′ix

′j − δij|x′|2) d3x′ (Quadrupole moment) . (2.15)

The quadrupole moment is a symmetric, trace-free tensor and therefore has five inde-pendent components. The potential and fields due to these multipole moments vary

Part-II Electrodynamics 6

as

Monopole: φ ∼ r−1 E ∼ r−2 (2.16)

Dipole: φ ∼ r−2 E ∼ r−3 (2.17)

Quadrupole: φ ∼ r−3 E ∼ r−4. (2.18)

As r →∞, the field is dominated by that of the lowest order non-vanishing multipolemoment. For example, if the total charge vanishes, asymptotically the field will varylike a dipole (unless the dipole moment vanishes too).

A simple charge distribution with no total charge but a non-vanishing dipole momentresults from taking a pair of oppositely charged particles, q and −q, and separatingthem by some vector d. The dipole moment is then P = qd. The dipole moment iscritical to inter-molecular bonding in chemistry. The strong hydrogen bonds betweenwater molecules arise from interactions of the electric dipole moments of the polarmolecules.

We can make a simple charge distribution with vanishing total charge and dipole mo-ment by adding two dipoles end-to-end. This gives charges q at ±d along the x-axisand a charge of −2q at the origin, say. The dipole vanishes by symmetry, but thequadrupole moment is

Qij = 2qd2 diag(2,−1,−1) . (2.19)

It is worth noting that, generally, the multipole moments of the charge distributiondepend on the choice of origin. Specifically, if we shift the origin by −a (or equivalentlymove all the charges by a), the total charge is invariant but the dipole changes to

P→ P +Qa . (2.20)

We see that only if the total charge vanishes is the dipole independent of the choiceof origin. A little thought should convince you that the multipole moments have tochange to preserve the value of the potential and field at a given physical location,since shifting the origin also changes x. Note that if Q 6= 0, we can always choose theorigin so that the dipole moment vanishes, i.e., take a = −P/Q.

Exercise: Show that the quadrupole moment transforms under a change of origin as

Qij → Qij + [3(aiPj + ajPi)− 2δija ·P] +Q(3aiaj − δija2

). (2.21)

It is therefore invariant if both the total charge and dipole vanish. More generally, the lowestorder non-vanishing multipole moment is independent of the choice of origin.

Part-II Electrodynamics 7

2.2 Electrostatic energy

In building up a charge distribution, we generally have to do work to overcome repulsiveforces. We can interpret this work as electrostatic potential energy of the chargedistribution. If we bring up a test charge q from infinity to a point y in an electrostaticfield, we have to do work W = −q

∫ yE · dx. The integral is independent of the path

taken since the field is conservative (i.e., curl free). In terms of the potential,

W = q

∫ y

∇φ · dx = qφ(y) . (2.22)

Consider now a charge distribution ρ(x) with associated potential φ(x). We want tofigure out how much work it took to build up this distribution starting with infinitelyseparated charges. We can imagine doing this by incrementing the charge densityfrom λρ(x) to (λ + dλ)ρ(x), i.e., by bringing in infinitesimal charges ρ(x)d3xdλ toeach volume d3x. Repeating this from λ = 0 to λ = 1 builds up the entire chargedistribution. At each stage, the new charge that we bring in feels a potential λφ(x) bylinearity. The increment of work we have to do for each volume is thus λφ(x)ρ(x)d3xdλ.Integrating over λ from 0 to 1 gives the total work

W =1

2

∫ρ(x)φ(x) d3x . (2.23)

This is the required electrostatic potential energy.

An alternative, and very useful, way of reinterpreting the electrostatic potential energyis to think of it being stored in the field itself. To see how this works, we use ρ = ε0∇·Ein Eq. (2.23) to find

W =1

2ε0

∫φ∇ · E d3x

=1

2ε0

∫[∇ · (φE)− E ·∇φ] d3x

=1

2ε0

∫|E|2 d3x , (2.24)

where we used the divergence theorem in the last step (noting that the product of thefield and potential go to zero at least as fast as 1/r3 as r → 0 for a localised chargedistribution). We see that the energy density of the electrostatic field can be taken tobe ε0|E|2/2.

3 Review of magnetostatics

In magnetostatics, we are interested in steady distributions of current hence ∂/∂t = 0.Since current is just the flow of charge, the charges themselves are not static but the

Part-II Electrodynamics 8

charge density is time independent. This means that we must have ∇·J = 0. Maxwell’sequation for the magnetic field now give

∇ ·B = 0 , ∇×B = µ0J . (3.1)

Since the magnetic field is divergence free, it can always be written as the curl of somemagnetic vector potential A, i.e., B = ∇×A.

At this point, we have to face the issue of gauge freedom. The physical magnetic fielddoes not uniquely determine the vector potential since both A and A + ∇χ have thesame curl.3 We can use this gauge freedom to our advantage with a judicious choiceof gauge. For magnetostatics it is convenient to adopt the Coulomb gauge in which∇ · A = 0. (We shall generalise this to electrodynamics later.) We can always findsuch a vector potential since if we start with some A in a general gauge, a gaugetransformation with ∇2χ = −∇ ·A will yield a divergence-free vector potential. ThePoisson equation for χ can always be solved (it is just the same problem as solving forφ given ρ in electrostatics).

Adopting the Coulomb gauge, and using B = ∇ × A in the second of Eq. (3.1), wehave

∇× (∇×A) = µ0J

⇒ ∇(∇ ·A)−∇2A = µ0J

⇒ −∇2A = µ0J Coulomb gauge) . (3.2)

This is just the vector form of Poisson’s equation, and the solution with A tending tozero at infinity for a localised current distribution is

A(x) =µ0

∫J(x′)

|x− x′|d3x′ . (3.3)

We can check that the gauge condition is satisfied as follows:

∇ ·A =µ0

∫J(x′) ·∇

(1

|x− x′|

)d3x′

= −µ0

∫J(x′) ·∇x′

(1

|x− x′|

)d3x′

= −µ0

∫ [∇x′ ·

(J(x′)

|x− x′|

)− ∇x′ · J(x′)

|x− x′|

]d3x′

=µ0

∫ ∇x′ · J(x′)

|x− x′|d3x′ = 0 , (3.4)

3The related problem in electrostatics is that the potential φ is only determined up to a constantby the electric field. We fixed this gauge freedom by taking φ = 0 at infinity.

Part-II Electrodynamics 9

where we used the divergence theorem in the step leading to the last line, noting thatthe current vanishes at infinity for a localised current distribution, and ∇ · J = 0 inthe last equality. Note that ∇x′ is the gradient with respect to x′. The magnetic fieldfollows from taking the curl in Eq. (3.3):

B = ∇×A = −µ0

∫J(x′)×∇

(1

|x− x′|

)d3x′

=µ0

∫J(x′)× (x− x′)

|x− x′|3d3x′ . (3.5)

Finally, for a current I localised in a thin loop of wire we can replace J(x′)d3x′ → Idx′,and so

B(x) =µ0I

∮dx′ × (x− x′)

|x− x′|3. (3.6)

This is the Biot–Savart law.

3.1 Multipole expansion

As for electrostatics, we can ask what is the form of the magnetic field at a largedistance from a localised current distribution. Using the first two terms of Eq. (2.11)in Eq. (3.3), we have

A(x) =µ0

4πr

∫ (1 +

x · x′

r2+ · · ·

)J(x′) d3x′ . (3.7)

To make further progress, we establish the following useful result. Since ∇ ·J = 0 here,we have

0 =

∫S

(xjxk . . . xl)∇ · J d3x

=

∫S

∂xi[Ji(xjxk . . . xl)] d

3x−∫S

Ji∂

∂xi(xjxk . . . xl) d

3x

=

∫∂S

(xjxk . . . xl)J · dS−∫S

[(Jjxk . . . xl) + · · ·+ (xjxk . . . Jl)] d3x , (3.8)

where we used the generalisation of the divergence theorem in going to the last line.Extending the region of integration S to be all space, the surface term then vanishesfor a localised current distribution and so must the last term on the right of the finalequality. Special cases of this result that are useful for our purposes here are∫

J d3x = 0 ,

∫(Jixj + Jjxi) d

3x = 0 . (3.9)

The first of these results means that the first (monopole) term in Eq. (3.7) vanishes:there is no magnetic analogue of the total charge Q of electrostatics. This is because

Part-II Electrodynamics 10

there are no magnetic monopoles since ∇ · B = 0. For the second term in Eq. (3.7),we have

Ai(x) =µ0

4πr3

∫xjx

′jJi(x

′) d3x′

=µ0

4πr3

1

2

∫xj[x′jJi(x

′)− x′iJj(x′)]d3x′ , (3.10)

where we used the second of Eq. (3.9). It follows that

A(x) =µ0

4πr3

1

2

∫[x · x′J(x′)− x′x · J(x′)] d3x′

=µ0

4πr3

(1

2

∫x′ × J(x′) d3x′

)× x . (3.11)

Defining the magnetic dipole moment,

m ≡ 1

2

∫x′ × J(x′) d3x′ , (3.12)

we can write the vector potential at great distance from a localised current distributionas

A(x) =µ0

4πr3m× x + · · · , (3.13)

where the remaining terms are smaller than the dipole term by factors of a/r, where ais the spatial extent of the current distribution.

The magnetic moment is invariant under a change of origin since under x′ → x′ + a,the dipole transforms as

m→m +1

2

∫a× J(x′) d3x′ = m , (3.14)

since∫

J d3x = 0.

The magnetic field of a dipole follows from taking the curl of Eq. (3.13):

B(x) =µ0

4π∇×

(m× x

r3

)=µ0

[∇(

1

r3

)× (m× x) +

1

r3∇× (m× x)

]=µ0

[− 3

r5x× (m× x) +

1

r3(3m−m)

]=

µ0

4πr5

(3m · xx− r2m

). (3.15)

Part-II Electrodynamics 11

This has exactly the same form as the electric field of a dipole. Indeed, noting that

∇(m · x

r3

)=

m

r3− 3

m · xx

r5, (3.16)

we can write the magnetic field of a dipole as a gradient of a dipole potential:

B(x) = −µ0

4π∇(m · x

r3

). (3.17)

For the case of current flowing in a thin wire confined to a plane, but otherwise ofarbitrary shape, the magnetic moment can be written as the product of the currentand the vector area (in the right-handed sense with respect to positive current) of thecurrent loop, S. This follows since, replacing Jd3x with Idx, we have

m =1

2

∫x× J d3x

=I

2

∮x× dx

= IS . (3.18)

For more complicated current distributions, for which the dipole may vanish, it maybe necessary to consider higher multipoles of the current to obtain the field at greatdistances. A simple example is current flowing in a figure-of-eight loop. This is twoopposite dipoles displaced from each other and so has no net magnetic dipole moment.Instead, the quadrupole will dominate the asymptotic field.

3.2 Magnetic energy

In establishing a current distribution, work must be done against induced electromotiveforces. Incrementing the current density by dλJ(x), as we did in electrostatics, the fluxlinked by any current loop changes and the current-generating sources must do work.For a set of current loops, with final current Ii and each linking total flux Φi,

4 the workdone in establishing the current (in addition to any Ohmic losses in the wires) is

W =1

2

∑i

IiΦi , (3.19)

where the factor of 1/2 follows from linearity. The sum here is over all current loops.We can write this is in a more convenient form by noting that

Φ =

∫(∇×A) · dS =

∮A · dx (3.20)

4The magnetic flux linked by a current loop is Φ =∫

B · dS, where the integral is over any surfacebounded by the loop.

Part-II Electrodynamics 12

which follows from Stokes theorem. Making the usual replacement Idx → Jd3x, wehave

W =1

2

∑i

Ii

∮i

A · dx

=1

2

∫A · J d3x , (3.21)

where the volume integral can be taken over all space.

Exercise: Show that the expression on the right of Eq. (3.21) is gauge invariant as a conse-quence of ∇ · J = 0.

As with electrostatics, we can fruitfully think of this work as being potential energy ofthe magnetic field. Eliminating J from Eq. (3.21) with ∇×B = µ0J, we have

W =1

2µ0

∫A · (∇×B) d3x

=1

2µ0

εijk

∫Ai∂Bk

∂xjd3x

=1

2µ0

εijk

∫∂(AiBk)

∂xjd3x− 1

2µ0

εijk

∫Bk∂Ai∂xj

d3x

=1

2µ0

∫B · (∇×A) d3x

=1

2µ0

∫|B|2 d3x , (3.22)

where we used the generalisation of the divergence theorem to show that the first termin the third line is vanishing for localised current distributions. It follows that we cantake the magnetic energy density to be |B|2/(2µ0).

4 Review of special relativity

You met the revolutionary ideas of special relativity in the IA course Mechanics andRelativity. Here we recap the main ideas and introduce a more streamlined notationthat we shall use in the rest of the course.

Special relativity is based on the following two postulates:

Part-II Electrodynamics 13

• The laws of physics are the same in all inertial frames5.

• Light signals in vacuum propagate rectilinearly and with the same speed c in allinertial frames.

The first postulate means that any equation of physics must take the same form whenwritten in terms of spacetime coordinates in different inertial frames.

4.1 Lorentz transformations

The consequence of these two postulates is that for two inertial frames, S and S ′, withspatial axes aligned and with S ′ moving at speed v = βc along the positive x-axis of Sand with origins coincident at t = t′ = 0, the spacetime coordinates of any event arerelated by the (standard) Lorentz transformation:

ct′ = γ(ct− βx)

x′ = γ(x− βct)y′ = y

z′ = z , (4.1)

where γ ≡ (1 − β2)−1/2. The transformation for a general relative velocity v, withaxes still aligned, can be obtained by first co-rotating the spatial axes of S and S ′ sothat the relative motion is along the x-axis, performing the above transformation, andthen back-rotating. Such Lorentz transformations are called Lorentz boosts and theycontain no additional relative (spatial) rotation between the two frames. Note that theLorentz transformation of Eq. (4.1) reduces to the Galilean transformation for β 1.

By construction, Lorentz transformations preserve the spacetime interval between twoevents, i.e.,

(∆x′)2 − (c∆t′)2 = (∆x)2 − (c∆t)2 . (4.2)

If we write the spacetime coordinates as xµ = (ct, xi), with µ = 0 labelling the timecomponent, and µ = i = 1, 2, 3 labelling the space components, we can write theinvariant interval for infinitesimal separations in the form

ds2 = ηµνdxµdxν . (4.3)

5Inertial frames are frames of reference in which a freely-moving particle (i.e., one subject to noexternal forces) proceeds at constant velocity in accordance with Newton’s first law. A frame ofreference can be thought of as a set of test particles, all at rest with respect to each other, and eachequipped with a synchronised clock. Laying the test particles out on a Cartesian grid and assigningspatial coordinates, spacetime coordinates can be assigned to any event in spacetime by the spatialcoordinates of the test particle on whose wordline the event occurs, and by the time on the testparticle’s clock at the event.

Part-II Electrodynamics 14

Here, ηµν = diag(−1, 1, 1, 1) is the Minkowski metric and we have introduced thesummation convention that repeated spacetime (Greek) indices, with one upper andone lower, are summed over.

More generally, we define homogeneous Lorentz transformations to be transformationsfrom one system of spacetime coordinates xµ to another x′µ, with

x′µ = Λµνx

ν , (4.4)

such thatηµνΛ

µαΛν

β = ηαβ . (4.5)

Equivalently, we can write this in matrix form as ΛTηΛ = η. Such transformationsleave ηµνdx

µdxν invariant since

ηµνdx′µdx′ν = ηµνΛ

µαΛν

β︸ ︷︷ ︸ηαβ

dxαdxβ

= ηαβdxαdxβ . (4.6)

Note that when we sum over repeated indices, the exact symbol we assign to the“dummy” index is irrelevant. For the standard transformation of Eq. (4.1), we have

Λµν =

γ −γβ 0 0−γβ γ 0 0

0 0 1 00 0 0 1

. (4.7)

Homogeneous Lorentz transformations form a group. Closure in ensured since, for acomposition of two transformations represented by the matrix product Λ1Λ2, we have

(Λ1Λ2)Tη(Λ1Λ2) = ΛT2 ΛT

1 ηΛ1Λ2

= ΛT2 ηΛ2

= η . (4.8)

Clearly, the set of homogeneous Lorentz transformations contains the identity, Λµν =

δµν , and every element is invertible6 since ΛTηΛ = η implies that det(Λ)2 = 1. Weshall be most interested in the subgroup of restricted homogeneous Lorentz transfor-mations. These are transformations that are continuously connected to the identity,i.e., any element can be taken to the identity by smooth variation of its parameters. In

6The inverse of a Lorentz transformation is also a Lorentz transformation since multiplyingΛTηΛ = η on the left with (ΛT )−1 and on the right with Λ−1 implies that (Λ−1)TηΛ−1 = η.Physically, this makes sense since the inverse transforms back from x′µ to xµ.

Part-II Electrodynamics 15

particular, restricted Lorentz transformations have det(Λ) = +1 and7 Λ00 ≥ 1. Phys-

ically, they correspond to transformations between inertial frames that exclude spaceinversion or time reversal.

Since ηµν is symmetric, Eq. (4.5) puts 10 independent constraints on the 16 independentcomponents of Λµ

ν . It follows that a general homogeneous Lorentz transformation isdescribed by 16 − 10 = 6 parameters. Three of these encode the relative velocity ofthe two frames (i.e., the boost part) and three any additional spatial rotation.

The invariant interval ∆s2 = ηµν∆xµxν between two events allows us to classify all

spacetime events in relation to some given event as being either spacelike separated(∆s2 > 0), timelike (∆s2 < 0) or null (lightlike, with ∆s2 = 0). At any event, the setof null-separated events defines the light cone. Light rays emitted from some event andpropagating in vacuum have worldlines that lie on the lightcone of the emission event.Massive particles follow worldlines for which the tangent vector is timelike (i.e., hasnegative norm) and at any event a massive particle will be moving within the lightconein spacetime. For timelike- and null-separated events, the temporal ordering of theevents (i.e., the sign of ∆t) is Lorentz invariant so we can speak of the future and pastlight cone. This is not the case for spacelike-separated events.

4.2 4-vectors

In Newtonian physics, it is very convenient to write our equations in terms of 3D vectors(or tensors) to make manifest their transformation properties under spatial rotations.We now generalise this idea to spacetime 4-vectors and tensors. By expressing ourequations in terms of such objects, we can be sure that we are writing down equationsthat are form invariant under Lorentz transformations and hence satisfy the postulatesof special relativity.

We define a 4-vector to be an object Aµ that transforms like the displacement dxµ

under Lorentz transformations, i.e.,

Aµ → ΛµνA

ν for xµ → x′µ = Λµνx

ν . (4.9)

It follows that ηµνAµAν is invariant, defining a Lorentz-invariant norm. More generally,

we can define a Lorentz-invariant scalar product between two 4-vectors, Aµ and Bµ asηµνA

µBν .

7Note that the 00 element of Eq. (4.5) gives

η00 = η00(Λ00)2 +

∑i

ηii(Λi0)2

⇒ (Λ00)2 = 1 +

∑i

(Λi0)2 ≥ 1 .

Part-II Electrodynamics 16

We should properly call Aµ a contravariant 4-vector to distinguish it from a covariant4-vector. To motivate the introduction of the latter, consider the transformation of thedifferential operator ∂/∂xµ: the chain rule gives

∂x′µ=∂xν

∂x′µ∂

∂xν. (4.10)

Since∂xν

∂x′µ∂x′µ

∂xρ︸︷︷︸Λµ

ρ

= δνρ , (4.11)

it follows that ∂xν/∂x′µ = (Λ−1)νµ, where the right-hand side is the matrix representingthe inverse transformation. The Lorentz transformation law for partial derivatives istherefore

∂x′µ= (Λ−1)νµ

∂xν. (4.12)

Note the distinction to the transformation law for a contravariant 4-vector. It is con-venient to define the quantities

Λµν ≡ (Λ−1)νµ ; (4.13)

we then define covariant 4-vectors, Aµ, to be objects that transform under Lorentztransformations as

A′µ = ΛµνAν . (4.14)

Note the consistent index placement on both sides of all equations.

We can express Λµν in terms of Λµ

ν and the Minkowski metric directly by noting that

ΛTηΛ = η ⇒ Λ−1 = η−1ΛTη , (4.15)

or, in terms of components,

Λµν = (Λ−1)νµ = ηνρΛτ

ρητµ

= ηµτηνρΛτ

ρ , (4.16)

where we have written the components of the inverse of the metric as ηµν . Numerically,the components ηµν and ηµν are the same, and

ηµρηρν = δµν . (4.17)

The contraction of a contravariant vector Aµ and a covariant vector Bµ is Lorentzinvariant since

A′µB′µ = Λµ

νAνΛµτB

τ

= (Λ−1)νµΛµτAνB

τ

= δντAνBτ

= AνBν . (4.18)

Part-II Electrodynamics 17

Every contravariant vector can be mapped with the Minkowski metric into an associ-ated covariant vector and vice versa. For example, we can write

Aµ = ηµνAν , Aµ = ηµνAν . (4.19)

Explicitly, this means that A0 = −A0 and Ai = Ai. We can easily check that ηµνAν is

a covariant vector by considering its transformation law:

A′µ = ηµνA′ν

= ηµνΛντA

τ

= ηµνΛντη

τρ︸ ︷︷ ︸Λµ

ρ

= ΛµρAρ . (4.20)

The scalar product between two contravariant vectors, Aµ and Bµ, can then be writteneither as ηµνA

µBν , or as a contraction AµBµ between the contravariant Aµ and thecovariant Bµ (or the other way around).

For the covariant differential operator ∂µ ≡ ∂/∂xµ, we can form the Lorentz-invariant(scalar) operator

2 ≡ ηµν∂µ∂ν = − ∂2

∂(ct)2+∇2 . (4.21)

This is the operator that appears in the wave equation for waves propagating at speedc.

Alternative take on covariant and contravariant components of 4-vectors: In 3D Euclideanspace we are used to thinking of vectors as geometric objects, the archetypal example beingdisplacement vectors that connect two points in space. Let us generalise this idea to spacetimeso that the 4-vector dx connects two neighbouring events8. By taking unit displacementsalong the coordinate axes of some inertial frame, we can introduce a basis of four vectors eµ.Here, the subscript µ labels the element of the basis. In terms of this basis, we can write

dx = dxµeµ . (4.22)

Under a Lorentz transformation (i.e., a passive reparameterisation of spacetime with a dif-ferent set of inertial coordinates) the coordinate differentials dxµ transform but the 4-vectoritself is unchanged. This means that the basis vectors must also change to compensate thechange in the dxµ. Indeed, since eµ = ∂x/∂xµ, we have

e′µ =∂x∂x′µ

=∂xν

∂x′µ∂x∂xν

=∂xν

∂x′νeν = Λµνeν . (4.23)

8We shall temporarily denote 4-vectors by boldface in this brief interlude.

Part-II Electrodynamics 18

We introduce an inner product on spacetime; this is a bilinear symmetric function of two4-vectors, A = Aµeµ and B = Bµeµ, defined by

A ·B ≡ ηµνAµBν . (4.24)

It follows that we must haveeµ · eν = ηµν , (4.25)

and it is easy to verify that this is preserved under Lorentz transformations.

The dual basis of 4-vectors, denoted eµ, is defined by

eµ · eν = δµν , (4.26)

so that eµ = ηµνeν and eµ = ηµνeν . The dual basis transforms in the same way as the dxµ

under Lorentz transformations, e′µ = Λµνeν , to preserve the duality relation in Eq. (4.26).We can express a 4-vector A in terms of its components on either basis, i.e.,

A = Aµeµ = Aµeµ . (4.27)

As usual, we can extract the components by taking the inner product with the appropriatebasis 4-vectors: Aµ = A · eµ and Aµ = A · eµ. It follows that

Aµ = (Aνeν) · eµ = ηµνAν , (4.28)

and, similarly, Aµ = ηµνAν . Under Lorentz transformations,

A′µ = A · e′µ = A · (Λµνeν) = ΛµνAν , (4.29)

so the Aµ (the covariant components) transform in the same way as the coordinate basisvectors. This is the origin of the covariant and contravariant terminology for the components.

We see that the covariant components Aµ and the contravariant components Aµ are anequivalent description of the same geometric object, the 4-vector A. Some 4-vectors aremore naturally represented by their contravariant components, an obvious example being thedxµ of the spacetime displacement, while for others the contravariant components are morenatural. An example of the latter are the ∂/∂xµ of the spacetime gradient operator.

4.2.1 Examples of 4-vectors in relativistic kinematics

Velocity 4-vector. Consider a point particle following a worldline xµ(τ), where τ is theparticle’s proper time (i.e., time as measured on a clock carried with the particle). Thevelocity 4-vector is defined by

uµ = dxµ/dτ , (4.30)

Part-II Electrodynamics 19

where we are dividing a 4-vector dxµ by the scalar dτ . We can express the proper timein terms of the time in any inertial frame S by noting that

c2dτ 2 = −ηµνdxµdxν = c2dt2 − dx2

= c2dt2(1− v2/c2)

= c2dt2/γ2 , (4.31)

where γ is the Lorentz factor associated with the particle’s velocity v relative to S. Itfollows that dτ = dt/γ, and so the components of the velocity 4-vector in S are

uµ = γd

dt(ct,x) = (γc, γv) . (4.32)

Note the norm of uµ:

ηµνuµuν = (ηµνdx

µdxν)/(dτ)2 = −c2 . (4.33)

Momentum 4-vector. For a massive particle of rest-mass m, the momentum 4-vector isobtained from the velocity 4-vector by multiplication by m:

pµ ≡ muµ = (γmc, γmv) . (4.34)

The norm is ηµνpµpν = −m2c2. In some inertial frame, the components of pµ are E/c,

the (total) energy measured in that frame, and p, the relativistic 3-momentum, i.e.,

pµ = (E/c,p) where E = γmc2 and p = γmv . (4.35)

For an isolated system, the 4-momentum is conserved, i.e., the relativistic 3-momentumand energy are conserved.

More generally, the 3-momentum of a particle is changed by a force F according toNewton’s second law dp/dt = F. The interpretation of the time component of pµ asthe energy is then consistent since differentiating ηµνp

µpν = −m2c2 gives

EdE/dt = c2p · dp/dt= c2p · F

⇒ dE/dt = p · F/(γm) = v · F , (4.36)

where the quantity on the right of the final equality is the usual rate of working of theforce. We can introduce a 4-vector force F µ as F µ = dpµ/dτ ; then

F µ = γd

dt

(E

c,p

)=

c

dE

dt, γF

)=(γcv · F, γF

). (4.37)

Part-II Electrodynamics 20

The 4-force is necessarily orthogonal to pµ (and so to uµ) to preserve the norm of pµ.

Acceleration 4-vector. The 4-acceleration aµ is

aµ = duµ/dτ . (4.38)

Constancy of the norm ηµνuµuν = −c2 implies that aµ is orthogonal to uµ. The

components of aµ are

aµ = γd

dt(γc, γv)

= γ

(cdγ

dt,dγ

dtv + γa

)= γ2

(γ2

cv · a, γ

2

c2a · vv + a

), (4.39)

where a = dv/dt is the usual 3-acceleration and we used dγ/dt = γ3a · v/c2, whichfollows from differentiating γ−2 = 1− v · v/c2. In the instantaneous rest frame of theparticle (i.e., the frame in which it is at rest at any instant), the components of the4-acceleration are simply aµ = (0, arest). It follows that the norm of aµ is the square ofthe rest-frame acceleration: ηµνa

µaν = a2rest.

Example: A particle moves along the x-axis of some inertial frame S with constant rest-frameacceleration α. At proper time τ , suppose that the particle is at x(τ) and t(τ) in S and ismoving with speed βc (along the x-axis). In its instantaneous rest frame, the components ofthe 4-acceleration are a′µ = α(0, 1, 0, 0). Performing the inverse Lorentz transform back to Sgives

aµ = duµ/dτ = (γβα, γα, 0, 0) . (4.40)

Since the velocity 4-vector is uµ = (γc, γcβ, 0, 0), we have

d(γc)dτ

= γβα and cd(γβ)dτ

= γα . (4.41)

To solve these, we note that dγ/dτ = γ3βdβ/dτ ; combining with Eq. (4.41), we find

dτ=

α

γ2c=α

c(1− β2) . (4.42)

The solution with β = 0 at τ = 0 is

β(τ) = tanh(ατ/c) and γ(τ) = cosh(ατ/c) . (4.43)

As expected, the speed asymptotes to c (β = 1). We can solve for the wordline of the particleusing

dx

dτ= γ

dx

dt= γcβ = c sinh(ατ/c) . (4.44)

Part-II Electrodynamics 21

The solution with x = c2α at τ = 0 (for convenience) is

x(τ) =c2

αcosh(ατ/c) . (4.45)

Similarly, dt/dτ = γ = cosh(ατ/c) gives

t(τ) =c

αsinh(ατ/c) , (4.46)

taking t = 0 at τ = 0. We see that the motion is hyperbolic in the x-t plane, and asymptotesto x = c|t| as |τ | → ∞. The wordline is entirely within the “Rindler wedge”, x > c|t|. Notethat if we consider an observer at rest in S at the origin, sending out light signals to theaccelerated particle, only signals emitted before t = 0 are ever received by the acceleratedparticle.

4.3 4-tensors

A tensor of type(pq

)has p upper (contravariant) indices and q lower (covariant) indices.

Such a tensor T µν...ρσ... transforms as

T ′µν...ρσ... = ΛµαΛν

β . . .ΛργΛσ

δ . . . Tαβ...γδ... (4.47)

under x′µ = Λµνx

ν . As with 4-vectors, one can raise and lower indices with ηµν andηµν , for example

T µν = ηναT µα (4.48)

which takes a type-(

11

)tensor to a type

(20

).

The following rules apply to tensor manipulations.

• Contractions may only be made between upper and lower indices. Contracting a(pq

)tensor returns a

(p−1q−1

)tensor. For example Tαα is a Lorentz scalar since

T ′αα = ΛαρΛα

δ︸ ︷︷ ︸δδρ

T ρδ

= T ρρ . (4.49)

Note that the specific symbol assigned to a contracted (dummy) index is irrele-vant.

• Symmetrisation/antisymmetrisation may be performed only over upper or lowerindices (but not mixed). Symmetrisation is denoted with round brackets andantisymmetrisation with square brackets:

T (αβ) ≡ 1

2

(Tαβ + T βα

)and T [αβ] ≡ 1

2

(Tαβ − T βα

). (4.50)

Part-II Electrodynamics 22

• Contraction, symmetrisation and index raising/lowering commute with Lorentztransformations. For example,

2T ′(αβ) = T ′αβ + T ′βα

= ΛασΛβ

ρTσρ + Λβ

σΛαρT

σρ

= ΛαρΛ

βσ (T ρσ + T σρ)

= 2ΛαρΛ

βσT

(ρσ) . (4.51)

• Partial differentiation takes a(pq

)tensor to a

(pq+1

)tensor. For example, under a

Lorentz transformation

∂Tαβ∂xµ

→ ∂T ′αβ∂x′µ

= ΛµρΛα

σΛβτ ∂T

στ

∂xρ, (4.52)

and ∂µTαβ is a

(12

)tensor.

Finally, we return to the Minkowski metric. We now see that this is a(

02

)tensor that

takes the same form in all Lorentz frames. This follows since (Λ−1)TηΛ−1 = η implies

ηµν = ΛµρΛν

σηρσ . (4.53)

Similarly, ηµν is a(

20

)tensor and

ηµν = ηµρηρν = δµν (4.54)

is of type(

11

).

Having introduced all the machinery we need, we shall now see how to rewrite Maxwell’sequations as 4-tensor equations, thus making manifest their Lorentz invariance.

5 Maxwell’s equations and special relativity

5.1 Current 4-vector

We begin by considering the source terms ρ and J in Maxwell’s equations from arelativistic perspective. We shall establish that

Jµ = (ρc,J) (5.1)

is a 4-vector (the current 4-vector) by considering the transformation properties ofcharge density and current density under Lorentz transformations.

Part-II Electrodynamics 23

Consider charges at rest in some frame S ′ giving a charge density ρ0. In this frameJ ′µ = (ρ0c, 0, 0, 0). How do these charges appear in a Lorentz boosted frame S in whichthe charges have velocity v along the x-axis? By length contraction, the number densityis increased by the Lorentz factor γ, and hence the charge density is ρ = γρ0. Moreover,since this density of charge has velocity (v, 0, 0), the current density is J = (γρ0v, 0, 0).Hence in the frame S, (ρc,J) = (γρ0c, γρ0v, 0, 0). But this is exactly what we get fromapplying an inverse Lorentz transformation to J ′µ since

J0 = γ(J ′0 + βJ ′1) = γρ0c

J1 = γ(J ′1 + βJ ′0) = γρ0βc = γρ0v . (5.2)

This argument generalises to charges with a range of velocities by summing over setsof charges that are individually comoving.

The continuity equation∂ρ

∂t+ ∇ · J = 0 , (5.3)

can now be written in terms of the 4-current as

∂µJµ = 0 , (5.4)

which is manifestly form-invariant under Lorentz transformations.

5.2 Maxwell’s equations and the 4-potential

In Sec. 3 we introduced the vector potential A for the magnetic field with B = ∇×A.Since ∇ ·B = 0 holds generally even for time-dependent fields, we can always write Bin terms of A. We cannot generally write E = −∇φ though, since

∇× E = −∂B

∂t. (5.5)

However, substituting B = ∇×A, we see that E + ∂A/∂t is curl-free, and so we canalways write

E = −∂A

∂t−∇φ , (5.6)

where φ is the electric potential.

We met the idea of gauge transformations in Sec. 3. There, we argued that the physicalmagnetic field B is unchanged by a transformation of the vector potential A of theform A→ A + ∇χ. We now see that if the electric field is also to remain unchanged,we must simultaneously change the electric potential as φ→ φ− ∂χ/∂t since then

− ∂

∂t(A + ∇χ)−∇

(φ− ∂χ

∂t

)= E− ∂

∂t∇χ+ ∇∂χ

∂t

= E . (5.7)

Part-II Electrodynamics 24

We now write the remaining two of Maxwell’s equations,

∇ · E =ρ

ε0(5.8)

∇×B = µ0J +1

c2

∂E

∂t, (5.9)

in terms of φ and A. (Note that in Eq. (5.9) we have used ε0µ0 = 1/c2.) Equation (5.9)gives

∇× (∇×A) = µ0J−1

c2

∂t

(∂A

∂t+ ∇φ

)⇒ ∇∇ ·A−∇2A = µ0J−

1

c2

∂2A

∂t2− 1

c2∇∂φ

∂t. (5.10)

Combining terms, and recognising the wave operator 2 = ∇2−∂2/∂(ct)2 acting on A,we have

2A−∇(

∇ ·A +1

c2

∂φ

∂t

)= −µ0J . (5.11)

We can write Eq. (5.8) in similar looking form:

2

c

)+

∂(ct)

(∇ ·A +

1

c2

∂φ

∂t

)= −µ0ρc , (5.12)

where we added and subtracted ∂2φ/∂(ct)2 and replaced ε0 in favour of µ0 and c. Letus introduce a candidate 4-vector potential with components

Aµ ≡ (φ/c,A) . (5.13)

We can then combine Eqs. (5.11) and (5.12) into a single equation

2Aµ − ∂µ(∂νAν) = −µ0J

µ , (5.14)

where ∂µ = ηµν∂ν = (−∂/∂(ct),∇), and Jµ is the 4-current. Clearly, this equation ismanifestly covariant if Aµ is indeed a 4-vector, i.e., we have established that Maxwell’stheory is Lorentz invariant provided that the potentials φ and A transform underLorentz transformations like the time and space components of a 4-vector.

As well as being covariant under Lorentz transformations, Eq. (5.14) is gauge invariantas it must be. To see this, note that the gauge transformation φ → φ − ∂χ/∂t andA→ A + ∇χ implies

Aµ → Aµ + ∂µχ (5.15)

for the 4-potential. The left-hand side of Eq. (5.14) is invariant under this transforma-tion since

2Aµ − ∂µ(∂νAν)→ 2(Aµ + ∂µχ)− ∂µ(∂νA

ν + 2χ)

= 2Aµ − ∂µ(∂νAν) . (5.16)

Part-II Electrodynamics 25

Note also how the charge continuity equation (5.4) is built into Eq. (5.14); since

∂µ [2Aµ − ∂µ(∂νAν)] = 2(∂µA

µ)−2(∂νAν) = 0 , (5.17)

it follows that ∂µJµ = 0 as required.

We shall return to Eq. (5.14) in Sec. 7 when we consider the emission of electromagneticradiation. Being a sourced wave equation, it can be solved straightforwardly in termsof a suitable Green’s function once the gauge is fixed. Before doing so, we shall consideran alternative relativistic formulation of Maxwell’s equations that is more obviouslyrelated to their 3D first-order (in space and time derivatives) form, and for whichgauge-invariance is manifest.

5.3 Maxwell field-strength tensor

We define the Maxwell field-strength tensor

F µν = ∂µAν − ∂νAµ = 2∂[µAν] , (5.18)

which is a type-(

20

)antisymmetric tensor. To motivate introducing F µν consider its

components. For example,

F 01 = ∂0A1 − ∂1A0

= − ∂Ax∂(ct)

− ∂(φ/c)

∂x

= Ex/c , (5.19)

and

F 12 = ∂1A2 − ∂2A1

=∂Ay∂x− ∂Ax

∂y

= Bz . (5.20)

Evaluating the other components similarly gives

F µν =

0 Ex/c Ey/c Ez/c

−Ex/c 0 Bz −By

−Ey/c −Bz 0 Bx

−Ez/c By −Bx 0

. (5.21)

We see that the six linearly-independent components of F µν encode the 3 + 3 com-ponents of the electric and magnetic fields. Note that under gauge transformations,Aµ → Aµ + ∂µχ and F µν is invariant since ∂[µ∂ν]χ = 0.

Part-II Electrodynamics 26

We now look to write Maxwell’s equations directly in terms of the field-strength tensor.Since Maxwell’s equations are first-order in space and time derivatives acting on theelectric and magnetic fields, consider

∂µFµν = ∂µ (∂µAν − ∂νAµ)

= 2Aν − ∂ν(∂µAµ) . (5.22)

This is exactly the 4-vector on the left-hand side of Eq. (5.14) so we can write

∂µFµν = −µ0J

ν . (5.23)

This 4-vector equation is equivalent to the sourced Maxwell’s equations (5.8) and (5.9).We can verify this explicitly; for example with ν = 0 we have

∂µFµ0 = −µ0J

0

⇒ ∂F i0

∂xi= −µ0ρc

⇒ −1

c∇ · E = −µ0ρc , (5.24)

which gives Eq. (5.8). For ν = 1, we have

∂µFµ1 = −µ0J

1

⇒ ∂F 01

∂(ct)+∂F 21

∂y+∂F 31

∂z= −µ0Jx

⇒ ∂(Ex/c)

∂(ct)− ∂Bz

∂y+∂By

∂z= −µ0Jx , (5.25)

which is the x-component of Eq. (5.9).

Charge continuity is implicit in Eq. (5.23) due to the antisymmetry of F µν ; applying∂ν we have

∂ν∂µFµν = −µ0∂νJ

ν

⇒ ∂[ν∂µ]Fµν = −µ0∂νJ

ν (antisymmetry of F µν)

⇒ 0 = −µ0∂νJν . (5.26)

The other of Maxwell’s equations, ∇ ·B = 0 and ∇×E = −∂B/∂t, are implicit in thedefinition of the field-strength tensor as the antisymmetric derivative of the 4-potential.If we lower indices to obtain9

Fµν = ∂µAν − ∂νAµ , (5.28)

9The components of Fµν are

Fµν =

0 −Ex/c −Ey/c −Ez/c

Ex/c 0 Bz −ByEy/c −Bz 0 BxEz/c By −Bx 0

. (5.27)

Part-II Electrodynamics 27

taking a further derivative and antisymmetrising gives

∂[ρFµν] =1

3(∂ρFµν + ∂νFρµ + ∂µFνρ)

=1

3[∂ρ (∂µAν − ∂νAµ) + ∂ν (∂ρAµ − ∂µAρ) + ∂µ (∂νAρ − ∂ρAν)]

= 0 , (5.29)

where we used the antisymmetry of Fµν in the first line.

The tensor equation

∂[ρFµν] = 0 or equivalently ∂ρFµν + ∂νFρµ + ∂µFνρ = 0 , (5.30)

is 4×3×2/3! = 4 independent equations corresponding to the scalar and vector-valuedsource-free Maxwell equations. To see this explicitly, consider taking ρ = 0, µ = 1 andν = 2; then

∂F12

∂(ct)+∂F01

∂y+∂F20

∂x= 0

⇒ ∂Bz

∂(ct)− ∂(Ex/c)

∂y+∂(Ey/c)

∂x= 0

⇒ −∂Bz

∂t= (∇× E)z . (5.31)

Taking 0, 1, 3 and 0, 2, 3 give the y- and x-components of the Faraday equation (1.2).Finally, taking 1, 2, 3 gives

∂F23

∂x+∂F12

∂z+∂F31

∂y= 0

⇒ ∂Bx

∂x+∂Bz

∂z+∂By

∂y= 0

⇒ ∇ ·B = 0 . (5.32)

We see that Maxwell’s equations can be written in manifestly covariant form in termsof the field-strength tensor as

∂µFµν = −µ0J

ν and ∂[ρFµν] = 0 . (5.33)

The second of these equations is sometimes rewritten in terms of the dual field-strengthtensor 10 ∗F µν defined by

∗F µν ≡ 1

2εµνρσFρσ , (5.34)

10This is analogous to what we do in 3D when we dualise the antisymmetric second-rank tensor∂[iAj] by contracting with εijk to get the dual (pseudo-)vector ∇×A. The dual is properly called apseudo-vector since εijk is only invariant under those orthogonal transformations of coordinates thathave unit determinant, i.e., avoid space inversion.

Part-II Electrodynamics 28

where the alternating (pseudo-)tensor is the fully-antisymmetric object defined by

εµνρσ =

1 if µνρσ is an even permutation of 0, 1, 2, 3

−1 if µνρσ is an odd permutation of 0, 1, 2, 3

0 otherwise .

(5.35)

Under a Lorentz transformation, applying the usual tensor transformation law, we have

εµνρσ → ΛµαΛν

βΛργΛ

σδεαβγδ . (5.36)

But εµνρσ = ε[µνρσ] hence

ΛµαΛν

βΛργΛ

σδεαβγδ = Λµ

αΛνβΛρ

γΛσδε

[αβγδ]

= Λµ[αΛν

βΛργΛ

σδ]ε

αβγδ

= Λ[µαΛν

βΛργΛ

σ]δεαβγδ , (5.37)

and so the transformed object is fully antisymmetric and so must be proportional toεµνρσ. Considering the one linearly-independent component 0, 1, 2, 3, we have

Λ0αΛ1

βΛ2γΛ

3δεαβγδ = det(Λ) , (5.38)

where we have used the definition of the determinant. Since ε0123 = 1, it follows that

εµνρσ → det(Λ)εµνρσ . (5.39)

If we consider only restricted Lorentz transformations (recall these are continuouslyconnected to the identity; see Sec. 4.1) we have det(Λ) = 1 and εµνρσ is an invarianttensor under such transformations. Finally, note that on lowering indices, ε0123 = −1.Returning to ∗F µν , we now see that it is a tensor under restricted Lorentz transforma-tions.

Straightforward calculation, for example,

∗F 01 =1

2

(ε0123F23 + ε0132F32

)= F23 = Bx , (5.40)

establishes the components of ∗F µν as

∗F µν =

0 Bx By Bz

−Bx 0 −Ez/c Ey/c−By Ez/c 0 −Ex/c−Bz −Ey/c Ex/c 0

. (5.41)

These components are related to those of F µν by the duality transformation E → cBand B→ −E/c.

Part-II Electrodynamics 29

The covariant Maxwell equation ∂[ρFµν] = 0 can be written in terms of ∗Fµν as follows:

∂[ρFµν] = 0 ⇒ 1

2εαρµν∂ρFµν = 0 , (5.42)

so that∂µ∗F µν = 0 . (5.43)

Finally, we can write Maxwell’s covariant equations as the pair

∂µFµν = −µ0J

ν and ∂µ∗F µν = 0 . (5.44)

The lack of duality symmetry (i.e., there is no source term for the dual field strength)reflects the fact that there are no known magnetic charges (monopoles) in nature.

5.4 Lorentz transformations of E and B

Knowing that E and B form the components of the field-strength tensor, we can nowcalculate their transformation laws under Lorentz transformations. Generally, we have

F µν → F ′µν = ΛµαΛν

βFαβ . (5.45)

Specialising to the standard Lorentz boost of Eq. (4.7), we find, for example,

F ′01 = Λ0αΛ1

βFαβ

= Λ01Λ1

0F10 + Λ0

0Λ11F

01

⇒ E ′x/c = (−γβ)(−γβ)(−Ex/c) + γγEx/c

⇒ E ′x = γ2(1− β2)Ex = Ex . (5.46)

Similarly,

F ′02 = Λ0αΛ2

βFαβ

= Λ00Λ2

2F02 + Λ0

1Λ22F

12

⇒ E ′y/c = γEy/c− γβBz

⇒ E ′y = γ(Ey − vBz) , (5.47)

where v = βc. Repeating for the other components gives

E′ =

Exγ(Ey − vBz)γ(Ez + vBy)

and B′ =

Bx

γ(By + vEz/c2)

γ(Bz − vEy/c2)

. (5.48)

We see that the components of the fields along the direction of the relative velocity donot change, but the perpendicular components mix E and B. In the non-relativisticlimit v c, we recover the Galilean transformation E→ E + v ×B and B→ B.

Part-II Electrodynamics 30

Example: In the inertial frame S, charges lie along the x-axis with charge per length λ. Thesegenerate an electric field with points out radially from the axis with magnitude E = λ/(2πε0r)at distance r from the axis (from Gauss’s law). In Cartesian coordinates, we have

E =λ

2πε0(y2 + z2)

0yz

. (5.49)

The magnetic field vanishes since there is no current in S. We now transform to the frameS′ moving at speed v along the x-axis to find

E′ =λγ

2πε0(y2 + z2)

0yz

=λγ

2πε0(y′2 + z′2)

0y′

z′

, (5.50)

B′ =γv

c2

λ

2πε0(y2 + z2)

0z−y

=λγvµ0

2π(y′2 + z′2)

0z′

−y′

. (5.51)

These fields should be consistent with the charges and currents in S′. By length contraction,the charge per length is γλ in S′ and this does indeed generate the radial field in Eq. (5.50)by Gauss’s law. The charges are all moving with velocity (−v, 0, 0) in S′, so the current inthe wire is −γλv along the x′-axis. This current generates an azimuthal field that circles thenegative x′-axis in a right-handed sense. The magnitude follows from integrating ∇×B = µ0Jaround a circular contour in the y′-z′ plane centred on the x′-axis: |B′| = µ0I

′/(2πr′), whereI ′ = γλv is the current in S′. Reassuringly, we see that direct calculation of the magneticfield from Maxwell’s equations in S′ gives results consistent with the Lorentz transformationof the fields calculated directly in S.

Example: The fields of a uniformly moving charge can be easily derived by transforming theelectric field from the rest frame. Let the charge q be at rest in the inertial frame S at theorigin. The electric field is simply the Coulomb field,

E =q

4πε0(x2 + y2 + z2)3/2

xyz

. (5.52)

The magnetic field vanishes in S. Transforming to S′, which moves at speed v along the

Part-II Electrodynamics 31

x-axis, the electric field becomes

E′ =q

4πε0(x2 + y2 + z2)3/2

xγyγz

=

4πε0[γ2(x′ + vt′)2 + y′2 + z′2]3/2

x′ + vt′

y′

z′

, (5.53)

where we further expressed the coordinates in S′ in terms of those in S in the second equality.In S′, the charge is at x′ = (−vt′, 0, 0) at time t′ so we see that the electric field is radiallyout from the current position of the charge. However, the electric field is anisotropic. If wedenote by r′ the displacement vector connecting the current position of the charge to theobservation point, so that r′ = (x′ + vt′, y′, z′), and let ψ be the angle between that r′ makeswith the x′-axis (i.e.,, the direction of motion), we can write

γ2(x′ + vt′)2 + y′2 + z′2 = (γ2 − 1)(x′ + vt′)2 + r′2

= r′2(γ2β2 cos2 ψ + 1

)= r′2γ2

(1− β2 sin2 ψ

), (5.54)

where β = v/c as usual. The electric field is therefore

E′ =qr′

4πε0r′2γ2(1− β2 sin2 ψ)3/2. (5.55)

Along the x′-axis (ψ = 0), the field is smaller by a factor of 1/γ2 than for a charge at rest atthe current position, while perpendicular to this (ψ = π/2) the field is enhanced by a factorof γ over that for a charge at rest. The field lines are thus concentrated close to the planeperpendicular to the motion for a rapidly moving particle.

There is also an azimuthal magnetic field with cB′y = βE′z and cB′z = −βE′y.

Although the electric and magnetic fields transform in a non-trivial way under Lorentztransformations, there are two (scalar) quadratic invariants11. The first of these isFµνF

µν/2, which is invariant by construction. In terms of the components,

1

2FµνF

µν =1

2Tr

0 Ex/c Ey/c Ez/c−Ex/c 0 −Bz By

−Ey/c Bz 0 −Bx

−Ez/c −By Bx 0

0 Ex/c Ey/c Ez/c−Ex/c 0 Bz −By

−Ey/c −Bz 0 Bx

−Ez/c By −Bx 0

= B2 − E2/c2 . (5.56)

11There is no invariant that is linear in the fields since ηµνFµν = 0 (recall that the field-strengthtensor is antisymmetric).

Part-II Electrodynamics 32

It is straightforward to verify the invariance of B2 − E2/c2 directly from Eq. (5.48).The other invariant follows from contracting F µν with its dual:

−1

4Fµν

∗F µν = −1

4Tr

0 Ex/c Ey/c Ez/c−Ex/c 0 −Bz By

−Ey/c Bz 0 −Bx

−Ez/c −By Bx 0

×

0 Bx By Bz

−Bx 0 −Ez/c Ey/c−By Ez/c 0 −Ex/c−Bz −Ey/c Ex/c 0

= E ·B/c . (5.57)

The third contraction we might consider forming is ∗Fµν∗F µν . However, this does not

yield an independent invariant since ∗Fµν∗F µν = FµνF

µν . Higher-order invariants, suchas det(F µ

ν) are also sometimes useful.

5.5 Lorentz force law

Recall that the relativistic force law

dpµ/dτ = F µ (5.58)

involves the 4-force F µ with components

F µ =(γcv · F, γF

), (5.59)

where F is the 3-force on the particle and v is its velocity. Necessarily, F µuµ = 0. Forthe Lorentz force law for a particle of charge q,

F = q(E + v ×B) , (5.60)

the rate of working v · F = qv · E is from the electric field alone. Substituting intoEq. (5.59), we find the Lorentz 4-force

F µ = γq

(1

cv · E,E + v ×B

). (5.61)

We now seek to express this 4-force in terms of the covariant field-strength tensor and4-velocity.

Given the form of the components of F µ, a sensible thing to try is qF µνuν . This isorthogonal to uµ since the field-strength is antisymmetric. Using uµ = γ(−c,v), we

Part-II Electrodynamics 33

have

qF 0νuν = qF 0iui

= qγE · v/c , (5.62)

which is indeed the required time component. For the space components, we have, forexample,

qF 1νuν = qF 10u0 + qF 12u2 + qF 13u3

= q

(−Exc

)(−γc) + qBz(γvy)− qBy(γvz)

= γq(E + v ×B)x , (5.63)

with the analogous result for the other two components. It follows that we can writethe 4-force as F µ = qF µνuν , and the equation of motion of a point charge as

dpµ/dτ = qF µνuν . (5.64)

5.6 Motion in constant, uniform fields

We now consider the motion of a point particle of rest mass m and charge q in uniformelectric and magnetic fields. The field-strength tensor is independent of spacetimeposition and the equation of motion (5.64) becomes

duµ

dτ=

q

mF µ

νuν , (5.65)

with uµ = dxµ/dτ . The mixed components of the field-strength tensor are

F µν =

0 Ex/c Ey/c Ez/c

Ex/c 0 Bz −By

Ey/c −Bz 0 Bx

Ez/c By −Bx 0

. (5.66)

The solution of Eq. (5.65) with initial condition uµ = uµ(0) at τ = 0 is

uµ(τ) = exp(qτmF)µ

νuν(0) , (5.67)

where, as usual, the exponential is defined by its power series:

exp( qmFτ)µ

ν = δµν +qτ

mF µ

ν +1

2

(qτm

)2

F µρF

ρν + · · · . (5.68)

Integrating again gives the worldline

xµ(τ) = xµ(0) +

[∫ τ

0

exp

(qτ ′

mF

)µν dτ

′]uν(0) , (5.69)

Part-II Electrodynamics 34

where the spacetime position at τ = 0 is xµ(0). Now, for an invertible matrix M, wehave ∫ τ

0

exp(Mτ ′) dτ ′ =

∫ τ

0

(I + Mτ ′ +

1

2M2τ ′2 +

1

3!M3τ ′3 + · · ·

)dτ ′

= τI +1

2Mτ 2 +

1

3!M2τ 3 +

1

4!M3τ 4 + · · ·

= M−1

(I + Mτ +

1

2M2τ 2 +

1

3!M3τ 3 + · · · − I

)= M−1 [exp(Mτ)− I] . (5.70)

With a suitable choice of spacetime origin, we can therefore write

xµ(τ) =m

q(F−1)µν exp

(qτmF)ν

ρuρ(0) for det(F µ

ν) 6= 0 . (5.71)

Exercise: Show that the quartic invariant

det(Fµν) = −(E ·B)2/c2 . (5.72)

We first consider the case where E · B 6= 0, i.e., both E and B are non-zero and arenot perpendicular. (Note that this holds in all inertial frames, if true in one, due tothe Lorentz invariance of E ·B.) We shall now show that a frame can always be foundin which the E and B fields are parallel. The motion turns out to be rather simple inthis frame, and the motion in a general frame can then by found by an appropriateLorentz transformation.

For the case that E and B are not parallel in S, let us choose spatial axes such thatboth fields lie in the y-z plane. It follows that under a boost along the x-axis, thetransformed fields are still in the y′-z′ plane with components

E′ = γ

0Ey − vBz

Ez + vBy

and B′ = γ

0By + vEz/c

2

Bz − vEy/c2

. (5.73)

Since we want to demonstrate that for some β = v/c with |β| < 1, we can set E′ andB′ parallel, consider E′ ×B′ (which lies along the x′-axis):

(E′ ×B′)x = γ2[(Ey − vBz)(Bz − vEy/c2)− (Ez + vBy)(By + vEz/c

2)]

= γ2[(E×B)x(1 + β2)− v|B|2 − v|E/c|2

]. (5.74)

Part-II Electrodynamics 35

It follows that if we choose the velocity such that

β

1 + β2=

(E×B)x/c

|B|2 + |E/c|2, (5.75)

then the electric and magnetic fields will be parallel in S ′. However, this is onlypossible if we can solve Eq. (5.75) for |β| < 1. For given |E| and |B|, the magnitude|(E × B)x| < |E||B| since E and B are not perpendicular. Writing |B| = µ|E|/c forsome dimensionless µ > 0, it follows that the magnitude of the right-hand side ofEq. (5.75) satisfies ∣∣∣∣ (E×B)x/c

|B|2 + |E/c|2

∣∣∣∣ < µ

1 + µ2. (5.76)

The maximum value of the function µ/(1 +µ2) is 1/2 (at µ = 1) and so the right-handside of Eq. (5.75) is necessarily less than 1/2. It follows that we can always find a βwith |β| < 1, such that Eq. (5.75) is satisfied and the transformed electric and magneticfields are parallel.

Adopting a frame in which E and B are parallel, and now taking these to lie along thex-axis, we have E = (E, 0, 0) and B = (B, 0, 0) so that the field-strength tensor takesthe block-diagonal form

F µν =

0 E/c 0 0E/c 0 0 0

0 0 0 B0 0 −B 0

. (5.77)

Noting that (0 11 0

)2

= I and

(0 1−1 0

)2

= −I , (5.78)

we have

exp

(0 ww 0

)= I + w

(0 11 0

)+w2

2I +

w3

3!

(0 11 0

)+w4

4!I + · · ·

= coshwI + sinhw

(0 11 0

), (5.79)

and

exp

(0 w′

−w′ 0

)= I + w′

(0 1−1 0

)− w′2

2I− w′3

3!

(0 1−1 0

)+w′4

4!I + · · ·

= cosw′I + sinw′(

0 1−1 0

). (5.80)

Part-II Electrodynamics 36

It follows that

exp(qτmF)µ

ν =

cosh

(qτmcE)

sinh(qτmcE)

0 0sinh

(qτmcE)

cosh(qτmcE)

0 00 0 cos

(qτmB)

sin(qτmB)

0 0 − sin(qτmB)

cos(qτmB) , (5.81)

which, from Eq. (5.71), gives the motion

ct(τ) =mc

qE

[sinh

( qτmc

E)ct(0) + cosh

( qτmc

E)x(0)

]x(τ) =

mc

qE

[cosh

( qτmc

E)ct(0) + sinh

( qτmc

E)x(0)

]y(τ) =

m

qB

[sin(qτmB)y(0)− cos

(qτmB)z(0)

]z(τ) =

m

qB

[cos(qτmB)y(0) + sin

(qτmB)z(0)

]. (5.82)

Notice how, with E and B parallel, the motion in the ct-x plane decouples from thatin the y-z plane. The ct-x motion is a hyperbolic trajectory (like for the example ofconstant rest-frame acceleration given earlier) with

x2 − (ct)2 =

(mc

qE

)2 ([ct(0)]2 − x2(0)

)> 0 , (5.83)

where the inequality follows from [ct(0)]2 − x2(0) = c2. Note that

[ct(τ)]2 − x2(τ) = [ct(0)]2 − x2(0) . (5.84)

The motion in the y-z plane is circular with

y2 + z2 =

(m

qB

)2 (y2(0) + z2(0)

). (5.85)

The radius of the circle is controlled by the transverse proper-time speed (y2 + z2)1/2

which is constant sincey2(τ) + z2(τ) = y2(0) + z2(0) . (5.86)

Defining the cyclotron frequency ωc ≡ qB/m, we see that this is the angular frequencyof the circular motion with respect to proper time. Combining the two motions gives,generally, a helical motion along the field direction. The radius of the circular projec-tion is constant in time, but the spacing along the x-axis grows in time and asymptot-ically the spacing is constant in lnx. As the speed along the x-axis approaches c, theperiod of the circular motion in t gets significantly dilated and the transverse speed

[(dy/dt)2 + (dz/dt)2]1/2

falls towards zero.

Part-II Electrodynamics 37

To complete our treatment of motion in constant, uniform fields we now consider thespecial cases for which E ·B = 0.

B = 0. The motion can be obtained as the limit of Eq. (5.82) as B → 0 (with anappropriate shift in the choice of origin which is infinite in the limit). The worldline ishyperbolic in the ct-x plane, but now linear in the y-z plane corresponding to circularmotion in that plane with infinite radius. The velocity in the y-z plane (perpendicularto the electric field) asymptotes to zero.

E = 0. The motion can be obtained as the limit of Eq. (5.82) as E → 0. The motion inthe ct-x plane is linear with ct and x constant, corresponding to a constant 3-velocitycomponent along the x-axis (the direction of the magnetic field). Constancy of ctensures that the speed of the particle is constant, as required since the magnetic fielddoes no work. In the y-z plane, the motion is circular with proper angular frequencyωc. If the 3-velocity along the field direction is vx, and the perpendicular velocity isv⊥, the constant Lorentz factor is γ−2 = 1−v2

x/c2−v2

⊥/c2 (and t = γ) and the angular

frequency of the circular motion with respect to coordinate time t is ωc/γ. The spacingof the helical motion along the x-axis is constant. The radius r of the circular motionfollows from

r2 =1

ω2c

(y2(0) + z2(0)

)=γ2v2

⊥ω2c

. (5.87)

This simple helical motion can be easily obtained directly from the equations of motionin 3D form. Since the speed is constant, dp/dt = γmdv/dt and the Lorentz force lawgives

dv

dt=

q

γmv ×B , (5.88)

with γ−2 = 1 − v2/c2. It follows that the component of v along the field is constant,while the perpendicular component precesses on a circle with angular frequency ω =qB/(γm). The radius of the circle then follows from |v⊥| = ωr.

E ⊥ B with non-zero E and B. We can always choose axes such that E = (0, E, 0)and B = (0, 0, B) (where E and B may be negative). If we now boost along the x-axis,we have

E′ =

0γ(E − vB)

0

and B′ =

00

γ(B − vE/c2)

. (5.89)

If |E| > c|B|, we can set B′ = 0 by taking β = cB/E. We are then left with motion ina uniform electric field, which was discussed above. For |E| < c|B|, we can set E′ = 0by taking β = E/(cB) and we have helical motion in a constant magnetic field. Sincea particle at rest in a magnetic field remains stationary, there exists a solution for|E| < c|B| where in the original frame the particle moves at constant velocity in thedirection perpendicular to the crossed fields. This is simply the velocity v such thatelectric and magnetic forces cancel, i.e., qE = −qv×B requiring v = E×B/B2. The

Part-II Electrodynamics 38

fact that charged particles with a specific velocity can move uniformly through crossedelectric and magnetic fields can be used to select charged particles on their velocity.The special case |E| = c|B| is left as an exercise.

Exercise: For perpendicular magnetic fields with |E| = c|B| and E = (0, E, 0) and B =(0, 0, E/c), show that ct(τ) and x(τ) are, generally, cubic polynomials in τ , y(τ) is a quadraticpolynomial, and z(τ) is linear.

6 Energy and momentum of the electromagnetic

field

In this section we shall discuss the energy and momentum densities carried by theelectromagnetic field, and show how these are combined with the fluxes of energy andmomentum into the spacetime stress-energy tensor.

6.1 Energy and momentum conservation

We begin by recalling Poynting’s theorem, which expresses energy conservation inelectromagnetism. In any inertial frame, the rate of work done per unit volume by theelectromagnetic field on charged particles is J · E. Note that magnetic field does nowork since the magnetic force on any charge is perpendicular to its velocity. We noweliminate J in terms of the fields using Eq. (1.4) to find

J · E =1

µ0

E · (∇×B)− ε0E ·∂E

∂t. (6.1)

We can manipulate this further using

∇ · (E×B) = B · (∇× E)− E · (∇×B) . (6.2)

Replacing ∇× E with −∂B/∂t from Eq. (1.2), and substituting in Eq. (6.1), we find

J · E = − ∂

∂t

(ε0E

2

2+

B2

2µ0

)− 1

µ0

∇ · (E×B) . (6.3)

Introducing the Poynting vector,

N ≡ E×B/µ0 , (6.4)

Part-II Electrodynamics 39

and recalling the electromagnetic energy density

ε =ε0E

2

2+

B2

2µ0

, (6.5)

we have the (non-)conservation law

∂ε

∂t+ ∇ ·N = −J · E . (6.6)

The Poynting vector therefore gives the energy flux of the electromagnetic field, i.e.,the rate at which electromagnetic energy flows through a surface element dS is N · dS.Integrating Eq. (6.6) over a volume V gives

d

dt

∫V

ε d3x +

∫V

E · J d3x = −∫∂V

N · dS . (6.7)

In words, this states that the rate of change of electromagnetic energy in the volumeplus the rate at which the field does work on charges in the volume (i.e., the rate ofchange of mechanical energy in the volume) is equal to the rate at which electromagneticenergy flows into the volume.

We now perform a similar calculation for the force per unit volume on charged particles,ρE + J×B. Eliminating ρ with ∇ ·E = ρ/ε0 and J with Eq. (1.2) as above, we havethe force density

ρE + J×B = ε0E∇ · E +1

µ0

(∇×B)×B− ε0∂E

∂t×B

= ε0E∇ · E +1

µ0

(∇×B)×B− ε0∂

∂t(E×B) + ε0E×

∂B

∂t

= ε0E∇ · E +1

µ0

(∇×B)×B− ε0∂

∂t(E×B) + ε0(∇× E)× E . (6.8)

We can make the right-hand side look more symmetric between E and B by adding inB∇ ·B/µ0 = 0. Moreover, we can manipulate the curl terms with

(∇×B)×B = B ·∇B−∇(B2)/2 , (6.9)

and similarly for B→ E, to give

ρE + J×B = ε0

(E∇ · E + E ·∇E− 1

2∇E2

)+

1

µ0

(B∇ ·B + B ·∇B− 1

2∇B2

)− ε0

∂(E×B)

∂t. (6.10)

The first two terms on the right-hand side are total derivatives; they are (minus) thedivergence of the symmetric Maxwell stress tensor

σij ≡ −ε0(EiEj −

1

2E2δij

)− 1

µ0

(BiBj −

1

2B2δij

). (6.11)

Part-II Electrodynamics 40

Further introducing the electromagnetic momentum density

g ≡ ε0E×B = N/c2 , (6.12)

we have a (non-)conservation law for momentum:

∂gi∂t

+∂σij∂xj

= − (ρE + J×B)i . (6.13)

The Maxwell stress tensor σij encodes the flux of 3-momentum. In particular, the rateat which the ith component of 3-momentum flows through a surface element dS isgiven by σijdSj. Equation (6.13) can be written in integral form as

d

dt

∫V

gi d3x +

∫V

(ρE + J×B)i d3x = −

∫∂V

σij dSj , (6.14)

which states that the rate of change of electromagnetic momentum in some volumeV , plus the total force on charged particles (i.e., the rate of change of mechanicalmomentum) is equal to the rate at which electromagnetic momentum flows into Vthrough the boundary.

Note that the Maxwell stress tensor is symmetric. This is true more generally (e.g.it holds for the stress tensor in continuum and fluid mechanics) as a consequence ofangular momentum conservation; we shall briefly review this argument in Sec. 6.4.Note also that the momentum density and energy flux (Poynting vector) are related,g = N/c2. We might have anticipated this result by thinking quantum mechanicallyof the electromagnetic field being composed of massless photons (all moving at thespeed of light), with energy and momentum related by E = pc. A net energy flux willthen necessarily be accompanied by a momentum density with g = N/c2. In fact, therelation between momentum density and energy flux is universal and is necessary toensure symmetry of σij in every inertial frame (see also Sec. 6.4).

Given the integral form of the conservation law, Eq. (6.14), we note finally that forstatic situations, the total electromagnetic force on a charge and current distributioncan be calculated by integrating the Maxwell stress tensor over any surface that fullyencloses the sources.

6.2 Stress-energy tensor

We now develop a covariant formulation of the energy and momentum conservationlaws derived above, seeking to unite them in a single spacetime conservation law. Indoing so, we shall show that ε, N (and g) and σij are the components of a type-

(20

)Lorentz tensor, the stress-energy tensor.

Part-II Electrodynamics 41

To motivate the introduction of the stress-energy tensor, first consider a simpler me-chanical example. Massive particles, of rest mass m, are all at rest in some inertialframe S with (proper) number density n0. In that frame, we clearly have

ε = mc2n0 , N = 0 = g , σij = 0 . (6.15)

Now consider performing a standard Lorentz boost to the frame S ′ in which the particleshave 3-velocity v′ = −(v, 0, 0). In S ′, the number density is γn0 (by length contraction),the energy of each particle is γmc2 and the momentum of each is −γmv. It followsthat the energy density in S ′ is

ε′ = γmc2γn0 = γ2ε . (6.16)

There is now a non-zero energy flux in S ′, with

N ′x = −ε′v = −γ2vε , (6.17)

and a momentum density with non-zero component

g′x = (−γmv)γn0 = −γ2vε/c2 = N ′x/c2 . (6.18)

Finally, there is a flux of the x-component of 3-momentum along the x′-direction, sothe stress tensor has a single non-zero component

σ′xx = −vg′x = γ2(v/c)2ε . (6.19)

Note how ε, N and σij mix amongst themselves under Lorentz transformations. Letus tentatively write

T µν =

(ε cgj

Ni/c σij

). (6.20)

We shall now show that this object is indeed a tensor since under Lorentz transforma-tions the components transform correctly. In the rest frame S, we have

T µν =

ε 0 0 00 0 0 00 0 0 00 0 0 0

. (6.21)

If T µν were a tensor, its components in S ′ would necessarily be given by T ′µν =Λµ

αΛνβT

αβ, where Λµα for the standard boost is given in Eq. (4.7). Evaluating the

transformation gives

ΛµαΛν

βTαβ =

γ2ε −γ2βε 0 0−γ2βε γ2β2ε 0 0

0 0 0 00 0 0 0

=

ε′ cg′x 0 0

N ′x/c σ′xx 0 00 0 0 00 0 0 0

, (6.22)

Part-II Electrodynamics 42

where in the last equality we used the transformations derived above from physicalarguments. Since the components on the right are exactly what we should have in S ′

from the identification in Eq. (6.20), we conclude that T µν is indeed a tensor.

Returning to electromagnetism, we have

T µν =1

µ0

(12(E2/c2 + B2) (E×B)j/c(E×B)i/c

12(E/c)2δij − EiEj/c2 + 1

2B2δij −BiBj

). (6.23)

Since this is quadratic in the fields, we must be able to write it in terms of suitable con-tractions of the field-strength tensor F µν with itself. Some straightforward calculationwill convince you that

T µν =1

µ0

(F µαF ν

α −1

4ηµνFαβFαβ

). (6.24)

This is symmetric as it must be. Noting the invariant FαβFαβ = 2(B2 − E2/c2) [seeEq. (5.56)], we have, for example,

µ0T00 = F 0αF 0

α − 14η002(B2 − E2/c2)

= F 0iF 0i + 1

2(B2 − E2/c2)

= E2/c2 + 12B2 − 1

2E2/c2

= 12

(B2 + E2/c2

)= µ0ε . (6.25)

Exercise: Verify that Eq. (6.24) correctly reproduces the expressions for the energy flux andMaxwell stress tensor given in Eqs (6.4) and (6.11) respectively.

Note that T µν is trace-free since

T µµ =1

µ0

(F µαFµα −

1

4ηµνη

µνFαβFαβ

)= 0 , (6.26)

where we used ηµνηµν = 4.

6.3 Covariant conservation of the stress-energy tensor

Having introduced the stress-energy tensor T µν , we expect to be able to write theconservation laws, Eqs (6.6) and (6.13), in the covariant form

∂µTµν = some 4-vector linear in the field strength tensor and current 4-vector .

(6.27)

Part-II Electrodynamics 43

Taking the 0-component of ∂µTµν , we have

∂µTµ0 =

∂T 00

∂(ct)+∂T 0i

∂xi

=1

c

∂ε

∂t+

1

c∇ ·N

= −1

cJ · E . (6.28)

Repeating for the i-component,

∂µTµi =

∂T 0i

∂(ct)+∂T ji

∂xj

=∂gi∂t

+∂σji∂xj

= −(ρE + J×B)i , (6.29)

so that

∂µTµν = −

(J · E/c

ρE + J×B

). (6.30)

The right-hand side (which must be a 4-vector) is −F νµJ

µ, since

F νµJ

µ =

0 Ex/c Ey/c Ez/c

Ex/c 0 Bz −By

Ey/c −Bz 0 Bx

Ez/c By −Bx 0

ρcJxJyJz

=

J · E/c

ρEx + JyBz − JzBy

ρEy + JzBx − JxBz

ρEz + JxBy − JyBx

.

(6.31)It follows that we can combine energy and momentum conservation into the covariantconservation law for the stress-energy tensor:

∂µTµν = −F ν

µJµ . (6.32)

The 4-vector on the right of this equation has a simple interpretation in the case of asimple convective flow of charges, each with charge q and number density n0 in theirrest frame. Then, Jµ = qn0u

µ, where uµ is the 4-velocity, and F νµJ

µ = qn0Fνµu

µ isthe product of n0 and the 4-force on each charge. It follows that, in this case, F ν

µJµ

is the 4-force per unit rest-frame volume.

We can establish the conservation law, Eq. (6.32), directly from the covariant form ofMaxwell’s equations. Lowering the ν index, we have

µ0∂µTµν = ∂µ

(F µαFνα − 1

4δµνFαβF

αβ)

= (∂µFµα)Fνα + F µα∂µFνα − 1

2Fαβ∂νFαβ

= −µ0FναJα + 1

2

(F µα∂µFνα + F µα∂µFνα − Fαβ∂νFαβ

), (6.33)

Part-II Electrodynamics 44

where we used the first of Eq. (5.33) in the last equality. With some relabelling of thedummy indices, the sum of the final three terms can be shown to vanish by virtue ofthe second of Eq. (5.33),

F µα∂µFνα + F µα∂µFνα − Fαβ∂νFαβ = Fαβ∂αFνβ + F βα∂βFνα − Fαβ∂νFαβ

= −Fαβ (∂αFβν + ∂βFνα + ∂νFαβ)

= 0 , (6.34)

leaving ∂µTµν = −FνµJµ.

6.4 Symmetry of the stress-energy tensor

The stress-energy tensor is symmetric since g = N/c2 and σij = σji. We demonstratedabove that this was true for electromagnetism but it is true generally. In this sectionwe shall briefly discuss why.

Consider the energy and momentum conservation laws for an isolated system:

∂ε

∂t+ ∇ ·N = 0 (6.35)

∂gi∂t

+∂σij∂xj

= 0 . (6.36)

The usual 3D angular momentum of the system is

L ≡∫

x× g d3x , (6.37)

where the integral extends over all space. Taking the time derivative gives

dLidt

= εijk

∫xj∂gk∂t

d3x

= −εijk∫xj∂σkl∂xl

d3x

= −εijk∫

∂xl(xjσkl) d

3x + εijk

∫σkj d

3x

= −εijk∫xjσkl dSl + εijk

∫σkj d

3x . (6.38)

The first term on the right-hand side in the last line is the rate at which angularmomentum is entering the (isolated) system from infinity and and must vanish fora localised field configuration. We then see that angular momentum of the isolatedsystem is conserved provided that σij is symmetric so that the last term on the rightof Eq. (6.38) vanishes.

Part-II Electrodynamics 45

Now consider a Lorentz transformation. The stress-energy tensor transforms as

T ′µν = ΛµαΛν

βTαβ . (6.39)

Considering the 1-2 component, we have

σ′xy = Λ1αΛ2

βTαβ

= Λ10Λ2

2T02 + Λ1

1Λ22T

12 (standard boost)

= −γβT 02 + γT 12 . (6.40)

Repeating for the 2-1 component, we have

σ′yx = −γβT 20 + γT 21 . (6.41)

Symmetry of σij in all inertial frames therefore requires T i0 = T 0i and so g = N/c2.

A further consequence of the symmetry of the stress-energy tensor follows from con-sidering the dynamics of the “centre of energy” of the system,

∫εx d3x/

∫ε d3x. Since

the total energy is conserved, the dynamics of the centre of energy follow from thoseof∫εx d3x; we have

d

dt

∫εxi d

3x =

∫∂ε

∂txi d

3x

= −∫∂Nj

∂xjxi d

3x

= −∫

∂xj(xiNj) d

3x +

∫Ni d

3x

= −∫xiNj dSj +

∫Ni d

3x . (6.42)

The first term on the right-hand side vanishes for a localised field distribution. IfN = gc2, then

∫N d3x is constant by the conservation law for g [Eq. (6.36)], and we

see from Eq. (6.42) that the centre of energy moves uniformly. In particular, in thezero momentum frame where

∫g d3x = 0, the centre of energy remains at rest.

The above argument shows that two theorems from mechanics for isolated systems –that the centre of energy moves uniformly and that angular momentum is conserved –are related in relativity. Indeed, one can introduce the object

Sµαβ ≡ xαT µβ − xβT µα = −Sµβα , (6.43)

which transforms as a tensor under homogeneous Lorentz transformations. Then, foran isolated system, conservation of the stress-energy tensor implies that

∂µSµαβ = δαµT

µβ + xα ∂µTµβ︸ ︷︷ ︸

=0

−δβµT µα − xβ ∂µT µα︸ ︷︷ ︸=0

= Tαβ − T βα = 0 . (6.44)

Integrating this conservation law over space, the 0i component returns the centre-of-energy theorem and the ij component returns 3D angular momentum conservation.

Part-II Electrodynamics 46

6.5 Stress-energy tensor of a plane electromagnetic wave

As an example of computing the stress-energy tensor, we consider electromagneticplane waves. Maxwell’s equations admit wave-like solutions [see the wave equation inEq. (1.8)] in free space:

E = E0f(k · x− ωt)B = B0f(k · x− ωt) with ω = c|k| . (6.45)

If the function f is sinusoidal, with period 2π, these are harmonic waves propagating atspeed c with wavelength 2π/k (with k = |k|) and frequency ω/(2π). At any time, thephase of the wave, k ·x−ωt, is constant on 3D planes perpendicular to the wavevectork. Since ∇ · E = 0 in free space, and ∇ ·B = 0, we must have

k · E0 = 0 and k ·B0 = 0 . (6.46)

Furthermore, from ∇× E = −∂B/∂t we have

k× E0 = ωB0 ⇒ k× E0 = cB0 , (6.47)

where k is a unit vector in the direction of k, and so c|B0| = |E0|. It follows that E0,B0 and k form a right-handed triad of vectors.

We can write the fields in covariant form as follows. Phase differences must be Lorentzinvariant so we can introduce the 4-wavector

kµ = (ω/c,k) . (6.48)

This is a null 4-vector, ηµνkµkν = 0, since ω = ck. The phase can then be written

covariantly as k ·x−ωt = kµxµ. Specialising to k along the x-direction of some inertial

frame, and E0 along y (so that B0 is along z), the components of the field-strengthtensor in that frame are

F µν = B0f(kµxµ)

0 0 1 00 0 1 0−1 −1 0 0

0 0 0 0

, (6.49)

where B0 = |B0|. Both invariants F µνFµν and ∗F µνFµν vanish since E · B = 0 and|E| = c|B|.

The stress-energy tensor, Eq. (6.23), reduces to T µν = F µαF να/µ0 since F µνFµν = 0.

Part-II Electrodynamics 47

It follows that

T µν =B2

0f2(kµx

µ)

µ0

0 0 1 00 0 1 0−1 −1 0 0

0 0 0 0

0 0 1 00 0 −1 01 1 0 00 0 0 0

=B2

0f2(kµx

µ)

µ0

1 1 0 01 1 0 00 0 0 00 0 0 0

. (6.50)

It follows that the energy density

ε = B20f

2(kµxµ)/µ0 , (6.51)

which receives equal contributions from the electric and magnetic fields. The Poyntingvector is along the x-direction, with

Nx = cB20f

2(kµxµ)/µ0 . (6.52)

Finally, the momentum flux has only one non-zero component:

σxx = B20f

2(kµxµ)/µ0 . (6.53)

Note thatσxx = Nx/c = cgx = ε . (6.54)

These results are consistent with the photon picture in which photons with energyE = pc (where p is the magnitude of their momentum) propagate at speed c alongthe x-direction. The energy density is then c times the momentum density, and themomentum flux is c times the momentum density12.

Exercise: Consider reflection of a plane electromagnetic wave, propagating along the x-direction, from a perfect conductor whose plane surface is at x = 0. The conductivity is sohigh that the fields decay very rapidly inside the conductor and we can therefore approximatethe fields as being zero inside. The E field is tangent to the surface and, since the tangentialcomponent is continuous, the E field must be zero at the surface of the conductor. The Bfield is also tangential but is now discontinuous at the surface with the discontinuity drivenby a surface current flowing in a thin layer on the surface.

The total electric and magnetic fields (sum of incident and reflected waves) outside theconductor (x < 0) are therefore standing waves of the form

Etot = B0c [f(kx− ωt)− f(−kx− ωt)] yBtot = B0 [f(kx− ωt) + f(−kx− ωt)] z . (6.55)

12The momentum carried through an area dS in the y-z plane in time dt is the total momentum ofthose photons within a volume cdtdS, i.e., σxxdSdt = gxcdtdS.

Part-II Electrodynamics 48

Show that the energy density is

ε =B2

0

µ0

[f2(kx− ωt) + f2(−kx− ωt)

], (6.56)

which is the sum of the energy density in the incident and reflected waves. For harmonicwaves, the time-averaged energy density is the same at all x. Show also that the Poyntingvector has a single non-zero component

Nx =B2

0c

µ0

[f2(kx− ωt)− f2(−kx− ωt)

], (6.57)

which, again, is the sum of the incident and reflected energy fluxes. The Poynting vectorvanishes at the surface of the conductor at all times, so no energy enters the conductor. Thisis still consistent with there being a surface current since the rate of energy dissipation bythis current scales as the inverse of the conductivity and so vanishes in the limit of infiniteconductivity assumed here. At other locations, the time-averaged Poynting vector vanishesfor harmonic waves. Finally, show that the Maxwell stress-tensor is diagonal with

σxx =B2

0

µ0

[f2(kx− ωt) + f2(−kx− ωt)

]σyy =

2B20

µ0f(kx− ωt)f(−kx− ωt) , (6.58)

and σzz = −σyy. The σxx = ε is the sum of the incident and reflected momentum fluxes.However, the σyy and σzz components arise purely from interference of the incident andreflected waves. Since σyy and σzz only vary in the x-direction, the non-zero values are stillconsistent with the momentum density of the field and the conservation law (6.36).

There is a force exerted on the conductor (“radiation pressure”) in the x-direction given perarea by

σxx|x=0 =2B2

0

µ0f2(−ωt) . (6.59)

This follows from the integral form of the momentum conservation law, Eq. (6.14), notingthe momentum density is zero inside the conductor. The incident momentum density atx = 0 is B2

0f2(−ωt)/(µ0c), and so the radiation pressure is 2c times the incident momentum

density. What is the physical origin of this force? It is the Lorentz force of the B field atthe surface on the surface current distribution. To see that this is consistent, note that thesurface current Js (defined such that the current through a given line element dl lying inthe surface and perpendicular to Js is |Js|dl) needed to maintain the discontinuity in B isalong the x-direction with µ0Js = By|x=0. The force per area on this current is JsBy|x=0/2along the x-direction. The factor of 1/2 here accounts for the average value of the magneticfield across the depth of the current distribution. Using By|x=0 = 2B0(−ωt), we see that theLorentz force per area is exactly σxx at the surface of the conductor, as required.

Part-II Electrodynamics 49

6.5.1 Radiation pressure of a photon “gas”

We can obtain an isotropic distribution of radiation by forming a random superpositionof incoherent plane waves propagating in all directions with equal intensity. (Incoherentwaves have no lasting phase relation amongst themselves, e.g. the radiation emitted bytwo separated thermal sources.) Such radiation is the classical analogue of a photongas. Isotropy requires the (time-averaged) Poynting vector to vanish and that theMaxwell stress tensor σij = pδij, where p is the pressure. The averaged stress-energytensor is then diagonal with

〈T µν〉 = diag(ε, p, p, p) . (6.60)

The electromagnetic stress energy tensor is necessarily trace-free;

T µµ = 0 ⇒ p = ε/3 . (6.61)

We see that for a photon gas the pressure is one-third of the energy density irrespectiveof the way that the energy is distributed over frequency (i.e., the radiation does nothave to have a blackbody spectrum).

7 Radiation of electromagnetic waves

In this section we consider the production of electromagnetic radiation from time-dependent charge distributions. We first consider radiation from macroscopic chargeand current densities involving non-relativistic motions, as, for example, in radiationfrom a radio antenna. We then move on to radiation from isolated charges in arbi-trary motion. As we shall see, it is the acceleration of the charges that determinesthe radiation field. The theory we develop underlies extreme phenomena such as thesynchrotron radiation emitted from charges circling relativistically in magnetic fields.

7.1 Retarded potentials

We begin by deriving the 4-potential generated causally by an arbitrary 4-current Jµ.Our starting point is Eq. (5.14) for the 4-vector potential Aµ:

2Aµ − ∂µ(∂νAν) = −µ0J

µ . (7.1)

It is very convenient to make a gauge choice (the Lorenz gauge) such that ∂µAµ =

0. This is clearly a Lorentz-invariant statement, which has the benefit of retainingcovariance of the subsequent gauge-fixed equations under Lorentz transformations. Itis always possible to make this gauge choice: starting in some general gauge we make

Part-II Electrodynamics 50

a transformation Aµ → Aµ + ∂µχ with χ chosen to satisfy 2χ = −∂µAµ. As we showbelow, we can always solve this equation to determine a suitable χ. In the Lorenzgauge, Eq. (7.1) simplifies to

2Aµ = −µ0Jµ or

(∇2 − 1

c2

∂2

∂t2

)Aµ = −µ0J

µ , (7.2)

which is a sourced wave equation.

We solve Eq. (7.2) by taking a Fourier transform in time, writing

Aµ(t,x) =1√2π

∫ ∞−∞

Aµ(ω,x)eiωt dω , (7.3)

and similarly for Jµ. The sourced wave equation then becomes(∇2 +

ω2

c2

)Aµ = −µ0J

µ . (7.4)

This is a sourced Helmholtz equation with k2 = ω2/c2. As for Poisson’s equation inelectrostatics, we can solve this with a Green’s function G(x; x′) such that(

∇2 + k2)G(x; x′) = δ(3)(x− x′) , (7.5)

so that

Aµ(ω,x) = −µ0

∫G(x; x′)Jµ(ω,x′) d3x′ . (7.6)

To describe the fields produced by localised sources, we require the Green’s function totend to zero as |x| → 0. Homogeneity and isotropy require that G(x; x′) is a functionof |x− x′| alone. Taking x′ = 0 and r = |x|, away from r = 0 we have

1

r2

∂r

(r2∂G

∂r

)+ k2G = 0 . (7.7)

The solution that tends to zero as r →∞ is G = −Ae−ikr/r. (The reason for choosinge−ikr rather than eikr is to do with causality and will be explained shortly.) We fix thenormalisation A by integrating Eq. (7.5) over a sphere of radius ε centred on x′, andtaking the limit as ε→ 0. The integral of the k2G term goes like ε3/ε and so vanishesin the limit, leaving

limr→0

4πr2∂G/∂r = 1

⇒ limr→0

4πr2A

(e−ikr

r2+ ik

e−ikr

r

)= 1 , (7.8)

so that A = 1/(4π) and

G(x; x′) = − 1

e−iω|x−x′|/c

|x− x′|. (7.9)

Part-II Electrodynamics 51

Inserting this Green’s function into Eq. (7.6) and taking the inverse Fourier transform,gives

Aµ(t,x) =µ0

∫ ∞−∞

dω√2π

∫d3x′

eiω(t−|x−x′|/c)

|x− x′|Jµ(ω,x′) . (7.10)

The integral over ω is the inverse Fourier transform of Jµ(ω,x′) and returns Jµ(tret,x′)

where the retarded time tret ≡ t−|x−x′|/c. This gives our final result for the 4-potentialfrom an isolated time-dependent charge distribution:

Aµ(t,x) =µ0

∫Jµ(tret,x

′)

|x− x′|d3x′ . (7.11)

This has a simple physical interpretation. Since the event at tret and x′ is on thepast lightcone of the event at t and x, the field at any event xµ is generated by thesuperposition of the fields due to currents at earlier times on the past lightcone of xµ.This reflects causality in that electromagnetic disturbances propagate at the speed oflight and that only earlier events can influence the field at any spacetime point13. Thepotential Aµ is called the retarded potential. We now see why we had to choose thee−ikr/r form for the Green’s function – had we chosen eikr/r we would have obtainedthe advanced potential for which the currents at t+ |x−x′|/c and x′ influence the fieldat t and x, i.e.,, acausal propagation.

It is not obvious that Eq. (7.11) is covariant under Lorentz transformations, since weare integrating over space alone. To see that it is covariant, we rewrite it as a spacetimeintegral as follows:

Aµ(x) =µ0

∫Jµ(y)

δ(y0 − x0 + |x− y|)|x− y|

d4y . (7.12)

This looks more covariant since it involves the Lorentz-invariant measure14 d4y. How-ever, we still have the issue of the delta-function. To see that this is Lorentz invariant,consider

δ (ηµν(xµ − yµ)(xν − yν)) Θ(x0 − y0) , (7.13)

where Θ is the Heaviside (unit-step) function, which equals unity if its argument isgreater than zero and vanishes otherwise. The product of the two terms in Eq. (7.13)is therefore Lorentz invariant since the delta-function is non-zero only for null-separatedxµ and yµ, and the temporal ordering of such events is Lorentz invariant. We now showthat Eq. (7.13) is twice the term δ(y0− x0 + |x− y|)/|x− y| in question in Eq. (7.12).This follows from a general result for the delta-function with a function f(x) as itsargument:

δ (f(x)) =∑i

δ(x− x∗,i)|f ′(x∗,i)|

, (7.14)

13Recall that the temporal ordering of timelike and null-separated events is Lorentz invariant14Under a Lorentz transformation, x′µ = Λµνxν , the Jacobian |∂x′µ/∂xν | is unity since det(Λµν)2 =

1.

Part-II Electrodynamics 52

where the sum is over the roots x∗,i of f(x), i.e.,, f(x∗,i) = 0, and f ′(x) is the derivativeof f(x). Writing out the components of ηµν(x

µ − yµ)(xν − yν), we have

|x− y|2 − (x0 − y0)2 = (|x− y|+ x0 − y0)(|x− y| − x0 + y0) , (7.15)

so that

δ (ηµν(xµ − yµ)(xν − yν)) =

δ(y0 − x0 + |x− y|)2|y0 − x0|

+δ(y0 − x0 − |x− y|)

2|y0 − x0|

=δ(y0 − x0 + |x− y|)

2|x− y|+δ(y0 − x0 − |x− y|)

2|x− y|. (7.16)

The Heaviside function in Eq. (7.13) selects only the first term on the right, showingthat

δ (ηµν(xµ − yµ)(xν − yν)) Θ(x0 − y0) =

δ(y0 − x0 + |x− y|)2|x− y|

, (7.17)

as required. Equation (7.12) therefore involves a Lorentz-invariant integration of the4-vector current over the past lightcone of the event xµ, and so is properly covariant.

Exercise: Noting that the gauge condition ∂µAµ = 0 takes the form iωA0/c+∇ · A = 0 afterFourier transforming in time, verify that the right-hand side of Eq. (7.6) satisfies the gaugecondition by virtue of charge conservation.

7.2 Dipole radiation

As a first application of the retarded potential derived above, we consider radiationfrom some macroscopic charge distribution in which all motions are non-relativistic (insome inertial frame). We shall be interested in the fields at large distances comparedto the spatial extent of the charge distribution.

Specifically, consider the case where all charges are localised within some region of sizea, and place the origin of spatial coordinates within this volume. To calculate the4-potential at x, where r = |x| a, we Taylor expand |x− x′| in Eq. (7.11) as

|x− x′| = r − x · x′

r+ · · ·

= r − x · x′ + · · · , (7.18)

where x is a unit vector in the direction of x. Note that the second term on the rightis smaller than the first by O(a/r). Substituting into Eq. (7.11), we have

Aµ(t,x) =µ0

4πr

∫Jµ(tret,x

′)

(1 +

x · x′

r+ · · ·

)d3x′ , (7.19)

Part-II Electrodynamics 53

where tret = t−r/c+x·x′/c+· · · . At r a, we can safely neglect the O(a/r) correctionsin the round brackets in Eq. (7.19). However, the treatment of the dependence of tret onx′ depends on the rate at which the source varies. For an oscillating charge distribution,with angular frequency ω, if a/c 1/ω then the variation in tret over the source isshort compared to the period and can be handled perturbatively via a Taylor expansionof Jµ in time. Note that the typical velocity of the charges is aω and so we requirethis to be much less than c for an expansion to be valid. Equivalently, since the fieldsradiated by a source oscillating at ω oscillate at the same frequency, the associatedwavelength λ must be large compared to a.

For sufficiently slowly-varying sources, we can neglect the variation of tret across thesource. The leading-order contribution to the magnetic vector potential is then

A(t,x) =µ0

4πr

∫J(t− r/c,x′) d3x′ . (7.20)

We can express the right-hand side in terms of the time derivative of the electric dipolemoment of the charge distribution as follows. Charge conservation, ∂ρ/∂t+ ∇ · J = 0,implies that

P =d

dt

∫ρ(x)x d3x

=

∫∂ρ

∂tx d3x

= −∫

(∇ · J)x d3x , (7.21)

so that

Pi = −∫

∂Jj∂xj

xi d3x

=

∫Jj∂xi∂xj

d3x

=

∫Ji d

3x , (7.22)

where the second equality follows from the generalised divergence theorem and notingthat the surface term vanishes for a localised current distribution. We therefore have

A(t,x) ≈ µ0

4πrP(t− r/c) . (7.23)

This is called the electric dipole approximation. For a source oscillating at frequencyω, we have A ∼ e−iω(t−r/c)/r, i.e.,, outgoing spherical waves.

Part-II Electrodynamics 54

We can calculate the magnetic field from B = ∇ × A. We have to be careful toremember that t− r/c depends on x, so that for some function f(t− r/c),

∂f(t− r/c)∂xi

= −1

cf ′(t− r/c) ∂r

∂xi

= −xircf ′(t− r/c) . (7.24)

It follows that

B(t,x) = − µ0

4πr2(∇r)× P(t− r/c)− µ0

4πrc(∇r)× P(t− r/c)

= − µ0

4πr2x× P(t− r/c)− µ0

4πrcx× P(t− r/c) . (7.25)

Note that the first contribution to the field varies as 1/r2 while the second goes like1/r. For a source oscillating at ω, the relative size of the 1/r part compared to the 1/r2

part is O(r/λ). We define the far-field (sometimes known as the radiation zone or thewave zone) as the asymptotic region r λ. In the far-field, the 1/r term dominatesthe magnetic field, and describes outgoing spherical waves of radiation:

B(t,x) = − µ0

4πrcx× P(t− r/c) (r λ) . (7.26)

Note that the B field is perpendicular to x and to P, and that it is P that determinesthe radiation field.

For a set of charges qi located at xi, we have P =∑

i qixi and so

P =∑i

qixi . (7.27)

This shows that it is acceleration of the charges that generates the dipole radiationfield.

We have to work a little harder for the electric potential since, with the same orderof approximation as we used above for A, the electric potential φ involves

∫ρ(t −

r/c,x′) d3x′, i.e., the total charge Q at time t − r/c. Since Q is independent of timefor an isolated charge distribution, the potential is also time-independent to this levelof approximation (it is just the usual asymptotic Coulomb field). We therefore haveto work at higher order in a/λ to find the part of φ that describes radiation15. It isstraightforward to do so, but here we shall follow the simpler approach of using theLorenz gauge condition to find φ.

If we use Eq. (7.23) in ∇ ·A + c−2∂φ/∂t = 0, we find

∂φ

∂t=µ0c

2

4πr2x · P(t− r/c) +

µ0

4πrcx · P(t− r/c) , (7.28)

15The 3-current J ∼ ρv is higher-order in v/c ∼ a/λ than ρ, which is why we can apparently workto lower order in a/λ when dealing with the vector potential.

Part-II Electrodynamics 55

which integrates to give

φ(t,x) =Q

4πε0r+

1

4πε0r2x ·P(t− r/c) +

1

4πε0rcx · P(t− r/c) . (7.29)

Here, we have included the Coulomb term from the constant charge Q, which appearsas an integration constant. The second term is the usual dipole potential that wouldbe generated by a static charge distribution, but here it depends on the electric dipolemoment at the earlier time t − r/c. Typically, this term is smaller than the Coulombterm by O(a/r). It is the third term that describes the radiation component of thepotential. It varies as 1/r and is typically O(r/λ) times the second term, and sodominates in the far-field.

We can form the electric field from E = −∂A/∂t −∇φ. Equations (7.23) and (7.29)give terms varying as 1/r, 1/r2 and 1/r3. The part going as 1/r dominates in thefar-field:

E(t,x) = − µ0

4πr

[P(t− r/c)− xx · P(t− r/c)

](r λ)

=µ0

4πrx× [x× P(t− r/c)] . (7.30)

It originates from the time derivative of P in A (Eq. 7.23) and the spatial derivative ofP(t− r/c) in φ (Eq. 7.29). This expression for the electric field in the far-field can alsobe obtained directly from the magnetic field with ∇ × B = c−2∂E/∂t. The electricfield is perpendicular to x and lies in the plane formed from x and P.

We see that in the far-field,

E = −cx×B (r λ) , (7.31)

so that |E| = c|B|, and E · B = 0. This behaviour is just like for a plane wavewith wavevector along x. This makes sense since in the far-field, the radius r is solarge compared to the wavelength λ that on the scale of λ the spherical wavefronts areeffectively planar.

7.2.1 Power radiated

Energy is carried off to spatial infinity from the source in the form of electromagneticradiation. We can determine the rate at which energy is radiated by integrating thePoynting vector over a spherical surface at very large radius. Working in the far-field,

Part-II Electrodynamics 56

we have

N =1

µ0

E×B

=c

µ0

B× (x×B)

=c

µ0

|B|2x

=µ0

(4πr)2c|x× P|2x . (7.32)

This is radial, as expected. The magnitude of the Poynting vector varies as 1/r2, sothe integral over a spherical surface of radius r is independent of r. If P is along thez-axis, |x × P| = sin θ|P|, where θ is the angle that x makes with the z-axis. In thiscase,

N =µ0

(4πr)2c|P|2 sin2 θ x . (7.33)

The total radiated power then follows from∫N · dS =

µ0

(4π)2c|P|2

∫ 1

−1

sin2 θ dcos θ

∫ 2π

0

=µ0

6πc|P|2 . (7.34)

The power radiated per solid angle is |N|r2 and varies as sin2 θ. The radiation ismaximal in the plane perpendicular to P and there is no radiation along the directionof P. The radiation is linearly polarized, with the electric field in the plane containingx and P. The magnetic field is perpendicular to E.

7.3 Radiation from an arbitrarily moving point charge

We now consider radiation from a single point charge moving arbitrarily, i.e.,, we relaxthe assumption of non-relativistic motion. Specifically, we consider a point charge qfollowing a wordline yµ(τ) in spacetime, where τ is proper time.

The 4-current is non-zero except on the worldline, and can be written covariantly as

Jµ(x) = qc

∫δ(4) (x− y(τ)) yµ(τ) dτ . (7.35)

The 4D delta-function here is Lorentz-invariant (since the 4D measure d4x is), and itsappearance ensures that the integration over τ only selects that proper time (if any)for which xµ = yµ(τ). The 4D delta-function can be decomposed in any inertial frameas

δ(4) (x− y(τ)) = δ(ct− y0(τ)

)δ(3) (x− y(τ)) . (7.36)

Part-II Electrodynamics 57

We can perform the integration over proper time in Eq. (7.35) using the result inEq. (7.14) for the delta-function with a function as its argument, to give

Jµ(t,x) = qcδ(3) (x− y(t)) yµ/y0 , (7.37)

where y(t) is the 3D location of the charge at coordinate time t. The µ = 0 componentgives ρ(t,x) = qδ(3) (x− y(τ)), and the µ = i component gives the current densityJ(t,x) = qv(t)δ(3) (x− y(τ)), where v(t) is the 3D velocity of the charge at time t.These are clearly the correct (non-covariant) expressions for the charge and currentdensity due to a point charge, from which we establish the validity of the covariantresult in Eq. (7.35).

7.3.1 Lienard–Weichert potentials and fields

Inserting Eq. (7.37) for the 4-current into Eq. (7.12) for the retarded potential, we find

Aµ(x) =µ0qc

∫dτ

∫d4z δ(4) (z − y(τ)) yµ(τ)

δ(x0 − z0 − |x− z|)|x− z|

=µ0qc

∫yµ(τ)

δ (x0 − y0(τ)− |x− y(τ)|)|x− y(τ)|

dτ , (7.38)

where we performed the integration d4z using the defining property of the delta-function. To make further progress, it is convenient to express the remaining 1Ddelta-function in covariant form using Eq. (7.17):

δ (x0 − y0(τ)− |x− y(τ)|)|x− y(τ)|

= 2Θ(R0(τ)

)δ (ηµνR

µ(τ)Rν(τ)) , (7.39)

where we defined Rµ(τ) = xµ − yµ(τ). The combination of the delta-function andHeaviside function selects the unique value of proper time when the charge’s worldlineintersects the past lightcone of xµ. Denoting the proper time there by τ∗ (which is afunction of xµ), and using

d

dτ[ηµνR

µ(τ)Rν(τ)] = −2ηµνRµ(τ)yν(τ) , (7.40)

we haveδ (x0 − y0(τ)− |x− y(τ)|)

|x− y(τ)|=

δ(τ − τ∗)|Rµ(τ∗)yµ(τ∗)|

. (7.41)

Finally, using this in Eq. (7.38), and integrating over τ , we obtain the Leinard–Wiechertpotential in covariant form:

Aµ(x) =µ0qc

yµ(τ∗)

|Rν(τ∗)yν(τ∗)|. (7.42)

Part-II Electrodynamics 58

Equation (7.42) is a beautifully compact expression for the 4-potential due to an arbi-trarily moving charge. However, to gain intuition for this result, it is useful to unpack itinto time and space components in some inertial frame. Then, yµ(τ∗) = γ(τ∗) (c,v(τ∗)),where v(τ∗) is the 3-velocity at proper time τ∗ and γ(τ∗) is the associated Lorentz factor.Furthermore, Rµ(τ∗) = (c∆t(τ∗),R(τ∗)), where ∆t(τ∗) is the time difference betweenthe current event and the time that the charge intersected the past lightcone, andR(τ∗) ≡ x− y(τ∗). Note that c∆t(τ∗) = |R(τ∗)| = R(τ∗). We now have

|Rν(τ∗)yν(τ∗)| = cγ(τ∗)R(τ∗)(

1− R(τ∗) · v(τ∗)/c), (7.43)

and the time and space components of Eq. (7.42) give

φ(t,x) =q

4πε0R(τ∗)(

1− R(τ∗) · v(τ∗)/c) (7.44)

A(t,x) =µ0qv(τ∗)

4πR(τ∗)(

1− R(τ∗) · v(τ∗)/c) . (7.45)

Due to the presence of the 1−R(τ∗) ·v(τ∗)/c factors in the denominators of Eqs. (7.44)and (7.45), the potentials (and hence fields) are concentrated in the direction of motion,R(τ∗) ∝ v(τ∗), for relativistic motion. This is a consequence of extreme aberration fora rapidly moving particle.

If we adopt an inertial frame such that the particle is at rest in that frame at τ∗, theLienard–Wiechert potentials simplify to A(t,x) = 0 and

φ(t,x) =q

4πε0Rrest(τ∗)(rest-frame) , (7.46)

where Rrest(τ∗) is the distance between the observation event and yµ(τ∗) in this rest-frame. These potentials are exactly what one would expect for a stationary pointcharge. If instead we adopt a frame in which the charge is not at rest at τ∗, but ismoving non-relativistically, the vector potential no longer vanishes but is approximately

A(t,x) ≈ µ0qv(τ∗)

4πR(τ∗)(non-relativistic) . (7.47)

This is consistent with the non-relativistic (dipole) approximation in Eq. (7.23) notingthat dP/dt = qv for a point charge.

To calculate the electric and magnetic fields of an arbitrarily moving charge, we returnto the covariant form of the Lienard–Weichert potentials (Eq. 7.42) and form Fµν =∂µAν − ∂νAµ. We have to remember that τ∗ is a function of xµ, so that, for example,

Part-II Electrodynamics 59

∂µyν(τ∗) = yν(τ∗)∂µτ∗. We can evaluate ∂µτ∗ by differentiating the null condition

ηαβRα(τ∗)R

β(τ∗) = 0 to find

0 = ηαβ∂µRα(τ∗)R

β(τ∗)

= ηαβ∂µ (xα − yα(τ∗))Rβ(τ∗)

= ηαβ(δαµ − yα(τ∗)∂µτ∗

)Rβ(τ∗) . (7.48)

Solving for ∂µτ∗ gives∂τ∗∂xµ

=Rµ(τ∗)

Rν(τ∗)yν(τ∗). (7.49)

The other term we require in differentiating Eq. (7.42) is

∂µ [Rν(τ∗)yν(τ∗)] =(δνµ − yν(τ∗)∂µτ∗

)yν(τ∗) +Rν(τ∗)yν(τ∗)∂µτ∗

= ∂µτ∗ [Rν(τ∗)yν(τ∗)− yν(τ∗)yν(τ∗)] + yµ(τ∗)

= ∂µτ∗[c2 +Rν(τ∗)yν(τ∗)

]+ yµ(τ∗) . (7.50)

Putting these pieces together, we find (exercise!), on antisymmetrising,

Fµν(x) = − µ0qc

4π(cRrest)

(Rµyν −Rν yµ

(cRrest)+

(Rµyν −Rν yµ)

(cRrest)2(c2 +Rαyα)

), (7.51)

where all quantities on the right are evaluated at τ∗. Here, we have used

−Rµ(τ∗)yµ(τ∗) = cRrest , (7.52)

to express Fµν in terms of Rrest, the radial distance between xµ and yµ(τ∗) as measuredin the frame in which the charge is at rest at τ∗.

Inspection of Eq. (7.52) reveals terms going like 1/R, that depend on the 4-acceleration,and a term going like 1/R2 that depends only on the 4-velocity of the charge. The 1/Rterms describe radiation, while the 1/R2 term turns out to be the field that would begenerated from a charge moving uniformly with the same velocity at τ∗ (see below).

Exercise: Extract the electric and magnetic fields from the field-strength tensor in Eq. (7.52)in a general inertial frame, recalling that the 4-acceleration has components

yµ = γ2

(γ2

cv · a, γ

2

c2a · vv + a

). (7.53)

You should find (after a lengthy, but straightforward, calculation) that

E(t,x) =q

4πε0(1− v · R/c)3

(R− v/cγ2R2

+R× [(R− v/c)× a]

Rc2

), (7.54)

Part-II Electrodynamics 60

where all quantities on the right are evaluated at τ∗. For the magnetic field, you should find

B(t,x) =1cR(τ∗)×E(t,x) . (7.55)

The exercise above derives the electric and magnetic fields from Fµν . Note the followingpoints.

• The magnetic field is always perpendicular to the electric field and to R(τ∗).

• The radiation (1/R) part of the fields is sourced by the 3-acceleration. Theelectric field has an angular dependence given by R × [(R − v/c) × a], and istherefore perpendicular to R(τ∗), like the magnetic field. The angular dependenceis more complicated than for dipole radiation [Eq. (7.30)], due to the presenceof the v/c correction to R. However, in the non-relativistic limit (|v| c), theradiation fields reduce to those derived in the dipole approximation in Sec. 7.2(noting that d2P/dt2 = qa for a point charge). The magnitudes of the fieldssatisfy the usual relation for radiation, |E| = c|B|, even for non-relativistic motionof the charge.

• The 1/R2 contribution to the fields are the same as for a charge moving uniformlywith the same velocity at τ∗. To see this, recall the following expression for theelectric field from a uniformly moving charge (see the example in Sec. 5.4),

E =qγr

4πε0 [γ2(r · v)2/c2 + r2]3/2, (7.56)

where r points from the current position of the charge to the observation point.Let us express this in terms of R(τ∗) in the case that the charge is movinguniformly. In time R(τ∗)/c, a uniformly moving charge displaces by vR(τ∗)/c,and so

r = R(τ∗)− vR(τ∗)/c = R(τ∗)[R(τ∗)− v/c] . (7.57)

Using this relation, it is straightforward to show that

γ2(r · v)2/c2 + r2 = γ2R2(τ∗)(

1− R(τ∗) · v/c)2

, (7.58)

and so Eq. (7.57) reduces to the first term on the right of Eq. (7.54).

Part-II Electrodynamics 61

7.3.2 Power radiated

We now calculate the rate at which energy is radiated by an arbitrarily moving pointcharge. We shall first calculate this in the instantaneous rest frame of the charge andthen use Lorentz transformations to deduce the rate in a general inertial frame.

In the frame in which the particle is at rest at proper time τ∗, the radiation part ofelectric field on the forward lightcone through yµ(τ∗) is, from Eq. (7.54),

E(t,x) =q

4πε0Rrestc2R× (R× arest) (Instantaneous rest frame) , (7.59)

where arest is the 3-acceleration in the instantaneous rest frame. The magnetic field isB = R× E/c, so the Poynting vector is

N =1

µ0

E×B

=1

cµ0

E× (R× E)

=1

cµ0

|E|2R , (7.60)

which is radial. We can get the radiated power by integrating over a spherical surfaceof radius Rrest at time t (i.e., a section through the forward lightcone corresponding toa constant time surface in the instantaneous rest frame). Since the charge is at rest atτ∗, the radiation emitted in proper time dτ around τ∗ crosses this spherical surface in(coordinate) time16 dt = dτ . It follows that the energy radiated in the instantaneousrest frame in proper time dτ is

dErest = dτ

∫N · dS

=R2

restdτ

cµ0

∫|E|2 dΩ , (7.61)

16 This would not be true if the charge were moving at τ∗. Generally, the time taken for the radiationemitted in dτ∗ to cross a spherical surface of radius R(τ∗) can be got from Eq. (7.49). Rearranginggives

dτ∗ =Rµ(τ∗)dxµ

Rν(τ∗)yν(τ∗).

Taking dxµ = (cdt,0) gives

dτ∗ =dt

γ(τ∗)[1− R(τ∗) · v(τ∗)/c

] .This is just the usual Doppler effect. In the instantaneous rest frame, it reduces to dτ∗ = dt, asadvertised.

Part-II Electrodynamics 62

where dΩ is the element of solid angle (i.e., dΩ = d2R). Taking the rest-frame acceler-ation to be along the z-axis, we have

|R× (R× arest)| = sin θ|arest| , (7.62)

so that on integrating over solid angle

dErest

dτ=µ0q

2|arest|2

6πc. (7.63)

This simple result is exactly what we would have got by applying the dipole approx-imation, Eq. (7.34), in the instantaneous rest frame, with d2P/dt2 → qarest. This isnot unexpected – the dipole approximation holds in situations where all velocities arenon-relativistic and this is exactly true in the instantaneous rest frame. Indeed, thedipole approximation provides a useful short-cut to deriving Eq. (7.63).

Now consider the 4-momentum radiated in time dτ . Since equal power is emitted alongR and −R in the instantaneous rest frame, the total 3-momentum of the radiation iszero. We therefore have

dP µ = dτ

(1

c

dErest

dτ,0

)(7.64)

in the rest frame. Lorentz transforming to an inertial frame in which the charge hasvelocity v(τ∗) at τ∗, we have

dE ′

c= dP ′0 = γ(τ∗)dP

0 =γ(τ∗)dτ

c

dErest

dτ. (7.65)

The time elapsed between the events yµ(τ∗) and yµ(τ∗ + dτ) in the general frame isdt′ = γ(τ∗)dτ by time dilation, hence

dE ′

dt′=dErest

dτ. (7.66)

We see that the rate at which energy is radiated by the accelerated charge is Lorentzinvariant and so, generally, the instantaneous power radiated is

dE

dt=µ0q

2aµaµ

6πc=µ0q

2γ4

6πc

[a2 + γ2(a · v/c)2

], (7.67)

where aµ is the acceleration 4-vector (so that aµaµ = |arest|2). This result is knownas the relativistic Larmor formula. (Its non-relativistic limit was first worked out byLarmor in 1897.)

Non-examinable aside: If the above argument is not to your taste, the same expression forthe power radiated in a general frame can be obtained by direct calculation. The radiationpart of the electric field is now given by

E(t,x) =q

4πε0R× [(R− v/c)× a]R(1− v · R/c)3c2

, (7.68)

Part-II Electrodynamics 63

where all quantities on the right are evaluated at τ∗. The Poynting vector is still given byEq. (7.60), but when integrating this over the surface of a sphere of radius R(τ∗) at time t,to determine the radiated power, we have to be careful to account for Doppler shifts thatrelate dt at the source to the increment in time for radiation to pass through the radius R(τ∗).Following the arguments in the footnote on Page 61, and noting that the coordinate timebetween the events at τ∗ and τ∗+ dτ on the charge’s worldline is γ(τ∗)dτ , the power emittedper solid angle (i.e.,, energy per solid angle per coordinate time at source) is

dE

dtdΩ=

1cµ0

(1− v · R

c

)|E|2R2

=µ0q

2

(4π)2c

∣∣∣R× [(R− v/c)× a]∣∣∣2(

1− v · R/c)5 . (7.69)

Note that the angular dependence of the radiated power is more complicated than the sin2 θdependence in the instantaneous rest frame. In particular, for a particle moving relativisti-cally, the power is significantly concentrated in the direction of motion, R ∝ v. After somevector algebra, Eq. (7.69) can be reduced to

dE

dtdΩ=

µ0q2

(4π)2c

2(R · a)(a · v/c)(

1− v · R/c)4 +

|a|2(1− v · R/c

)3 −(R · a)2

γ2(

1− v · R/c)5

. (7.70)

Integrating over solid angle17 directly gives the final result in Eq. (7.67).

7.4 Scattering

As a final application of the material developed in this section on radiation of elec-tromagnetic waves, we consider the issue of scattering. An incident electromagneticwave will cause a free charge, such as an electron, to oscillate and the acceleration thataccompanies the oscillation will cause the charge to radiate. In this manner, the inci-dent radiation is scattered into other directions. A similar thing happens for systemsof bound positive and negative charges (e.g. an atom or molecule) for which the totalcharge vanishes. The incident field induces an oscillating electric dipole moment thatradiates, scattering the incident radiation.

For the case of electron scattering, consider incident fields that are weak enough not toinduce relativistic motions. In the frame in which the electron is at rest (on average),

17A straightforward, but lengthy, way to do this is to align v with the z-axis and a in the x-z plane,and express the components of R in spherical polar coordinates.

Part-II Electrodynamics 64

the equation of motion of the electron in the incident electromagnetic wave is simply

mex = −eE , (7.71)

where me is the mass of the electron, −e is the charge, and E is the incident fieldevaluated at the location of the electron. For an incident wave at angular frequency ω,we can evaluate the right-hand side of Eq. (7.71) at the average position of the electronprovided that the wavelength of the incident wave is large compared to the amplitudeof the oscillation induced upon the electron. In this case, the electron oscillates atfrequency ω and with amplitude eE0/(meω

2), where E0 is the amplitude of the electricfield in the incident wave. This oscillation amplitude will be small compared to thewavelength provided that

eE0

mecω 1 . (7.72)

However, since the maximum speed of the electron is the product of the frequencyand amplitude, i.e., eE0/(meω), Eq. (7.72) is equivalent to the requirement that theelectron motion is non-relativistic and so will hold by assumption.

The acceleration of the electron is therefore −eE/me and, using the Larmor for-mula (7.67) (and taking v = 0), the radiated power is, after averaging over time,⟨

dE

dt

⟩=

µ0

6πc

e4E20

2m2e

. (7.73)

The time-averaged magnitude of the Poynting vector of the incident radiation is

〈|N|〉 =E2

0

2µ0c, (7.74)

and the ratio of the radiated (scattered) power to this incident flux defines the Thomsoncross-section,

σT =µ2

0e4

6πm2e

=e4

6πε20m2ec

4. (7.75)

This is the effective geometric cross-section of the electron to scattering. Note that itis independent of frequency. The Thomson cross-section can be neatly written in theform

σT =8π

3r2e , (7.76)

where the classical electron radius re is defined by

e2

4πε0re= mec

2 . (7.77)

(One can think of re as being proportional to the radius that a uniform, extendedcharge e would have if the rest-mass energy were all electrostatic.)

Part-II Electrodynamics 65

The other type of scattering we shall consider here is scattering of long wavelength ra-diation (λ physical size of scatterer) from bound charges. This is known as Rayleighscattering. For bound charges, provided that the frequency of the incident wave is smallcompared to any resonant frequency (or transition frequency in a quantum description),the bound charges will collectively displace following the applied force from the inci-dent wave. This induces an oscillating electric dipole moment of the form P = αE,where E is the incident electric field at the position of the (small) scatterer, and α isthe polarizability.

The wavelength of the scattered radiation is the same as that of the incident radiationand is large compared to the size of the scatterer. We are therefore in the dipole limit,and the radiated power is given by Eq. (7.34). Since P = −αω2E, the time-averagedradiated power is ⟨

dE

dt

⟩=

µ0

6πc

α2ω4E20

2. (7.78)

Dividing by the time-averaged incident energy flux gives the cross-section for Rayleighscattering:

σ =µ2

0α2ω4

6π∝ 1

λ4. (7.79)

Note the 1/λ4 wavelength dependence of the cross-section, so that short wavelengths(high frequencies) are scattered more than longer wavelengths. Rayleigh scatteringunderlies many natural phenomena. For example, the blue hue of the overhead skyis due to the (mostly) nitrogen and oxygen molecules in the atmosphere preferentiallyscattering the blue (short wavelength) component of sunlight. Furthermore, red sunsetsfollow from preferentially scattering blue light out of the line of sight towards the settingsun.

8 Variational principles

Recall that in non-relativistic classical mechanics, the equation of motion mx = −∇Vfor a particle of mass m moving in a potential V (x, t) may be derived by extremisingthe action

S =

∫ t2

t1

(1

2mx2 − V (x)

)dt . (8.1)

In other words, we look for the path x(t), with fixed end-points at times t1 and t2,for which the action is stationary under changes of path. Note that the action hasdimensions of energy× time.

The action is the integral of the Lagrangian

L =1

2mx2 − V (x) . (8.2)

Part-II Electrodynamics 66

The Euler–Lagrange equations,

∂L

∂x=

d

dt

(∂L

∂x

), (8.3)

determine the stationary path as that satisfying

−∇V =d

dt(mx) (8.4)

(subject to the boundary conditions at t1 and t2). This is just the expected Newtonianequation of motion.

In this short section, we shall generalise this action principle to relativistic motion ofpoint particles moving in prescribed electromagnetic fields, and extend the principle tothe field itself. The ultimate motivation for this is that the starting point for quantizingany classical theory is the action. We will not have much to say on quantization in thiscourse (although we shall discuss the Schrodinger equation for particles in magneticfields in Sec. 9), but the quantization of fields is fully developed in Part-III coursessuch as Quantum Field Theory and Advanced Quantum Field Theory.

8.1 Relativistic point particles in external fields

We start by seeking an action principle for a free relativistic point particle of rest massm. We shall then include the effect of an external electromagnetic field.

The non-relativistic action for a free particle is just the integral of the kinetic energymx2/2 over time. A relativistic version of the action principle must involve a Lorentz-invariant action, and reduce to the non-relativistic version in the limit of speeds muchless than c. A good candidate for the relativistic Lagrangian is −mc2/γ, where γ isthe Lorentz factor of the particle. To see this, note that dt/γ = dτ (the proper timeincrement) and so the action is Lorentz invariant – it is just the proper time along thepath between the two fixed spacetime endpoints. Moreover,

−mc2

(1− v2

c2

)1/2

= −mc2

(1− 1

2

v2

c2+ · · ·

)= −mc2 +

1

2mv2 + · · · , (8.5)

and so the action reduces to the usual Newtonian one in the non-relativistic limit upto an irrelevant constant (−mc2). Writing the Lagrangian as

L = −mc2(1− v2/c2

)1/2, (8.6)

Part-II Electrodynamics 67

the Euler–Lagrange equations give

d

dt

(v

(1− v2/c2)1/2

)= 0 , (8.7)

since ∂L/∂x = 0. This is just the statement that the relativistic 3-momentum isconserved, as required for a free particle.

We now want to add in the effect of an external electromagnetic field. The non-relativistic form of the Lagrangian for a particle moving in a potential V (x, t) suggestsadding −qφ to the relativistic Lagrangian, where q is the charge of the particle. How-ever, this alone is insufficient since the magnetic field would not appear in the equationof motion. A further issue is related to gauge-invariance: the action principle must beinvariant under gauge transformations of the form

φ→ φ− ∂χ

∂t, A→ A + ∇χ . (8.8)

This means that the change in action between two paths with fixed endpoints x1 andx2 at t1 and t2 must be gauge-invariant. We can fix both of these issues by includinga further term qA · v in the Lagrangian:

L = −mc2(1− v2/c2

)1/2 − qφ+ qA · v . (8.9)

To see how this gives a gauge-invariant action principle, note that under a gaugetransformation

S → S + q

∫ t2

t1

(∂χ

∂t+ v ·∇χ

)dt

= S + q

∫ t2

t1

dtdt

= S + q [χ(t2,x2)− χ(t1,x1)] . (8.10)

As the endpoints are fixed, the gauge transformation does not affect the change in theaction (although it does change the absolute value of the action).

Finally, we need to check that we recover the correct equation of motion. The ithcomponent of the Euler–Lagrange equations gives

d

dt

(mvi

(1− v2/c2)1/2

)+ q

dAidt

= −q ∂φ∂xi

+ qvj∂Aj∂xi

. (8.11)

Expanding the total (convective) derivative dAi/dt gives the required equation of mo-tion:

dpidt

= −q ∂φ∂xi

+ qvj∂Aj∂xi− q∂Ai

∂t− qvj

∂Ai∂xj

= qEi + q [v × (∇×A)]i= q (E + v ×B)i , (8.12)

Part-II Electrodynamics 68

where p = γmv.

We can formulate the action principle more covariantly by recalling that the kineticpart of the action is just proportional to the proper time along the path, and can bewritten as

−mc2

∫dτ = −mc

∫(−ηµνdxµdxν)1/2

= −mc∫ (−ηµν

dxµ

dxν

)dλ , (8.13)

for some arbitrary parameter λ along the path. This action is invariant to a reparame-terisation of the path with a different λ18. We must now look for spacetime paths thatextremise the action between fixed endpoint events. For the electromagnetic part ofthe action, note that

Aµdxµ = (−φ+ v ·A) dt , (8.14)

so the full action can be written covariantly as

S = −mc∫ (−ηµν

dxµ

dxν

)1/2

dλ+ q

∫Aµ

dxµ

dλdλ . (8.15)

It is straightforward to verify the gauge-invariance of changes in the action now since,under Aµ → Aµ + ∂µχ, we have

Aµdxµ

dλ→ Aµ

dxµ

dλ+dχ

dλ. (8.16)

The last term on the right is a total derivative and so does not contribute to changesin the action for fixed endpoints.

For the covariant action, the paths are parameterised by λ so the relevant Euler–Lagrange equations are

∂L

∂xµ=

d

(∂L

∂(dxµ/dλ)

), (8.17)

where the Lagrangian

L = −mc(−ηµν

dxµ

dxν

)1/2

+ qAµdxµ

dλ. (8.18)

The required derivatives of the Lagrangian are

∂L

∂xµ= q(∂µAν)

dxν

dλ, (8.19)

18It is more convenient to use a general parameter than τ itself since the latter would imply theconstraint ηµν(dxµ/dτ)(dxν/dτ) = −c2, which would have to be accounted for when varying the path.

Part-II Electrodynamics 69

and∂L

∂(dxµ/dλ)=

mc

[−ηµν(dxµ/dλ)(dxν/dλ)]1/2ηµν

dxν

dλ+ qAµ . (8.20)

Substituting these derivatives into the Euler–Lagrange equations, and using dAµ/dλ =(dxν/dλ)∂νAµ, we find

mcd

(ηµνdx

ν/dλ

[−ηµν(dxµ/dλ)(dxν/dλ)]1/2

)= q

dxν

dλ(∂µAν − ∂νAµ)

= qFµνdxν

dλ. (8.21)

This equation is reparameterisation invariant, as expected since the action is. It sim-plifies if we take λ equal to the proper time,

dpµdτ

= qFµνuν , (8.22)

where pµ is the 4-momentum and uµ = dxµ/dτ is the 4-velocity. This is the expectedcovariant equation of motion (Eq. 5.64).

8.2 Action principle for the electromagnetic field

We now extend the action principle to the electromagnetic field itself. Rather thandealing with functions of a single variable, like xµ(λ), we now have fields defined overspacetime. The action will therefore involve the integral over spacetime (with theLorentz-invariant measure d4x introduced in Sec. 7) of a Lorentz-invariant Lagrangiandensity L. This Lagrangian density must also be gauge-invariant. The variational prin-ciple is then to find field configurations such that the action is stationary for arbitrarychanges in the field that vanish on some closed surface in spacetime.

The covariant Maxwell equations involve first derivatives of Fµν or, equivalently, secondderivatives of Aµ. This suggests considering a Lagrangian density L ∝ FµνF

µν , with Aµthe underlying field to vary, for the part of the action describing the free electromagneticfield. Adding this to the action in Eq. (8.15) for a particle in an external field, we aretherefore led to consider a total action of the form

S = − 1

4µ0c

∫FµνF

µν d4x−∑

mc

∫ (−ηµν

dxµ

dxν

)1/2

dλ+∑

q

∫Aµ

dxµ

dλdλ ,

(8.23)for a system of N point particles interacting the electromagnetic field. Here, thesummations are over the particles, each of which has its own worldline xµ(λ). Notethat the first term in the action, describing the free electromagnetic field, has the

Part-II Electrodynamics 70

correct dimensions of energy× time. The numerical factor is chosen to get the correctfield equation (see below).

Varying the xµ(λ) for each particle gives an equation of motion like Eq. (8.22) for eachparticle. For the Aµ variation, we note that the interaction (i.e., third) term in theaction can be rewritten in terms of the 4-current of the particles,

Jµ(x) =∑

qc

∫δ(4) (x− y(λ))

dyµ

dλdλ (8.24)

using ∫AµJ

µ d4x =∑

qc

∫dλ

∫d4x δ(4) (x− y(λ))

dyν

dλAµ(x)

=∑

qc

∫Aµ (y(λ))

dyµ

dλdλ . (8.25)

Varying Aµ by δAµ in the action therefore gives

δS = − 2

4µ0c

∫(∂µδAν − ∂νδAµ)F µν d4x+

1

c

∫δAµJ

µ d4x

= − 1

µ0c

∫(F µν∂µδAν − µ0J

νδAν) d4x

=1

µ0c

∫(∂µF

µν + µ0Jν) δAν d

4x . (8.26)

Here, we have integrated by parts going to the last equality and dropped the surfaceterm since the variation δAν vanishes on the boundary. If the action is at an extremum,the change δS must vanish for all δAµ. This can only be the case if

∂µFµν = −µ0J

ν , (8.27)

which is the desired (sourced) Maxwell equation. The other covariant Maxwell equa-tion, ∂[ρFµν] = 0, is implied by our assumption that Fµν is derived from the 4-potentialAν .

9 Scalar electrodynamics and superconductivity

In this section we discuss how to include electromagnetism (and in particular magneticfields) in the non-relativistic quantum mechanics of a massive, charged spin-0 particle.Our treatment of the electromagnetic field will remain classical; quantization of theelectromagnetic field requires the more involved machinery of Quantum Field Theory(see Part-III courses). The non-relativistic scalar electrodynamics that we developprovides a phenomenological (but powerful) description of superconductivity that webriefly introduce in Sec. 9.2.

Part-II Electrodynamics 71

9.1 Electromagnetism and non-relativistic quantum mechan-ics

We begin by recalling the general procedure for quantizing the non-relativistic dynamicsof a massive particle (see e.g. the Part-II course Principles of Quantum Mechanics)moving in a potential V (x, t). The non-relativistic Lagrangian is just the kinetic energyminus the potential energy:

L =1

2mv2 − V (x) , (9.1)

where m is the mass and v = x is the 3-velocity (overdots denoting derivatives withrespect to coordinate time in this section). From the Lagrangian, we construct thecanonical momentum

π ≡ ∂L

∂x= mv , (9.2)

which here is just the usual mechanical 3-momentum. The Hamiltonian is H ≡ π·x−L,and here is just the total energy

H =1

2mπ2 + V (x, t) , (9.3)

and is conserved if the potential V is time independent. To quantize, we promote πand x to operators satisfying the canonical commutation relations

[xi, πj] = ihδij . (9.4)

In the coordinate representation, where the state of the system is (fully) described by atime-dependent wavefunction Ψ(x, t), we have π → −ih∇ and x→ x (i.e.,, the actionof x on the state vector |Ψ〉 reduces to xΨ). The wavefunction evolves according tothe time-dependent Schrodinger equation

H|Ψ〉 = ih∂|Ψ〉∂t

. (9.5)

In the coordinate representation, this reduces to

1

2m

(h

i∇)2

Ψ + V (x, t)Ψ = ih∂Ψ

∂t. (9.6)

We now consider the case of a particle coupled to electromagnetism through its chargeq. The non-relativistic Lagrangian is, from Eq. (8.9),

L =1

2mv2 − qφ+ qA · v , (9.7)

The canonical momentum evaluates to

π = mv + qA . (9.8)

Part-II Electrodynamics 72

Note this is not just the mechanical momentum, but includes the magnetic vectorpotential too. Expressing the mechanical momentum as mv = π−qA, the HamiltonianH = π · x− L simplifies to

H =1

2m(π − qA)2 + qφ , (9.9)

giving the time-dependent Schrodinger equation

1

2m

(h

i∇− qA

)2

Ψ + qφΨ = ih∂Ψ

∂t. (9.10)

The observable properties of the system described by Eq. (9.10) must be gauge-invariant.In other words, if we work in a different gauge, with A→ A+∇χ and φ→ φ−∂χ/∂t,we will generally get a different solution for Ψ, but the observables must be the same.Since global phase changes of Ψ are not observable, we consider19

Ψ→ eiqχ/hΨ . (9.11)

The particular phase change is chosen so that the Schrodinger equation is still solvedby the transformed Ψ, A and φ if it was solved in the original gauge. To check this isthe case, note that(

h

i∇− q(A + ∇χ)

)(eiqχ/hΨ

)= eiqχ/h

(h

i∇− qA

⇒(h

i∇− qA

)Ψ→ eiqχ/h

(h

i∇− qA

)Ψ , (9.12)

and also [ih∂

∂t− q

(φ− ∂χ

∂t

)] (eiqχ/hΨ

)= eiqχ/h

[ih∂

∂t− qφ

⇒(ih∂

∂t− qφ

)Ψ→ eiqχ/h

(ih∂

∂t− qφ

)Ψ . (9.13)

Hence, we see that under the gauge transformation A→ A + ∇χ and φ→ φ− ∂χ/∂t,the wavefunction also must transform according to Eq. (9.11).20 The phase changedoes not alter any observable properties, for example the probability density |Ψ|2.

19Gradients in the phase can have observable consequences, for example they alter the probabilitycurrent. We shall see below how to construct a gauge-invariant current.

20Note how the coupling to electromagnetism promotes the global U(1) symmetry of the usualSchrodinger theory (i.e., the freedom to make global phase changes that are constant in spacetime)to a local U(1) symmetry (spacetime-dependent phase changes). Indeed, the idea that introducinggauge fields to promote a global symmetry of a theory to a local one also introduces interactions tothe theory, which are mediated by the gauge field, is a key one in particle physics and underlies thestandard model.

Part-II Electrodynamics 73

The (probability) current density will play an important role in our discussion of su-perconductivity. This satisfies

∂|Ψ|2

∂t+ ∇ · JΨ = 0 , (9.14)

where, recall, we interpret |Ψ|2 as the probability density for locating the particle. Inthe presence of the magnetic vector potential, the probability current density mustdiffer from the usual expression,

JΨ = − ih

2m(Ψ∗∇Ψ−Ψ∇Ψ∗) , (9.15)

since the latter is neither conserved nor gauge-invariant. The appropriate generalisationis to replace the derivative ∇ with the covariant derivative

D ≡∇− i qh

A , (9.16)

that appears (squared) on the left side of the Schrodinger equation (9.10). We see, fromEq. (9.12), that it is this derivative whose action on a wavefunction is covariant undergauge transformations (i.e.,, the result transforms the same way as the wavefunctionitself). To see that we properly obtain a conserved current, we write the Schrodingerequation as

− h2

2mD2Ψ =

(ih∂

∂t− qφ

)Ψ , (9.17)

so that

ih∂|Ψ|2

∂t=

(ih∂Ψ

∂t

)Ψ∗ −Ψ

(ih∂Ψ

∂t

)∗=

(ih∂Ψ

∂t− qφΨ

)Ψ∗ −Ψ

(ih∂Ψ

∂t− qφΨ

)∗= − h2

2m

[Ψ∗D2Ψ−Ψ

(D2Ψ

)∗]. (9.18)

Now, for any Ψ and Φ

∇ · (Φ∗DΨ) = (DΦ)∗ ·DΨ + Φ∗D · (DΨ) , (9.19)

so that Eq. (9.18) becomes

ih∂|Ψ|2

∂t= − h2

2m[∇ · (Ψ∗DΨ)−∇ · (Ψ(DΨ)∗)] . (9.20)

This establishes that an appropriate probability current is

JΨ = − ih

2m[Ψ∗DΨ−Ψ(DΨ)∗]

= − ih

2m(Ψ∗∇Ψ−Ψ∇Ψ∗)− q

m|Ψ|2A . (9.21)

Part-II Electrodynamics 74

Finally, in terms of the phase θ of the wavefunction, where Ψ = |Ψ|eiθ, the current canbe written as

JΨ =h|Ψ|2

m

(∇θ − q

hA), (9.22)

so it is the gauge-invariant gradient of the phase that determines the current density.

The Schrodinger equation (9.10) can be derived from an action S =∫Ldt, where the

Lagrangian

L =

∫ [1

2Ψ∗(ih∂

∂t− qφ

)Ψ +

1

((ih∂

∂t− qφ

)∗− h2

2m|DΨ|2

]d3x . (9.23)

This is manifestly gauge-invariant. The asymmetry between time and spatial deriva-tives reflects the lack of Lorentz invariance of the non-relativistic Schrodinger equation.Varying Ψ, we have

δS =

∫ [1

2δΨ∗

(ih∂

∂t− qφ

)Ψ +

1

2Ψ∗(ih∂

∂t− qφ

)δΨ

− h2

2m(DδΨ)∗ ·DΨ + c.c.

]dt d3x

=

∫ [δΨ∗

(ih∂

∂t− qφ

)Ψ− h2

2m

(∇ · (δΨ∗DΨ)− δΨ∗D2Ψ

)+ c.c.

]dt d3x ,

(9.24)

where we integrated by parts (in time) and dropped the boundary term, and usedEq. (9.19). If we now use the divergence theorem, we find

δS =

∫ [δΨ∗

(ih∂

∂t− qφ

)Ψ +

h2

2mδΨ∗D2Ψ + c.c.

]dt d3x . (9.25)

Since this must vanish for all δΨ, we recover the Schrodinger equation in the form ofEq. (9.17).

9.2 Application to superconductivity

In this section we apply the ideas developed above to give a phenomenological de-scription of the equilibrium (hence static) properties of superconductors in magneticfields.

Superconductivity is observed in many metals and alloys below a (material-dependent)transition temperature Tc. The superconducting state is defined by the following elec-trodynamic properties.

Part-II Electrodynamics 75

• The electrical resistivity disappears completely (at zero frequency) and so cur-rents can flow persistently with no dissipation. This phenomenon was first ob-served in mercury (Tc = 4.1 K) by Onnes in 1911.

• Perfect diamagnetism, i.e.,, the magnetic field inside a superconductor vanishes.If a magnetic field is established in a sample above Tc by external currents, asthe sample is cooled below Tc (super-)currents are induced on the surface of thesuperconductor that exactly cancel the magnetic field in the bulk of the sample.This effect is called the Meissner effect and was first observed by Meissner &Ochensfeld in 1933. Note the distinction from a perfect conductor, which resistsany change in magnetic flux and so would not expel a pre-existing field from itsinterior on cooling.

The microscopic theory of (conventional) superconductivity is due to Bardeen, Cooper& Schrieffer (BCS; 1957). It is beyond the scope of this course to give a detaileddescription of this theory and we shall make do with a brief summary of the key ideas.The most important idea is that of a net attractive interaction between electrons nearthe Fermi surface mediated by phonons (quantized vibrations of the ionic lattice). Inessence, the mutual Coulomb repulsion of the delocalised electrons is over-screenedby the motion of the lattice as the ions are attracted to electron overdensities. Aphonon-mediated interaction neatly explains the observed isotope effect, whereby thesuperconducting properties of a metallic element depend on the specific isotope (hencenuclear mass). The consequence of the net attraction between electrons is that theycan form bound pairs, known as Cooper pairs, which therefore have integer spin andso behave as bosons. The Pauli exclusion principle does not apply to the bosonicCooper pairs, so they can call occupy the same two-electron state. The extent ofthe pair wavefunction is very large (around 10−7 m) compared to the typical ionicseparation (around 10−10 m), so that the collective behaviour of the pairs is critical forthe superconducting state.

The transport properties of a superconductor, for example the electrical conductivity,arise from non-equilibrium motion of the Cooper pairs (plus any contribution fromelectrons in the normal phase). At any finite temperature, some pairs are dissociatedby thermal excitation and their electrons occupy the usual quasi-single-electron levelsof the normal (non-superconducting) phase. Above the critical temperature all pairsare dissociated and the electrons occupy the usual eigenstates of the independent elec-tron approximation. The transition from normal to superconducting behaviour is anexample of a second-order phase transition21. Roughly speaking, there is no latent heatassociated with a second-order phase transition, but the heat capacity is discontinuous.A perhaps more familiar example is the ferromagnetic phase transition, whereby the

21Phase transitions are covered in more detail at the end of the Lent-term Statistical Physics Part-IIcourse.

Part-II Electrodynamics 76

magnetization of a sample in zero applied magnetic field changes continuously fromzero as the sample is cooled through the Curie temperature.

9.2.1 Ginzburg–Landau theory

Ginzburg–Landau theory provides a general phenomenological description of second-order phase transitions. Its application to superconductivity pre-dates the microscopicBCS theory. The basic idea is to characterise the superconducting state by a complexorder parameter ψ(x), which can crudely be thought of as a one-particle wavefunctiondescribing the centre-of-mass motion of a Cooper pair of electrons22. The theory isconstructed so that the equilibrium value of the order parameter vanishes above thecritical temperature Tc, but it acquires a non-zero (temperature-dependent) value belowTc. The value of |ψ|2 can be interpreted as the number density of electron pairs formingthe superconducting state.

The key quantity in Ginzburg–Landau theory is the free energy functional, which is afunctional of the order parameter ψ and the magnetic field in the sample. The idea is tominimise the free energy functional to determine the form of these fields in equilibrium,subject to the external constraints imposed on the system such as its temperature,volume and the magnetic field generated by external currents. The (equilibrium) orderparameter is supposed to be continuous through the critical temperature, so close toTc we can write the free energy functional as a Taylor expansion in |ψ| about ψ = 0.Moreover, we expect gradients in the order parameter to cost energy, which we canpenalise with terms proportional to |∇ψ|2 in the free energy functional. Finally, toinclude coupling to the magnetic field, we make the gradient terms gauge-invariant byreplacing ∇ → ∇ − iqA/h and include the magnetic field energy. The microscopictheory tells us that we should take q = −2e since we are dealing with pairs of electrons.Putting these ingredients together, we have the free energy functional

F [ψ,A] =

∫ (1

2µ0

|B|2 +h2

2m

∣∣∣∣(∇− iqAh

∣∣∣∣2 + α(T )|ψ|2 +1

2β(T )|ψ|4

)d3x ,

(9.26)where B = ∇×A. The normalisation of the gradient-squared term is conventional andis chosen by analogy with the action for the Schrodinger equation. Being a phenomeno-logical description, the mass m has to be chosen by comparison with experiment orfrom the appropriate limit of a microscopic theory. On the basis of BCS theory, weexpect m ∼ 2m∗e, where m∗e is an appropriate effective mass of the electrons moving inthe periodic potential of the unperturbed lattice of ions. Finally, at any temperature,we require β > 0 so that F is bounded from below. We shall see below how to chooseα(T ) such that we get the desired phase transition to the superconducting state at Tc.

22In ferromagnetism, the appropriate order parameter is the magnetic moment per volume.

Part-II Electrodynamics 77

The free energy functional in Eq. (9.26) is gauge-invariant under the usual transforma-tions A→ A + ∇χ and ψ → eiqχ/hψ. Varying the action with respect to ψ (followingthe steps used for the Schrodinger equation in the previous section), we find

− h2

2m

(∇− iqA

h

)2

ψ + α(T )ψ + β(T )|ψ|2ψ = 0 . (9.27)

This is like the Schrodinger equation but with a non-linear term β|ψ|2ψ. Varying withrespect to A, we have

δF =

∫ h2

2m

[(−iqδA

)· (Dψ)∗ + (Dψ) ·

(iqδA

hψ∗)]

+1

µ0

(∇× δA) ·Bd3x ,

(9.28)

where, recall, D = ∇− iqA/h. Noting that

(∇× δA) ·B = ∇ · (δA×B) + δA · (∇×B) , (9.29)

on dropping the resulting surface term in δF , we find the (static) Ampere equation,

∇×B = µ0Js , (9.30)

where the supercurrent

Js = −ihq2m

[ψ∗Dψ − ψ(Dψ)∗]

= −ihq2m

[ψ∗(

∇− iqAh

)ψ − c.c.

]. (9.31)

Note that this is just the product of the charge q and the usual gauge-invariantprobability current for a wavefunction ψ. Generally, we must solve the non-linearSchrodinger equation (9.27) and the Ampere equation (9.30) simultaneously, with ap-propriate boundary conditions, to find the magnetic field and supercurrent in the su-perconductor.

We begin by considering the case of a uniform (homogeneous) superconductor in zeroapplied magnetic field. Assuming A = 0, there exist constant solutions to Eq. (9.27)with

ψ[α(T ) + β(T )|ψ|2

]= 0 . (9.32)

Now, we want the superconducting order parameter to vanish for temperatures higherthan the critical temperature Tc, so that no electrons are in the superconducting phase.We can ensure this is the case by taking α(T ) > 0 for T > Tc, since, recall, β(T ) > 0for all T . Below Tc we want a non-zero (equilibrium) order parameter, which requiresα(T ) < 0 for T < Tc; then

|ψ|2 = −α(T )/β(T ) (T < Tc) . (9.33)

Part-II Electrodynamics 78

In the vicinity of the critical temperature, generically we expect α(T ) = a(T − Tc) forsome constant a, and β(T ) ≈ b for some other constant b.

For zero field and with ψ uniform, the free energy functional becomes

F ∝ α(T )|ψ|2 + β(T )|ψ|4/2 . (9.34)

For T > Tc, this has a single minimum at |ψ| = 0, but for T < Tc, the point |ψ| = 0becomes a local maximum and the minimum free energy is realised at |ψ| = −α/β.We see that the equilibrium value of the order parameter smoothly transitions from|ψ| = 0 above the critical temperature to |ψ| =

√−α/β below Tc, as appropriate for a

second-order phase transition.

Meissner effect

Ginzburg–Landau theory naturally accounts for the expulsion of magnetic flux from asuperconductor. To see this, we write the supercurrent in terms of the gauge-invariantphase gradient, following Eq. (9.22):

Js =qh|ψ|2

m

(∇θ − q

hA), (9.35)

where ψ = |ψ|eiθ. We expect the magnetic field to decay sharply inside the supercon-ductor. If we assume that we can ignore spatial variations in |ψ| over this transitionregion, we can approximate |ψ|2 = −α(T )/β(T )23. Taking the curl, we have

∇× Js ≈ −q2|ψ|2

mB , (9.36)

which combines with the Ampere equation (9.30) to give

∇× (∇×B) = −µ0q2|ψ|2

mB

⇒ ∇2B =µ0q

2|ψ|2

mB . (9.37)

23This approximation is not really justified. Equation (9.27) defines a characteristic coherence lengthξ over which the magnitude of ψ will vary. By writing ψ in dimensionless form by dividing by

√−α/β

(for T < Tc), the remaining length scale in the zero-field limit is

ξ =h√

2m|α|.

It is over this scale that ψ recovers to the homogeneous value√−α/β. Since α → 0 as T → Tc, the

coherence length is large near the critical temperature. However, as we shall see shortly, the magneticfield only penetrates a distance λ ∝ 1/

√|α|, which may be comparable to ξ. In general, one must

consider variations in the order parameter and the magnetic field, in which case both lengths ξ and λenter the problem. Note that their ratio κ ≡ λ/ξ is (almost) independent of temperature.

Part-II Electrodynamics 79

Consider the case of a plane surface at x = 0, with the superconductor filling the regionx > 0. Take the magnetic field along the z-direction. Then the solution of Eq. (9.37)for x > 0 that is finite at x→∞ is

Bz(x) = B0e−x/λ , (9.38)

where B0 is the magnetic field outside the superconductor and the penetration depth

λ2 =m

µ0q2|ψ|2. (9.39)

(We have assumed that |ψ|2 is constant.) We see that a static magnetic field onlypenetrates a distance of order λ into the superconductor. Supercurrents flow withinthis surface region, screening the interior of the superconductor from the applied mag-netic field. Taking q = −2e, m ∼ 2me (twice the free electron mass) and |ψ|2 = nsroughly equal to half the number density of free electrons for the normal state (whichis reasonable well below Tc), we find λ ∼ 10−8–10−7 m. This is large compared to theionic separation but small compared to typical macroscopic scales.

Flux quantization

The form of the supercurrent in Eq. (9.35) has a very striking implication: the magneticflux linked by a superconductor is quantized. To see this, consider a superconductingring with a magnetic field threading the hole in the centre. Since the magnetic fieldinside the superconductor is appreciable only near the surface, the same is true of thesupercurrent (by Ampere’s equation). In the bulk of the superconductor, where Js = 0,we have

A =h

q∇θ , (9.40)

where θ is the phase of ψ. If we integrate this equation around a closed path deepinside the superconductor that encircles the hole, we have∮

A · dx =h

q

∮∇θ · dx

⇒∫

B · dS =h

q∆θ , (9.41)

where we used Stokes theorem to relate∮

A · dx to the magnetic flux Φ =∫

B · dSlinked by the circuit (hence ring). The change in phase ∆θ on traversing the circuitmust be an integer multiple n of 2π since ψ must be single-valued. It follows that

Φ =2πhn

2e, (9.42)

where we have used the BCS result q = −2e, i.e., the flux linked by the ring is quantizedin units of Φ0 where

Φ0 =πh

e= 2.07× 10−15 T m2 . (9.43)

Part-II Electrodynamics 80

Flux quantization was first observed in 1961 (by Deaver & Fairbank and Doll &Nabauer), providing compelling evidence for the validity of a description of super-conductivity by a complex order parameter ψ. Moreover, the observed value of thefluxoid Φ0 validates the idea that the current is carried by pairs of electrons withq = −2e.