TUM · Contents Preamble 5 1 Basic elements of uid dynamics7 1.1 Kinematics of uids.. . . . . . . ....

Typeset in LATEX 2ε May 23, 2019

A Mathematical Introduction toMagnetohydrodynamics

Omar Maj

Max Planck Institute for Plasma Physics, D-85748 Garching, Germany.

e-mail: [email protected]

1

Contents

Preamble 5

1 Basic elements of fluid dynamics 71.1 Kinematics of fluids. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.2 Lagrangian trajectories and flow of a vector field. . . . . . . . . . . . . . . . . . 81.3 Deformation tensor and vorticity. . . . . . . . . . . . . . . . . . . . . . . . . . . 191.4 Advective derivative and Reynolds transport theorem. . . . . . . . . . . . . . . 221.5 Dynamics of fluids. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251.6 Relation to kinetic theory and closure. . . . . . . . . . . . . . . . . . . . . . . . 291.7 Incompressible flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441.8 Equations of state, isentropic flows and vorticity. . . . . . . . . . . . . . . . . . 451.9 Effects of Euler-type nonlinearities. . . . . . . . . . . . . . . . . . . . . . . . . . 46

2 Basic elements of classical electrodynamics 512.1 Maxwell’s equations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512.2 Lorentz force and motion of an electrically charged particle. . . . . . . . . . . . 642.3 Basic mathematical results for electrodynamics. . . . . . . . . . . . . . . . . . . 69

3 From multi-fluid models to magnetohydrodynamics 833.1 A model for multiple electrically charged fluids. . . . . . . . . . . . . . . . . . . 833.2 Quasi-neutral limit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 883.3 From multi-fluid to a single-fluid model. . . . . . . . . . . . . . . . . . . . . . . 963.4 The Ohm’s law for an electron-ion plasma. . . . . . . . . . . . . . . . . . . . . . 1003.5 The equations of magnetohydrodynamics. . . . . . . . . . . . . . . . . . . . . . 104

4 Conservation laws in magnetohydrodynamics 1094.1 Global conservation laws in resistive MHD. . . . . . . . . . . . . . . . . . . . . 1094.2 Global conservation laws in ideal MHD. . . . . . . . . . . . . . . . . . . . . . . 1134.3 Frozen-in law. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1164.4 Flux conservation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1204.5 Topology of the magnetic field. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1244.6 Analogy with the vorticity of isentropic flows. . . . . . . . . . . . . . . . . . . . 132

5 Basic processes in magnetohydrodynamics 1355.1 Linear MHD waves. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1355.2 Nonlinear shear Alfven waves. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1435.3 Magnetic field diffusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1455.4 Magnetic reconnection: basic ideas and examples. . . . . . . . . . . . . . . . . . 150

6 Variational formulation 1536.1 Basic elements of calculus of variations. . . . . . . . . . . . . . . . . . . . . . . 1536.2 Existence of a variational formulation. . . . . . . . . . . . . . . . . . . . . . . . 1636.3 Variational principle for Maxwell’s equations. . . . . . . . . . . . . . . . . . . . 1656.4 Motion of a changed particle in an electromagnetic field. . . . . . . . . . . . . . 1676.5 First-order Lagrangian theories. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1686.6 Jet bundles and Noether’s theorem. . . . . . . . . . . . . . . . . . . . . . . . . . 1706.7 Geodesics and Euler’s equations of fluid dynamics. . . . . . . . . . . . . . . . . 1796.8 Lagrangian formulation of ideal MHD. . . . . . . . . . . . . . . . . . . . . . . . 186

7 Hamiltonian formulation 1917.1 Introduction to Hamiltonian systems. . . . . . . . . . . . . . . . . . . . . . . . 1917.2 Hamiltonian structure of ideal MHD. . . . . . . . . . . . . . . . . . . . . . . . . 1917.3 Metriplectic systems and dissipation. . . . . . . . . . . . . . . . . . . . . . . . . 191

A Proofs of the results on kinetic theory and closure 193A.1 Proof of proposition 1.13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195A.2 Proof of proposition 1.14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197A.3 Proof of proposition 1.15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199

B Energy conservation in extended MHD models 203

3

C Magnetic vector potential in MHD 207

D Lie derivatives and passively advected quantities 209

References 222

4

Preamble

Magnetohydrodynamics is the theory of electrically conducting, neutral fluids.The basic equations of magnetohydrodynamics (MHD) have been proposed byHannes Alfven [1, 2], who realized the importance of both the electric currentscarried by a plasma and the magnetic field they generate. Alfven combined theequations of fluid dynamics with Faraday’s and Ampere’s laws of electrodynam-ics, thus obtaining a novel mathematical theory, which helped us to understandspace plasmas in Earth and planetary magnetospheres, as well as the physics ofthe Sun, solar wind, and stellar atmospheres. In fusion research, MHD is crucialto the understanding of plasma equilibria and their stability. Liquid metals andelectrolytes, like salt water, can also be modeled by MHD equations.

Besides the important physical applications, MHD equations exhibit a re-markably beautiful mathematical structure, with connections to geometry andtopology that allows us to understand some of the dynamics of magnetic fieldsin plasmas in terms of topological ideas [3, 4, 5, 6, 7, 8]. As a dynamical systemMHD is an example of infinite-dimensional Hamiltonian system [9, 10].

The scope of this lecture is, in this regard, extremely limited. The goal isto introduce MHD equations in a reasonably self-contained way and to discusssome of their most important features. The style of this lectures is quite similarto the mathematical introduction to fluid dynamics by Chorin and Marsden [11]as the title of this note suggests. Particularly, we shall attempt to introduce thephysics modeling in a mathematically precise albeit not always rigorous way.

The physics literature on the subject is vast. As a reference for further read-ing, the book by Biskamp [12] provides a clear and comprehensive exposition,while the lectures by Schnack [13] offer a more gradual learning curve. For MHDof the solar atmosphere one can refer to Priest [14] as well as to Aschwanden’sbook on the solar corona [15]. For an introduction to magnetohydrodynamicswith emphasis on equilibria and stability of fusion plasmas one can refer to thebooks by Freidberg [16] and Zohm [17], while Goedbloed, Poedts and Keppensaddress applications to both astrophysical and fusion plasmas [18, 19]. A niceintroduction to MHD with a broader perspective which includes applications tometals can be found in Davidson’s book [20].

On the mathematical side, MHD has received considerable attention from ap-plied mathematicians. Its rich mathematical structure has become a paradigmfor the application of geometry and topology [8, 21] as well as for structurepreserving discretization [22, 23, 24, 25]. As a system of partial differentialequations, well-posedness of the Cauchy problem for MHD equations subject toappropriate boundary conditions have been studied first by Duvaut and Lions[26, in French], Sermange and Temam [27], Secchi [28] and more recently, byChen and coworkers [29] and by Fefferman and coworkers [30, and referencestherein].

5

1 Basic elements of fluid dynamics

The basic understanding of fluid dynamics is an essential prerequisite to thestudy of MHD. We shall start by recalling the basic elements thereof, followingChorin and Marsden [11], cf. also Marsden and Hughes [31]. First, we definethe physical quantities that describe the dynamical state of a fluid (kinematics)and continue with the equations of motion (dynamics).

1.1 Kinematics of fluids. Fluid dynamics is built on the basis of the con-tinuum hypothesis: A fluid is a distribution of matter occupying a certain regionof the continuous three-dimensional space. The considered region of space is adomain (i.e., an open, non-empty, and connected subset) Ω ⊆ R3. We neglectthe fact that any fluid is ultimately made of atoms and molecules as we areinterested in studying its collective motion on a much larger spatial scale.

With the continuum hypothesis, one needs to quantify how matter is dis-tributed in Ω at any time t in a certain time interval I ⊆ R. Thus, the first phys-ical quantity of interest is the mass density, which is a positive time-dependentscalar field, ρ : I × Ω → R+ with R+ being the set of strictly positive realnumbers, such that

ρ(t, x) > 0 (1.1)

gives the mass per unit of volume at time t ∈ I at the spatial location x ∈ Ω.By definition, the amount of mass contained in an arbitrary volume W ⊆ Ω (tobe referred to as a control volume) is given by

(mass in W at time t) =

∫W

ρ(t, x)dx,

which implies that the mass density must be at least locally integrable.Physically we think of an infinitesimal volume of fluid centered around a

point x ∈ Ω. The volume of this infinitesimal region is mathematically repre-sented by the Lebesgue measure dx in Ω and the mass is represented by themeasure ρ(t, x)dx. Such infinitesimal portions of fluid are referred to as fluidelements.

In plasma physics the mass density is often replaced by an equivalent positivescalar field referred to as the particle number density or simply number densitywhich is defined in terms of the mass density by

n(t, x) = ρ(t, x)/m.

Here the fluid is regarded as a collection of particles that all have the same massm; thus, the number of particles contained in a control volume W is given by

(number of particles in W at time t) =1

m

∫W

ρ(t, x)dx =

∫W

n(t, x)dx.

According to this definition, the number of particles does not need to be aninteger, due to the continuum hypothesis.

Next we need to describe the motion of the fluid. We introduce a velocityfield defined as a time-dependent vector field, u : I × Ω→ R3 such that,

u(t, x) ∈ R3 (1.2)

7

gives the velocity of the fluid element at the point x ∈ Ω and time t ∈ I. Thevector u(t, x) is referred to as the fluid velocity. The mass times the velocityof the fluid element, namely, ρ(t, x)u(t, x)dx, gives the linear momentum of thefluid element, hence

(momentum in W at time t) =

∫W

ρ(t, x)u(t, x)dx.

In addition to the quantities ρ(t, x) and u(t, x), that can be regarded as thecounterpart in fluid dynamics of mass and velocity of a particle in mechan-ics, we need to specify another scalar field for the internal energy of the fluidelement. Differently from a point-mass particle, a fluid element is a thermody-namical system that can undergo expansions and compressions, thus absorbingand releasing energy. The thermodynamic status of a fluid element is specifiedby the internal energy density U : I×Ω→ R+. If each fluid element is regardedas an ideal gas composed by n(t, x)dx particles, we can equivalently expressthe internal energy density in terms of a new variable. Specifically the laws ofthermodynamics for a perfect gas allow us to write the internal energy densityin the form

U(t, x) =3

2n(t, x)kBT (t, x), (1.3)

where kB is the Boltzmann constant and T : I × Ω → R+ is a strictly positivescalar field, such that

T (t, x) > 0

represents the local temperature of the fluid element at time t ∈ I and positionx ∈ Ω. Therefore, the total energy carried by a fluid element is the sum of thekinetic energy associated to its motion plus the internal energy associated to itsthermodynamics, namely,

(energy in W at time t) =

∫W

(1

2ρ(t, x)u(t, x)2 +

3

2n(t, x)kBT (t, x)

)dx.

When the law of a perfect gas does not apply we can still define T by equa-tion (1.3) which is now viewed as a purely mathematical change of variableU 7→ T . In general, we shall regard T as a measure of the internal energy,without necessarily implying thermodynamical equilibrium.

Summarizing, we shall describe the dynamical state of a fluid by the tripleof functions

(ρ, u, T

), where

• the mass density ρ is a positive scalar field,

• the fluid velocity u is a vector field, and

• the temperature T is a positive scalar field.

The equations of fluid dynamics are a system of partial differential equationsgoverning the time evolution of (ρ, u, T ).

1.2 Lagrangian trajectories and flow of a vector field. Under appro-priate hypotheses, we can associate to any velocity field u : I×Ω→ R3 a familyof maps Ft : Ω → Ω, parametrized by time t in a possibly smaller intervalIε ⊆ I. Such a one-parameter family of maps Ftt∈Iε is referred to as the

8

flow of the vector field. It gives an equivalent description of the motion of thefluid, i.e., the vector field u and its flow Ft contain the same information on thefluid motion. In this section we shall define the flow and prove some of its basicproperties. Although it is often overlooked in the physics literature, the flow isa key concept in the mathematical theory of fluid dynamics and thus of MHD.

Let us start from a given velocity field u : I ×Ω→ R3. The associated flowFt is constructed from the solution of the Cauchy problem

dx(t)

dt= u

(t, x(t)

), x(t0) = x0. (1.4)

Physically, the solution t 7→ x(t) represents the trajectory of a fluid element as itmoves with the fluid velocity from the initial position x0 ∈ Ω at time t = t0 ∈ I.Such curves are referred to as Lagrangian trajectories.

Basic results from the theory of ordinary differential equations (ODE) guar-antee the existence and uniqueness of the solution of the Cauchy problem (1.4)at least for a short time. A compact account of results on ODEs can be found,for instance, in Marsden et al. [32] as part of the theory of vector fields. Thefirst chapter of both Hormander’s [33] and Tao’s [34] lectures on nonlinear par-tial differential equations gives a very nice and compact overview of the theory.Specifically we have the following standard result that we recall without proof.

Let us fix constants τ, r > 0 such that the interval Iτ = [t0 − τ, t0 + τ ] andthe ball Br(x0) = x ∈ R3 | |x−x0| ≤ r are contained in I and Ω, respectively,and let u be continuous with V = sup |u(t, x)| for (t, x) in Iτ ×Br(x0).

Theorem 1.1 (Local existence and uniqueness for ODEs). If u is continuousand satisfies the Lipschitz condition

|u(t, x)− u(t, y)| ≤ L|x− y|,

with constant L ≥ 0 on Iτ × Br(x0), then for any positive ε ≤ minτ, r/V with V = sup|u(t, x)| | (t, x) ∈ Iτ × Br(x0), there exists a solution x ∈C1([t0 − ε, t0 + ε], Br(x0)

)of the Cauchy problem (1.4) and any other solution

x ∈ C1([t0−ε, t0+ε]) must satisfy x(t) = x(t) on the intersection of the domains.

We can see that the upper limit of the domain of definition is determinedby the minimum time r/V needed to traverse the ball Br(x0). The interval[t0 − ε, t0 + ε] is referred to as the lifespan of the solution. This has a physicalsignificance: the maximum lifespan of the solution is determined by how fastthe trajectory can travel up to the boundary of the considered ball. In generalthe maximum lifespan depends on the initial condition x0. For instance, ifthe initial condition is very close to the boundary of Ω, r and thus ε can berather small. We can at most refine a bit this result and make the lifespan ofthe solution uniform for all initial conditions in a small neighborhood of x0.This can be established by applying the basic existence result to a smaller ballcentered on x0: For all initial conditions y0 ∈ Br/2(x0), theorem 1.1 with x0 andr replaced by y0 and r/2, respectively, gives a solution of the Cauchy problemwith initial condition x(t0) = y0; then such a solution is contained in Br(x0)and the lifespan is ≤ minτ, r/(2V ) for all y0 ∈ Br/2(x0). Hence,

Corollary 1.2. Let u, t0, x0, r, τ , and V be as in theorem 1.1. Then there existsa neighborhood U ⊂ Br(x0) and 0 < ε ≤ minτ, r/(2V ) such that for everyy0 ∈ U the Cauchy problem (1.4) has a solution x ∈ C1([t0 − ε, t0 + ε], Br).

9

We shall however work under the assumption that the lifespan of Lagrangiantrajectories is uniform on the whole domain Ω, i.e., we assume that there is anε > 0 depending only on t0, such that for every initial condition x0 ∈ Ω thereis a Lagrangian trajectory x : Iε → Ω with Iε = [t0 − ε, t0 + ε]. For a genericordinary differential equation, this is a very strong assumption. For our problem,however, this is not so strong because, in practice, it just means that the domainΩ and the boundary conditions for the vector field u have been chosen properly,in the sense that “Ω contains the fluid”.

We shall fix the initial time to be t0 = 0 and let Iε = [−ε, ε] the interval ofexistence of the Lagrangian trajectories. At this point we are ready to definethe flow of the velocity field.

Definition 1.1 (Flow). For every t ∈ Iε the map Ft : Ω→ Ω is defined by

x0 7→ Ft(x0) = x(t),

where x(t) is the Lagrangian trajectory passing through x0 at the time t = 0.In addition this defines a map F : Iε × Ω→ Ω given by F (t, x) = Ft(x).

The uniqueness of the solution of the Cauchy problem for Lagrangian tra-jectories is essential in the definition of Ft. In fact, for Ft to be unambiguouslydefined we need that Ft(x) 6= Ft(y) implies x 6= y; it is not admissible thatthe same point is mapped into two different points. The reader can check thatdefinition 1.1 is well posed in this sense because of the uniqueness of Lagrangiantrajectories.

From a physical point of view the flow describes the displacement of the fluidas time advances, i.e., given a control volume W ⊆ Ω, then Ft(W ) ⊆ Ω is thevolume occupied by the fluid initially in W after it has evolved for a time t.

In summary, we can consider a one-parameter family of maps Ft which canbe used in two different ways, namely,

• t 7→ Ft(x) is the Lagrangian trajectory passing through x at time t = 0;

• x 7→ Ft(x) is the displacement of the point x after a time t.

As a consequence of its definition, the flow satisfies, cf. equation (1.4),d

dtFt(x0) = u

(t, Ft(x0)

),

F0(x0) = x0,(1.5)

where the initial point x0 ∈ Ω is regarded as a parameter, so that we writea total derivative instead of a partial derivative and consider this an ordinarydifferential equation rather than a partial differential equation.

For the flow Ft, we always imply initial conditions at t0 = 0. For generalinitial time t0 = s, we define the map Ft,s for t, s ∈ Iε by the Cauchy problem

d

dtFt,s(x0) = u

(t, Ft,s(x0)

),

Fs,s(x0) = x0,

with t0 = s; hence Ft = Ft,0. For an autonomous vector field u = u(x) themaps Ft,s can be written in terms of Ft. Specifically, we observe that the curve

10

cs(t) = Ft−s(x0) solves the Cauchy problem

dcs(t)

dt= u

(cs(t)

), cs(s) = x0,

and by uniqueness of the solution of Cauchy problems, we deduce that forautonomous fields Ft,s(x0) = cs(t) = Ft−s(x0).

We shall now establish a few key properties of the flow Ft that essentiallydescend from equation (1.5).

Proposition 1.3. For every t, s ∈ Iε such that t + s ∈ Iε, we have in generalFt+s = Ft+s,s Fs = Ft+s,t Ft, while for autonomous vector fields we have thesemi-group property Ft+s = Ft Fs = Fs Ft.

Proof. For every x0 ∈ Ω, let us consider the Lagrangian trajectory x(t′) corre-sponding to the initial condition x(0) = x0 at t′ = 0. By definition, Ft+s(x0) =x(t+ s). The curve t′ 7→ x(t′) also solves the problem

dx(t′)

dt′= u

(t′, x(t′)

), x(s) = Fs(x0),

hence, x(t′) = Ft′,s(Fs(x0)

)and, replacing s by t, x(t′) = Ft′,t

(Ft(x0)

). Then

Ft+s(x0) = x(t + s) = Ft+s,s Fs(x0) = Ft+s,t Ft(x0). For an autonomousvector field we have Ft′,s = Ft′−s, hence Ft+s = Ft Fs and Ft+s = Fs Ft.

Proposition 1.3 has an immediate consequence.

Corollary 1.4. For every t ∈ Iε, Ft : Ω→ Ω is invertible and the inverse mapis given by F−1

t = F0,t. For autonomous fields F−1t = F0,t = F−t.

Proof. We have F0(x) = x for all x ∈ Ω and the identity Ft+s = Ft+s,t Ftin proposition 1.3 with s = −t gives x = F0(x) = F0,t Ft(x) which showsthat F0,t is the inverse of Ft. For autonomous fields we have Ft,s = Ft−s henceF−1t = F−t.

Another basic result from ODE theory implies that Ft : Ω→ Ω is Lipschitzcontinuous in Ω for all t ∈ Iε.

Proposition 1.5. If u is continuous and satisfies the Lipschitz condition oftheorem 1.1 on Iε × Ω and Ft : Ω→ Ω is defined on Ω for all t ∈ Iε, then∣∣Ft(x)− Ft(y)

∣∣ ≤ |x− y|eL|t|,for all x, y ∈ Ω and t ∈ Iε. Here L is the Lipschitz constant of u.

Proof. Let us first consider the half-interval t ≥ 0. For every x, y ∈ Ω fixed, leth(t) = Ft(x)− Ft(y) and, by the Lipschitz condition for u,∣∣∣dh(t)

dt

∣∣∣ =∣∣u(t, Ft(x)

)− u(t, Ft(y)

)∣∣ ≤ L|h(t)|.

We actually need to control the derivative of the norm, rather then the norm ofthe derivative. With this aim we can estimate

1

2

d

dt

(h(t)2

)= h(t) · dh(t)

dt≤∣∣h(t)

∣∣ · ∣∣∣dh(t)

dt

∣∣∣ ≤ Lh(t)2,

11

and for t ≥ 0,

d

dt

(h(t)2e−2Lt

)=[ ddth(t)2 − 2Lh(t)2

]e−2Lt ≤ 0,

hence h(t)2e−2Lt ≤ h(0)2 which is equivalent to the claim for t ≥ 0. As for theother half-interval t ≤ 0, let s = −t ≥ 0 and h(s) = F−s(x) − F−s(y) and wenotice that

dh(s)

ds= −u

(− s, F−s(x)

)+ u(− s, F−s(y)

),

and repeat the argument, integrating in the variable s.

We now know that Ft is a continuous transformation of Ω into itself for allt ∈ Iε. We also shall need to understand when Ft is differentiable and in thosecases have a convenient way to compute its Jacobian matrix and determinant,namely,

DFt(x0) = t(∇x0Ft(x0)

), Jt(x0) = det

(DFt(x0)

),

where tA denotes the transpose of a tensor A. In this note, the gradient ∇v(t, x)of a generic vector field v(t, x) is defined according to standard dyadic vectorcalculus, which differs from the definition adopted by Chorin and Marsden [11].The Jacobian matrix is then denoted by Dv and it is the transpose of thegradient, namely, [

∇v(t, x)]ij

=∂vj∂xi

=[Dv(t, x)

]ji.

We derive an evolution equation for the Jacobian matrix of the flow.

Proposition 1.6. If the velocity field u is of class C1 and Ft : Ω→ Ω is definedon Ω for all t ∈ Iε, then Ft ∈ C1(Ω), the map t 7→ DFt(x) is in C2 and satisfiesthe Cauchy problem,

d

dtDFt(x0) =

[Du(t, Ft(x0)

)]DFt(x0),

DF0(x0) = I,(1.6)

with I being the identity matrix. By induction, if u ∈ Ck for k ≥ 1 thenFt ∈ Ck(Ω) and t 7→ Ft(x) is Ck+1.

Proof. Cf. lemma 4.1.9 of Marsden et al. [32].

If we can say that the function F (t, x) = Ft(x) is of class C2, equation (1.6)is a direct consequence of the chain rule. In fact, the time derivative and thegradient of the flow commute and

∂t∇x0Ft(x0) = ∇x0

[∂tFt(x0)

]= ∇x0

[u(t, Ft(x0)

)]= ∇x0Ft(x0) ·

[∇u(t, Ft(x0)

)]= ∇Ft(x0) ·

[∇u(t, Ft(x0)

)].

By transposing this identity and considering the initial condition ∇F0 = I,which follows from F0(x0) = x0 in equation (1.5), we obtain equation (1.6).Without assuming further regularity, the proof require some more work. Thefull argument can be found in Marsden et al. [32], with the only difference thathere we are assuming that Ft is defined uniformly on the whole domain Ω andnot just locally.

The evolution equation for the determinant follows from proposition 1.6.

12

Proposition 1.7. Under the same hypotheses of proposition 1.6, we haved

dtJt(x0) =

[∇ · u

(t,X(t, x0)

)]Jt(x0),

J0(x0) = 1.(1.7)

This result is a special case of the Liouville’s formula, which can be provenfrom the properties of the determinant.

Lemma 1.8 (Liouville’s formula). Let A,ψ be functions from an interval I ⊆ Rwith values in the space Rn×n of n×n matrices, such that ψ ∈ C1 and dψ/dt =A(t)ψ(t). Then

ddetψ(t)/dt = trA(t) detψ(t),

where tr(A) is the trace of the matrix A.

Proof. See, for instance, proposition 1.2.4 in Hormander’s lectures [33].

Here however we give a proof which relies on the following basic identityfrom vector calculus. This establishes a relationships between the volume of aparallelepiped spanned by three vectors, expressed by the scalar triple productof the vectors, and the determinant of the matrix defined by the vectors.

Lemma 1.9. Let A be a 3-by-3 matrix and we write it as a column of rowvectors, i.e., A = t(A1, A2, A3) with Ai = (aij)j. Then

det(A) = A1 · (A2 ×A3) = A2 · (A3 ×A1) = A3 · (A1 ×A2).

Proof. By definition, the determinant is

det(A) =∑σ∈S3

sign(σ)a1σ(1)a2σ(2)a3σ(3),

where the sum runs over the set S3 of permutations of three elements (1, 2, 3).In terms of the completely anti-symmetric (Levi-Civita) symbol

εijk =

1 (i, j, k) is an even permutation of (1, 2, 3),

0 (i, j, k) is not a permutation of (1, 2, 3),

−1 (i, j, k) is an odd permutation of (1, 2, 3),

we write

det(A) =∑σ∈S3

sign(σ)a1σ(1)a2σ(2)a3σ(3) =∑ijk

εijka1ia2ja3k,

and the right-hand side is just the scalar triple product A1 · (A2 × A3). Theother two identities follow on noting that the triple product is invariant undercyclic permutations.

It is worth noting that the identity of lemma 1.9 holds in any dimension,but then the cross product has to be replaced by an appropriate anti-symmetricmulti-linear operation.

The proof of proposition 1.7 now follows by direct calculation.

13

Proof of proposition 1.7. By proposition 1.6, Ft is Ck in space and Ck+1 intime. By lemma 1.9,

det(∇x0

Ft)

= ∇x0X1 ·

[∇x0

X2 ×∇x0X3

],

where Ft(x0) =(X1(t, x0), X2(t, x0), X3(t, x0)

)and Xi(t, x0) are the Cartesian

coordinates of the position vector of the Lagrangian trajectory. We note that∇Xi are the rows of the Jacobian matrix DFt, hence, from equation (1.6),

∂t∇x0Xi = ∇x0

X · ∇ui,

where u = (u1, u2, u3) is the fluid velocity in Cartesian components, and ∇ui isevaluated at

(t, Ft(x0)

). Then, we compute

∂t det(∇x0

Ft)

= (∂t∇x0X1) ·

[∇x0

X2 ×∇x0X3

]+∇x0

X1 ·[(∂t∇x0

X2)×∇x0X3

]+∇x0

X1 ·[∇x0

X2 × (∂t∇x0X3)

]= (∇x0

X · ∇u1) ·[∇x0

X2 ×∇x0X3

]+∇x0

X1 ·[(∇x0

X · ∇u2)×∇x0X3

]+∇x0

X1 ·[∇x0

X2 × (∇x0X · ∇u3)

]=

3∑i=1

[∇x0

Xi ·[∇x0

X2 ×∇x0X3

]]∂u1

∂xi

+

3∑i=1

[∇x0

X1 ·[∇x0

Xi ×∇x0X3

]]∂u2

∂xi

+

3∑i=1

[∇x0

X1 ·[∇x0

X2 ×∇x0Xi

]]∂u3

∂xi.

Since the scalar triple product A · [B ×C] vanishes if any pair of its factors areequal, in the first sum the only contribution comes from i = 1, in the secondsum from i = 2, and in third sum from i = 3. We therefore have

∂t det(∇x0

Ft)

= det(∇x0

Ft)[∂u1

∂x1+∂u2

∂x2+∂u3

∂x3

].

In addition we have det(∇x0Ft(x0)

)= 1 since ∇x0F0(x0) = I, so that we can

write the initial value problem for the Jacobian determinant.

This concludes our overview of basic properties of the flow Ft. Let us nowaddress a few examples. First we consider the stationary velocity field

u(t, x) = ν

x1

−x2

0

, (1.8)

in coordinates x = (x1, x2, x3), and ν > 0 is a constant with the dimensionsof a frequency. This field is essentially two-dimensional because it is uniformin the third coordinate x3 and the corresponding component u3 is zero. The

14

Figure 1.1: Field lines of the velocity field (1.8) and evolution with the flowFt of a sample of points arranged in the shape of a square. Blue points are thepositions x0 at the time t = 0, while red points are their evolution x = Ft(x0)at time t = 1, with ν = 0.5. One can observe that the square shape is deformedbut not rotated by the flow.

corresponding flow can be computed by solving the ordinary differential equa-tion (1.5), which in this case amounts to (only the two non-trivial components)

∂tX1 = νX1, X1(0, x0) = x0,1,

∂tX2 = −νX2, X2(0, x0) = x0,2,

where the initial point is x0 = (x0,1, x0,2, x0,3). Since the system is uniform inx3, the planes x3 = constant are invariant, that is, X3 = x0,3, and the solutionfor the flow is readily found in the form

Ft(x0) =

eνtx0,1

e−νtx0,2

x0,3

.

We see that x1(t)x2(t) = x0,1x0,2, i.e., the trajectories of the flow (Lagrangiantrajectories) are hyperbolas on the plane (x1, x2). Figure 1.1 shows the fieldlines of the velocity field and the evolution of a sample of points (arranged inthe shape of a square) according to the flow map Ft. The Jacobian matrix ofthe vector field (1.8) is

Du = t(∇u) =

ν 0 00 −ν 00 0 0

, (1.9)

so that the Jacobian matrix of the flow satisfies

d

dt(DFt)1j = ν(DFt)1j ,

d

dt(DFt)2j = −ν(DFt)2j , j = 1, 2, 3,

d

dt(DFt)3j = 0,

15

Figure 1.2: The same as in figure 1.1, but for the velocity field (1.10). In thiscase, ν = 1.2π and the final time is t = 1. The initial control volume is rotated,but not stretched.

and, upon accounting for the initial condition DF0 = I,

DFt =

eνt 0 00 e−νt 00 0 1

,

as can be seen directly from the flow. At last, the vector field is divergence free,i.e., ∇ · u(t, x) = 0, so that Jt = J0 = 1, cf. equation (1.7); this can be deducedby inspection of the matrix DFt.

Another example, with quite different properties, is given by the flow

u(t, x) = ν

x2

−x1

0

. (1.10)

The ordinary differential equation for the flow in this case reads∂tX1 = νX2, X1(0, x0) = x0,1,

∂tX2 = −νX1, X2(0, x0) = x0,2,

and the x3 = constant planes are again invariant. The solution is now oscillatory,with frequency given by the constant ν > 0,

Ft(x0) =

x0,1 cos(νt) + x0,2 sin(νt)−x0,1 sin(νt) + x0,2 cos(νt)

x0,3

.

Thus Lagrangian trajectories are circles, cf. figure 1.2. The Jacobian matrix ofthe velocity field is

Du = t(∇u) =

0 ν 0−ν 0 00 0 0

, (1.11)

16

and it is anti-symmetric. Correspondingly, the equation for the Jacobian matrixof the flow amounts to

d

dt(DFt)1j = ν(DFt)2j ,

d

dt(DFt)2j = −ν(DFt)1j , j = 1, 2, 3,

d

dt(DFt)3j = 0,

and the solution is the matrix for a clock-wise rotation of an angle νt, namely,

DFt =

cos(νt) sin(νt) 0− sin(νt) cos(νt) 0

0 0 1

,

which could have been computed directly from the flow. As for the previousexample, the field is divergence free, ∇ ·u = 0, therefore the corresponding flowhas unit Jacobian Jt = J0 = 1.

At last, we consider an example of flow with non-zero divergence, namely,

u(t, x) = −ν

x1

x2

0

, (1.12)

where again ν > 0 is a constant. The ordinary differential equations for the floware

∂tX1 = −νX1, X1(0, x0) = x0,1,

∂tX2 = −νX2, X2(0, x0) = x0,2,

and ∂tX3 = 0, so that the flow is

Ft(x0) =

x0,1e−νt

x0,2e−νt

x0,3

.

The Jacobian matrix of the velocity field reads

Du = t(∇u) =

−ν 0 00 −ν 00 0 0

, (1.13)

and one can check that

DFt =

e−νt 0 00 e−νt 00 0 1

,

is indeed the solution of equation (1.6). Differently from the other two examples,the determinant of the flow is not constant as

Jt(x) = e−2νt,

consistently with equation (1.7) and with the fact that ∇ · u = −2ν. The effect

17

Figure 1.3: The same as in figure 1.1, but for the velocity field (1.12). Theparameter is ν = 0.5 and the final time is t = 2. The field lines are radial andLagrangian trajectories moves radially toward the origin slowing down expo-nentially. A control volume is then compressed equally along both axes so thatits shape is preserved and no rotation occurs.

of the non-zero divergence can be appreciated in figure 1.3.It is worth noting that, in all the considered examples, the Lagrangian tra-

jectories coincide with the field lines of the velocity field, since u is independentof time (stationary flow).

The first two cases, namely fields (1.8) and (1.10), are examples of two-dimensional divergence-free flows. Such vector fields can be expressed in termsof a scalar function ψ = ψ(x1, x2), to be referred to as streaming function, by

u1 = ∂ψ/∂x2, u2 = −∂ψ/∂x1.

In the case of equation (1.8), one has ψ(x1, x2) = νx1x2, while in the case ofequation (1.10) the streaming function is ψ(x1, x2) = ν(x2

1 +x22)/2. One should

notice the analogy with Hamilton’s equations [32] on a two-dimensional phasespace, ψ playing the role of the Hamiltonian function. Equivalently,

u = ∇ψ × e3, (1.14)

where e3 is the unit vector in the x3-direction. A vector field written in theform (1.14) is automatically divergence-free, since

∇ · u = ∇ ·[∇ψ × e3

]= e3 · (∇×∇ψ) = 0,

where we have used the identity ∇ · (v1 × v2) = v2 · (∇ × v1) − v1 · (∇ × v2),which holds for every pair of smooth vector fields v1 and v2. On the other hand,a vector field of the form (1.14) can have a non-trivial curl,

∇× u = ∇× (∇ψ × e3) = −e3∆ψ.

18

The advection of a scalar field χ ∈ C1 by the flow of (1.14) leads to the bi-linearanti-symmetric operator

[χ, ψ] = u · ∇χ = e3 · (∇χ×∇ψ), (1.15)

which is the canonical Poisson bracket on R2. Explicitly we have

[χ, ψ] = ∂x1χ∂x2

ψ − ∂x2χ∂x1

ψ,

which shows the anti-symmetry of the brackets, [χ, ψ] = −[ψ, χ]. As a conse-quence, the contours of the potential ψ are invariant for the flow, i.e.,

u · ∇ψ = [ψ,ψ] = 0.

Geometrically, one can observe that the field u is just the gradient of ψ rotatedby π/2 clock-wise; this, in particular, implies that u is everywhere tangent tothe contours of ψ, and thus that the contours of ψ coincide with the field lines,which in turn coincide with the Lagrangian trajectories, in the same way as,for two-dimensional Hamiltonian systems, the the contours of the Hamiltonianfunction coincide with the trajectories.

The last case, equation (1.12), is an example of potential flow, namely, thereexists a scalar function φ = φ(x1, x2) such that

u = −∇φ. (1.16)

For (1.12), the potential function is φ(x1, x2) = ν(x21 + x2

2)/2 which is the sameas the streaming function for the flow in equation (1.10); the difference is thathere the gradient is not rotated. A velocity field of the form (1.16) can have anon-trivial divergence,

∇ · u = −∆φ,

but it is automatically irrotational, namely,

∇× u = −∇×∇φ = 0.

For a potential flow, the velocity field is orthogonal to the surfaces φ = constantand the Lagrangian trajectories are attracted toward minima of the potential.For the specific case of example (1.12), cf. figure 1.3, there is a unique maximumin the origin.

Although the definition of the flow Ft might appear a mere mathematicalabstraction, its importance in fluid dynamics cannot be stressed enough. Inmodern (geometrical) approaches to fluid dynamics, Ft is the main variablespecifying the state of a fluid [8, 35].

1.3 Deformation tensor and vorticity. At time t = 0, let us considertwo points x0, x

′0 ∈ Ω and follow them as they evolve with the fluid motion.

From the result of the last section, we can write the trajectories of those twopoint as

x(t) = Ft(x0), x′(t) = Ft(x′0).

We are interested in studying the evolution of the difference vector, cf. figure 1.4,

h(t) = x′(t)− x(t).

19

x0

x′0

Ft(x0)

Ft(x′0)

h(t)

Figure 1.4: Evolution of two nearby points x0, x′0 with the fluid motion (La-

grangian trajectories), and definition of the vector h(t).

We shall assume that the initial points x0, x′0 are very close to each other so

that |h(0)| is small, and consider a time interval so short that |h(t)| can still beconsidered small; we make no attempt to be mathematically more precise here.

In view of equation (1.5) and Taylor expansion we have

d

dth(t) =

d

dtFt(x

′0)− d

dtFt(x0)

= u(t, Ft(x

′0))− u(t, Ft(x0)

)= u

(t, x′(t)

)− u(t, x(t)

)= u

(t, h(t) + x(t)

)− u(t, x(t)

)= h(t) · ∇u

(t, x(t)

)+O(|h(t)|2),

or equivalently,d

dth(t) = Du

(t, x(t)

)h(t) +O(|h(t)|2),

where Du = t(∇u) is the Jacobian matrix of the velocity field. We see that, atleast for a short time, the evolution of h(t) is determined by Du. Since it is asquare matrix, Du can be split into its symmetric and anti-symmetric parts,

Du =1

2

[Du+ t(Du)

]+

1

2

[Du− t(Du)

].

The symmetric part is referred to as deformation tensor,

D =1

2

[Du+ t(Du)

]=

1

2

[∇u+ t(∇u)

], (1.17)

while the anti-symmetric part,

S =1

2

[Du− t(Du)

]= −1

2

[∇u− t(∇u)

],

is such that, for every vector v ∈ R3,

Sv =1

2

[v · ∇u−∇u · v

]=

1

2(∇× u)× v =

1

2ω × v,

20

and the vector fieldω = ∇× u, (1.18)

is referred to as vorticity of the fluid. In matrix form we have

S =1

2

0 −ω3 ω2

ω3 0 −ω1

−ω2 ω1 0

.

At last the evolution of the vector h(t) is determined by

d

dth(t) = D

(t, x(t)

)h(t) +

1

2ω(t, x(t)

)× h(t) +O(|h(t)|2).

We can now study separately the effects of the symmetric and anti-symmetricterms. Of course the full dynamics is the result of the combination of the two.Let us start with the deformation tensor D(t, x) for which we have to considerthe linear symmetric equation

d

dth(t) = D

(t, x(t)

)h(t).

For sake of simplicity (we are only interested in qualitative ideas) let us assumethat the deformation tensor does not vary too much along the Lagrangian tra-jectory, i.e., we set it to a constant, D(t, x) = D(0, x0) = D0. Since D0 is bydefinition symmetric, we can find a set of three orthonormal eigenvector ei ∈ R3,i.e.,

D0ei = λiei,

and the eigenvalues λi are real. The set of eigenvectors ei constitutes a basesfor vectors in R3, hence we can write

h(t) =

3∑i=1

ci(t)ei,

and the coefficients ci(t) of the expansion satisfy the scalar ordinary differentialequation

dcidt

= λici, ci(t) = ci(0)eλit,

where ci(0) are the coefficient of the expansion of the initial vector h(0). Fromthe full solution,

h(t) =

3∑i=1

eλitci(0)ei,

we can deduce the effect of D0 on the fluid motion: The fluid element is stretchedexponentially along the directions of the eigenvalues of D0, but such directionsare invariant (no rotation happens). Here stretching can be either expansion(λi > 0) or compression (λi < 0).

The contribution of the vorticity on the other hand can be understood byconsidering the equation

d

dth(t) =

1

2ω(t, x(t)

)× h(t).

21

Again we set ω(t, x(t)

)= ω(0, x0) = ω0 and we recognize that this generates a

rigid rotation of h about the direction of ω0 with angular frequency 12 |ω0|. In

order to see that, we can assume (without loss of generality) that the vorticity isdirected along the x1-axis of a Cartesian coordinate system, i.e., ω0 = (|ω0|, 0, 0).Then, the equation of motion for h(t) =

(h1(t), h2(t), h3(t)

)becomes

dh1(t)/dt = 0,

dh2(t)/dt = − 12 |ω0|h3(t),

dh3(t)/dt = 12 |ω0|h2(t),

from which we see that the component of h(t) parallel to ω0 does not change,while the perpendicular projection

(h2(t), h3(t)

)rotates with angular frequency

12 |ω0|, as claimed.

Let us consider the examples of section 1.2. For the velocity field (1.8), byinspection of equation (1.9) we see that ∇u = t(Du) is symmetric and thus,

D = ∇u, ω = 0.

The corresponding flow therefore should just deform the fluid element withoutrotating it, cf. figure 1.1. The same holds for any potential field, as definedin equation (1.16), which is irrotational by definition. For the case of exam-ple (1.12), in particular, the fluid element is compressed but not rotated asshown in figure 1.3.

On the other hand, the velocity field (1.10) is such that∇u is anti-symmetric,cf. equation (1.11). Hence,

D = 0, ω = (0, 0,−2ν),

and we see that the vorticity is pointing along the axis of rotation of the vor-tex (the third axis in this case) and it is equal to twice the rotation angularfrequency, cf. figure 1.2.

Summarizing the results of this section, we have shown that the motion oftwo nearby points in the fluids amounts to the combination of two effects: ex-ponential stretching along prescribed directions (controlled by the deformationtensor) and rigid rotation (controlled by the vorticity).

1.4 Advective derivative and Reynolds transport theorem. The con-cept of flow of the fluid velocity is central in the proof of the Reynolds transporttheorem, by means of which the equations of motion of fluid dynamics are usu-ally formulated.

Let us start from a generic scalar function f ∈ C1(I × Ω,R) defined fort ∈ I and x ∈ Ω; that can represent any scalar physical quantity as the massdensity, the temperature or any function thereof. We consider a velocity fieldu : I × Ω → R3 satisfying the hypothesis of proposition 1.5 and evaluate thefunction f along a Lagrangian trajectory,

x(t) = Ft(x0).

The time derivative of f restricted to the Lagrangian trajectory then reads

d

dt

[f(t, x(t)

)]= ∂tf

(t, x(t)

)+dx(t)

dt· ∇f

(t, x(t)

).

22

Since by definition dx(t)/dt = u((t, x(t)

), cf. equation (1.5), we can write

d

dt

[f(t, x(t)

)]=Df

Dt

(t, x(t)

),

where we have defined the advective derivative as the operator

Df

Dt(t, x) = ∂tf(t, x) + u(t, x) · ∇f(t, x). (1.19)

The advective derivative (also referred to as convective derivative, or materialderivative) gives the rate of variation of a function along the Lagrangian trajec-tory passing through the space-time point (t, x).

Incidentally, it is worth noting that, if the advective derivative of a functionf is identically zero on Ω for all time t ∈ I, then the value of f along anyLagrangian trajectory is constant. A function f(t, x) that has this propertysatisfies the partial differential equation,

∂tf(t, x) + u(t, x) · ∇f(t, x) = 0,

which is referred to as linear advection equation. Vice versa we can use the flowof the velocity field u(t, x) to construct a solution of the initial-value problem forthe advection equation. In fact, if f(t, x) is constant on any Lagrangian trajec-tory, we have f

(t, Ft(x0)

)= f0(x0) where f0 is the initial condition at the time

t = 0 and x0 ∈ Ω; inverting the flow we have the solution f(t, x) = f0

(F−1t (x)

).

This is a particular case of a much more general method to solve initial-valueproblems for first-order linear and non-linear partial differential equations knownas the method of characteristic curves [36]. In this case the Lagrangian trajec-tories coincide with the characteristic curves of the linear advection equation.

Having defined the advective derivative, we can now give a proof of a centralresult in fluid dynamics due to Reynolds.

Let us consider an arbitrary volume Wt which moves along with the fluid.The fact that Wt moves with the fluid can be expressed mathematically in termsof the flow map Ft, namely,

Wt = Ft(W0),

where W0 is the configuration of the volume at time t = 0, and the applicationof the map Ft to the set W0 is defined pointwise, i.e., Wt is the set obtained byapplying Ft to each point of W0, cf. figure 1.5. For instance, figures 1.1, 1.2,and 1.3 show a control volume W0 (sampled by blue points) which evolves intoWt (sampled by red points) under the flow (1.8), (1.10), and (1.12), respectively.

We shall choose W0 compact in Ω and since Ft is continuous, Wt will becompact at any time t for which the flow is defined, i.e., for t ∈ Iε defined insection 1.2. For an arbitrary function f ∈ C1(I × Ω), the restriction to Wt atany given time is bounded hence integrable on Wt and we consider the integral

Q(t) =

∫Wt

f(t, x)dx.

We are interested in computing the time derivative of Q. The difficulty here isthat both the integrand and the domain of integration depend on time.

23

W0

Wt

Ft

Figure 1.5: The control volume W0 is dragged along by the fluid motion, thusevolving in time Wt = Ft(W0).

Theorem 1.10 (Reynolds transport theorem, [31]). Let u ∈ C1(I × Ω) be avelocity field with flow Ft defined on the whole domain Ω for t ∈ Iε ⊂ I. Thenfor all f ∈ C1(I × Ω,R), and W0 ⊂ Ω compact, we have Q ∈ C1(Iε) and

d

dt

∫Wt

f(t, x)dx =

∫Wt

[DfDt

+ f∇ · u]dx =

∫Wt

[∂tf +∇ ·

(fu)]dx, (1.20)

where Wt = Ft(W0).

Proof. The idea behind this calculation is that we can use the map x = Ft(x′)

as a coordinate transformation mapping W0 into Wt; changing coordinates inthe integral we have∫

Wt

f(t, x)dx =

∫W0

f(t, Ft(x

′))Jt(t, x

′)dx′,

where Jt is the Jacobian determinant addressed in section 1.2 and dx′ is thevolume element in the primed coordinates. The advantage is that now thedomain of integration does not depend on time. In addition all the integralsare finite as both the integrand and its time derivative are the restriction toa compact domain of continuous functions. We see that we can differentiateunder the integral sign and Q ∈ C1. Since the derivative of f restricted to aLagrangian trajectory gives the advective derivative, we compute

d

dt

∫Wt

f(t, x)dx =

∫W0

[DfDt

Jt + f∂tJt

]dx′

=

∫W0

[DfDt

+ f∇ · u]Jt(t, x

′)dx′,

where the terms in square brackets are evaluated at(t, Ft(x

′))

and we have usedequation (1.7) in the second identity. By transforming back to Wt we have

d

dt

∫Wt

f(t, x)dx =

∫Wt

[DfDt

(t, x) + f(t, x)∇ · u(t, x)]dx.

At last, we note that

Df

Dt(t, x) + f(t, x)∇ · u(t, x) = ∂tf(t, x) +∇ ·

(f(t, x)u(t, x)

).

24

This conclude the proof of the Reynolds transport theorem.

In this argument we have considered a scalar function f for sake of simplicity;however, the advective derivative and the transport theorem can be appliedcomponent-wise to multi-component fields, such as vectors or tensors.

A special case of the Reynolds transport theorem (1.20) is obtained for f ≡ 1,i.e., the function identically equal to one. Then, the integral of f amounts tothe volume of Wt,

|Wt| =∫Wt

dx,

and from the advective form of (1.20) we have

d

dt|Wt| =

∫Wt

∇ · u(t, x)dx.

This equation controls the expansion/compression of a volume of fluid; it canbe considered the “macroscopic” form of equation (1.7) for the Jacobian deter-minant of the flow. We can also notice that, when

∇ · u(t, x) = 0, (1.21)

the volume |Wt| is preserved, i.e., the fluid in incompressible and the divergence-free condition (1.21) is referred to as the incompressibility condition. Bothexamples of figures 1.1 and 1.2 are incompressible flows. For comparison, theexample of figure 1.3 is a compressible flow which shrink exponentially a volumeelement.

By considering the transport of mass, linear momentum, and total energywe shall use the transport theorem to justify the equations of motion of fluiddynamics.

1.5 Dynamics of fluids. In this section we shall consider three basicsphysics principles, translate them into mathematical statements, and use theReynolds transport theorem to obtain partial differential equations for the threestate variables introduced in section 1.1.

The considered physics principles are:

• mass conservation, which implies an equation for ρ,

• momentum balance (Newton second law of dynamics), which implies anequation for u,

• energy balance, which defines the dynamics of the internal energy U .

Mass conservation. Let us consider the mass of fluid in a volume Wt thatmoves with the fluid. Since the volume Wt is “following” the fluid in its motion,the mass contained therein should be constant, that is,

d

dt

∫Wt

ρ(t, x)dx = 0,

where ρ(t, x) is the mass density. Then the Reynolds transport theorem ofsection 1.4 gives ∫

Wt

[∂tρ+∇ ·

(ρu)]dx = 0.

25

This identity must be true for an arbitrary control volume Wt ⊂ Ω, hence theintegrand must vanish with the result that

∂tρ+∇ ·(ρu)

= 0.

This is the first equation of fluid dynamics expressing the conservation of thefluid mass, and it is referred to as mass continuity equation. Upon integratingover the whole domain Ω and using the Gauss theorem∫

Ω

∇ ·(ρu)dx =

∫∂Ω

ρu · ndS,

where n is the outgoing unit normal on the boundary ∂Ω of the domain, weobtain that the total variation of the fluid mass in Ω is

d

dt

∫Ω

ρdx = −∫∂Ω

ρu · ndS,

that is, ρu is the mass flux through the boundary. If the boundary conditionsfor the velocity u are appropriately chosen, e.g., either if u = 0 on the boundary(no-slip boundary conditions), or n·u = 0, then the mass of the fluid is conserved.

Momentum balance. The main equation of fluid dynamics follows from thetransport of linear momentum, which parallels Newton’s second law, namely,

d

dt

∫Wt

ρudx = (forces acting on Wt).

The left-hand side of the equation can be treated by means of the transporttheorem, but we need to identify the forces acting on the volume of fluid Wt.We have to distinguish between forces acting on the whole body of the fluidvolume Wt, and those acting on its boundary ∂Wt. Forces acting on the wholevolume Wt can be written as

(body forces on Wt) =

∫Wt

ρ(t, x)f(t, x)dx,

where f(t, x) is the force per unit of mass acting of the fluid element ρ(t, x)dx;one might think of such forces as the result of an external force field acting onthe region occupied by the fluid, such as gravity or electromagnetic forces if thefluid is electrically charged or an electric conductor (this will be the case forplasmas). On the other hand, forces acting on the boundary ∂Wt are due tointernal interaction among the fluid elements. The force per unit of area actingon the boundary of Wt can be shown to be a linear function of the outgoingunit normal n on ∂Wt, namely,

(surface forces on ∂Wt) = −∫∂Wt

P · ndS,

where P is a symmetric tensor referred to as Cauchy stress tensor (the linearrelationship between surface forces and n is known as Cauchy stress theorem[31]; the proof is not reported here, but an alternative view will be given in thenext section.) It is worth noting that P has the dimensions of a pressure (force

26

per unit of volume). In fact if P is isotropic, i.e., P = pI where p is a scalar,then the expression above reduces to the familiar pressure force

(pressure force) = −∫∂Wt

pndS,

with pressure p; in general, the surface force may not be exactly normal andthis is described by the symmetric tensor P . We can still isolate the isotropiccontribution by means of the identity,

P = pI + π, (1.22)

where the scalar pressure p = 13 trP is defined as one third of the trace of P

so that the symmetric tensor π is trace-free trπ = 0. The trace-free part ofthe stress tensor is referred to as viscosity tensor. By the transport theorem(applied component-wise), the momentum balance becomes∫

Wt

[∂t(ρu) +∇ · (ρuu)

]dx = −

∫∂Wt

[pn + π · n

]dS +

∫Wt

ρfdx,

where uu = u ⊗ u = (uiuj)ij is the tensor product in dyadic notation. Theboundary terms can be dealt with by Gauss theorem∫

∂Wt

[pn + π · n

]dS =

∫Wt

[∇p+∇ · π

]dx,

and since the control volume Wt is arbitrary, we can deduce that

∂t(ρu) +∇ · (ρuu+ π) = −∇p+ ρf,

which expresses the balance law for the linear momentum density of the fluid,in terms of viscosity, pressure, and external forces. This is referred to as themomentum balance equation or equation of motion in analogy with Newton’ssecond law.

Energy balance. At last we consider the energy transport,

d

dt

∫Wt

[1

2ρu2 +

3

2nkBT

]dx = (rate of energy input on Wt).

The rate at which energy is injected into the fluid contained in Wt is the sumof the work done by the forces on the fluid, plus the energy flux through theboundary as the control volume Wt is embedded in the fluid and can exchangeheat with the surroundings, plus the energy produced by heat sources in thevolume Wt. If q is the heat flux vector and Q the rate of energy production perunit of volume, we have

(rate of energy input on Wt) = −∫∂Wt

u · P · ndS +

∫Wt

ρu · fdx

−∫∂Wt

q · ndS +

∫Wt

Qdx,

27

where the first two terms represent the work done by the internal and externalforces respectively and the last two terms are the total heat flux through theboundary and the rate of energy produced by heat sources. Again by the Gausstheorem, we can rewrite the boundary terms so that

(rate of energy input on Wt) =

∫Wt

[ρu · f +Q−∇ · (P · u+ q)

]dx.

The transport theorem applied to the total energy then yields,

∂t(

12ρu

2 + 32nkBT

)+∇·

[( 1

2ρu2 + 3

2nkBT + p)u+π ·u+ q]

= ρu · f +Q, (1.23)

which is the total energy balance law. It is worth noting that the energy flux(i.e., the vector within the divergence operator on the left-hand side) amountsto the advection of kinetic energy, internal energy, and pressure, plus the con-tribution of viscosity and heat flux. We can use the continuity equation andthe momentum balance equation to eliminate the kinetic energy terms in theenergy transport, thus obtaining an equation for the internal energy only. Thisrequires some calculations. First,

∂t(

12ρu

2)

+∇ ·(

12ρu

2u)

= ρu · ∂tu+ ρu · ∇u · u,

where the continuity equation has been accounted for. The momentum balanceequation on the other hand can be written as

ρ(∂tu+ u · ∇u) +∇ · π +∇p = ρf,

where again the continuity equation has been accounted for. Upon scalar mul-tiplying by u, the latter gives

ρu · ∂tu+ ρu · ∇u · u = ρu · f − (∇ · π) · u− u · ∇p

so that

∂t(

12ρu

2)

+∇ ·(

12ρu

2u)

= ρu · f − (∇ · π) · u− u∇p= ρu · f −∇ · (π · u+ pu) + π : ∇u+ p∇ · u,

where π : ∇u = tr(π · ∇u). Equivalently,

∂t(

12ρu

2)

+∇ ·(

12ρu

2u+ pu+ π · u)

= ρu · f + π : ∇u+ p∇ · u. (1.24)

In view of this identity, the total energy balance equation (1.23) implies

∂t(

32nkBT

)+∇ ·

(32nkBTu+ q

)+ p∇ · u+ π : ∇u = Q,

which is the internal energy balance equation. Remarkably, the work done bythe force f does not contribute to the production of internal energy, but it goesinto the kinetic energy only.

Summary of fluid equations. The basic equations of fluid dynamics amount to:

• the mass continuity equation for the mass density ρ(t, x),

∂tρ+∇ ·(ρu)

= 0, (1.25a)

28

• the momentum balance for the fluid velocity u(t, x),

∂t(ρu) +∇ · (ρuu+ π) = −∇p+ ρf, (1.25b)

• and the internal energy balance for the temperature T (t, x),

∂t(

32nkBT

)+∇ ·

(32nkBTu+ q

)+ p∇ · u+ π : ∇u = Q, (1.25c)

where n(t, x) = ρ(t, x)/m.

One should note however that equations (1.25) are not closed as the pressurep, the viscosity tensor π, the heat flux q, and the heat source Q, as well asthe forces f have not been specified yet. We need to find expressions of thosequantities in terms of the basic state variables ρ, u and T . This is known as theclosure problem. In order to obtain a physically accurate closure, one needs toaccount for the microscopic properties of the fluid.

1.6 Relation to kinetic theory and closure. In order to find a closure offluid equations for both gases and plasmas, one usually relies on kinetic theory[37]. Specifically for plasmas, this leads to the transport equations derived byBraginskii [38] that constitute the standard basis for fluid and transport modelsin plasma physics [39, 40].

In kinetic theory a fluid is viewed as a collection of particles (atoms ormolecules for gases, ions and electrons for plasmas) all of the same type, i.e.,same mass m > 0 and same electric charge (if any). When particles of differentspecies are present, e.g., ions and electrons, each particle species is treatedseparately.

In the position-velocity phase-space (x, v) ∈ Ω×R3, the fluid is described bythe particle distribution function f : I × Ω×R3 → R+, where I ⊆ R is a timeinterval and R+ is the set of (strictly) positive real numbers. Its value f(t, x, v)gives the number of particles per unit of phase-space volume that have positionx ∈ Ω and velocity v ∈ R3 at time t ∈ I. The basic equation of the theory isthe kinetic equation, which has the general form

∂tf + v · ∇xf + a · ∇vf = C(f),

where a : I × Ω×R3 → R3 is a vector-valued function such that a(t, x, v) rep-resents the acceleration of the particle at the time t, position x, and velocity v,while C(f) is the collision operator, which describes deviations from the motionof the individual particles due to interaction (collisions) with other particles.The specific expressions for a and C are problem-dependent, but we can workunder the following hypotheses:

- The acceleration field satisfies

∇v · a(t, x, v) = 0. (1.26)

- The collision operator satisfies∫R3

C(f)(t, x, v)dv = 0. (1.27)

29

Condition (1.26) on the acceleration field is verified by fundamental forces actingin all physical systems we are interested in, i.e., fluids and plasmas. Such acondition, however, excludes acceleration fields that strictly dissipate the kineticenergy of a particle, i.e., such that

1

2

dv2

dt= v · a(t, x, v) < 0.

In order to see this, let Br(0) denote the ball of radius r in velocity spacecentered in the origin. From condition (1.26) and Gauss identity we have

0 =

∫Br(0)

∇v · a(t, x, v)dv =1

r

∫∂Br(0)

v · a(t, x, v)dS,

in contradiction with the strict dissipation condition v · a(t, x, v) < 0. On theother hand, the fact that the acceleration field has zero velocity divergence isnot sufficient to guarantee energy conservation. As an example, the accelerationfield

a(t, x, v) =

ν1v2

−ν2v1

0

has a zero velocity divergence and d(v2)/dt = (ν1−ν2)v1v2 which does not havea definite sign if ν1 6= ν2.

Condition (1.27) on the collision operator is also automatically satisfied forthe standard collision operators relevant to gases and plasmas. In fact thiscondition just means that collisions can change the velocity of a particle butcannot change its position and cannot destroy or create particles. One exceptionis the class of effective collision operators which, in some approaches, are usedto describe chemical reactions and ionization phenomena. We prefer howeverto distinguish collisions from “non-ideal” effects that are usually introducedphenomenologically in the model (and not derived from first principles) andthat are anyway not addressed here.

In addition the collision operator can have other properties such as momen-tum and energy conservation (elastic collisions), but for the moment we shallonly rely on the two conditions stated above.

Assumption (1.26) on the acceleration, in particular, allows us to re-writethe kinetic equation as

∂tf +∇x · (vf) +∇v · (af) = C(f), (1.28)

which has the same form as the continuity equation (1.25a), with the differencethat it is formulated in phase-space and it has a non-zero right-hand side. Infact the kinetic equation can also be understood on the basis of the Reynoldstransport theorem 1.10 in the same way the continuity equation (1.25a) hasbeen constructed. Here the domain Ω is replaced by the phase space Ω × R3,the mass density ρ is replaced by the distribution function f , and the fluid flowis replaced by the laws of particle mechanics, i.e.,

dx

dt= v,

dv

dt= a(t, x, v).

This also means that the characteristic curves of the kinetic equation are theparticle orbits in phase space. Differently from the continuity equation, however,

30

the right-hand side of the kinetic equation is not zero as the particle can escapea control volume due to collisions.

The total number of particle composing the fluid at the time t is given by theintegral of the particle distribution function on the whole phase-space, namely,

N(t) =

∫Ω×R3

f(t, x, v)dxdv,

hence we need at least f(t, ·) ∈ L1(Ω×R3). From the form (1.28) of the kineticequation, one obtains that assumption (1.27) on the collision operator impliesthe conservation of the number of particles, i.e., N(t) = N(0).

Fluid quantities, such as the mass density, the flow velocity, and the in-ternal energy, are related to the partial statistical moments of the phase-spacedistribution function with respect to the particle velocity, namely,∫

R3

vαf(t, x, v)dv,

where α = (α1, α2, α3) is a multi-index (a vector of non-negative integers) andvα = vα1

1 vα22 vα3

3 .Specifically, the velocity integral of f alone (α = 0) gives the number of

particles per unit of volume independently of their velocity, hence the massdensity is

ρ(t, x) = mn(t, x) = m

∫R3

f(t, x, v)dv.

Then, if we integrate the kinetic equation (1.28) in velocity and multiply by themass m, we obtain

∂tρ+∇ ·∫R3

mvfdv = 0,

which allows us to identify the mass flux by comparison with the mass continuityequation (1.25a), namely,

ρ(t, x)u(t, x) = m

∫R3

vf(t, x, v)dv.

The Cauchy stress tensor and the macroscopic forces acting of the fluid can thenbe identified by multiplying the kinetic equation (1.28) by mv and integrating,thus obtaining the evolution of the momentum density,

∂t(ρu) +∇ ·∫R3

mvvfdv =

∫R3

[maf +mvC(f)

]dv,

where we have integrated by part the term involving the acceleration. Thisresult should be compared to the momentum balance equation (1.25b). Thesecond-order moment of f , which appears on the left-hand side, can be rewrittenas ∫

R3

mvvfdv =

∫R3

m(v − u+ u)(v − u+ u)fdv

= ρuu+m

∫R3

(v − u)(v − u)fdv,

31

from which we see that the Cauchy stress tensor is related to the distributionfunction by, cf. equation (1.22),

P (t, x) = p(t, x)I + π(t, x) = m

∫R3

(v − u)(v − u)f(t, x, v)dv.

This result can be considered a proof in the framework of kinetic theory of theCauchy stress theorem mentioned in section 1.5. The pressure p is then obtainedas the isotropic part of P , namely,

p(t, x) =1

3m

∫R3

(v − u)2f(t, x, v)dv,

which is 2/3 of the kinetic energy of the particles measured in the referenceframe moving at the velocity u(t, x). The viscosity is then the trace-free part ofP , namely,

π(t, x) = m

∫R3

[(v − u)(v − u)− 1

3 (v − u)2I]f(t, x, v)dv.

At last, the macroscopic forces acting on the fluid split into a component dueto the actual particle acceleration a plus a component that accounts for thecollision momentum transport, namely,

ρ(t, x)f(t, x) =

∫R3

ma(t, x, v)f(t, x, v)dv +

∫R3

mvC(f)(t, x, v)dv

=

∫R3


∫R3

m(v − u)C(f)(t, x, v)dv,

where we have used condition (1.27) in the second equality.We examine now the total energy density, which is given by the kinetic

energy 12mv

2 carried by each fluid particle times the number of particles perunit of volume in the phase-space, integrated in velocity,

1

2ρu2 + U =

1

2m

∫R3

v2fdv,

where U is the internal energy density. The integral on the right-hand side canbe dealt with by writing v = v−u+u and expanding the square with the result

1

2m

∫R3

v2fdv =1

2ρu2 +

1

2m

∫R3

(v − u)2fdv,

where we have accounted for the identity∫R3

m(v − u)fdv = 0.

Therefore,

U =1

2m

∫R3

(v − u)2fdv,

which means that the internal energy of the fluid, as defined in section 1.1, isdue to the motion of the fluid particles relative to the overall fluid velocity u. Onthe other hand, the right-hand side equals 3

2p, hence we have established a first

32

closure relation, which expresses the internal energy in terms of the pressure,namely,

U =3

2p =

p

γ − 1, γ =

5

3,

where the constant γ is referred to as adiabatic index and for the specific case ofa gas of identical particles under consideration we find the value γ = 5/3. Thisrelationship is somewhat special because it does not depend on the particularform of the distribution function: it follows from the statistical definition ofinternal energy.

Upon multiplying the kinetic equation by 12mv

2 and integrating, we obtainthe transport equation for the total energy in the form

∂t(

12ρu

2 + U)

+∇ ·( ∫R3

1

2mv2vfdv

)=

∫R3

mv · afdv +

∫R3

1

2mv2C(f)dv,

where we have already integrated by part the acceleration term. By using againthe identity v = v − u + u and expanding the products, the total energy fluxamounts to∫

R3

1

2mv2vfdv =

(12ρu

2 + U)u+ pu+ π · u+

∫R3

1

2m(v − u)2(v − u)fdv,

and by comparison with the total energy flux in equation (1.23) we can deducethe heat flux

q(t, x) =

∫R3

1

2m(v − u)2(v − u)f(t, x, v)dv.

On the other hand we have∫R3

mv · afdv +

∫R3

1

2mv2C(f)dv

= ρf · u+

∫R3

1

2m(v − u)2C(f)dv +

∫R3

m(v − u) · afdv,

from which we can deduce that the heat sources are

Q(t, x) =

∫R3

1

2m(v − u)2C(f)dv +

∫R3

m(v − u) · afdv.

This expression is actually valid when the acceleration a(t, x, v) is a genericfunction of v satisfying only the divergence-free constraint ∇v · a(t, x, v) = 0.Usually however the forces are such that they do not contribute to heating thefluid, i.e., the second integral vanishes exactly for the physical accelerations weshall consider.

As an example of acceleration field relevant to plasma physics, let

a(t, x, v) = a0(t, x) + v × b0(t, x), (1.29)

where a0(t, x), b0(t, x) ∈ R3 are two vector fields independent of velocity. Weneed to check the velocity divergence, and indeed, we have

∇v · a(t, x, v) = ∇v · (v × b0(t, x))

= b0(t, x) · ∇v × v = 0,

33

hence assumption (1.26) is satisfied. For such acceleration fields, one finds

(v − u) · a = (v − u) ·[a0(t, x) + u(t, x)× b0(t, x)

],

and the right-hand side integrated in velocity against f is zero, i.e., no contri-bution to the heat source.

We can summarize the expressions of fluid quantities in terms of the distri-bution function f describing the microscopic state of the fluid:

• mass density and particle density,

ρ(t, x) = mn(t, x) = m

∫R3

f(t, x, v)dv; (1.30a)

• linear momentum,

ρ(t, x)u(t, x) = m

∫R3

vf(t, x, v)dv; (1.30b)

• internal energy and pressure,

U(t, x) =3

2p(t, x) =

1

2m

∫R3

(v − u)2f(t, x, v)dv; (1.30c)

• forces,

ρ(t, x)f(t, x) =

∫R3


∫R3

m(v − u)C(f)(t, x, v)dv;

(1.30d)

• viscosity tensor,

π(t, x) = m

∫R3

[(v − u)(v − u)− 1

3 (v − u)2I]f(t, x, v)dv; (1.30e)

• heat flux,

q(t, x) =1

2m

∫R3

(v − u)2(v − u)f(t, x, v)dv; (1.30f)

• heat sources

Q(t, x) =

∫R3

1

2m(v − u)2C(f)dv +

∫R3

m(v − u) · afdv. (1.30g)

The formal argument leading to (1.30) also shows that

Proposition 1.11 (formal). Let |a(t, x, v)| ≤ Ct,x(1 + v2). If f = f(t, x, v) isa smooth solution of the kinetic equation (1.28) such that both

v 7→ (1 + v2)32 f(t, x, v), v 7→ (1 + v2)C(f)(t, x, v)

are in L1(R3), then the quantities defined by (1.30) are finite and satisfy iden-tically fluid equations (1.25).

34

This establishes a link between kinetic theory of a gas of particles and fluiddynamics. We shall now apply this result in order to obtain closure relationsthat will lead to Euler’s and Navier-Stokes equations of fluid dynamics.

For sake of definiteness we choose a specific collision operator. As a model ofcollisions with all the important properties of physical collisions, we choose theBGK (Bhatnagar, Gross and Krook [41]) operator, which, in particular, has theproperty of relaxing the distribution function to a local thermodynamic equi-librium. We say that a gas or a plasma is in local thermodynamical equilibrium,if the distribution is described by a local Maxwell distribution,

fM (t, x, v) = n(t, x)( m

2πkBT (t, x)

) 32

exp[−m(v − u(t, x)

)22kBT (t, x)

], (1.31)

where n(t, x) is the number density (we shall use n and ρ = mn equivalently),u(t, x) the local average velocity, and T (t, x) is the local temperature, measuringthe spread of v with respect to its average. The physical meaning of the dis-tribution (1.31) is that a fluid element is considered as an infinitesimal thermo-dynamical system of n(t, x)dx particles, moving with average speed u(t, x), andin thermodynamical equilibrium with a thermal bath at the local temperatureT (t, x). Hence the particle distribution is given by the Boltzmann distributionfM ∝ exp

[− E/(kBT )

], where E is the kinetic energy of the particle in the

moving frame. In this sense, we say that, for a Maxwellian distribution, thetemperature T has a thermodynamical meaning. If the system is in local ther-modynamical equilibrium, then equation (1.30c) yields the ideal gas law (1.3).More generally we have

Proposition 1.12. The first three velocity moments of the Maxwell distributionfM amount to∫

R3

fMdv = n,

∫R3

vfMdv = nu,

∫R3

(v − u)2fMdv = 3nkBT/m,

which implies the ideal gas law U = (3/2)nkBT . In addition, we have∫R3

[(v − u)⊗ (v − u)− 1

3(v − u)2I

]fMdv = 0,

∫R3

(v − u)2(v − u)fMdv = 0,

hence π = 0 and q = 0.

Proof. We observe that the Maxwell’s distribution defines a measure in velocityspace

fMdv = nπ−3/2e−ξ2

dξ = ndµM (ξ), dµM (ξ) = π−3/2e−ξ2

dξ,

where we have introduced the coordinate ξ = (v−u)/vth which is the velocity rel-ative to u and normalized to the thermal speed, defined by vth = (2kBT/m)1/2.One can check that the measure dµM (ξ) is normalized on R3, i.e.,

∫R3 dµM (ξ) =

1 and is symmetric under reflection with respect to the origin, ξ 7→ −ξ, hence,∫R3

fMdv = n

∫R3

dµM (ξ) = n,

∫R3

vfMdv = nu+ nvth

∫R3

ξdµM (ξ) = nu,

the last integral being zero by symmetry. Analogously,∫R3

(v − u)2fMdv = v2th

∫R3

ξ2dµM (ξ),

35

and the integral is equal to 3/2, cf. lemma A.1 in appendix A. Then the internalenergy defined in equation (1.30c) amounts to U(t, x) = 3

2p(t, x) = 32nkBT . As

for the last point, we observe that for i 6= j,∫R3

ξiξjdµM (ξ) = 0 (i 6= j),

∫R3

ξ2ξidµM (ξ) = 0,

due to the symmetry of the Maxwellian measure µM . Then, the off-diagonalentries of π are zero and q = 0. As for the diagonal entries, for i = j we have∫

R3

ξ2i dµM (ξ) =

1

3

3∑i=1

∫R3

ξ2i dµM (ξ) =

1

3

∫R3

ξ2dµM (ξ),

since the measure is isotropic (i.e., invariant under any rigid rotation). Thisimplies that the diagonal terms of π are also zero.

For a generic non-Maxwellian distribution f , we can still define T by meansof the ideal gas law (1.3), which is a good change of variables U 7→ T . Suchan effective temperature, however, has only a statistical meaning measuringthe internal energy and no thermodynamical interpretation, as the system isnot in a local equilibrium. With this generalized definition of temperature,equation (1.30c) amounts to

p = nkBT, (1.32)

independently on the distribution function.In terms of local Maxwell distributions, we can define the map

M : f 7→M(f) = fM ,

which associates a local Maxwellian fM to an element f in the class of measur-able functions f : I × Ω×R3 → R+ such that∫

R3

(1 + v2)f(t, x, v)dv < +∞. (1.33)

For any such function, the velocity moments with weight ϕ(v) ∈ 1, v, |v|2,namely, n

nunu2 + 3nkBT/m

=

∫R3

1v|v|2

f(t, x, v)dv,

are finite, and since n > 0, we can solve for u and T . Then M(f) = fMis defined as the Maxwell distribution (1.31) with density n, average velocityu, and temperature T computed from the velocity moments of f . One shouldnotice that the operator M is strongly non-linear and the non-linearity is hiddenin the expression for fM .

The BGK operator is defined in terms of M by [37, 42, 43]

C(f) = νc(M(f)− f

), (1.34)

where νc > 0 is the collision frequency. Since the velocity-space integrals of ϕfand ϕM(f) with weight ϕ(v) ∈ 1, v, |v|2 are the same by construction, theBGK operator (1.34) satisfies∫

R3

ϕ(v)C(f)(t, x, v)dv = 0, for all ϕ(v) ∈ 1, v, |v|2. (1.35)

36

Particularly, condition (1.27) corresponds to the case ϕ(v) = 1 and is thereforesatisfied. In addition, we have M(fM ) = fM , hence

C(fM ) = 0, (1.36)

for any local Maxwellian distribution fM , that is, the kernel of the BGK operatoris equal to the family of local Maxwellians. Properties (1.35) and (1.36) aretrue also for the other collision operators such as the Boltzmann operator [37]for hard collisions in gasses and the Landau operator [38, 44] for Coulombcollisions in plasmas (with appropriate modification for multi-species plasmas,cf. section 3.1). The velocity moments with weight functions ϕ(v) ∈ 1, v, |v|2are referred to as collision invariants.

As a consequence of (1.35) and (1.36), the BGK operator relaxes the dis-tribution function toward a local Maxwellian with n, u, and T determined bythe initial condition. More precisely, a solution of the simplified initial-valueproblem

∂tf = C(f), f(0) = fi, (1.37)

(neglecting the advection operators on the left-hand side of equation (1.28) forthe moment) approaches the Maxwellian distribution M(fi) determined by theinitial condition fi, as t→ +∞. In fact if f = f(t, x, v) is a sufficiently regularsolution of (1.37) such that we can differentiate in time under the integral sign,then for every ϕ(v) ∈ 1, v, |v|2,

d

dt

∫R3

ϕ(v)fdv =

∫R3

ϕ(v)∂tfdv =

∫R3

ϕ(v)C(f)dv = 0,

which means that collision invariants are exactly preserved. For such solutionswe have M(f) = M(fi) since the moments of f are necessarily the same as themoments of the initial condition. We can then write

f = M(fi) + g,

and upon substituting into equation (1.37) we have

∂tg = νc(M(f)−M(fi)− g

)= −νcg, g(0) = fi −M(fi),

which is a linear problem and is readily solved by

g(t) =(fi −M(fi)

)e−νct.

One can check that all collision invariants of g are zero as it should be. At lastwe obtain that all sufficiently regular solutions of (1.37) must necessarily be ofthe form

f(t) = fie−νct +M(fi)

(1− e−νct

),

that is a time-dependent convex combination joining the points fi and M(fi)as time advances. In particular we see that

limt→+∞

f(t) = M(fi).

We say that f relaxes to the specific Maxwell’s distribution which has the sameparticle number, momentum, and energy densities as the initial condition. Theinvariants of the collision operator determine the relaxed state.

37

If the full kinetic equation (1.28) is accounted for, the relaxation process ismuch more complicated. Well-posedness of the kinetic equation with the BGKoperator has been proven by Perthame [42] for the case of zero acceleration(a = 0) on Ω = Rd for any dimension d.

In general, if collisions are strong enough, we might still expect that thedistribution will become nearly Maxwellian in the long time, even in presenceof the advection terms.

This suggests the possibility of an asymptotic solution of the kinetic equa-tion. In order to represent mathematically “strong collisions” let us scale thecollision frequency according to νc = ν0/ε with ν0 > 0 fixed and let ε ∈ (0, 1]tends to zero.

We consider the Hilbert expansion of the distribution function [37, 45, andreferences therein],

f = f ε ∼∑n≥0

εnfn,

where the symbol “∼” means asymptotic convergence, i.e., for all integers N > 0there are constants Cα,N such that

∣∣∂α(f ε − N∑n=0

εnfn)∣∣ ≤ Cα,N εN ,

for all multi-indices α with |α| ≤ k, where k is the required regularity, e.g.,k = 1. In plain words this means that for every N the partial series of εnfn isa good approximation of f ε for ε sufficiently small, even if the full series mightnot converge for any fixed value of ε.

We notice that

M(f ε) ∼∑n≥0

εnMn,

where Mn depends on f0, . . . fn and

M0 = M(f0).

Then the kinetic equation (1.28) becomes

ν0

ε

(f0 −M(f0)

)+∑n≥1

εn−1[∂tfn−1 + v · ∇xfn−1 + a · ∇vfn−1

+ ν0

(fn −Mn(f0, . . . , fn)

)]∼ 0,

and separating different powers of the small parameter ε, we have

ν0

(M(f0)− f0

)= 0, for n = 0,

ν0

(Mn(f0, . . . , fn)− fn

)= ∂tfn−1 + v · ∇xfn−1 + a · ∇vfn−1, for n ≥ 1.

The lowest order equation (n = 0) tells us that f0 is a Maxwell distribution,with arbitrary moments ρ0, u0, and T0. Then, we can try to solve iterativelyfor the correctors fn, n ≥ 1, and from the solutions fn, we have corrections ρn,un, and Tn.

38

The simplest possible closure is obtained by retaining only the lowest-orderterm in the Hilbert series so that

f(t, x, v) = f0(t, x, v) = n0(t, x)( m

2πkBT0(t, x)

) 32

exp[−m(v − u0(t, x)

)22kBT0(t, x)

].

(1.38)The distribution defined by (1.38) is not an exact solution of the kinetic equa-tion, but we have

∂tf + v · ∇xf + a · ∇vf −ν0

ε

(M(f)− f

)= O(1),

pointwise in (t, x, v) for ε → 0+, that is, the approximation is not uniform!Equations (1.30) and proposition 1.12 give

ρ = ρ0, u = u0, T = T0,

together withπ = 0, and q = 0, (1.39a)

while, since C(fM ) = 0 and considering the case of an acceleration field whichdoes not produce heat, e.g., accelerations of the form (1.29), we have

Q = 0. (1.39b)

There is no contribution of collisions to the body force (1.30d),

ρ(t, x)f(t, x) =

∫ma(t, x, v)f(t, x, v)dv. (1.39c)

This leaves us with the equations, cf. equations (1.25),

∂tρ+∇ ·(ρu)

= 0,

∂t(ρu) +∇ · (ρuu) = −∇p+ ρf,

∂t(

32p)

+∇ ·(

32pu

)+ p∇ · u = 0,

(1.40)

or the equivalent advective form

Dρ

Dt= −ρ∇ · u,

ρDu

Dt= −∇p+ ρf,

Dp

Dt= −γp∇ · u,

(1.41)

where we have defined the adiabatic index

γ =2

3+ 1 =

5

3. (1.42)

Equations (1.40), or equivalently their advective form (1.41), are referred to ascompressible Euler equations for inviscid fluids (viscosity is zero).

However, in many applications the lowest order closure will not suffice andcorrections should be added thus obtaining non-trivial transport coefficients,including non-trivial viscosity and heat fluxes.

39

It is instructive to consider at least the first-order correction in the Hilbertexpansion. With this aim we need to compute the linearization of the operatorM , that is the term M1(f0, f1). Let V be the space of measurable functionsg : I × Ω × R3 → R+ such that for every (t, x), g(t, x, ·) ∈ L2(R3, dµM ),where L2(R3, dµM ) is the space of squared-integrable functions with respect tothe Maxwellian measure dµM introduced in the proof of proposition 1.12. Ofparticular importance is the subspace

V0 = span(

1,m

kBT0(v − u0),

m(v − u0)2

2kBT0− 3

2

)⊆ V, (1.43)

that is equivalently defined as the subspace of linear combinations of the threecollision invariants ϕ(v) ∈ 1, v, |v|2 with coefficients depending on (t, x). Wealso define the average [43]

〈g〉 =1

n0

∫R3

g(t, x, v)f0(t, x, v)dv =

∫R3

g(t, x, u0 + vthξ)dµM (ξ), (1.44)

for all g ∈ V . We can now compute the linearization of M .

Proposition 1.13. If f0 is given in equation (1.38), the first-order term in theHilbert expansion of M(f) can be written as

M1(f0, f1) = f0Π(g1),

for f1 = f0g1, g1 ∈ V , and Π : V → V0 is the linear operator given by

Π(g) = 〈g〉+m

kBT0(v − u0) · 〈(v − u)g〉

+(m(v − u0)2

2kBT0− 3

2

)⟨(m(v − u0)2

3kBT0− 1)g⟩,

for any g ∈ V . We have that Π is a projector, i.e., Π2 = Π.

Proof. Given in appendix A.

The first-order equation in the Hilbert expansion then takes the form

ν0

(Π(g1)− g1

)= h1, (1.45)

where the right-hand side is defined by

f0h1 = ∂tf0 + v · ∇xf0 + a · ∇vf0,

with f0 given in (1.38). Since Π is a projector, we deduce a necessary conditionfor the existence of a solution of equation (1.45): if g1 is a solution, then applyingΠ to both sides of the equations gives

ν0

(Π2(g1)−Π(g1)

)= Π(h1),

and since Π2 = Π, we have the necessary condition

Π(h1) = 0, (1.46)

40

which is referred to as the solvability condition for equation (1.45). We shall seethat condition (1.46) is equivalent to compressible Euler’s equations (1.41). Ifthis condition is satisfied, then we have

ν0

(Π(−ν−1

0 h1) + ν−10 h1

)= −Π(h1) + h1 = h1,

which shows that g1 = −ν−10 h1 is a particular solution of (1.45). Then all

possible solutions are obtained by adding to it an element of the kernel of theoperator ν0(Π− I), where I is the identity operator in the space V , that is, anelement of the range of Π. In summary, we have

Proposition 1.14. Let the acceleration field a = a(t, x, v) be of the form (1.29).If n0, u0, and T0 solve the compressible Euler’s equations (1.41), then −ν−1

0 h1

is a particular solution of equation (1.45) and all solutions are

g1 = g1,0 − ν−10 h1, (1.47)

for any g1,0 ∈ V0.


Let us remark that the solution for g1 is not fully determined by the first-order equation, since g1,0 can be any element of V0. This freedom can be usedto enforce the solvability condition for the next-order corrector, in the same wayas the solvability condition (1.46) determines the moments n0, u0, and T0 of thezeroth-order solution f0.

From the explicit solution (1.47), we can compute the first-order correctionsto the closure relations. We obtain a non-trivial viscosity tensor and heat flux.On the other hand the heat sources are exactly zero due to the fact that the BGKcollision operator preserves energy and the acceleration a does not contributeto the heat source by hypothesis.

Proposition 1.15. For any g1 in the space of solutions in proposition 1.14,

n1 = n0〈g1,0〉, u1 = 〈(v − u0)g1,0〉, T1 = T0

⟨(m(v − u0)2

3kBT0− 1)g1,0

⟩,

and the correction to viscosity and heat flux are

π1 = −2n0kBT0

ν0

[D0 −

1

3(tr D0)I

], q1 = −5

2

(kBm

)n0kBT0

ν0∇T0.

where D0 = (∇u0 + t∇u0)/2 is the deformation tensor of u0.


From the results of proposition 1.15, we have

π = επ1 +O(ε2) = −2nkBT

νc

[D− 1

3(∇ · u)I

]+O(ε2),

q = εq1 +O(ε2) = −5

2

(kBm

)nkBTνc∇T +O(ε2),

where we have used 1/νc = ε/ν0, and we have substituted n, u and T forn0, u0, and T0, respectively, as the difference introduces terms of order O(ε2).

41

Neglecting the remainder, we obtain the standard Navier-Stokes viscosity andthe Fourier’s law for the heat flux,

π = −2µ[D− 1

3(∇ · u)I

], q = −k∇T, (1.48)

where the coefficients

µ =nkBT

νc, k =

5

2

(kBm

)nkBTνc

, (1.49)

are the dynamic viscosity coefficient and the thermal conductivity, respectively.The equations of fluid dynamics with the closure of proposition 1.15 amount

to (in advective form)

Dρ

Dt= −ρ∇ · u,

ρDu

Dt= −∇p−∇ · π + ρf, π = −µ

[∇u+ t∇u− 2

3(∇ · u)I

],

ρcVDT

Dt= −p∇ · u− π : ∇u+∇ ·

(k∇T

),

(1.50)

where, in writing the internal energy equation we have introduced the specificheat capacity at constant volume, defined by [46, 47]

U =3

2nkBT = ρcV T, cV =

3

2

kBm

=kB/m

γ − 1.

Equations (1.50) are the compressible Navier-Stokes equations. The derivationof Navier-Stokes equations based on the Hilbert expansion is purely formal.In addition, as already noted, the Hilbert expansion method does not producea uniform approximation of the solution and it is expected to fail when steepgradients build up [37]. A slightly more sophisticated approach is the Chapman-Enskog method [37, and references therein], in which f0 is replaced by theMaxwell distribution with full moments n, u, and T , instead of their lowest-order approximations [45]. A rigorous approach based on convergence resultsfor BGK collision has been developed by Saint-Raymond [43].

On going back to Navier-Stokes equations, one should notice that, in general,µ and k are functions of the density and temperature of the fluid. On the otherhand, for perfect gasses cV is a constant.

In order to estimate the relative importance of inertia, viscosity and heatflux, it is customary to make use of dimensionless numbers. We introduce con-stants that represents the typical order of magnitude of the fluid quantities fora given class of solutions corresponding to the specific case under consideration.For instance let τ and L be positive real numbers representing the typical scalesof time- and space-variations of the solution, while V is the typical scale of theflow u. The ratio between inertia and viscosity forces is then estimated by

ρ|u · ∇u||∇ · π|

≈ ρV 2

µV/L=V L

ν,

whereν = µ/ρ, (1.51)

42

is referred to as kinematic viscosity. The symbol “≈” is used here to indicatean order-of-magnitude estimate. The dimensionless parameter

Re =V L

ν, (1.52)

is the Reynolds number of the considered flow. A large Reynolds number signifiesthat inertia is dominating over viscosity, and naively one might expect that thesolution follows a dynamics close to Euler’s equations. The presence of even asmall viscosity can, however, change the qualitative behavior as exemplified insection 1.9 below.

Upon dividing by ρ, the momentum balance equation in (1.50), we see thatthe kinematic viscosity ν gives the velocity diffusion coefficient and has in factthe dimensions of a squared length per time.

Analogously, in the heat transport equation, the quantity

κ =k

ρcV(1.53)

is a squared length per time and gives the diffusion coefficient for T . It istherefore referred to as thermal diffusivity [46]. Incidentally, we mention thatfor general fluids (for which cV is not constant) in the low subsonic regime [47],κ is defined with the specific heat capacity at constant pressure cp, which for aperfect gas amounts to

cp = γcV , cp − cV = kB/m.

The ratio between the velocity diffusion coefficient and the thermal diffusioncoefficient, defined using cp, is the Prandtl number,

Pr =µcpk

=γ

γ − 1

(kB/m)µ

k. (1.54)

With the viscosity coefficient µ and the heat conductivity k obtained with BGKcollisions, we have exactly

Pr = 1,

and this is a well-known problem of the BGK model, since for ordinary fluids andgasses the values of the Prandtl number are ≈ 2/3. Generalizations of the BGKcollision operator can be considered in order to obtain more realistic values. Forinstance one can consider collisions that relax the distribution function towardan anisotropic Maxwellian [48].

For plasmas in presence of a magnetic field, such transport coefficients aremuch more complicated, even with strong collisions. The seminal paper by Bra-ginskii [38] provides the transport coefficients for a collision-dominated magne-tized plasma. In its basic form however, magnetohydrodynamics is build uponEuler’s equations and thus we shall not need the intricacies of Braginskii’s clo-sure.

For the specific case of plasmas, apart from some notable exceptions, colli-sions are weak since the collision frequency satisfies [44]

νc ∝ nT−3/2,

43

and we have relatively low density and high temperature. Then transport isdominated by turbulence rather then by collisions. In such situations, the cur-rent trend is to give up with the search for the appropriate closure and focuson the numerical modeling of turbulent transport leading to multi-scale codesin which the equation of fluid dynamics are coupled to kinetic codes.

1.7 Incompressible flows The closure relations obtained on the basis ofthe kinetic equation (1.28), as well as the corresponding thermodynamic rela-tions such as equation (1.30c) between internal energy and pressure, are validas long as the kinetic model describes correctly the microscopic dynamics of theconsidered fluid. This is the case for important physical systems that are madeby atoms or molecules such as gases or by ions and electrons such as plasmas.Even stellar systems such as globular clusters and galaxies can be described interms of a kinetic equation and thus, of the corresponding fluid quantities [49].

There are situations however, in which this does not apply and closure rela-tions have to be inferred in other ways.

One such case is that of incompressible flows. For an incompressible flow,cf. equation (1.21), the closure obtained above is not appropriate. In fact, theaddition of the incompressibility condition to Euler’s equations (1.41) gives anover-determined system. In order to see this, let us consider an initial conditionat the time t = 0 with uniform density and pressure. According to Euler’s equa-tions (1.41), for an incompressible flow both density and pressure are constantalong Lagrangian trajectories, hence they remain uniform for all t ≥ 0. Onthe other hand, the divergence of the momentum equation together with theincompressibility condition gives

∇u : ∇u = ∇ · f, (pressure is constant),

which cannot be satisfied in general. The problem has its origin in the iden-tification of the internal energy U with pressure (apart from a multiplicativeconstant), which is a consequence of kinetic theory.

If we still maintain that viscosity vanishes, but accept that pressure andinternal energy are not just proportional to each other (as they are in kinetictheory), we see that equation (1.25c) is decoupled from the system. The re-maining equations,

Dρ

Dt= 0,

ρDu

Dt= −∇p+ ρf,

∇ · u = 0,

(1.55)

describe an incompressible flow and are referred to as incompressible Euler’sequations. One should notice that the divergence-free condition implicitly de-termine the pressure p, as follows by taking the divergence of the momentumequation. A particular solution of the density transport equation is ρ = con-stant, but in general, incompressibility does not imply constant density: a non-uniform initial mass density will, in general, evolve in a complicated way evenif the flow is incompressible.

44

Analogously, incompressible Navier-Stokes equations are written as

Dρ

Dt= 0,

ρDu

Dt= −∇p+ ρf + ρν∆u,

∇ · u = 0.

(1.56)

where we have used the incompressibility condition in order to simplify theviscosity term,

∇ · π = −ρν∆u,

where ∆ =∑i ∂

2/∂x2i is the Laplace operator, and ν is the kinematic viscos-

ity (1.51).For incompressible Navier-Stokes equations, the hyperbolic character of Eu-

ler’s equation is complicated by the Laplace operator. The effect of viscosityconsists in dissipating kinetic energy which is converted into internal energyand thus temperature according to equation (1.25c). In this case, however, theinternal energy is not proportional to the pressure as the closure relation (1.30c)does not apply. Hence the energy balance equation (1.25c) decouples from thesystems and one needs to solve it only if one is interested in heat transport inthe fluid.

1.8 Equations of state, isentropic flows and vorticity. Concerningcompressible Euler’s equations, in many situations the equation for the fluidpressure does not need to be explicitly solved.

In fact, if the mass density is uniformly bounded away from zero, i.e., thereis a constant ρ0 > 0 such that ρ ≥ ρ0, we can express ∇ · u from the continuityequation,

∇ · u = −1

ρ

Dρ

Dt,

and we can substitute it into the pressure equation with the result that

Dp

Dt− γp

ρ

Dρ

Dt= 0.

This implies that pρ−γ is constant along the Lagrangian trajectories of thesolution of Euler’s equations. In order to see this, we compute

ργD

Dt

(pρ−γ

)=Dp

Dt− γp

ρ

Dρ

Dt= 0,

and since ρ ≥ ρ0 > 0, we have D(pρ−γ)/Dt = 0. If the initial conditions forequations (1.41) are chosen so that pρ−γ = C = constant at the initial time,then pρ−γ = C for all time. Under those conditions we can replace the pressureequation with the algebraic relation p = Cργ .

Other closure relations may be considered in which the pressure is assignedas a function of the other fluid variables, and particularly as a function of thedensity, P : R+ → R+, p = P(ρ). The relation defining the pressure in termsof the density is referred to as equation of state. For the specific case of Euler’sequations discussed above, we have P(ρ) = Cργ , but one can also consider otherequations of state. The following is a particularly interesting class thereof.

45

Definition 1.2 (Isentropic flows). A flow is called isentropic if there exists afunction h : R+ → R, to be referred to as enthalpy, such that ∇h(ρ) = ∇p/ρ.

With reference to the momentum balance in Euler’s equation (1.41) one cannotice that ∇p/ρ plays an important role, as it provides a drive for the fluidvelocity. For isentropic flows, this driving force is a gradient and the momentumbalance equation reduces to

∂tu+ u · ∇u = −∇h+ f. (1.57)

As a consequence the vorticity ω = ∇× u of an isentropic flow satisfies

∂tω −∇× (u× ω) = ∇× f. (1.58)

This is readily derived by making use of the vector calculus identity

u · ∇u−∇u · u = (∇× u)× u.

The second term on the left-hand side is an exact gradient, hence, the momen-tum equation (1.57) can be rewritten as

∂tu− u× (∇× u) = −∇(h+ u2/2) + f.

Then one can compute (formally)

∂tω = ∇× (∂tu) = ∇× (u× ω) +∇× f,

as the curl of a gradient vanishes identically, and this is equation (1.58). Oneshould notice that ∇ × f is the only drive of vorticity, i.e., of rotation of thefluid. A potential force cannot create vortices.

The equation of state P(ρ) = Cργ obtained for Euler’s flows is isentropic.In fact we compute

∇p/ρ = γCργ−2∇ρ.

On the other hand,

γCργ−2 =d

dρ

( γ

γ − 1Cργ−1

),

hence we have ∇p/ρ = ∇h(ρ) where

h(ρ) =γ

γ − 1Cργ−1,

is the enthalpy associated to Euler’s flows. Therefore for any solution of Euler’sequations satisfying the equation of state at the initial time, one has that thevorticity solves equation (1.58).

1.9 Effects of Euler-type nonlinearities. In virtue of the Reynolds trans-port theorem, the advective derivative D/Dt along the flow plays a central rolein the equations of fluid dynamics. In the momentum balance equation, thevelocity u is advected by its own flow, producing the typical Euler nonlinearityDu/Dt = ∂tu+ u · ∇u.

46

In order to understand the behavior of this operator let us briefly study aprototypical case of Euler nonlinearity in one spatial dimension. Specifically, westudy the Cauchy problem

∂tu+ u∂xu = 0, u(0, x) = u0(x), (1.59)

for u : R≥0 × R → R, with smooth initial data u0 ∈ C1(R). Here R≥0 =x ∈ R | x ≥ 0 denotes the set of non-negative real number. This equationis referred to as Hopf equation or inviscid Burgers equation and it is discussedin details in Hormander’s lectures [33], which we follow closely here. It is anonlinear first-order partial differential equation which can be dealt with bymeans of the characteristics method sketched in section 1.4.

Proposition 1.16. For any initial condition u0 ∈ C1(R) with −u′0(x) ≤ b,b ≥ 0, there exists a unique classical solution u ∈ C1

([0, T )

), 1/T = b, of the

initial value problem (1.59).

Proof. If u ∈ C1 is a solution, then along Lagrangian trajectories, defined by

dx(t, y)

dt= u

(t, x(t, y)

), x(t, y) = y,

we haved

dt

[u(t, x(t, y)

)]= (∂t + u∂xu)(t, x(t, y)) = 0,

and u(0, x(0, y)) = u0(y). We can solve the two coupled ordinary differentialequations analytically, with the result that

u(t, x(t, y)

)= u0(y), x(t, y) = y + u0(y)t. (1.60)

The flow map y 7→ x(t, y) is invertible as long as ∂x/∂y 6= 0, which is the case if

1 + u′0(y)t ≥ 1− bt > 0,

that is for t < T and 1/T = b. We construct a field u : [0, T )×R→ R by

u(t, x) = u0

(y(t, x)

),

where y(t, x) is the solution of x = y + u0(y)t for fixed (t, x). By the inversefunction theorem [32] the function y is in C1 and so is u. Since u is constant alongthe Lagrangian trajectories, it is a solution of the Cauchy problem. Uniquenessfollows from the uniqueness of the solution of the associated characteristicsequations.

If the initial condition is bounded, then sup |u(t, ·)| = sup |u0|, which followsfrom the for of solution (1.60). If T is finite, however, for t → T the spatialderivatives blows up since ∂x/∂y approaches zero and

∂xu(t, x(t, y)

)=[∂x(t, y)

∂y

]−1

u′0(y),

by the inverse function theorem. The classical solution breaks with a singularityin the gradient.

47

By inspection of the solution for the characteristics, one can see the effect ofthe Euler nonlinearity: each point y on the real line moves with a constant speedgiven by the value of the initial condition u0(y) at the initial point. Hence, ifthe initial condition is decreasing (negative derivative) the trajectory startingfrom y1 < y2 have velocity u0(y1) > u0(y2) and it must overtake the trajectoryissued from y2 at some time in the future. At this time both points y1, y2 aremapped into the same point x by the flow, which means that the flow is nolonger invertible and the classical solution breaks. It is still possible to defineweak solutions that capture the shock due to the crossing of trajectories [33].

A regularized version of the Hopf equation (1.59) is the Burgers equation

∂tu+ u∂xu = ν∂2xu, u(0, x) = u0(x), (1.61)

in which a finite kinematic viscosity ν > 0 is added.For ν sufficiently small (large Reynolds numbers), the behavior of the Burg-

ers equation is dominated by the nonlinear advection up to a time near thebreakdown time T . There characteristics are getting closer to each other thusamplifying derivatives of u so that the terms ν∂2

xu becomes large (even for asmall viscosity). The effect of this term is to balance the formation of a singu-larity in the gradient.

Figure 1.6 shows the numerical solution of the Cauchy problem for the Burg-ers equation (1.61) with a Gaussian initial condition. One can see that the initialcondition is distorted so that the part of the profile where u′0(x) < 0 is steep-ened by the Euler nonlinearity increasing the slope up to the point where theviscosity balances the nonlinear advection and dissipative shock is formed.

48

Figure 1.6: Solution of the Cauchy problem (1.61) for the Burgers equationcorresponding to the initial condition u0(x) = exp(−(x− π)2/a2) with a = 0.2and ν = 0.002. The profiles of the solution u(t, ·) at various points in timeare shown on the top panel. One can see the effect of the nonlinear advectionsteepening the profiles where the spatial derivative is negative. The full solutionis shown on the bottom-left panel as a function of (t, x). A detail of the contoursof the solution is shown on the bottom-right panel. The contours are almoststraight lines, thus following the characteristics of the inviscid equation, up tothe dissipative shock where viscosity starts to play a role.

49

2 Basic elements of classical electrodynamics

This section gives a short summary of classical electrodynamics. We shall splitthis in two parts. The first part is a quick introduction on Maxwell’s equationsthat allow us to compute electromagnetic fields given their sources, namely elec-tric charges and currents. In the second part, we consider the Lorentz force andthe motion of an electrically charged particle under the influence of a givenelectromagnetic field. A standard reference on the foundations of classical elec-trodynamics is the book by Jackson [50]. In a final addendum, two basic math-ematical results are briefly presented, namely, the Poisson equation (repeatedlyused in the main discussion) and the Cauchy problem for Maxwell’s equationson the whole space R3.

2.1 Maxwell’s equations. Maxwell’s equations are differential equationsdescribing the electromagnetic field (E,B) where

E : I × Ω→ R3, and B : I × Ω→ R3,

are the electric field and the magnetic field, respectively, defined for time t ∈ Iand position x in a domain Ω ⊆ R3.

As we shall see, it is appropriate (both physically and mathematically) totreat E and B as elements of the same object, the electromagnetic field (E,B).For this reason, throughout this note, Gauss (c.g.s.) units are used, so that bothE and B are measured with the same dimensions, a property which is broken inthe international system (SI). Conversion formulas between Gauss and SI unitsand related definitions can be found, e.g., in Jackson’s book [50] or in the NRLplasma formulary [44]. The unit system in electrodynamics have to be preciselyspecified as the constants in Maxwell’s equations (unfortunately) depend on it.

Historically, however, electrodynamics developed from the observation ofstatic (i.e., time-independent) phenomena, for which electric and magnetic fieldsare decoupled and were therefore studied separately as two different physical ob-jects. Electrostatics and magnetostatics, respectively, developed as two separatephysics theories.

Later Faraday discovered experimentally that a time-dependent magneticfield induces an electric field, thus establishing a connection between those twoseparate worlds.

It was the mathematical physicist James Clerk Maxwell who formulateda consistent system of equations, namely, Maxwell’s equations, that containelectrostatics, magnetostatics and Faraday’s induction laws as special cases.In addition Maxwell’s theory predicted electromagnetic waves that were laterconfirmed experimentally.

For our purposes, it is important to develop a certain familiarity with allthe four main aspects of Maxwell’s equations: electrostatics, magnetostatics,Faraday’s induction (which is a key element of MHD), and the full system ofMaxwell’s equations.

Let us begin with the definition of the sources of electromagnetic fields,namely, electric charge densities and electric current densities.

Sources. In addition to mass, elementary particles can carry an electric chargewhich can be either positive or negative (differently from mass which has one

51

sign only). Analogously to the continuum hypothesis of fluid dynamics, cf.section 1.1, at a macroscopic level we can model charges as a continuum withan associated electric charge density.

By definition, the electric charge density is a function ρc : I × Ω → R suchthat, for any control volume W ⊆ Ω,

(electric charge in W at time t) =

∫W

ρc(t, x)dx.

This is the analogous of the mass density introduced in section 1.1, but withthe important difference that ρc takes values in the whole real line R as electriccharges can be either positive or negative.

It is a basic principle of physics that electric charge can be neither creatednor destroyed. Analogously to the mass continuity equation (1.25a), we have

∂tρc +∇ · J = 0, (2.1)

which is referred to as charge continuity equation. The flux of electric chargeJ : I × Ω→ R3 is by definition the electric current density.

Physically an electric current I(t) through a certain surface Σ, for instancethe cross-section of a conductor, is defined as the rate at which charge is flowingthrough it. If W is a control volume and Σ = ∂W is its boundary, we have

I(t) =

∫∂W

J · ndS = −dQ(t)

dt,

where I(t) is the electric current through ∂W and Q(t) is the change in W attime t. For common conductors, like an electric wire in a circuit, dQ/dt ≈ 0 and,by Gauss theorem, the current density J is in good approximation divergence-free.

Electrostatics. Let us consider a time-independent charge density ρc : Ω → R

defined on a bounded domain Ω ⊂ R3.The static charge density ρc generates a static electric field E : Ω → R3,

according to the Gauss law,

∇ · E = 4πρc, (2.2a)

subject to the constraint that E must be irrotational, namely,

∇× E = 0. (2.2b)

Equipped with proper boundary conditions, equations (2.2) define electrostatics.If the domain Ω is simply connected, then condition (2.2b) is equivalent to

E being a potential field, i.e., there exists a scalar potential φ : Ω→ R such that

E = −∇φ,

and Gauss law takes the form of a Poisson equation for φ, namely,

−∆φ = 4πρc, (2.3)

where ∆ denotes the Laplace operator, defined by ∆φ = ∇ · ∇φ. The poten-tial φ is referred to as the electrostatic (or electric) potential. Mathematical

52

results on the Poisson equation with simple boundary conditions are collectedin section 2.3.

It is worth noting that the electrostatic potential is defined apart from con-stants. This means that the value of φ in a point does not have a physicalmeaning; only differences of potential between two points are physical. Moregenerally physical quantities should never depend on the arbitrary offset of thepotential. We therefore have some degree of freedom in choosing the boundaryconditions for equation (2.3).

Magnetostatics. Let us now consider a static current density J : Ω→ R3. Sicha current density generates a static magnetic field B : Ω → R3 according toAmpere law

∇×B =4π

cJ, (2.4a)

where c is a constant determined by the unit system and in Gauss (c.g.s.) unitshas the dimensions of a velocity, while B is subject to the constraint

∇ ·B = 0. (2.4b)

Ampere law implies, as a necessary condition for the existence of a solution,∇ · J = 0, since ∇ ·∇×B = 0, and by charge conservation (2.1) one must haveno change in time of the electric charge distribution, ∂tρc = 0.

Under suitable hypotheses on the domain Ω and with boundary conditions,equations (2.4) determine uniquely the magnetic field B and define magneto-statics.

Again the task of computing the solution B can be reduced to solving aPoisson equation. With this aim let us recall that if the domain Ω is simplyconnected, equation (2.4b) is equivalent to the existence of a vector potentialA : Ω→ R3 such that

B = ∇×A. (2.5)

This is called magnetic vector potential. It should not be confused with the mag-netic scalar potential which is used for the homogeneous magnetostatic problem,i.e., when J = 0, so that ∇ · B = ∇ × B = 0 and we can set B = −∇Φ withthe magnetic scalar potential Φ satisfying the Laplace equation ∆Φ = 0. Thistechnique is used for potential field extrapolation in the solar corona [15].

As for both electrostatic and magnetic scalar potentials, the vector potentialis not uniquely determined. In fact, one can add to A the gradient of a functionf without changing the value of B, since ∇×∇f = 0. It is important thereforeto make sure that any physical quantity defined in terms of A does not dependon the choice of the arbitrary function f .

A specific choice or constraint that removes this arbitrariness in the definitionof the potential is referred to as a gauge.

A particular choice of the gauge, to be referred to as the Coulomb gauge,corresponds to the requirement that the vector potential is divergence-free,

∇ ·A = 0. (2.6)

If the domain is bounded and regular enough, it is always possible to guaranteethe existence of this gauge. In fact, if we find a vector potential A that satis-fies (2.5) but not (2.6), then we can redefine A = A+∇f with a gauge function

53

f chosen so that∇ ·A = ∇ · A+ ∆f = 0,

which is again a Poisson equation for f with −∇· A as a source term. Equippedwith suitable boundary conditions, this equation has a unique solution and theresulting potential A satisfies the Coulomb gauge condition.

Ampere’s law (2.4a) in terms of the vector potential reads

∇× (∇×A) =4π

cJ,

and the vector-calculus identity

∇× (∇×A) = ∇(∇ ·A)−∆A,

together with the Coulomb gauge (2.6) implies that Ampere’s law is equivalentto a “vector Poisson equation”, namely,

−∆A =4π

cJ,

where the Laplace operator ∆ is applied to A component-wise. This is a systemof three decoupled Poisson equations for the three components of the vectorpotential. The boundary conditions for such decoupled Poisson equations mustbe chosen so that the Coulomb gauge (2.6) is satisfied. We notice that, if A isa sufficiently regular solution,

−∆∇ ·A =4π

c∇ · J = 0,

i.e., ∇ · A satisfies the Laplace equation in Ω. If we choose the boundary con-ditions such that, e.g., ∇ · A = 0 on the boundary, then the Laplace equationhas a unique solution and that is ∇ ·A = 0, i.e., the solution will automaticallysatisfy the Coulomb gauge condition. A mathematically precide argument isgiven in section 2.3.

A particularly simple case is that of periodic boundary conditions, that is,when Ω is a cube identified with the 3-torus T3. Then both the source J andthe vector potential A can be expanded in Fourier series [75],

A(x) =∑n∈Z3

Anein·x, J(x) =

∑n∈Z3

Jnein·x,

where we have considered Ω = (0, 2π)3 for simplicity and i =√−1. The Poisson

equation for A reduces to an algebraic condition on the Fourier coefficients An,namely,

n2An =4π

cJn,

while the condition ∇ · J = 0 becomes n · Jn = 0. We see that all coefficientsAn with |n| 6= 0 are uniquely determined, and we have n · An = 0 which is theCoulomb gauge ∇ · A = 0. For n = 0 we find the necessary condition J0 = 0which must be satisfied for the existence of a solution. If that is the case, thenthe coefficient of the zero frequency, A0, is undetermined, but that correspondsto a constant off-set of A and can be set to zero (zero-average condition).

54

Faraday’s induction. Let us now turn our attention to time-dependent electricand magnetic fields, E,B : I × Ω → R3. Essentially Faraday’s induction lawis expressed mathematically as a relation between the time-derivative of themagnetic field and the curl of the electric field, namely,

∇× E = −1

c

∂B

∂t. (2.7)

This can be seen as a generalization of constraint (2.2b) and replaces it whentime-dependent fields are considered.

One remarkable consequence is that for time dependent fields the electricfield E is no longer a potential field, since Faraday’s law has replaced (2.2b).However, the magnetic field still satisfies the divergence-free condition (2.4b)and thus, in simply connected domains we can find A : I × Ω → R3 suchthat (2.5) holds point-wise in time.

From Faraday’s law, it follows that

∇×(E +

1

c

∂A

∂t

)= 0,

which means that the vector E + c−1∂tA = 0 is irrotational and thus, if thedomain Ω is simply connected, there exists a time-dependent scalar potentialφ : I × Ω→ R such that

E = −∇φ− 1

c

∂A

∂t,

B = ∇×A.(2.8)

In view of the relation between A and the electric field, the gauge transformationA 7→ A′ = A+∇f introduced above has to be dealt with more carefully. If wechange the gauge for A we need to change the potential φ as well in order forthe electric field E to be invariant.

This leads us to the full gauge transformation

φ 7→ φ′ = φ− 1

c

∂f

∂t,

A 7→ A′ = A+∇f.(2.9)

Then both E and B are invariant with respect to this transformation. As beforewe fix the gauge to be the Coulomb gauge (2.6), for which we have

∇ · E = −∆φ− 1

c

∂

∂t∇ ·A = −∆φ.

Hence, in the Coulomb gauge, Poisson equation (2.3) holds unchanged for thefull time-dependent potential φ(t, x).

Maxwell’s equations. The equations from electrostatics, magnetostatics, andthe generalization of condition (2.2b) due to Faraday’s induction amount to thesystem

∇ · E = 4πρc,

∇× E = −1

c

∂B

∂t,

∇×B =4π

cJ,

∇ ·B = 0,

(not consistent!)

55

but this is not consistent with the principle of charge conservation. The problemis due to Ampere’s law which implies

∇ · J = 0,

thus violating the charge continuity equation (2.1) when time dependent fields,and sources are considered. Ampere’s law was in fact justified for the case ofstatic fields.

Maxwell saw that a simple way to fix the problem is adding a term propor-tional to the time derivative of the electric field to the Ampere’s equation in asymmetric way with respect to the other “curl equation” namely Faraday’s law.This leads us to Ampere-Maxwell law,

∇×B =4π

cJ +

1

c

∂E

∂t,

and the added term was interpreted as an effective current density

Jdisp =1

4π

∂E

∂t,

which is referred to as the displacement current. Now the divergence of Ampere-Maxwell law gives

0 =4π

c∇ · J +

1

c

∂

∂t∇ · E,

which, upon accounting for Gauss law, becomes

0 =4π

c∇ · J +

4π

c

∂ρc∂t

,

consistently with the general charge continuity equation (2.1).With the modified Ampere-Maxwell law, the full system reads

∇ · E = 4πρc,

∇× E = −1

c

∂B

∂t,

∇×B =4π

cJ +

1

c

∂E

∂t,

∇ ·B = 0,

(2.10)

and those are Maxwell’s equations of classical electrodynamics. It is importantto observe that:

1. The charge continuity equation is a necessary condition for the existenceof solutions, since it follows from Gauss and Ampere-Maxwell laws.

2. With time-independent fields and sources, Maxwell’s equations for E andfor B decouples, recovering electrostatics and magnetostatics, respectively.

3. Faraday’s induction is included as one of Maxwell’s equations.

4. The two equations for the divergence of the fields are constraints on theinitial conditions, while Faraday and Ampere-Maxwell equations definethe dynamics.

56

The last point explicitly means that, if E0 and B0 are initial data at the timet = t0 satisfying the two divergence equations, i.e.,

∇ · E0(x) = 4πρc(0, x), ∇ ·B0(x) = 0,

then the solution of the Cauchy problem∂tE − c∇×B = −4πJ,

∂tB + c∇× E = 0,

E(0, x) = E0(x), B(0, x) = B0(x),

automatically satisfies the divergence equations at later time t ≥ t0. Indeed, if(E,B) is the solution of the above problem, we have from Faraday’s equation

∂t∇ ·B = 0, ∇ ·B(0, x) = ∇ ·B0(x) = 0,

which implies ∇ · B(t, x) = 0 for all t ≥ t0. Analogously, the divergence of theAmpere-Maxwell equation gives

∂t∇ · E = −4π∇ · J = ∂t(4πρc),

∇ · E(0, x)− 4πρc(0, x) = ∇ · E0(x)− 4πρc(0, x) = 0,

where the charge continuity equation has been accounted for, and thus Gausslaw ∇ · E(t, x)− 4πρc(t, x) = 0 is satisfied at all later time t ≥ t0.

A physical proof of the correctness of Maxwell’s equations came with thediscovery of electromagnetic waves. In fact a direct implication of Maxwell’sequations is the existence of solutions for the electromagnetic field in which aperturbation of E and B propagates like a wave. The standard way to seethat is via another gauge, known as the Lorenz 1gauge, which is defined by thecondition

∇ ·A+1

c

∂φ

∂t= 0. (2.11)

Under non-restrictive hypotheses, it is possible to satisfy this condition by choos-ing an appropriate gauge transformation. Let us assume that we have a pair ofpotentials (φ, A) which do not satisfy the Lorenz condition. Then we apply agauge transformation (2.9) and require that the new potentials φ = φ− c−1∂tfand A = A+∇f satisfy (2.11), that is,

0 = ∇ ·A+1

c

∂φ

∂t= ∇ · A+

1

c

∂φ

∂t+ ∆f − 1

c2∂2f

∂t2,

which is equivalent to

f = ∇ · A+1

c

∂φ

∂t

where

=1

c2∂2

∂t2−∆,

denotes the D’Alembert operator (which is the Laplacian with a Lorentz metric).We recognize the D’Alembert wave equation with sources which we can solvefor the function f , thus forcing the Lorenz gauge.

1Notice that this is not a spelling error: The gauge is named after Ludvig Lorenz notHeinrich Lorentz [51].

57

In terms of potentials, Gauss law becomes

∇ ·(−∇φ− 1

c

∂A

∂t

)= 4πρc

and substituting ∇ ·A from the Lorenz gauge condition (2.11) we have

1

c2∂2φ

∂t2−∆φ = 4πρc.

On the other hand, Ampere-Maxwell law in terms of potentials reads

∇× (∇×A) =4π

cJ +

1

c

∂

∂t

(−∇φ− 1

c

∂A

∂t

),

and using again the identity ∇× (∇×A) = ∇(∇ ·A)−∆A, we have

1

c2∂2A

∂t2−∆A+∇

[∇ ·A+

1

c

∂φ

∂t

]=

4π

cJ,

and the term in square brackets is zero in the Lorenz gauge. At last we haveobtained two decoupled wave equations for the potentials,

φ = 4πρc,

A =4π

cJ.

(in the Lorenz gauge). (2.12)

This proves (formally at least) the existence of propagating wave solutions toMaxwell’s equations.

In addition we can give a physical meaning to the constant c introducedafter equation (2.4a). We see from the definition of the D’Alembert operator that c is the propagation speed of the electromagnetic waves, i.e., c is the speedof light.

It is worth noting that using potentials is not essential (although that is thestandard way). The same conclusion could be deduced directly from Maxwell’sequations for the electric and magnetic field. In plasma physics it is common towork with the electric field. We differentiate Ampere-Maxwell law in time anddivide by c,

∇×(1

c

∂B

∂t

)=

4π

c2∂J

∂t+

1

c2∂2E

∂t2.

On substituting the derivative of the magnetic field from Faraday’s law, onefinds a decoupled equation for E, namely,

1

c2∂2E

∂t2+∇× (∇× E) = −4π

c2∂J

∂t. (2.13)

It is less obvious that equation (2.13) supports propagating wave solutions, andyet the left-hand side can be written as

1

c2∂2E

∂t2−∆E +∇(∇ · E) = E + 4π∇ρc,

where we have accounted for Gauss law in the last equality. Hence the equationfor E is equivalent to

E = −4π∇ρc −4π

c2∂J

∂t,

58

which is a wave-equation with sources. The same equation for E could also beobtained by applying the D’Alembert operator to equation (2.8) and by usingthe wave equations for the potentials. From this argument however, we canunderstand that the term

1

c2∂2E

∂t2,

in the D’Alembert operator E stems from the displacement current! Maxwell’sintuition of adding the displacement current is critical to the existence of wave-like solutions.

Plane electromagnetic waves. It is instructive to look for plane-wave solutionsof Maxwell’s equations without sources. This relatively straightforward tech-nique applies to any constant-coefficient wave equation.

We search for solutions of∂tE − c∇×B = 0,

∂tB + c∇× E = 0,

∇ · E = ∇ ·B = 0,

with oscillatory complex-valued initial conditions of the form

E(0, x) = E0eik·x, B(0, x) = B0e

ik·x,

where k ∈ R3 is a given real vector, k 6= 0, and E0, B0 ∈ C3 are complex vectors.The argument of the exponential, namely,

ψ0(x) = k · x,

is the phase of the initial oscillation. The initial condition for both E and Bis constant on the surfaces ψ0(x) = C = constant, and since ∇ψ0 = k suchsurfaces are planes orthogonal to k.

The divergence equations give us constraints on initial data, that is,

∇ · E = 0, ⇒ k · E0 = 0,

∇ ·B = 0, ⇒ k ·B0 = 0.

We look for a solution of such a Cauchy problem in the form of a plane wave

E(t, x) = Ee−i(zt−k·x), B(t, x) = Be−i(zt−k·x), (2.14)

where z ∈ C is a possibly complex number and E ,B ∈ C3 are complex vectorsall depending on the fixed vector k ∈ R3. We must have

k · E = k · B = 0.

A physical electromagnetic field (E,B) should however be real-valued. Wecan obtain a real-valued solution by summing to the complex solution its com-plex conjugate. This is possible since Maxwell’s equations are linear and havereal coefficients, so that if a plane wave is a solution, then its complex conjugateis again a solution, and so is the sum of the two.

The exponential in equation (2.14) is oscillatory with time-dependent phase

ψt(x) = k · x− ωt, ω = Re(z),

59

x0

xt

vphkt

(vphk + w)t

wt

ψt(x) = C

ψ0(x) = C

Figure 2.1: Definition of the phase velocity of a wave in three dimensions.The conventional definition takes the direction of the wave vector k which isorthogonal to the phase fronts. However, for an infinite plane, one can add anorthogonal velocity w, with k · w = 0, obtaining the same translation.

where ω is the frequency. We see that, if a point x0 belongs to a phase frontat the time t = 0, i.e., ψ0(x0) = C for some constant C, then xt = x0 + vphkt,

with k = k/|k| and vph = ω/|k|, belongs to the same phase front at the latertime t ≥ 0, i.e., ψt(xt) = C with the same constant C. In fact,

ψt(xt) = k · x0 + k · (vphkt)− ωt = k · x0 = C.

We can say that phase fronts move in the direction of the vector k, which isreferred to as the wave vector. The speed vph of the phase fronts is called phasevelocity. It should be observed that in more than one spatial dimension thedefinition of phase velocity is merely a convention: if an infinite plane is movingparallel to itself, then the definition of the velocity of its points is ambiguous.In fact, for any w ∈ R3 such that k ·w = 0 but otherwise arbitrary, the velocityv = vphk + w will move points from a phase front to the next, cf. figure 2.1.The choice w = 0 is a convention.

In general, if z ∈ C with γ = Im(z) 6= 0, we may also have and exponentialgrowth (γ > 0) or damping (γ < 0) of the wave. We shall see however that, forthe problem at hand, z must be real.

Substitution of the plane wave into equation (2.13) without sources yields

(z2/c2 − k2)E = 0.

Non-trivial solutions require E 6= 0, therefore, the complex number z must besolution of the algebraic equation

z2 − c2k2 = 0,

which is referred to as dispersion equation. For electromagnetic waves the dis-persion equation is particularly simple and it has a pair of real-valued solutions

60

Figure 2.2: Light cone represented for two spatial dimensions with c = 1.

z = ω (no exponential growth or damping) with

ω = ±c|k|.

The relation between the real frequency ω and the wave vector k is referredto as the dispersion relation. Geometrically this dispersion relation defines adouble-sided cone in the space (ω, k) ∈ R4, which is referred to as light cone,cf. figure 2.2. One should also observe that the dispersion relation is invariantwith respect to any rotation of the wave vector, i.e., electromagnetic waves areisotropic waves.

Then Faraday’s equation gives the magnetic field in the form

B =ck

ω× E = sign(ω)N × E ,

where N = ck/|ω| is the refractive-index vector. We have |N | = c|k|/|ω| = 1because of the dispersion relation. In virtue of this relation B is orthogonal toboth k and E and, if E is real, then B is also a real vector. A representation ofthe orthogonal oscillations of E and B is given in figure 2.3.

In conclusion, given k ∈ R3, k 6= 0, and initial vectors E0, B0 ∈ C3 both or-thogonal to k, we have unique solution of the Cauchy problem for homogeneousMaxwell’s equations in form of a plane-wave. In fact the general solution is thelinear combination of the two branches ω = ω± = ±c|k|, namely,

E = E+eik·(x−ckt) + E−eik·(x+ckt),

B = N × E+eik·(x−ckt) −N × E−eik·(x+ckt),

where E± are free and have to be determined by the initial condition. One waveis characterized by phase fronts moving toward the positive k direction and itis referred to as progressive wave, while the other wave has phase fronts moving

61

Figure 2.3: Electric (red curve) and magnetic (blue curve) oscillations in spaceat fixed time for a plane wave with k = (1, 0, 0) and with E0 = (0, 0, 1), allquantities are normalized. Then, for the positive root of the dispersion relation,ω > 0, B is oscillating in the plane orthogonal to both the propagation direction(black straight line) and the electric field and with opposite phase with, i.e.,B < 0 where E > 0. For the negative root, ω < 0, the fields are in phase. Thefigure refers to the positive root.

toward the negative k direction and it is referred to as regressive wave, althoughthis nomenclature is not universally used.

In order to match the general solution to the initial fields we must have

E0 = E+ + E−,N ×B0 = −E+ + E−,

where we have used the fact that E± are orthogonal to N and N2 = 1. This isa system of two algebraic equations that have a unique solution, namely,

E+ =1

2

(E0 −N ×B0

), E− =

1

2

(E0 +N ×B0

),

and that completely defines the plane-wave solution corresponding to the giveninitial conditions. The initial conditions for E0, B0 determine the fraction ofamplitude carried by the two branches of the dispersion relation.

Poynting theorem and energy conservation (informal version). An importantaspect of Maxwell’s equations is the energy balance which is usually referred toas Poynting’s theorem [50]. We shall first derive it formally, assuming that wehave a sufficiently regular solution of Maxwell’s equations. Mathematically thisis a conservation law for the L2-norm of the electromagnetic field (E,B), againtreating both fields as components of the same object.

The result follows from the two Maxwell’s equations for the curl of the fields,namely Ampere-Maxwell and Faraday laws. When we scalar-multiply it on theleft by the electric field, Ampere-Maxwell equation becomes

E · ∂E∂t− cE · ∇ ×B = −4πJ · E.

62

Analogously from Faraday law we have

B · ∂B∂t

+ cB · ∇ × E = 0,

and thus,

12∂t(|E|2 + |B|2

)+ c(B · ∇ × E − E · ∇ ×B

)= −4πJ · E.

The second term on the left-hand side is an exact divergence, namely,

B · ∇ × E − E · ∇ ×B = ∇ · (E ×B).

This identity is valid for any two vector fields E and B and it can be provenby means of the Levi-Civita symbol εijk introduced in section 1.2; with theshort-hand notation ∂i = ∂xi for derivatives, one computes

∇ · (E ×B) = εijk∂i(EjBk)

= εijkBk∂iEj + εijkEj∂iBk

= εkijBk∂iEj − εjikEj∂iBk= B · ∇ × E − E · ∇ ×B.

We have obtained a balance law for the norm of the vector (E,B), namely,

12∂t(|E|2 + |B|2

)+∇ · (cE ×B) = −4πJ · E.

Dividing by the constant 4π we obtain Poynting’s theorem in form of a continuityequation, namely,

∂twem +∇ · P = −J · E, (2.15a)

where

wem =1

8π

(|E|2 + |B|2

), (2.15b)

is energy density associated to the electromagnetic field, with |E|2/(8π) and|B|2/(8π) being the electric and magnetic energy densities respectively, while

P =c

4πE ×B, (2.15c)

is the electromagnetic energy flux, also referred to as the Poynting vector. Thefactor 4π in the definition of the energy can only be understood if we considerelectromagnetic fields together with particle dynamics which is addressed insection 2.2.

For complex-valued solutions, we have to modify slightly the derivation, aswell as the definition of the Poynting vector, but the result holds nonetheless.Instead of multiplying by E and B the evolution equations, we can scalar-multiply by the complex conjugate E,B and we find

E · ∂tE +B · ∂tB + cB · (∇× E)− E · (∇×B) = −4πE · J.

In this case, we find neither the total derivative of the norm of the field nora divergence term. However, we can consider the complex conjugate of the

63

evolution equations and multiply those on the right by E and B respectively,thus obtaining

∂tE · E + ∂tB ·B − c(∇×B) · E + c(∇× E) ·B = −4πJ · E.

The sum of the two foregoing equations amounts to

∂t(|E|2 + |B|2

)+ c[B · (∇× E)− (∇×B) · E

]+ c[(∇× E) ·B − E · (∇×B)

]= −4π(E∗J + J∗E),

where J∗ = tJ and analogously E∗ = tE are the Hermitian conjugate (i.e., thetranspose of the complex conjugate) of J and E, respectively. Now the termsin square brackets are divergences and we find

∂twem +∇ · P = −Re(J∗E),

where the Poynting vector for complex-valued fields amounts to

P =c

4πRe(E ×B).

As an example, let us compute the Poynting vector for a plane wave. By usingFaraday’s equation in the form B = ±N × E we find

P = ± c

4πRe[E × (N × E],

but the refractive-index vector N is orthogonal to E and thus to E, so that

E × (N × E) = |E|2N − (E ·N)E = |E|2N.

For a plane wave the energy flux amounts to

P = ±c |E|2

4πk = ±ckwem,

where we have used the fact that, for an electromagnetic wave the electric andthe magnetic energy are equal, hence wem = |E|2/(4π). The energy flux of aplane wave accounts for the advection of the wave energy density at the speedof light c, toward the positive k-direction for the root ω > 0 (progressive wave)and opposite to it for the negative root (regressive wave).

2.2 Lorentz force and motion of an electrically charged particle. Inthis section we consider the motion of an electrically charged test particle in agiven electromagnetic field. The word “test particle” indicates that we neglectthe effect that the charged particle has on the electromagnetic field.

Let ep and mp be the electric charge and mass of the considered test particle.The charge is usually a multiple of the elementary charge e > 0, which is definedso that −e is the charge of an electron; hence, ep = Zpe where Zp is an integer.

The Lorentz-force law states that the force acting on a charged particle inpresence of an electromagnetic field (E,B) is given by

FL(t, x, v) = ep(E(t, x) + v ×B(t, x)/c

), (2.16)

64

where x ∈ Ω is the position of the particle and v ∈ R3 is its velocity.If no other force acts on the particle but the Lorentz force, the equation of

motion for a non-relativistic test particle (Newton second law) takes the form

dx

dt= v,

dv

dt=

epmp

(E(t, x) + v ×B(t, x)/c

).

(2.17)

This is a system of first order equations for which we pose a Cauchy problemwith generic initial conditions

x(0) = x0, v(0) = v0.

In virtue of theorem 1.1, if the electromagnetic field is locally Lipschitz uniformlyin time, we have a solution (x, v) defined in a possibly small interval (−ε,+ε)and such that v ∈ C1, but x ∈ C2 due to the first equation.

We can check how the Lorentz force affect the kinetic energy of a particle,which is defined by

K(v) =1

2mpv

2.

Along a solution we have

d

dtK(v) = mpv ·

dv

dt= epv · E(t, x).

We can immediately see that, due to the cross-product structure, the magneticpart of the Lorentz force does not do work on the particle and thus it does notchange its energy. The electric field on the other hand can either accelerate theparticle, when it is directed toward its velocity, i.e., v · E > 0, or decelerate itwhen it is directed opposite to its velocity.

Instead of a single particle, let us consider a gas of many identical particleswith phase-space distribution function f(t, x, v), cf. section 1.6 for the definition.In view of the results of kinetic theory, cf. section 1.6, we already know thatthe contribution to heat sources of the Lorentz force, as computed by means ofequation (1.30g), is zero. Therefore, the only effect of the Lorentz force on theenergy balance is the work done by the average force, namely,∫

R3

epv · E(t, x)f(t, x, v)dv = J · E,

where the last identity follows on noting that

J(t, x) = ep

∫R3

vf(t, x, v)dv = epn(t, x)u(t, x) (2.18)

defines the flux of electric charge, and thus the current density, in terms ofthe distribution function. This result justify the choice of the constants in thePoynting theorem (2.15), in which the term J · E appears with a minus sign:when J ·E is positive, the energy is taken by the particles and therefore removedfrom the electromagnetic field. We also observe that the net force acting on anelement of the gas is, according to equation (1.39c),

ρf = ep

∫R3

(E(t, x) + v ×B(t, x)/c

)f(t, x, v)dv = ρcE + J ×B/c,

65

which has the same structure as (2.16), but with the charge density ρc insteadof ep and the current density J instead of epv.

For the simple case of uniform, i.e., constant in both time and space, elec-tromagnetic fields, there is a rather simple analytical solution for the motionof a charged particle. First we note that, if E and B in equations (2.17) areindependent of x, the equation for v decouples from that for x; once a solutionfor v is known, we can integrate the equation for x by quadrature. With uniformfields E,B and with B 6= 0, let us consider

dv

dt=

epmp

E +epmpc

v ×B v(0) = v0 ∈ R3.

In the case E = 0, the equation reduces to

dv

dt= ±ωcv × b, (2.19)

where the sign is determined by the sign of the electric charge of the particle,

ωc =|epB|mpc

> 0, (2.20)

is referred to as the cyclotron frequency, and b = B/|B| is the unit vector alongthe direction of the magnetic field.

We recognize the equation of rigid rotation around the direction of the mag-netic field. Without loss of generality, we can choose b = t(0, 0, 1) and thus theequation for v becomes

d

dt

v1

v2

v3

= ±ωc

v2

−v1

0

from which the have v3(t) = v0,3 = constant, and

d2v1

dt2= −ω2

cv1,

which has general integral,

v1(t) = C1 cos(ωct) + C2 sin(ωct).

Then,

v2(t) = ± 1

ωc

dv1

dt= ∓C1 sin(ωct)± C2 cos(ωct).

From the initial conditions we have the constants C1 = v0,1 and C2 = ±v0,2.We note the the projection of v onto the plane orthogonal to the magnetic fieldhas constant modulus. Indeed, we have

v1(t)2 + v2(t)2 = C21 + C2

2 = v20,1 + v2

0,2 = v2⊥, (2.21)

where we have defined the constant value v⊥ =√v2

1 + v22 . The projection onto

the direction of the magnetic field is also constant, namely, v3(t) = v0,3 = v‖.Hence the velocity vector is moving on a cone with circular section and constant

66

Figure 2.4: Trajectories of two particles with positive (blue trajectory) andnegative (green trajectory) electric charge in a uniform magnetic field directedvertically. Both particle are in x = 0 at the initial time. The projection ontothe x1-x2 (panel on the left-hand side) shows the circular motion, clockwise andcounter-clockwise for the positive and negative charge respectively. The parallelvelocity is such that v‖/v⊥ = 0.1.

angle with the magnetic field. Upon re-orienting the x1-axis of the Cartesianreference system by a rotation around its x3 axis (the magnetic field direction),we can assume v0,1 = v⊥ and v0,2 = 0, so that

v1(t) = v⊥ cos(ωct),

v2(t) = ∓v⊥ sin(ωct),

v3(t) = v‖,

(2.22a)

and integrating in time, we obtain that the particle position describes a spiralaround the magnetic field direction

x1(t) = x0,1 + ρL(v⊥) sin(ωct),

x2(t) = x0,2 ± ρL(v⊥)(cos(ωct)− 1),

x3(t) = x0,3 + tv‖,

(2.22b)

whereρL(v⊥) = v⊥/ωc > 0, (2.23)

is referred to as the Larmor radius of the particle. Figure 2.4 show an exampleof the helical orbit of a particle.

In presence of a uniform electric field this solution is still mostly valid butthe electric field induces a drift of the axis of the helix.

In order to see that, let us introduce the change of variable

v = vE + w, (2.24)

where the constant velocity

vE = cE ×BB2

, (2.25)

67

is called E × B-drift velocity. We say that the E ×B drift is ambipolar, i.e., itdoes not depend on the sign of the electric charge of the particle. Substitutingthe change of variables into the Lorentz force gives

E +v ×Bc

= E +(E ×B)×B

B2+w ×Bc

,

and the second term on the right-hand side can be evaluated by means of thevector calculus identity

(E ×B)×B = B × (B × E) = −B2[E − b(b · E)].

Hence,

E +v ×Bc

= E‖b+w ×Bc

,

where E‖ = b · E is the component of the electric field parallel to the magneticfield. The contribution from E × B-drift velocity cancels out exactly the com-ponent of the electric field normal to the direction b of the magnetic field. Thenthe equation for the new variable w reads (vE is constant),

dw

dt= epE‖b± ωcw × b,

which is the same as equation (2.19), apart for a parallel acceleration due tothe parallel electric field. Again we can choose the third axis of the referenceframe directed like b, but this time we orient the first axis toward v0− vE = w0

so that w0,1 = w⊥ and w0,2 = 0. The conservation of perpendicular energy,

equation (2.21), holds for the w-variables and w⊥ =√w2

1 + w22 = constant.

Then the solution for w in this coordinate system readsw1(t) = w⊥ cos(ωct),

w2(t) = ∓w⊥ sin(ωct),

w3(t) = v‖ + epE‖t,

(2.26a)

which differs from (2.22a) for the parallel acceleration only. If E‖ 6= 0 the anglebetween w and the magnetic field direction is changing and the velocity is nolonger moving on a cone. The actual particle velocity v is obtained by addingthe E ×B drift which is perpendicular to both the electric and magnetic field.At last, we obtain that the particle position describes a spiral motion around adrifting axis, namely,

x(t) = x0 + vEt+ y(t), (2.26b)

and y1(t) = ρL(w⊥) sin(ωct),

y2(t) = ±ρL(w⊥)(cos(ωct)− 1),

y3(t) = tv‖ + epE‖t2/2.

(2.26c)

The E×B-drift plays a crucial role in plasma physics particularly for stronglymagnetized plasmas as those used in magnetic fusion experiments. In MHD thefluid velocity is closely related to the E ×B drift. Figure 2.5 shows an exampleof drifting orbits.

68

Figure 2.5: The same as in figure 2.4 but with a uniform electric field. Theparallel electric field accelerates the particles in different direction, while theE × B-drift makes the center of the gyration move in the same direction forboth charges (ambipolarity). The initial condition is such that v‖ = 0 at theinitial time. The normalized electric field is cE/(B0w⊥) = (−0.05, 0.1,−2.0).

2.3 Basic mathematical results for electrodynamics. We conclude thissection with basic mathematical considerations that are central in electrody-namics: the boundary value problem for the Poisson equation, which has beenused in both electrostatics and magnetostatics, and the Cauchy problem forMaxwell’s equations on the whole space R3.

Dirichlet problem for the Poisson equation. In the physics discussion of bothelectrostatics and magnetostatics we have relied on the solution of the Poissonequation,

−∆u = f, (2.27)

for a scalar field u where f is a known source function. The unknown canbe either the electrostatic potential φ, or a component of the magnetic vectorpotential A.

We start with the following technical observation about the Laplace operator.

Proposition 2.1. The function v(x) = 1/|x|a, for x ∈ Rd, a = d − 2, andd ≥ 3, is in L1

loc(Rd) and

−∆v = (d− 2)Adδ,

in sense of distributions, δ being the Dirac mass in x = 0 and Ad the area ofthe unit sphere Sd−1 in the d-dimensional space.

Proof. The function v is smooth everywhere except in the origin x = 0, wherethere is an integrable singularity. Hence, v ∈ L1

loc(Rd) and thus v is a distri-bution, i.e., an element of the space D ′(Rd) which is defined as the space ofcontinuous linear functionals on C∞0 (Rd).

69

In sense of distributions, the Laplace operator of v is the linear functionalacting on ϕ ∈ C∞0 (Rd) according to

〈−∆v, ϕ〉 = −〈v,∆ϕ〉 = −∫Rd|x|−a∆ϕ(x)dx,

where 〈u, ϕ〉 = u(ϕ) denotes the action of a distribution u ∈ D ′(Rd) on a test-function ϕ ∈ C∞0 (Rd). The integral is well-defined since v ∈ L1

loc(Rd) and ϕhas compact support.

We compute the integral in polar coordinates (r, ϑ) where r = |x| is the radialcoordinate and ϑ ∈ Sd−1 are coordinates on the unit sphere. The Laplacian ofϕ in spherical coordinates amounts to

∆ϕ =1

rd−1

∂

∂r

[rd−1 ∂ϕ

∂r

]+

1

rd−1∆ϑϕ,

where ∆ϑ is the part of the Laplace operator on the unit sphere. The volumeelement is

dx = rd−1drdω,

where dω is the surface element on the unit sphere Sd−1 so that

Ad =

∫Sd−1

dω,

is the area of the unit sphere in the d-dimensional space. We have,∫Sd−1

∆ϑϕ(r, ϑ)dω = 0,

hence

〈v,∆ϕ〉 =

∫ +∞

0

r−a∂

∂r

[rd−1 ∂ψ

∂r

]dr,

where we have define

ψ(r) =

∫Sd−1

ϕdω.

The integral can be evaluated by integration by parts

〈v,∆ϕ〉 = a

∫ +∞

0

r−a−1[rd−1ψ′(r)

]dr

= (d− 2)

∫ +∞

0

ψ′(r)dr = −(d− 2)ψ(0),

where a = d − 2 and ψ′ = dψ/dr. At last, we use Taylor formula near x = 0,ϕ(x) = ϕ(0) + rϕ1(x) to show that ψ(r) = Adϕ(0) +O(r), hence,

〈−∆v, ϕ〉 = (d− 2)Adϕ(0),

and ϕ(0) = 〈δ, ϕ〉.

Proposition 2.1 is usually summarized in the statement that

E(x) =1

(d− 2)Ad|x|2−d,

is the fundamental solution of the Laplace operator. In general, one has thefollowing

70

Definition 2.1 (Fundamental solutions). A distribution E ∈ D ′(Rd) is thefundamental solution of the constant-coefficient partial differential operator Pif the identity PE = δ holds in D ′(Rd).

In three dimensions we have dω = sin θdθdφ where (θ, φ) are the standardlatitude and longitude angles of spherical coordinates, the Laplace operatoramounts to

∆ϕ =1

r2

[r2 ∂ϕ

∂r

]+

1

r2

[ 1

sin θ

∂

∂θ

(sin θ

∂ϕ

∂θ

)+

1

sin2 θ

∂2ϕ

∂φ2

],

from which we can deduce ∆ϑϕ. The area is

A3 =

∫ π

0

∫ 2π

0

sin θdθdφ = 4π,

and−∆v = 4πδ.

Physically by comparison with (2.3), one could regard the function v as thepotential generated by a unitary point charge located at x = 0. If we have aparticle of charge ep in x = 0, the charge density is given by ρc = epδ and wehave and the electrostatic potential is

V (x) =ep|x|, (2.28)

which is referred to as Coulomb potential. This is the electrostatic equivalent ofNewton’s gravitational potential for a point mass.

The associated electric field is

E(x) = −∇V (x) =ep|x|2

x, (2.29)

where x = x/|x| is the unit vector in the radial direction. If a test particle ofcharge e′p is placed in this electric field, it experiences a force given by equa-tion (2.16) with B = 0, namely,

FC =epe′p

|x|2x, (2.30)

which is referred to as Coulomb force: the force between two point charges isinversely proportional to the square of their distance and it is attractive if thecharges have opposite sign (epe

′p < 0) and repulsive is they have the same same

sign (epe′p > 0).

We can also make use of the fundamental solution of the Laplacian to con-struct solutions of equation (2.27) on Rd when the right-hand side is a distri-bution with compact support, that is, f ∈ E ′(Rd). The the convolution v ∗ f iswell defined in D ′(Rd) and we have

−∆(v ∗ f) = (−∆v) ∗ f = (d− 2)Adδ ∗ f = (d− 2)Adf,

where we have used the properties of the convolution and the fact that δ∗f = f .In addition, since a convolution with a smooth compactly supported function isregularizing, when f is smooth we have u ∈ C∞. We can summarize this resultas follows.

71

Theorem 2.2. For every f ∈ E ′(Rd), d ≥ 3 the convolution

u =1

(d− 2)Adv ∗ f,

satisfies equation (2.27) in the sense of distributions. If f ∈ C∞0 (Rd), thenu ∈ C∞(Rd).

We now turn to the Dirichlet problem in a bounded domain Ω ⊂ Rd with asmooth boundary ∂Ω,

−∆u = f, in Ω,

u = 0, on ∂Ω.

Lemma 2.3 (Green’s identities). For any two functions u, v ∈ C2(Ω) we have∫Ω

∇u · ∇vdx = −∫

Ω

u∆vdx+

∫∂Ω

u∂nvdS, (2.31a)∫Ω

(v∆u− u∆v)dx =

∫∂Ω

(v∂nu− u∂nv)dS, (2.31b)

where ∂n = n · ∇ and n is the unit outward normal on the boundary ∂Ω.

Proof. Equation (2.31a) follows from the identity

∇ ·[u∇v

]= u∆v +∇u · ∇v,

and Gauss theorem. Reversing the role of u and v and subtracting the one getsequation (2.31b).

The first Green identity, equation (2.31a), implies uniqueness of a classicalsolutions [52, 53].

Proposition 2.4. A solution u ∈ C2(Ω) of the Dirichlet problem is unique.

Proof. Let ui ∈ C2(Ω), i = 1, 2, be two solutions of the Dirichlet problem withthe same right-hand side f ∈ C(Ω). Then, identity (2.31a) with u = v = u1−u2

amounts to ∫Ω

|∇(u1 − u2)|2dx = 0,

and thus ∇(u1 − u2) = 0, pointwise since ui are in C2. In addition, bothfunctions must vanish on the boundary, hence u1 − u2 = 0.

The second Green identity, equation (2.31b), allows us to write an integralrepresentation of the solution in terms of Green’s functions.

For the specific case of the Dirichlet problem on a bounded domain Ω withsmooth boundary, the Green’s function can be constructed by correcting thefundamental solution according to

G(x, y) = E(x− y) +K(x, y), x ∈ Ω, y ∈ Ω

where for every x ∈ Ω, Kx = K(x, ·) ∈ C2(Ω) is the solution of

∆Kx = 0, in Ω, with Kx(y) = −E(x− y), for y ∈ ∂Ω.

If the correction K exists, we have the possibility to write an explicit solutionin the form of a convolution with the source function plus a boundary integral.

72

Proposition 2.5 (Green’s functions). If G is a Green function constructed asdescribed above, then

(i) G is C2 away from the diagonal (x, y) | x = y,

(ii) Gx = G(x, ·) ∈ L1(Ω),

(iii) −∆Gx = δx in D ′(Ω), and Gx|∂Ω = 0 for all interior points x ∈ Ω.

(iv) If u ∈ C2(Ω) is a solution of −∆u = f , with u|∂Ω = g, then

u(x) =

∫Ω

G(x, y)f(y)dy −∫∂Ω

g∂nGxdS.

Proof. Claim (i) and (ii) follow from the definition, since K ∈ C2(Ω) and E isC∞ on Rd\0 and L1

loc on Rd. As for (iii), for every interior point x ∈ Ω, Gx iscontinuous near the boundary and its restriction to ∂Ω is zero by construction;in addition, for every ϕ ∈ C∞0 (Ω) → C∞0 (Rd) we have

〈−∆Gx, ϕ〉 = 〈−∆E(x− ·), ϕ〉 −∫

Ω

∆Kxϕdy.

The last integral is zero since ∆Kx = 0 and change of variable in the distributiongives

〈−∆E(x− ·), ϕ〉 = 〈−∆E, ϕ(x− ·)〉 = ϕ(x) = 〈δx, ϕ〉,since −∆E = δ. The last point comes from the second Green identity appliedto the domain Ω \ Bε(x) where Bε(x) is a ball centered on an interior pointx ∈ Ω with radius ε so small that Bε(x) ⊂ Ω. If u is a C2 solution and Gis the Green function constructed above, we have u,Gx ∈ C2

(Ω \Bε(x)

)and

equation (2.31b) with v = Gx gives

−∫

Ω\Bε(x)

Gxfdy = −∫∂Bε(x)

(Gx∂nu− u∂nGx)dS −∫∂Ω

u∂nGxdS

where we have used the identity ∆Gx = 0, which holds away from y = x,together with −∆u = f , and the boundary condition Gx = 0 on ∂Ω. Since f isuniformly continuous on Ω and Gx is integrable,∫

Ω\Bε(x)

Gxfdy →∫

Ω

G(x, y)f(y)dy,

while∫∂Bε(x)

(Gx∂nu−u∂nGx)dS =

∫∂Bε(x)

(E(x−y)∂nu(y)−u(y)∂nE(x−y))dS(y)

+

∫∂Bε(x)

(Kx∂nu− u∂nKx)dS.

The last integral involves only continuous functions and thus it is bounded byCAdε

d−1 where C is the maximum of the integrand. In the remaining term wecan change variable so that it reduces to

1

(d− 2)Ad

∫∂Bε(0)

[ 1

rd−2∂ru− u

−(d− 2)

rd−1

]dS,

73

where we have used polar coordinates around the point x, and dS = rd−1dω, dωbeing the area element on the unit sphere. The normal derivative is the sameas the derivative with respect to the radial coordinate r. The first term in theintegral goes like rdω and thus is O(ε) for ε → 0+. The second term on theother hand gives a finite contribution, namely,

1

Ad

∫∂Bε(0)

udω = u(x) +1

Ad

∫∂Bε(0)

(u− u(x))dω → u(x),

since u is continuous. We have obtained that the limit for ε→ 0+ of the secondGreen identity yields the claimed result.

Although the representation of the solution in terms of Green’s functionscan be quite useful, Green’s functions for complicated domains are not easilyavailable. Moreover proposition 2.5 is not an existence result: existence of aregular solution has been assumed.

Therefore both for practical and theoretical reasons one usually consider theweak formulation of the Dirichlet problem in the Sobolev space H1

0 (Ω), that is,one looks for u ∈ H1

0 (Ω) such that

(∇u,∇v)L2(Ω) = F (v), for all v ∈ H10 (Ω),

where the right-hand side can be as general as any continuous linear functionalF : H1

0 (Ω)→ R, but usually has the form

F (v) =

∫Ω

fvdx,

for f ∈ L2(Ω). If u ∈ C2(Ω) is a solution of the Dirichlet problem with f ∈ C(Ω)and u|∂Ω = 0, then it solves the weak formulation as well. However weaksolutions might not in general be smooth.

Existence and uniqueness of weak solutions in H10 (Ω) follows from Riesz

representation theorem together with Poincare inequality which is stated herewithout proof, cf. chapter 10 of Tartar [54], as well as Hunter’s lecture notes[52] and Evans textbook [53].

Lemma 2.6 (Poincare inequality). If Ω ⊂ Rd is a bounded connected opensubset then there exists a constant C(Ω) depending on Ω, such that

‖u‖L2(Ω) ≤ C(Ω)‖∇u‖L2(Ω),

for u ∈ H10 (Ω).

This fact allows us to show that

Theorem 2.7. Let Ω ⊂ Rd be a bounded connected open subset. For everylinear continuous functional F : H1

0 (Ω) → R there exists a unique u ∈ H10 (Ω)

satisfying(∇u,∇v)L2(Ω) = F (v), for all v ∈ H1

0 (Ω).

Proof. Let us consider the bi-linear form a : H10 (Ω)×H1

0 (Ω)→ R defined by

a(u, v) =

∫Ω

∇u · ∇vdx.

74

This is symmetric, non-negative and if a(u, u) = 0, then ∇u = 0 in L2 and bythe Poincare inequality u = 0 in L2. Hence a(·, ·) satisfies the conditions for ascalar product. The space H1

0 (Ω) is a Hilbert space with the product

(u, v)H1(Ω) = (u, v)L2(Ω) + (∇u,∇v)L2(Ω),

which means that any Cauchy sequence in the topology induced by (·, ·)H1(Ω)

has a limit in H10 (Ω). In view of the Poincare inequality we have

‖∇u‖2L2(Ω) ≤ ‖u‖2H1(Ω) = ‖u‖2L2(Ω) + ‖∇u‖2L2(Ω) ≤

(1 + C(Ω)

)‖∇u‖2L2(Ω),

that is, the topology induced by the standard product is equivalent to the topol-ogy induced by the product a(·, ·). In particular, any Cauchy sequence for theproduct a(·, ·) is a Cauchy sequence for the standard product, and thus it hasa limit in H1

0 (Ω). This means that H10 (Ω) with the product a(·, ·) is a Hilbert

space. In addition, a functional F , which by hypothesis is continuous with re-spect to the topology of the standard product, is continuous also in the topologyof a(·, ·). We can invoke Riesz representation theorem for which any continuouslinear functional F can be represented as a scalar product with a unique elementu ∈ H1

0 (Ω), that is,

F (v) = a(u, v),

for all v ∈ H10 (Ω). Then u is the solution of the Poisson problem.

A consequence of theorem 2.7 is that −∆ is a bijective map : H10 (Ω) →

H−1(Ω), where H−1(Ω) is the space of continuous linear functionals on H10 (Ω).

Cauchy problem for symmetric hyperbolic systems in Hs(Rd). Let us first con-sider the two Maxwell’s equations for the curl of the fields in the form

∂tE − c∇×B = −4πJ,

∂tB + c∇× E = 0.(2.32)

This system belongs to the following class of first-order equations:

Definition 2.2 (Linear symmetric hyperbolic systems [55]). Let d, n be positiveintegers, Ai : R×Rd → Rn×n matrix-valued functions, bounded with boundedderivatives, and f : R×Rd → Rn. The first-order partial differential equation

∂tu+

d∑i=1

Ai(t, x)∂xiu = f,

for u : R × Rd → Rn is a linear symmetric hyperbolic system if the symbol ofthe spatial operator defined, for ξ ∈ Rd, by

A(t, x, ξ) =

d∑i=1

Ai(t, x)ξi,

is a symmetric matrix.

75

For the case of Maxwell’s equations (2.32), let d = 3 n = 6 and u = (E,B).By direct calculation we find from (2.32),(

−c∇×B+c∇× E

)=

3∑i=1

(0 AitAi 0

)∂

∂xi

(EB

),

where Ai are 3× 3 blocks given by

A1 =

0 0 00 0 c0 −c 0

, A2 =

0 0 −c0 0 0c 0 0

, A3 =

0 c 0−c 0 00 0 0

.

The block structure of the matrices Ai immediately show symmetry, hence equa-tions (2.32) constitute a symmetric hyperbolic system with constant coefficients.In this case the symbol depends only on ξ and we write,

A(ξ) =

3∑i=1

Aiξi ∈ R6×6, A(∂x) =

3∑i=1

Ai∂xi ,

the latter being a formal short-hand notation only.Another important example of constant-coefficient symmetric hyperbolic

system comes from D’Alembert wave equation,

∂2t v − c2∆v = g,

for a scalar field v : R × Rd → R. We can in fact write it as the first-ordersystem

∂tv + c∇ · w = h,

∂tw + c∇v = 0,

where we have introduced a vector field w : R × Rd → Rd and the source∂th = g. By applying the time derivative to the first equation and substitutingfrom the second, one recovers D’Alembert equation for v. If ei ∈ Rd are theunit vectors in the direction of the axes, the evolution equation for the combinedvariable u = t(v, w) can be written as

∂tu+

d∑i=1

c

(0 teiei 0

)∂u

∂xi=

(h0

),

which is a symmetric hyperbolic system, as claimed.Therefore by addressing the Cauchy problem for constant-coefficient sym-

metric hyperbolic systems, we address at once both Maxwell’s equations andthe D’Alembert wave equation.

For the general theory of linear symmetric hyperbolic systems includingvarying coefficients, we refer to the book by Rauch [55]. Here, we want tostudy the Cauchy problem for the constant-coefficient case,

∂tu+A(∂x)u = f, t ≥ 0, x ∈ Rd,u(0) = u0, t = 0, x ∈ Rd,

(2.33)

76

on the whole space Rd. With this aim we shall employ the Fourier transform[56, Chapter VII]. We recall that the Fourier transform of a function u in theSchwartz space S (Rd) is defined by

u(ξ) = Fu(ξ) =

∫Rde−iξ·xu(x)dx,

the integral being absolutely convergent, and u(ξ) belongs to S (Rd), that is,Fourier transform is an endomorphism of the Schwartz space. One can show thatit is actually an isomorphism, i.e., it is bijective. In the theory of distributionsthis allows us to extend the Fourier transform to an isomorphism on the spaceof tempered distributions S ′(Rd), which is the topological dual of the Schwartzspace, i.e., the space of linear continuous functions from S (Rd) to R.

For s ∈ R, let us look for solutions in the Sobolev spaces

Hs(Rd) = u ∈ S ′(Rd) | (1 + |ξ|2)s/2u ∈ L2(Rd).

The natural norm on those spaces is given by

‖u‖2Hs(Rd) = (2π)−d∫Rd

(1 + |ξ|2)s|u(ξ)|2dξ,

and with this norm, Hs(Rd) is complete and thus a Banach space. For s ≥ 0,we have (1 + ξ2)s ≥ 1, hence, Hs(Rd) ⊆ L2(Rd) with equality for s = 0. Fors < 0, an element of Hs is in general in S ′(Rd) only, and thus a distribution.

For constant-coefficient equations, Fourier transform is a powerful tool sinceit turns differential operators into algebraic operators. In fact, we have

F(∂xiu) = iξiu.

and thus

F(A(∂x)u

)=

d∑i=1

AiF(∂xiu) = i

d∑i=1

Aiξiu = iA(ξ)u, (2.34)

which justifies the definition of the symbol A(ξ) of the operator.

Theorem 2.8. If Aj are symmetric n × n matrices, s ∈ R, u0 ∈ Hs(Rd),and f ∈ L1

loc

(R, Hs(Rd)

), the Cauchy problem (2.33) has a unique solution

u ∈ C1(R, Hs−1(Rd)

)∩C(R, Hs(Rd)

)which depends continuously on the data,

namely,

‖u(t)‖Hs(Rd) ≤ ‖u0‖Hs(Rd) +

∫ t

0

‖f(t′)‖Hs(Rd)dt′, t ≥ 0, (2.35)

with equality if f = 0.

Proof. 1. First we address uniqueness. It is sufficient to show that zero is theonly solution in C1

(R, Hs−1(Rd)

)∩C(R, Hs(Rd)

)corresponding to zero initial

condition and zero sources.If u ∈ C1

(R, Hs−1(Rd)

)∩ C

(R, Hs(Rd)

)then, by Fourier transform one

finds that v = (1−∆)(s−1)/2u belongs to C1(R, L2(Rd)

)∩C(R, H1(Rd)

), Lu =

∂tu+A(∂x)u is in C(R, Hs−1(Rd)

), and Lv = ∂tv+A(∂x)v = (1−∆)(s−1)/2Lu

77

belongs to C(R, L2(Rd)

). Therefore ‖v(t)‖L2(Rd) is continuously differentiable

in time and we can compute

1

2

d

dt‖v(t)‖2L2(Rd) = (v, ∂tv)L2(Rd) = (v, Lv −A(∂x)v)L2(Rd).

The integration-by-part formula for functions in H1 shows that A(∂x) is anti-symmetric since Aj are real-symmetric by hypothesis, hence,

1

2

∣∣∣∣ ddt‖v(t)‖2L2(Rd)

∣∣∣∣ =∣∣(v(t), Lv(t)

)L2(Rd)

∣∣ ≤ ‖v(t)‖L2(Rd)‖Lv(t)‖L2(Rd),

and using1

2

d

dt‖v(t)‖2L2(Rd) = ‖v(t)‖L2(Rd)

d

dt‖v(t)‖L2(Rd),

we obtain ∣∣∣∣ ddt‖v(t)‖L2(Rd)

∣∣∣∣ ≤ ‖Lv(t)‖L2(Rd). (2.36)

For t ≥ 0, inequality (2.36) gives

d

dt

(‖v(t)‖L2(Rd) −

∫ t

0

‖Lv(t′)‖L2(Rd)dt′)

=d

dt‖v‖L2(Rd) − ‖Lv(t)‖L2(Rd) ≤ 0,

and thus

‖v(t)‖L2(Rd) −∫ t

0

‖Lv(t′)‖L2(Rd)dt′ ≤ ‖v(0)‖L2(Rd),

which is equivalent to

‖u(t)‖Hs−1(Rd) ≤ ‖u(0)‖Hs−1(Rd) +

∫ t

0

‖Lu(t′)‖Hs−1(Rd)dt′.

This results however is weaker than (2.35). For t < 0 inequality (2.36) gives

d

dt

(‖v(t)‖L2(Rd) −

∫ 0

t

‖Lv(t′)‖L2(Rd)dt′)

=d

dt‖v‖L2(Rd) + ‖Lv(t)‖L2(Rd) ≥ 0,

and thus

‖v(0)‖L2(Rd) ≥ ‖v(t)‖L2(Rd) −∫ 0

t

‖Lv(t′)‖L2(Rd)dt′,

for t < 0, and with ‖v(t)‖L2(Rd) = ‖u(t)‖Hs−1(Rd),

‖u(t)‖Hs−1(Rd) ≤ ‖u(0)‖Hs−1(Rd) +

∫ 0

t

‖Lu(t′)‖Hs−1(Rd)dt′, (t ≤ 0).

At last, for every T > 0,

supt∈[−T,T ]

‖u(t)‖Hs−1(Rd) ≤ ‖u(0)‖Hs−1(Rd) +

∫ +T

−T‖Lu(t′)‖Hs−1(Rd)dt

′.

If now u is a solution of Lu = 0 with u(0) = 0, the obtained estimate yieldsu(t) = 0 for t ∈ [−T,+T ]. Since T is arbitrary, u = 0 as claimed.

78

2. We now prove existence for highly regular data. Specifically, we showthat, with initial conditions

u0 ∈ C∞0 (Rd), and f ∈ C∞0 (R1+d),

there exists a unique solution u ∈ C∞(R1+d) such that u(t, ·) ∈ S (Rd).Since solutions with the required regularity can be Fourier-transformed in

space, U must satisfy, cf. equation (2.34),

∂tu(t, ξ) + iA(ξ)u(t, ξ) = f(t, ξ),

with u(0, ξ) = u0(ξ) at the initial time t = 0. For every ξ ∈ Rd fixed, this is aCauchy problem for an ordinary differential equation and its unique solution isgiven by

u(t, ξ) = e−itA(ξ)u0(ξ) +

∫ t

0

e−i(t−t′)A(ξ)f(t′, ξ)dt′,

with exp(−itA(ξ)

)being a unitary matrix since A(ξ) is symmetric. We see that

u(t, ξ) is smooth and for every t it is uniformly bounded in ξ. More genrally,we can introduce the quantities vαβ(t, ξ) = ξβ∂αξ u(t, ξ) and from the equationwe find

∂tvαβ(t, ξ) = −iξβ∂αξ(A(ξ)u(t, ξ)

)+ ξβ∂αξ f(t, ξ).

Since A(ξ) is linear in ξ, the Leibniz rule gives

∂tvαβ(t, ξ) + iA(ξ)vαβ(t, ξ) = gαβ(t, ξ),

where the right-hand side depends on vα′β for all α′ of length |α| − 1. As abovewe have

vαβ(t, ξ) = e−itA(ξ)vαβ(0, ξ) +

∫ t

0

e−i(t−t′)A(ξ)gαβ(t′, ξ)dt′,

but gαβ depends on vα′β with |α′| = |α|−1. A recurrence argument on α showsthat all vαβ(t, ·) are uniformly bounded in ξ. We deduce that u(t, ·) ∈ S (Rd).The inverse Fourier transform then gives

u(t, x) = U(t)u0(x) +

∫ t

0

U(t− t′)f(t′, x)dt′,

where the propagator U(t) : S (Rd)→ S (Rd) is defined by

U(t)v(x) =1

(2π)d

∫eiξ·x−itA(ξ)v(ξ)dξ,

and we have exchanged the time integral with the inverse Fourier transformsince the integrand is L1.

3. The regular solution constructed so far is such that

v(t, x) =(1−∆)s/2u(t, x) =

1

(2π)d

∫eiξ·x(1 + |ξ|2)s/2u(t, ξ)dξ,

belongs to C∞(R1+d) and, for every fixed t, v(t, ·) ∈ S (Rd). Therefore thefunction t 7→ v(t, .) is in C1

(R, Hs−1(Rd)

)∩ C

(R, Hs(Rd)

)for all s. We can

79

therefore apply the energy estimates obtained in step 1, with the index s replacedby s+ 1; hence,

‖u(t)‖Hs(Rd) ≤ ‖u(0)‖Hs(Rd) +

∫ t

0

‖f(t′)‖Hs(Rd)dt′. (2.37)

and for any T > 0,

supt∈[−T+T ]

‖u(t)‖Hs(Rd) ≤ ‖u0‖Hs(Rd) +

∫ +T

−T‖f(t′)‖Hs(Rd)dt

′, (2.38)

uniformly for t ∈ [−T,+T ].4. Let us now consider generic data u0 ∈ Hs(Rd) and f ∈ L1

loc

(R, Hs(Rd)

).

Following the standard approximation argument [55], let us choose sequences

u0,j ∈ C∞0 (Rd), u0,j → u0 in Hs(Rd),

fj ∈ C∞0 (R1+d), fj → f in L1loc

(R, Hs(Rd)

).

For every j we can construct the solution corresponding to the regular data u0,j

and fj as before, and for every j, k we have

∂t(uj − uk) +A(∂x)(uj − uk) = fj − fk, (uj − uk)|t=0 = u0,j − u0,k.

Estimate (2.38) gives

supt∈[−T+T ]

‖uj(t)− uk(t)‖Hs(Rd)

≤ ‖u0,j − u0,j‖Hs(Rd) +

∫ +T

−T‖fj(t′)− fk(t′)‖Hs(Rd)dt

′,

and thus uj is a Cauchy sequence in C([−T,+T ], Hs(Rd)

). Completeness im-

plies that there exists a limit u ∈ C([−T,+T ], Hs(Rd)

). As for the spatial

operator, one has

A(∂x)uj → A(∂x)u in C([−T,+T ], Hs−1(Rd)

).

Passing to the limit the equation one has that the weak time derivative ∂tuof the limit u is in C

([−T,+T ], Hs−1(Rd)

)and the equation is satisfied in

C([−T,+T ], Hs−1(Rd)

). Since T ≥ 0 is arbitrary we can extend the solution

to C(R, Hs(Rd)

)∩ C1

(R, Hs−1(Rd)

).

5. If u is the solution constructed above, then by making use of

‖u(t)‖Hd(Rd) ≤ ‖uj(t)‖Hs(Rd) + ‖u(t)− uj(t)‖Hs(Rd),

and analogous inequalities for u0,j and fj , together with (2.37), we obtain esti-mate (2.35) in general.

Having established the well-posedness of equation (2.32), we can now con-sider the full system of Maxwell’s equations. We have already discussed formallythat the two equations for the divergence of the fields are actually identicallysatisfied provided that they are satisfied for the initial data. This argumentapply to solutions in C

(R, Hs(Rd)

)as well.

80

Theorem 2.9. With initial data E0, B0 ∈ Hs(Rd) and ρ ∈ C1(R, Hs−1(Rd)

),

J ∈ C(R, Hs(Rd)

)satisfying

∇ · E0 = 4πρ(0), in Hs−1(Rd),

∇ ·B0 = 0, in Hs−1(Rd),

∂tρc +∇ · J = 0, in C(R, Hs−1(Rd)

),

Maxwell’s equations (2.10) have a unique solution E,B ∈ C1(R, Hs−1(Rd))∩

C(R, Hs(Rd)

)and we have

Es(t) ≤ Es(0) +

∫ t

0

‖4πJ(t′)‖Hs(Rd)dt′, t ≥ 0,

with equality if J = 0. Here, E2s (t) = ‖E(t)‖2Hs(Rd) + ‖B(t)‖2Hs(Rd).

The inequality reduces to an exact equality if J = 0, and for s = 0, E20 (t)

is proportional to the energy carried by the electromagnetic field. Hence wehave a mathematically precise version of the Poynting theorem (2.15) obtainedformally in section 2.1.

Proof. Theorem 2.8 applied to system (2.32), yields the unique solution E,Bsatisfying the two Maxwell’s equation for the curl of the fields and the estimatefor Es(t) follows directly from the estimate (2.35) since with u = (E,B),

‖u(t)‖2Hs(Rd) = ‖E(t)‖2Hs(Rd) + ‖B(t)‖2Hs(Rd).

We need to show that the equations for the divergence are identically satisfiedunder the hypotheses.

In Fourier space equations (2.32) take the form

∂tE(t, ξ)− icξ × B(t, ξ) = −4πJ(t, ξ),

∂tB(t, ξ) + icξ × E(t, ξ) = 0.

Multiplying by the Fourier variable iξ we have

∂t(iξ · E(t, ξ)− 4πρc

)= 0,

∂t(iξ · B(t, ξ)

)= 0,

and in the first identity we have used the charge continuity equation in the form

∂tρc(t, ξ) + iξ · J(t, ξ) = 0,

which holds in C(R, Hs−1(Rd)

)by hypothesis. We deduce that ∇ · E − 4πρc

and ∇ ·B ∈ C(R, Hs−1(Rd)

)are both constant in time and thus equal to zero

in view of the initial conditions.

81

3 From multi-fluid models to magnetohydrodynamics

In this section we introduce MHD equations as a single fluid theory for plas-mas obtained from a more fundamental multi-fluid model under appropriateconditions that will be clarified.

One could as well write MHD equations directly on the basis of a bit ofphysics modeling without the detailed derivation. A simple physical argumentto obtain MHD equations reads as follows. Let us consider an inviscid fluidwhich is electrically neutral but which carries an electric current J . We candescribe it by Euler’s equations (1.41) with an appropriately chosen force term.In presence of an electromagnetic fields (E,B) the current density carried by thefluid produces a net Lorentz force ρf = J ×B/c, where c is the speed of light infree space and Gaussian units are used throughout this note for electromagneticquantities. Euler’s equations then read

Dρ

Dt= −ρ∇ · u,

ρDu

Dt= −∇p+

J ×Bc

,

Dp

Dt= −γp∇ · u.

We still need equations for the current density J and for the electromagneticfields (E,B) which can be affected by the current carried by the conducting fluid.For low-frequency phenomena, the electromagnetic fields can be described byMaxwell’s equations without the displacement current term, namely,

∇×B =4π

cJ,

∂tB + c∇× E = 0,

∇ ·B = 0.

In view of the motion of the fluid, the standard Ohm’s law for an electricallyconducting body, i.e., E′ = ηJ with resistivity η, takes the form

E +u×Bc

= ηJ,

where E′ = E + u× B/c is the electric field in the reference frame of the fluidmotion. The foregoing equations form a closed system which constitutes in factthe resistive MHD model.

This derivation is appealing due to its clear physical argument and its sim-plicity. It has, however, the usual disadvantage of qualitative physics modeling,that is, we cannot establish precisely (that is, quantitatively) the limits of va-lidity of the proposed model and its relationship to other models in use todescribe the same physical system. Therefore, one should prefer, when possible,a derivation which is obtained from other more fundamental models. In the caseof MHD equations, let us start from a multi-fluid model of plasmas.

3.1 A model for multiple electrically charged fluids. Plasmas are ion-ized gasses composed by charged particles of various species: electrons and ionsof various elements. Each species of charged particles, labeled by the index

83

α, is characterized by the mass mα and the electric charge eα = Zαe where−e is electron change and Zα the charge state of the particle. For ions, thelatter does not always coincide with the atomic number of the correspondingelement, as the ion does not need to be completely ionized (this is the case forimpurities in the edge of fusion plasmas, low temperature plasmas in general,and the chromosphere of the Sun). Due to different mass and electric charge,electromagnetic forces as well as gravity act on each different particle speciesin a different way. In general, it is therefore necessary to treat each speciesindividually. In the context of fluid models, each species is regarded as a fluidwith number density nα(t, x), fluid velocity uα(t, x) and temperature Tα(t, x),all occupying the same spatial domain Ω.

In order to write the fluid equations governing the motion of each particlespecies, we need to compute the forces per unit of mass acting on the plasmafluid. Generally, we have electromagnetic forces due to an electric field E(t, x)and a magnetic field B(t, x), and in astrophysical plasmas, possibly, a gravita-tional acceleration field g(t, x). Each particle of the α-th species experience anacceleration,

aα(t, x, v) =eαmα

[E(t, x) +

v ×B(t, x)

c

]+ g(t, x).

According to the arguments of section 1.6, the force per unit of mass on theα-th species is then, cf. equation (1.30d),

mαnαfα = eαnα

[E +

uα ×Bc

]+mαnαg +Rα,

where Rα is total contribution of collisions. The gravitational field is a potentialfield, namely,

g = −∇Φg,

where Φg is the gravitational potential. In most physics problems, the gravityfield is externally imposed by the presence of a massive object such as the Sun ormore generally a star, i.e., the gravitational field generated by the mass of plasmaparticles can be neglected. On the other hand, electromagnetic fields must becomputed self-consistently from Maxwell’s equations including as sources thecharge and current density generated by plasma particles, namely,

ρc =∑α

eαnα, J =∑α

eαnαuα. (3.1)

Therefore we write a system of fluid equations for each particle species, coupledto Maxwell’s equations, namely,

∂tnα +∇ ·(nαuα

)= 0,

∂t(mαnαuα) +∇ · (mαnαuαuα + πα) = −∇pα+ eαnα

[E + uα×B

c

]+mαnαg +Rα,

∂t(

32pα

)+∇ ·

(32pαuα + qα

)+ pα∇ · uα + πα : ∇uα = Qα,

∂tE − c∇×B = −4πJ,

∂tB + c∇× E = 0,

∇ · E = 4πρc,

∇ ·B = 0,

(3.2)

84

where the index α runs over the set of all particle species and relation (1.30c)between internal energy and pressure has been accounted for. Viscosity tensorsπα, collision forces Rα, heat fluxes qα, and heat sources Qα are assumed tobe given by appropriate closure relations. At this level, we only assume thatcollisions do not change the total momentum or the total energy, i.e., they aresuch that ∑

α

Rα = 0,∑α

[uα ·Rα +Qα

]= 0. (3.3)

If all the three components of vector fields are counted individually, model (3.2)comprises 5×number of species+8 equations. In low temperature plasmas suchas those in the scrape-off layer of fusion devices or in the solar chromosphere,the number of species can be as high as one hundred and more. In most cases,however, one has a dominant ion species plus electrons, and possibly impurities,which have a very low concentration and can be treated as perturbations.

Model (3.2) has an extremely rich dynamics, that includes phenomena withvarious time scales, ranging from very fast plasma oscillations and electromag-netic wave modes down to relatively slow plasma waves. Moreover, model (3.2)allows for the possibility of electrically charged plasmas (spatial integral of ρchere can be non-zero), which is a situation usually not encountered in both fu-sion and astrophysical plasmas (though one should mention that pure electronplasmas and particle beam dynamics constitute interesting and active fields ofresearch).

From a computational point of view as much as for physics understanding,it is convenient to optimize the considered model to the physical phenomena ofinterest, rather then solve an unnecessarily general model. E.g., electromagneticwaves in model (3.2) introduce stiffness in the problem, but do not play any rolein the low-frequency dynamics of the plasma.

Analytical theory and, in particular, asymptotic methods, allows us to obtainapproximations of a given model (in this case model (3.2)) that are tailored tothe problem at hand.

Magnetohydrodynamics is one such model that can approximate the behav-ior of the solution of the multi-fluid model (3.2) under special conditions whichwe shall examine in the next sections.

Before continuing however, it seems important to add a methodological re-mark. In building model (3.2) we have taken different physical systems, namely,multiple fluids together with electromagnetic fields, and coupled them “by hand”in the most physically reasonable way, namely, the total charge and currentdensity carried by the fluids generate the electromagnetic fields which in turnact on the fluids via the Lorentz force. This coupling mechanism is physicallysound, but that’s in general not enough: Individually, fluid equations as well asMaxwell’s equations, without external forces or sources, enjoy various conser-vation laws, energy in particular; for fluids, that is the very basis from whichequation (1.25c) has been derived; for Maxwell’s equations energy conservationis guaranteed by Poynting theorem [50], cf. also section 2.1. It is natural toask whether the coupled system also have similar conservation laws. If we areinterested in the description of an isolated physical system, at least energy andmomentum conservation should be satisfied. For open systems that interactwith an environment, one has to check the energy and momentum balance. Itis, in general, not guaranteed a priory that the coupling mechanism of choice

85

has all the physically relevant conservation laws. For the specific case of themulti-fluid model (3.2) we can prove that this is the case.

Mass and electric charge conservation. The multi-fluid model (3.2) implies theconservation of the total mass: If we multiply each of the continuity equationsby the mass mα of the corresponding particles ad sum over all species we obtain

∂tρ+∇ ·(∑

α

mαnαuα

)= 0,

which is a conservation law for the mass density

ρ =∑α

mαnα,

The electric charge is also conserved; specifically, equations (3.2) implies thecharge continuity equation

∂tρc +∇ · J = 0,

which follows by multiplying the particle continuity equations by eα and sum-ming over the species. As discussed in section 2.1, this is a necessary conditionfor Maxwell’s equations.

Energy conservation. If the friction forces and heat sources satisfy the secondcondition in equation (3.3), we have conservation of the total energy,

(total energy) =

∫Ω

w(t, x)dx,

the total energy density w being given by

w =∑α

wα + wem, (3.4)

where

wα =1

2mαnαu

2α +

pαγ − 1

+mαnαΦg, γ =5

3,

is the energy density (kinetic plus internal plus gravitational) carried by thefluid of species α, and

wem =|E|2

8π+|B|2

8π,

is the energy density carried by the electromagnetic field, cf. equation (2.15b).Here, the internal energy is written in terms of pressure pα = nαkBTα, in virtueof (1.30c), and the adiabatic index γ. In order to prove energy conservation,we derive a continuity equation for w. For every species, we have, cf. equa-tion (1.23),

∂t(12mαnαu

2α + pα

γ−1 )

+∇ ·[( 1

2mαnαu2α + pα

γ−1 )uα + pαuα + πα · uα + qα]

= eαnαuα · E −mαnαuα · ∇Φg + uα ·Rα +Qα.

86

Then,

∂twα = ∂t(12mαnαu

2α + pα

γ−1 ) +mαΦg∂tnα

= ∂t(12mαnαu

2α + pα

γ−1 )−mαΦg∇ · (nαuα),

where the continuity equation has been accounted for. Combining the twoforegoing equations, one can notice that the terms involving the gravitationalpotentials combine into an exact divergence, namely,

−mαnαuα · ∇Φg −mαΦg∇ · (nαuα) = −∇ · (mαnαΦguα),

with the result that

∂twα +∇ ·[uαwα + pαuα + πα · uα + qα

]= eαnαuα · E + uα ·Rα +Qα,

where only the electric field contributes to the right-hand side as the magneticfield cannot do work on charged particles. By summing over plasma species andusing the second of conditions (3.3), one finds an energy balance equation forthe fluid part of the model, namely,

∂t

(∑α

wα

)+∇ ·

(∑α

Γwα

)= J · E, (3.5)

where the fluxes are Γwα = uαwα + pαuα + πα · uα + qα. The only sourceof energy for the fluids comes from the work done by the electric field on theplasma current. On the other hand, we have Poynting theorem for Maxwell’sequations, cf. equation (2.15),

∂twem +∇ · P = −J · E,

where

P =c

4πE ×B,

is the Poynting flux. The sum of fluid energy balance and Poynting theoremgives

∂tw +∇ · Γw = 0, (3.6)

where the total energy flux is Γw =∑α Γwα + P. This is a continuity equation

for the total energy density of the system and, upon integrating it over thedomain Ω with boundary conditions such that Γw · n = 0, one obtains that theconservation of the total energy as claimed. The crucial point is that the energydensity per unit of time transferred to the fluid by the electric field, is equal tothe energy density per unit of time lost by the electromagnetic fields, so thatthose energy-exchange terms cancel each other in the total energy balance.

Momentum conservation. The total momentum density is defined by (not tobe confused with the current density J)

j =∑α

jα +E ×B

4πc, (3.7)

87

where jα = mαnαuα and the momentum density carried by the electromagneticfield is P/c2. The sum over the species of each fluid momentum balance equationgives,

∂t

(∑α

jα

)+∇ ·

(∑α

Γjα

)= ρcE +

J ×Bc

+ ρg, (3.8)

where the momentum fluxes are Γjα = mαnαuαuα + πα + pαI, I being theidentity tensor, and the first of conditions (3.3) have been accounted for. FromMaxwell’s equations on the other hand, we have

∂tE ×B = c(∇×B)×B − 4πJ ×B,E × ∂tB = −cE × (∇× E),

so that

∂

∂t

(E ×B4πc

)=

1

4π

[(∇×B)×B − E × (∇× E)

]− J ×B

c.

We can make use of the vector calculus identity

(∇×B)×B = −∇(B2/2) +B · ∇B= −∇(B2/2) +∇ · (BB),

where we have used ∇ ·B = 0, and analogously

E × (∇× E) = ∇(E2/2)− E · ∇E= ∇(E2/2)−∇ · (EE) + (∇ · E)E

= ∇(E2/2)−∇ · (EE) + 4πρcE,

where we have used the Gauss law ∇ · E = 4πρc. Hence the electromagneticmomentum density satisfies the balance equation

∂

∂t

(E ×B4πc

)+∇ ·

( |E|2 + |B|2

8πI − EE +BB

4π

)= −ρcE −

J ×Bc

. (3.9)

The sum of the fluid momentum balance (3.8) and the electromagnetic momen-tum balance (3.9) gives

∂tj +∇ · Γj = ρg, (3.10)

where the momentum density flux,

Γj =∑α

Γjα +( |E|2 + |B|2

8πI − EE +BB

4π

),

is the sum of the fluid stress tensors plus the Maxwell’s stress tensor. The totalmomentum of the system is conserved provided that g = 0. Of course, gravitybreaks momentum conservation since it is an external force.

3.2 Quasi-neutral limit. Model (3.2) hides a very small parameter thatcan be exposed by scaling the equations. That means writing the equations interms of normalized variables, with normalization constants chosen to representthe typical magnitude of the corresponding variable for the specific physicsprocess under consideration.

88

The following scaling argument has been applied by Degond, Deluzet andSavelief [57] in the context of asymptotic preserving schemes for the Euler-Maxwell’s system.

With normalized variables denoted by over-bars, we write

t = τt, x = Lx, nα = Nenα, uα = V uα, Tα = TTα,

pα = NekBTpα, πα = NekBTπα, Qα = τ−1NekBTQα, qα = NekBTV qα,

E =kBT

eLE, B =

c

V

kBT

eLB, J = eNeV J, ρc = eNeρc,

g =V

τg, Rα =

NeMV

τRα, mα = Mmα, eα = eeα = eZα,

where τ , L, and V are the time, space, and velocity scales, respectively, while wehave used a reference value for the electron density Ne as a normalization scalefor all nα. The typical energy scale in a plasma is given by the thermal energy,namely, kBT , T being the temperature scale; therefore, pressure, viscosity, heatsources, and heat fluxes have all been normalized using this energy scale. Bothelectromagnetic fields (c.g.s units are used here) have the dimensions of energyper unit of electric charge per unit of length, but we have scaled the magneticfield by c/V , i.e., we normalize B/c rather then B. At last, the gravitationalacceleration g is normalized to the natural acceleration scale, friction Rα to thenatural scale for momentum variation, with the mass scale M being, e.g., themass of the ion species with the highest concentration.

The substitution of normalized variables into system (3.2) gives

Ne

τ ∂tnα + NeVL ∇x ·

(nαuα

)= 0,

MNeVτ ∂t(mαnαuα) + MNeV

2

L ∇x · (mαnαuαuα) + NekBTL

[∇x · πα +∇xpα

]=

+ NekBTL eαnα

(E + uα ×B

)+ MNeV

τ

[mαnαg +Rα

],

NekBTτ ∂t

(32pα

)+

NekBTVL

[∇x ·

(32pαuα + qα

)+ pα∇x · uα + πα : ∇xuα

]= NekBT

τ Qα,

kBTeLτ ∂tE − c

2 kBTeL2V ∇x ×B = −4πeNeV J,

c kBTeLV τ ∂tB + ckBTeL2 ∇x × E = 0,kBTeL2 ∇x · E = 4πeNeρc,

∇x ·B = 0.

At this point, we introduce crucial physical assumptions relating the variousscales. First, we recall the definition of a basic plasma parameter, the Debyelength (c.g.s. units)

λD =

√kBT

4πe2Ne, (3.11)

which controls how electrons shield an ion charge. Then, we can state the basicassumptions:

τ =L

V, MV 2 = kBT,

V

c=λDL. (3.12)

The first assumption relates the time scale τ to the advection time scale L/V .Although natural for fluid equations, this scaling has an important consequence

89

on Maxwell’s equations. The second assumption sets the value of the velocityscale to the typical thermal speed,

V = vth =√kBT/M,

and thus simplifies the scaling of the momentum and heat transport equations.The third assumption is less obvious since the two dimensionless parametersV/c and λD/L are physically independent. Setting them equal to each otheris justified a posteriori : One obtains that in the Ampere-Maxwell law, ∇x ×Band J have comparable magnitude. In fact, with assumptions (3.12), the scaledsystem becomes

∂tnα +∇x ·(nαuα

)= 0,

∂t(mαnαuα) +∇x ·(mαnαuαuα + πα

)=

−∇xpα + eαnα(E + uα ×B

)+mαnαg +Rα,

∂t(

32pα

)+∇x ·

(32pαuα + qα

)+ pα∇x · uα + πα : ∇xuα = Qα,

ε∂tE −∇x ×B = −J,∂tB +∇x × E = 0,

ε∇x · E = ρc,

∇x ·B = 0,

(3.13)

which depends on a single dimensionless parameter

ε =λ2D

L2=V 2

c2 1. (3.14)

For most plasmas, ε is extremely small since the Debye length λD is usuallyvery short as compared to the typical spacial scale L and the relevant velocityscale is negligible as compared to the speed of light. The parameter ε multipliesthe displacement current term in the Ampere-Maxwell law and the divergenceof the electric field in the Gauss law. The fact that the displacement currentis scaled by ε follows from the first of assumptions (3.12) which rules out thepossibility of high-frequency oscillations of the electric field. The scaling of theGauss law on the other hand, implies that the charge density ρc is negligible.The remaining equations in the system are unchanged by the scaling.

We are interested in the limit ε → 0+, which is referred to as the quasi-neutral limit, since formally ρc = O(ε), i.e., the plasma is neutral at the leadingorder in ε→ 0+.

The idea is that since the physical value of ε, although finite, is very small,the limit solution for ε→ 0 will be a good approximation of the full solution. Bysuch an approximation one hopes to obtain a system of equations which is onone hand, computationally simpler and on the other hand, closer to the physicsof interest (exposing only essential terms) than the original system.

In this case, the formal limit of the scaled system could be obtained bysetting ε = 0 directly into the equations. However it is instructive to followstep-by-step the general procedure of formal asymptotic expansions.

As a general rule, one should not approximate directly the equations. Ratherone should try to construct an approximation of the solution. In this case, we

90

try an asymptotic expansion in powers of the parameter ε > 0, namely,

χε(t, x) ∼+∞∑j=0

εjχj(t, x),

where χε(t, x) stands for any of the normalized unknown of the scaled system,viewed as a function of ε > 0. The series is not required to converge absolutely,thus we write “∼” instead of “=”. The precise definition of “∼” is given in insection 1.6 where the expansion of the phase-space distribution function is con-sidered. (The interested reader can refer to Boyd’s paper [58] for an instructivediscussions on asymptotic series).

The asymptotic series for each variable is substituted into the system andterms with the same power of ε are collected. The leading order terms, corre-sponding to εj with j = 0, give a closed system for the leading term χ0 in theasymptotic series of each normalized quantity, namely,

∂tn0α +∇x ·

(n0αu

0α

)= 0,

∂t(mαn0αu

0α) +∇x ·

(mαn

0αu

0αu

0α + π0

α

)=

−∇xp0α + eαn

0α

(E0 + u0

α ×B0)

+mαn0αg +R0

α,

∂t(

32p

0α

)+∇x ·

(32p

0αu

0α + q0

α

)+ p0

α∇x · u0α + π0

α : ∇xu0α = Q0

α,

∇x ×B0 = J0 =∑α

eαn0αu

0α,

∂tB0 +∇x × E0 = 0,

0 = ρ0c =

∑α

eαn0α,

∇x ·B0 = 0,(3.15)

which is formally identical to the full scaled system except for two equations:Ampere-Maxwell law (which is now without displacement current) and Gausslaw (which is reduced to the quasi-neutrality condition). As anticipated thiscould have been obtained by setting ε = 0 in the scaled system. We see that, inthis regime, the plasma must be quasi-neutral, i.e., the total charge density iszero to the lowest order in ε→ 0. This, however, does not imply that the lowestorder electric field E0 must be divergence-free, as one might expect from Gausslaw. This apparent paradox is easily understood computing the equations forthe first order correctors, i.e., the terms χ1 in the asymptotic series. We canconsider only the correctors for the electromagnetic fields, namely,

∂tE0 −∇x ×B1 = −J1,

∂tB0 +∇x × E0 = 0,

∇x · E0 = ρ1c .

(3.16)

We see that the time-variation of the zero-order electric field produces a displace-ment current that adds to the first-order current as a source for the correctorB1 of the magnetic field. In the same way, the divergence of the zero-orderelectric field defines a charge density ρ1

c which corresponds to a small charge

91

separation (of electrons and ions) in the plasma. Thus, Ampere-Maxwell andGauss laws are satisfied as they should be, even though the lowest order fieldE0 has a non-zero time derivative and divergence.

The quasi-neutral limit has a very important consequence on the conservedenergy which is critical to understand energetics in MHD. Let us consider theelectromagnetic energy in terms of normalized quantities. The natural scale forenergy density is MNeV

2 = NekBT , hence,

|E|2

8π+|B|2

8π= NekBT

[ kBT

4πe2NeL2

1

2|E|2 +

c2

V 2

kBT

4πe2NeL2

1

2|B|2

]= NekBT

[ ε2|E|2 +

1

2|B|2

],

and formally passing to the limit ε→ 0 yields

|E|2

8π+|B|2

8π= NekBT

[ ε2|E|2 +

1

2|B|2

]→ 1

2NekBT |B0|2. (3.17)

Surprising as it might seems, the electric field does not contribute to the lowest-order electromagnetic energy in the quasi-neutral limit. Similarly, the electro-magnetic momentum density is a first order quantity in ε:

P/c2 =E ×B

4πc=NekBT

V

λ2D

L2E ×B = NeMV εE ×B → 0. (3.18)

Upon restoring physical units in system (3.15), we obtain


)= 0,

∂t(mαnαuα) +∇ ·(mαnαuαuα + πα

)=

−∇pα + eαnα[E + uα×B

c

]+mαnαg +Rα,

∂t(

32pα

)+∇ ·

(32pαuα + qα

)+ pα∇ · uα + πα : ∇uα = Qα,

∇×B =4π

cJ =

∑α

eαnαuα,

∂tB + c∇× E = 0,

0 = ρc =∑α

eαnα,

∇ ·B = 0,

(3.19)

where for simplicity of notation we have dropped the superscript “0” and it isimplied from now on that all quantities are identified with their quasi-neutrallimit.

According to equation (3.17), the electric field energy density does not con-tribute to the lowest order in the quasi-neutral limit, hence, the energy w definedin equation (3.4) becomes (in physical quantities)

w =∑α

wα +|B|2

8π, wα =

1

2mαnαu

2α +

pαγ − 1

+mαnαΦg. (3.20)

System (3.19) preserves this energy, that is, the asymptotic expansion carriedout above is energetically consistent. In order to prove this claim, let us notice

92

that equation (3.5) still holds true, as the fluid equations in system (3.19) areunchanged, but Poynting theorem must be replaced by an equation for themagnetic energy density. Ampere’s law scalar-multiplied by the electric field Egives

c

4πE · ∇ ×B = E · J,

while from Faraday’s law one gets

∂

∂t

(B2

8π

)+

c

4πB · ∇ × E = 0.

The difference of the two equations yields

∂

∂t

(B2

8π

)+∇ · P = −E · J,

which, together with the energy balance (3.5) for the fluid energy, yields acontinuity equation for energy (3.20).

As for the momentum balance, in the quasi-neutral limit the electromagneticmomentum density is a first order quantity, cf. equation (3.18), so that thelowest order total momentum density is (in physical units),

j =∑α

jα =∑α

mαnαuα. (3.21)

The limit system (3.19) preserves the limit (3.21) of the total momentum. Infact, equation (3.8) still holds with the difference that quasi-neutrality cancelsthe electric force, namely

∂tj +∇ ·(∑

α

Γjα

)=J ×Bc

+ ρg.

For the J ×B force, we have

J ×Bc

=1

4π(∇×B)×B = −∇

(B2

8π

)+∇ ·

(BB4π

),

which is in divergence form, so that the total momentum balance (3.10) stillholds in the quasi-neutral limit with j given by (3.21) and flux

Γj =∑α

Pα +( |B|2

8πI − BB

4π

),

where only the magnetic part of Maxwell’s stress tensor enters.The charge continuity equation (2.1) now becomes

∇ · J = 0, (3.22)

which is implied by Ampere’s law as well as by particle continuity equationstogether with charge quasi-neutrality ρc = 0.

System (3.19) requires some more care. First, it does not feature an explicitequation for the electric field E. Second, equations for particle number densitiesnα might not be consistent with the charge quasi-neutrality condition. At last,

93

the constraint (3.22) might not be consistent with the momentum equations forthe fluid velocities uα that define the current density. We therefore have towonder if the limit problem (3.19) is well posed, or at least we have to checkthat there are sufficient equations to determine all the variables.

With this aim, let us consider an equivalent reformulation of system (3.19).One can notice that Ampere’s law can be used to express the current J in termsof the magnetic field, and J thus defined satisfies automatically condition (3.22).Then particle continuity equations yield ∂tρc = 0 which means that the quasi-neutrality constraint ρc = 0 is satisfied at all time, provided that it is satisfiedby the initial conditions. The critical point is therefore obtaining an equationfor the electric field such that Ampere’s law is consistent with the sum over αof the momentum balance equations, so that

J =c

4π∇×B =

∑α

eαnαuα,

and all constraints are satisfied.

If one multiples the momentum balance equation of the species α by thecharge-to-mass ratio eα/mα and sum over the species index α, the result is

∂tJ +∑α

eαmα∇ · Γjα =

∑α

e2αnαmα

[E +

uα ×Bc

]+∑α

eαmα

Rα,

where we have introduced the total stress tensor Γjα = mαnαuαuα + πα + pαIfor simplicity, and the gravitational force term vanishes on account of quasi-neutrality. On the other hand, from Ampere and Faraday laws we have

∂tJ =c

4π∇× (∂tB) = − c

2

4π∇× (∇× E),

and thus

∇× (∇× E) + d−2p E =

∑α

1

d2α

[∇ · Γjαeαnα

− uα ×Bc

− Rαeαnα

], (3.23)

where we have defined the characteristic lengths,

dα = c/ωp,α, d−2p =

∑α

d−2α , (3.24)

that are referred to as the skin depth of the species α and the plasma skin depth,respectively. Here, the quantity

ωp,α =

√4πe2

αnαmα

,

is the plasma frequency of the species α (c.g.s. units). Equation (3.23) is anelliptic equation for the electric field E and, together with appropriate boundaryconditions, fully determines the electric field given the skin depths and the right-hand side.

94

This suggests the following reformulation of system (3.19), where equa-tion (3.23) is added to the system:


)= 0,

∂t(mαnαuα) +∇ ·(mαnαuαuα + πα

)=

−∇pα + eαnα[E + uα×B

c

]+mαnαg +Rα,

∂t(

32pα

)+∇ ·

(32pαuα + qα

)+ pα∇ · uα + πα : ∇uα = Qα,

∇×B =4π

cJ,

∂tB + c∇× E = 0,

∇ ·B = 0,

∇× (∇× E) + d−2p E =

∑α

1d2α

[∇·Γjαeαnα

− uα×Bc − Rα

eαnα

],

0 = ρc =∑α

eαnα,

J =∑α

eαnαuα.

(3.25)

Since we have shown that the new equation for E is implied by the quasi-neutralmulti-fluid model (3.19), we have that any solution of the original quasi-neutralmulti-fluid model is also a solution of the reformulated model (3.25).

The advantage of the reformulation is that now we can formally prove thatthe last two equation in system (3.25) are just constraints on the initial condi-tions.

Specifically we claim that, if at the initial time t = 0 we pose initial condi-tions such that∑

α

eαnα = ρc = 0, and∑α

eαnαuα = J =c

4π∇×B,

then the same conditions holds for later time t ≥ 0. Hence, three equationsin the system are actually automatically satisfied and we do not need to solvethem explicitly.

We have already observed that, if the claim holds for the current, then theparticle continuity equations imply that ρc is constant and thus zero for t ≥ 0.It is therefore sufficient to prove the claim for the current. With this aim, weconsider again the sum over particle species of the momentum-balance equationsweighted by the charge-to-mass ratio eα/mα. With the same steps as before,we can write the result in the form

∂t

(∑α

eαnαuα

)=c2

4π

[ 1

d2p

E −∑α

1

d2α

(∇ · Γjα −Rαeαnα

− uα ×Bc

)].

It is worth noting that we cannot identify∑α eαnαuα with J yet, as that is the

equation we want to prove. However, now we know the equation for the electricfield which implies that the right-hand side is equal to

− c2

4π∇× (∇× E) =

c

4π∇× (∂tB),

95

where Faraday’s law has been accounted for. Therefore we have

∂t

(∑α

eαnαuα −c

4π∇×B

)= 0.

Using the Ampere law as a definition of J , we have

∂t

(∑α

eαnαuα − J)

= 0,

which means that J = c4π∇×B is consistent with the sum of eαnαuα, provided

that the initial conditions satisfy Ampere law, namely,

∇×B =4π

c

∑α

eαnαuα,

which is a physically reasonable request in the quasi-neutral limit. Thereforethe last two equations in system (3.25) are identically satisfied provided that theinitial conditions are properly chosen. In addition, this shows that a solutionof the reformulated system (3.25) is also a solution of the original quasi-neutralmodel (3.19), which is then equivalent to the reformulated one.

Although this argument is purely formal, it shows that system (3.25) hashope to be well posed, or at least we have enough equations to determine allthe variables compatibly with the constraints.

From now on we always implicitly assume that the initial conditions arewell prepared in the sense explained above. Therefore, system (3.19) and sys-tem (3.25) are regarded as two equivalent formulations of the quasi-neutralmulti-fluid model for plasmas.

3.3 From multi-fluid to a single-fluid model. Let us consider the sub-system of equations for fluid variables nα, uα, and Tα in the fluid models ofsection 3.1 and 3.2. It is remarkable that the multi-fluid equations imply asingle-fluid model obtained by averaging over the species. Let us begin from thedefinition of the variables that describe such a single fluid.

The mass density ρ of the single fluid is given by the total mass density ofthe plasma, namely,

ρ =∑α

mαnα, (3.26)

and this is intuitively natural. On the other hand, the single-fluid velocity u isidentified with the velocity of the center of mass of fluid elements, that is

ρu =∑α

mαnαuα. (3.27)

The single-fluid total energy density is given by the sum of the energies of thefluid element of each species, namely∑

α

wα =∑α

(1

2mαnαu

2α +

pαγ − 1

+mαnαΦg).

The kinetic energy term is∑α

1

2mαnαu

2α =

∑α

1

2mαnα(uα − u+ u)2 =

∑α

1

2mαnα(uα − u)2 + 1

2ρu2,

96

from which we have∑α

wα =1

2ρu2 +

∑α

( pαγ − 1

+1

2mαnα(uα − u)2

)+ ρΦg

=1

2ρu2 +

p

γ − 1+ ρΦg,

with γ − 1 = 2/3 and the single-fluid pressure is given by

p =∑α

(pα +

1

3mαnα(uα − u)2

). (3.28)

We can now proceed with the derivation of equations for the single-fluid variablesρ, u, and p from the multi-fluid system (3.25).

The mass continuity equation for the single-fluid is obtained from the particlecontinuity equations in (3.25). Upon multiplying by mα and summing overspecies, one gets

∂t

(∑α

mαnα

)+∇ ·

(∑α

mαnαuα

)= 0,

which reads∂tρ+∇ · (ρu) = 0, (3.29)

where we have used definition (3.26) of the total mass density and defini-tion (3.27) of the center of mass velocity. In fact, this result justifies the choiceof the center-of-mass velocity as the fluid velocity u.

The sum over α of the momentum balance equations in (3.25) gives

∂t

(∑α

mαnαuα

)+∇ ·

(∑α

(mαnαuαuα + πα))

+∇(∑

α

pα

)=J ×Bc

+ ρg,

where the quasi-neutrality condition ρc =∑α eαnα = 0 and the identity J =∑

α eαnαuα (that are both implied by system (3.25)) have been accounted for,as well as the first of conditions (3.3). In addition, we compute∑

α

mαnαuαuα =∑α

mαnα(uα − u+ u)(uα − u+ u)

= ρuu+∑α

mαnα(uα − u)(uα − u),

= ρuu+∑α

([mαnα(uα − u)(uα − u)− 1

3mαnα(uα − u)2I

]+

1


).

The tensor in square bracket is symmetric and trace-free. We can thereforedefine the single-fluid viscosity

π =∑α

(πα +mαnα(uα − u)(uα − u)− 1


), (3.30)

97

so that

∇ ·(∑

α

(mαnαuαuα + πα))

+∇(∑

α

pα

)= ∇ · (ρuu+ π) +∇p,

where the single-fluid pressure p has been defined in (3.28). At last, the sum ofmomentum balance equations can be written in the form

∂t(ρu) +∇ · (ρuu+ π) = −∇p+J ×Bc

+ ρg. (3.31)

As a consequence of quasi-neutrality and conservation properties of the collisionoperator, cf. equation (3.3), the only forces acting on the plasma in the single-fluid picture are the classical J × B force due to a current flowing in presenceof a magnetic field and (if present) gravity.

At last, we need to address the transport of internal energy of the single-fluid. With this aim it is convenient to start from the balance equation for thetotal fluid energy, namely, equation (3.5), in which the energy flux is∑

α

Γwα =∑α

[ 12mαnαu

2αuα +mαnαΦguα + 5

2pαuα + πα · uα + qα].

We need to recast the energy flux in a better form. With this aim, we can makeuse of the identity∑α

12mαnαu

2αuα =

∑α

12mαnα(uα − u+ u)2(uα − u+ u)

=∑α

12mαnα((uα − u)2 + 2u · (uα − u) + u2)(uα − u+ u)

= 12ρu

2u+ u ·(∑

α

mαnα(uα − u)(uα − u))

+ 12

∑α

mαnα(uα − u)2u+ 12

∑α

mαnα(uα − u)2(uα − u),

together with ∑α

pαuα =(∑

α

pα

)u+

∑α

pα(uα − u),∑α

πα · uα =(∑

α

πα

)· u+

∑α

πα · (uα − u).

The combination of the foregoing identities and the definitions of single-fluidpressure and viscosity, cf. equations (3.28) and (3.30), give∑

α

Γwα =(

12ρu

2 + pγ−1 + ρΦg)u+ π · u+ up+ q,

where the single-fluid heat flux is defined by

q =∑α

[qα + 5

2pα(uα − u) + πα · (uα − u) + 12mαnα(uα − u)2(uα − u)

]. (3.32)

98

Correspondingly, the energy balance equation (3.5) takes the form of the usualtotal energy balance of fluid dynamics, cf. equation (1.23),

∂t(

12ρu

2 + pγ−1 + ρΦg

)+∇ ·

[( 1

2ρu2 + p

γ−1 + ρΦg)u

+ π · u+ up+ q]

= J · E, (3.33)

the only energy source being the J · E Ohmic heating term.We have therefore derived a single-fluid equations, namely equations (3.29),

(3.31), and (3.33), from the multi-fluid model. One should note however thatthere is a difference with respect to ordinary fluids: In equation (1.23) the workdone by the forces is just ρu · f where ρf is the acting force on the right-handside of the momentum balance. In the case of plasmas, cf. equation (3.31),that would give u · (J ×B)/c, which differs from the Ohmic heating term on theright-hand side of the energy balance (3.33). Therefore, a literal transcription offluid equations to the case of plasmas (just by substitution of ρf with the J ×Bforce plus gravity) would break energy conservation, cf.e Poynting theorem insection 3.1, leading to physically incorrect results.

As a consequence the transport equation for the internal energy differs fromthat of ordinary fluids. We proceed as for equation (1.25c),

∂t

(1

2ρu2 +

p

γ − 1+ ρΦg

)+∇ ·

[(1

2ρu2 +

p

γ − 1+ ρΦg)u+ π · u+ up+ q

]= ∂t

( p

γ − 1

)+∇ ·

[ p

γ − 1u+ q

]+ p∇ · u+ π : ∇u+ ρu · ∇Φg

+ u · (ρ∂tu+ ρu · ∇u+∇ · π +∇p),

while the momentum balance equation (3.31) yields

ρ∂tu+ ρu · ∇u+∇ · π +∇p =J ×Bc− ρ∇Φg.

By means of the two foregoing identities, equation (3.33) can be rewritten inthe form of a transport equation for the single-fluid internal energy, namely,

∂t(

pγ−1

)+∇ ·

[p

γ−1u+ q]

+ p∇ · u+ π : ∇u = J · E − u · (J ×B)/c.

The rather complicated heat-exchange term on the right-hand side can be phys-ically understood by means of the vector calculus identity

u · (J ×B) = J · (B × u) = −J · (u×B),

for which the heat transport equation can be written as

∂t(

pγ−1

)+∇ ·

[p

γ−1u+ q]

+ p∇ · u+ π : ∇u = J · (E + u×Bc ). (3.34)

The right-hand side now can be understood as the Ohmic heating term J · E′with the electric field E′ = E+u×B/c being the electric field transformed intothe reference frame of the local fluid velocity u in the non-relativistic limit (cf.Jackson’s book [50] for Lorentz transformations of electromagnetic fields).

In summary, we have shown for the multi-fluid model that the center-of-mass fluid, with variables defined by equations (3.26), (3.27) and (3.28), satis-fies the equations of ordinary fluid dynamics given by equations (3.29), (3.31),

99

and (3.34). Let us note that this is an exact result: No approximations havebeen introduced.

Such single-fluid equations, however, are not closed since (i) viscosity (3.30)and heat flux (3.32) depend on all the multi-fluid variables, and (ii) the electro-dynamic quantity, particularly the electric field E, need to be computed from theelectrodynamics part of model (3.25), which again depends on the multi-fluidvariables.

The simplest answer to the first point is to consider a situation where theEuler’s closure π = 0, q = 0 applies, cf. section 1.6, and that yields Euler’sequations for a compressible inviscid plasma, namely,

Dρ

Dt+ ρ∇ · u = 0,

ρDu

Dt= −∇p+

J ×Bc

+ ρg,

D

Dt

p

γ − 1+

γ

γ − 1p∇ · u = J · (E +

u×Bc

),

(3.35)

where D/Dt = ∂t+u ·∇ is the advective derivative. The validity of this system,which is just the Euler’s system (1.41) with the appropriate force and heatingterms, is of course limited by the closure conditions. It is physically meaningfulto consider a case where each species is close to a local Maxwellian, hence,πα ≈ 0 and qα ≈ 0 for all α, but this assumption alone does not suffice. Byinspection of equations (3.30) and (3.32) we see that we need

uα ≈ u, (3.36)

which means that all plasma species move with approximately the same ve-locity. Here, the symbol “≈” must be interpreted as “approximately equal asfunctions”, that is, not just the values of the functions but also the values oftheir derivatives is approximately the same. This, in particular, implies thatthe pressure is approximated by the Dalton’s law of partial pressures, cf. equa-tion (3.28),

p ≈∑α

pα,

thus neglecting the kinetic energy associated to the relative velocity uα−u. Con-dition (3.36) is of course very strong. We shall discuss this point in section 3.4together with the electrodynamics part of the model.

As a last remark, let us note that even if we can obtain a closed systemfor single-fluid quantities, in general we cannot infer from a solution thereofa solution for the individual particle species. This is however possible for thespecific case of an electron-ion plasma addressed in section 3.4.

3.4 The Ohm’s law for an electron-ion plasma. In a multi-fluid model,different species can have a significantly different dynamics. The calculation ofsection 3.3 shows in full generality that we can obtain a single-fluid model for thecenter-of-mass fluid provided that we take into account the stress and heat-fluxdue to the relative motion uα − u of the different species.

In most cases, however, a single ion species (majority ions) accounts formost of the mass and positive electric charge in the plasma; other ion species

100

are impurities with small concentrations. Electrons provide the neutralizingnegative charge.

It is therefore natural to consider a plasma with a single ion species pluselectrons, with the idea that additional impurities have to be treated as per-turbations. In that case, the single fluid quantities are, cf equations (3.26),(3.27),

0 = Zini − ne,

ρ = mini +mene

ρu = miniui +meneue,

J = ene(ui − ue),

where Zi = ei/e is the charge state of the majority ion species; here, α = i refersto ions and α = e refers to electrons. It follows that, for a quasi-neutral electron-ion plasma, there is a one-to-one relation between single-fluid quantities (on theleft-hand side) and multi-fluid quantities (on the right-hand side). Therefore,the single-fluid model and the multi-fluid model (which is referred to as two-fluidfor an electron-ion plasma) are totally equivalent provided that the stress andheat flux due to the relative velocity are accounted for. Nonetheless, such anequivalent single-fluid formulation would be rather complicated so that the two-fluid model with electrons and ions treated independently should be preferred.

At the price of some approximations, we can however derive a convenientsingle-fluid model which will eventually lead to magnetohydrodynamics. In factwith one single ion species, it is easier to satisfy condition (3.36), due to thesmall electron mass me/mi ≈ 1/1836.

We have the exact relations,

ui = u+δ

1 + δ

J

ene,

ue = u− 1

1 + δ

J

ene= u− J

ene+

δ

1 + δ

J

ene,

δ =me

miZi. (3.37)

For the ion velocity, we see that ui ≈ u apart from a correction of the order ofthe mass ratio δ = Zi

me

mi 1, so that ions well satisfy condition (3.36) as far

as J/(ene) stays bounded. For the electron velocity, in addition to the sameterm of order δ, we find the J/(ene) velocity which accounts for the drift ofelectrons with respect to ions, necessary to sustain a current density. Hence,condition (3.36) for electrons requires

|J |ene |u|, (3.38)

which is much more difficult to verify.

Under assumption (3.38), we can make use of Euler’s equations (3.35), thesolution of which allows us to reconstruct a good approximation of the two-fluidvariables by relations (3.37).

As anticipated at the end of section 3.3, in order to close the single-fluidsystem, we need to write the electrodynamics part of model (3.25) in termsof single fluid variables only. The issue has to do with the equation for theelectric field, which depends on multi-fluid variables. That is equivalent to, cf.

101

section 3.2,

∂tJ +∑α

∇ · (eαnαuαuα) +∑α

eαmα

(∇ · πα +∇pα)

=∑α

e2αnαmα

(E +

uα ×Bc

)+∑α

eαmα

Rα.

For a quasi-neutral electron-ion plasma, we have∑α∈i,e

eαnαuαuα = ene(uiui − ueue)

= ene

(uiui − (ui −

J

ene)(ui −

J

ene))

= uiJ + Jui −JJ

ene

= uJ + Ju− JJ

ene+

2δ

1 + δ

JJ

ene,

where the first of equations (3.37) has been accounted for in the last equalityand δ in the mass ratio defined in (3.37). Thereby, one gets

E +ue ×Bc

− Re

ene+

1

ene∇ · (πe + peI)− me

e2ne

[∂tJ +∇ ·

(uJ + Ju− JJ

ene

)]= −δ

[E +

ui ×Bc

+Ri

ene− 1

ene∇ · (πi + piI)− 2

1 + δ

me

e2ne∇ ·( JJene

)],

where all terms of order δ have been isolated on the right-hand side. The collisionforces Rα for an electron-ion plasma have been computed by Braginskii [38, 44].Here, we take into account only the friction force which tends to equalize electronand ion velocities, namely,

−Ri = Re =mene

τe(ui − ue) =

me

eτeJ,

where [44]

τe =3√me(kBTe)3/2

4√

2πe4neΛ, (3.39)

is the electron collision time, and Λ is the Coulomb logarithm. With respect tothe full Braginskii expression, thermal forces and plasma anisotropy have beenneglected. With this expression for the collision force and with ui, ue written interms of u, J , we obtain

E +u×Bc− ηJ

+1

1 + δ

[ 1

ene∇ · (πe + peI)− J ×B

enec− me

e2ne

(∂tJ +∇ ·

(uJ + Ju− JJ

ene

))]=

δ

1 + δ

[ 1

eni∇ · (πi + piI)− J ×B

enec− 2

1 + δ

me

e2ne∇ ·( JJene

)], (3.40)

where the plasma resistivity is given by

η =me

e2neτe. (3.41)

102

Equation (3.40) is equivalent to a vectorial Helmholtz equation for the electricfield, as discussed in section 3.2, and it is known as the generalized Ohm’s lawfor an electron-ion plasma. In this case, it gives an explicit expression for theelectric field E in terms of the other variables.

This form of the generalized Ohm’s law is entirely general: Apart from aspecific choice of the collision forces, the only condition that has been used isquasi-neutrality.

In magnetohydrodynamics, the generalized Ohm’s law is further simplified inorder to capture the essential physics. The most straightforward approximationis neglecting the terms of order δ, hence (1 + δ)−1 ∼ 1 on the left-hand sideof the Ohm’s law and all terms on the right-hand side are neglected. Withthe assumption of electron (nearly) in local thermodynamical equilibrium, theelectron viscosity πe is also neglected. The resulting approximate form of thegeneralized Ohm’s law stands at the basis of extended MHD extensively studiedby Morrison and co-workers [59, 60, 61], cf. also appendix B.

The magnitude of the remaining terms is less obvious to establish, and dif-ferent choices lead to different flavors of MHD, each being a particular approxi-mation of extended MHD, cf. appendix B. In standard MHD theory the Ohm’slaw is approximated by

E +u×Bc

= ηJ, (3.42)

which is just the classical Ohm’s law E′ = ηJ written in the reference frameof the local fluid velocity, E′ = E + u × B/c being the Lorentz-transformedelectric field in the weakly relativistic limit. Here, the electron inertia term,proportional to me/(e

2ne) in the generalized Ohm’s law (3.40), and the Halleffect, given by the J×B term, have been neglected in view of condition (3.38).Last, the electron pressure gradient ∇pe is neglected compared to the electricforce eneE.

In correspondence to this choice of the Ohm’s law, the heating term in thepressure equation of the Euler’s system (3.35) becomes

J · E′ = ηJ2, (3.43)

which is the classic form of resistive heating.

It is worth noting that the perpendicular (with respect to B) part of thesingle-fluid velocity is entirely determined by the Ohm’s law (3.42) in terms ofthe electromagnetic fields and the current, namely,

u⊥ = −B × (B × u)

B2= c

E ×BB2

+ ηB × JB2

.

We see that in absence of resistivity (η = 0), the component of the fluid veloc-ity perpendicular to the magnetic field is given by the celebrated E × B-driftvelocity, namely,

vE = cE ×BB2

. (3.44)

This velocity field plays an important role in plasma physics and can be under-stood by considering the motion of a particle of electric charge eα and mass mα

under the influence of the Lorentz force, cf section 2.2.

103

3.5 The equations of magnetohydrodynamics. We have now built thenecessary basis to state magnetohydrodynamics equations. Let us summa-rize the results obtained in the previous sections, in order to have a completeoverview of the derivation.

We have shown in section 3.3, that the quasi-neutral model (3.19), or equiv-alently its reformulated version (3.25), implies single-fluid equations (3.29),(3.31), and (3.34) exactly, provided that stresses and heat fluxes due to therelative velocity of different species are accounted for.

For the specific case of an electron-ion plasma, such stresses and heat fluxescan be neglected provided that condition (3.38) is fulfilled. In addition, if allspecies are close to local thermodynamical equilibrium, Euler’s equations (3.35)apply in the sense that given a solution thereof one can construct electron andion fluid variables by means of identities (3.37), and those are expected to begood approximations of a solution of the quasi-neutral two-fluid model. At last,we have the MHD form (3.42) of Ohm’s law which holds when election inertia,electron pressure, and Hall effect can be neglected.

Under the conditions summarized above and for an electron-ion plasma,we can replace the multi-fluid equations in (3.25) by single-fluid Euler’s equa-tions (3.35) coupled to Ohm’s law (3.42), with the result that

Dρ

Dt+ ρ∇ · u = 0,

ρDu

Dt= −∇p+

J ×Bc

+ ρg,

D

Dt

p

γ − 1+

γ

γ − 1p∇ · u = ηJ2,

E +u×Bc

= ηJ,

∂tB + c∇× E = 0,

∇×B =4π

cJ,

∇ ·B = 0.

(3.45)

This is a closed system of equations describing the dynamics of an ideal com-pressible electrically conducting quasi-neutral fluid with finite resistivity, that isreferred to as magnetohydrodynamic equations or more precisely resistive mag-netohydrodynamic equations.

The derivation of MHD equations in the framework of multi-fluid theoryallows us to state validity conditions, that are repeated and commented here forsake of completeness:

• Quasi-neutrality and low-frequency dynamics, cf. equations (3.12),

τ =L

V, MV 2 = kBT,

V

c=λDL 1. (3.46a)

The condition of the time-scale τ limits the range of frequencies whereMHD applies. E.g., high frequency phenomena such as ion cyclotron,whistler, and electron cyclotron waves are well outside the applicabilityof MHD equations. The last two conditions on the other hand are ratherweak, i.e., easy to satisfy in most fusion and astrophysical plasmas.

104

• Electron-ion plasma with small electron drift velocity, cf. equation (3.38),

|J |ene |u|. (3.46b)

This is the most difficult condition. Within standard MHD large localizedcurrent densities (current sheets) can occur. Near a current sheet, J canbe very large, signaling that electron motion is very much different fromion motion. Then, a two-fluid model is to be preferred to plain MHD.

• In the MHD form (3.42) of Ohm’s law,

|∇pe| ene|E|, (3.46c)

while inertia and Hall effects have been dropped in virtue of (3.46b).

By inspection of equations (3.45), one can see that condition ∇ · B = 0,in virtue of Faraday’s induction law, amounts to a constraint on the initialconditions only; we shall always assume that initial data are compatible withthis constraint. In addition, the current density J and the electric field Eare algebraically given in terms of the magnetic field and velocity. They cantherefore be easily removed from the system.

With reference to the current density, let us consider the J ×B force in themomentum balance equation. By vector calculus,

J ×Bc

=1

4π(∇×B)×B = −∇

(B2

8π

)+

1

4πB · ∇B, (3.47)

that is, the force generated by the magnetic field on the fluid amounts to twocontributions. The first is the gradient of the magnetic-field energy densityand has a similar effect as the fluid pressure: It produces a force directed fromregions with high magnetic field intensity to regions with low magnetic fieldintensity; for this reason the magnetic-field energy density is also referred toas magnetic pressure. The dimensionless number comparing the fluid and themagnetic pressure is the plasma β parameter, defined by,

β =8πp

B2.

A low value of the plasma β, e.g., β ≈ 10−3, 10−2, usually implies a moderateplasma temperature and density with a high magnetic field. Those are ratherdilute plasmas that can be well confined by the magnetic field. Examples com-prise tokamak plasma and the low density regions of the solar corona (coronalholes [62]). On the other hand, high β, e.g., β ≈ 1, suggests that the fluid pres-sure can be comparable or even overcome the magnetic field. The solar wind,coronal loops and helmet streamers [15] are example of high-β plasmas. Thesecond component of the J ×B force, namely, B · ∇B is referred to as field-linebending force: One can notice that B · ∇B vanishes when B is constant alongmagnetic field lines, i.e., when the field lines are straight. Any curvature of thefield lines activates this force. We can compare the order of magnitude of thefield-line bending force with the advection terms u ·∇u in the momentum equa-tion, by means of the scaling argument which is typical in fluid dynamics, cf.

105

also section 3.2; if ρ, B, and V are the typical magnitudes of density, magneticfield, and velocity respectively, and L the scale of the gradient, we estimate

4πρ|u · ∇u||B · ∇B|

≈ V 2

V 2A

= Mm2 = Al−2, V 2A =

B2

4πρ,

where VA is the Alfven speed and the dimensionless number Mm = V/VA is re-ferred to as the magnetic Mach number ; equivalently, its inverse Al , the Alfvennumber, is also used. For large magnetic Mach numbers (or small Alfven num-bers), the flow is dominated by its inertia and the effect of the magnetic field isless important, while the converse holds for small magnetic Mach numbers (orlarge Alfven numbers).

As for the magnetic field, Faraday’s induction equation together with theOhm’s law and Ampere’s law amounts to

∂tB −∇× (u×B) = −∇×(ηc2

4π∇×B

), (3.48)

which is refereed to as the (resistive) MHD induction equation. The term onthe right-hand side can be written as

∇× (u×B) = −u · ∇B +B · ∇u− (∇ · u)B,

where we have used the fat that ∇ · B = 0. This is a hyperbolic operatordescribing the transport of the magnetic field B with the fluid velocity u. Theleft-hand side, on the other hand, is a parabolic operator,

−∇× (κη∇×B) = ∇ ·[κη(∇B − (∇B)T

)]= ∇ ·

[κη∇B

]−∇B · ∇κη, κη =

c2η

4π,

which describes diffusion with diffusion coefficient κη. This identity can beproven by making use of the Levi-Civita tensor εijk for which we have∑

k

εijkεkmn = δimδjn − δinδjm,

together with the condition ∇ · B = 0 in the last equality. An equivalent formof the induction equation then reads

DB

Dt−B · ∇u+ (∇ · u)B = ∇ ·

[κη(∇B − (∇B)T

)]. (3.49)

The strength of the diffusion operator relatively to the hyperbolic terms can beestimated by a dimensionless parameter. Proceeding as above, we estimate

|u · ∇B|∣∣∇ · [κη(∇B − (∇B)T)]∣∣ ≈ V B/L

κηB/L2=V L

κη,

and define the dimensionless number

Rm =V L

κη=

4πV L

c2η,

106

which is referred to as the magnetic Reynolds number in analogy with the clas-sical Reynolds number of fluid dynamics, namely,

Re =V L

ν,

which measures inertia as compared to viscosity in the Navier-Stokes equa-tion (1.50).

The regime of large magnetic Reynolds numbers (Rm 1) is particularlyinteresting since the resistivity (3.41) is inversely proportional to the collisiontime (3.39), which in turn scales like T 3/2: the hotter the plasma the largerthe magnetic Reynolds number. In this regimes the resistivity can be neglectedand the dynamics of the magnetic field is purely hyperbolic. This is calledideal magnetohydrodynamics: The plasma is ideal as a fluid (satisfies Euler’sequations) and as a conductor (zero electrical resistance).

In the opposite regime (Rm 1), the magnetic field diffusion is dominant:Neglecting the hyperbolic terms in (3.48) removes the fluid velocity u from theinduction equations, and the only coupling mechanism between the magneticfield and the fluid part of the system is provided by resistivity. Particularly,in the case of constant resistivity, the induction equation decouples from thefluid-dynamic part of the system.

At last, let us summarize the equations for both resistive and ideal MHD.

• Resistive MHD equations:

Dρ

Dt+ ρ∇ · u = 0,

ρDu

Dt= −∇p+

1

4π(∇×B)×B + ρg,

D

Dt

p

γ − 1+

γ

γ − 1p∇ · u =

κη4π|∇ ×B|2,

∂tB −∇× (u×B) = −∇× (κη∇×B), κη =c2η

4π,

∇ ·B = 0.

(3.50a)

• Ideal MHD equations:

Dρ

Dt+ ρ∇ · u = 0,

ρDu

Dt= −∇p+

1

4π(∇×B)×B + ρg,

Dp

Dt+ γp∇ · u = 0,

∂tB −∇× (u×B) = 0,

∇ ·B = 0.

(3.50b)

In both cases fluid equations are coupled to the magnetic field B, which isthe only electrodynamic variable left in the problem. This should explain thename magnetohydrodynamics.

In case of ideal MHD, the pressure equation can actually be removed fromthe system by accounting for the results of section 1.8. In fact we can takep = Cργ where C is a constant, as a solution of the pressure equations. Then,

107

• Ideal MHD, second form:

Dρ

Dt+ ρ∇ · u = 0,

ρDu

Dt= −∇p+

1

4π(∇×B)×B + ρg, p = Cργ ,

∂tB −∇× (u×B) = 0,

∇ ·B = 0.

(3.50c)

For an incompressible flow, ∇ · u = 0, the density ρ is constant along theLagrangian trajectories, hence ρ = constant is a solution; the equation for thepressure can be dropped as discussed for ordinary fluids, cf. section 1.7. It is alsoconvenient to make use of the form (3.47) of the J ×B force together with theadvection form (3.49) of the induction equation. Especially in the mathematicalliterature, it is common to consider a viscosity term (∝ µ∆u) in the momentumequation which dissipates energy, thus replacing Euler’s momentum equationwith the Navier-Stokes equation, cf. section 1.7. One should keep in mind,however, that the viscosity tensor in a magnetized plasma (as obtained e.g. byBraginskii [38]) is much more complex than that of ordinary fluid; thereforesuch a simplified viscosity model should be regarded as an effective dissipationmechanism rather then a physical mechanism.

• Resistive incompressible MHD equations:

∂tu+ u · ∇u = −∇P +1

4πρB · ∇B + g + ν∆u,

∂tB + u · ∇B −B · ∇u = −∇× (κη∇×B),

∇ · u = 0,

∇ ·B = 0,

(3.50d)

where P = ρ−1(p+B2/8π) is an effective pressure per unit of mass, whichcombines thermal and magnetic energies, and ν = µ/ρ.

• Ideal incompressible MHD:

∂tu+ u · ∇u = −∇P +1

4πρB · ∇B + g,

∂tB + u · ∇B −B · ∇u = 0,

∇ · u = 0,

∇ ·B = 0,

(3.50e)

the effective pressure P being defined above.

In the following we shall make use of the various equivalent ways to writeMHD equations as appropriate for the considered topic. The condition ∇·B = 0is implied if not explicitly stated.

108

4 Conservation laws in magnetohydrodynamics

The qualitative behavior of solutions of the equations of magnetohydrodynamicsis constrained by a set of conservation laws that are implied by the equations.Such conservation laws are crucial for the understanding of the plasma dynamicsfrom a physical point of view, but they are also important for the mathematicalanalysis of the equations as well as for the design of numerical scheme thatrespect the basic qualitative properties of the solution.

4.1 Global conservation laws in resistive MHD. We start examiningthe full system of resistive MHD equations (3.50a) and show that it preservesmass, momentum, and energy even in presence of arbitrary resistivity (providedthat the appropriate Ohmic heating term is accounted for on the right-handside of the pressure transport equation).

We consider a solution (ρ, u, p,B) in a spatial domain Ω and satisfying theboundary conditions

n · u|∂Ω = 0, n ·B|∂Ω = 0, (4.1)

where n is the outgoing unit normal to the boundary ∂Ω of the domain. Suchboundary conditions impose that the velocity field and the magnetic field aretangential to the boundary and are natural conditions for both resistive andideal MHD, as it will become apparent in the following.

Proposition 4.1. Resistive MHD equations (3.50a) equipped with boundaryconditions (4.1) imply following conservation laws.

• Mass conservation:d

dt

∫Ω

ρdx = 0. (4.2a)

• Momentum conservation: If g = 0, then

d

dt

∫Ω

ρudx = −∫∂Ω

(p+

B2

8π

)ndS. (4.2b)

• Energy conservation: If g = −∇Φg with Φg = Φg(x) being given, then

d

dt

∫Ω

wdx = −∫∂Ω

κηJ ×Bc· ndS, (4.2c)

where

w =1

2ρu2 +

p

γ − 1+ ρΦg +

|B|2

8π,

is the MHD energy density.

The energy density, in particular, comprises the kinetic energy per unit ofvolume of the fluid element, the internal energy density expressed in termsof pressure, the gravitational energy, and the magnetic energy density, whilethe electric energy density does not appear as it vanishes to first-order in thequasi-neutral limit. Analogously, the momentum density corresponds to themomentum associated to the fluid motion, as the electromagnetic momentumdensity vanishes to first-order in the quasi-neutral limit. In presence of anexternally imposed gravitational field the momentum conservation is broken.

109

The non-vanishing right-hand sides in equation (4.2b) and (4.2c) mean thatwithout additional boundary conditions there can be a non-zero momentumand/or energy flux through the boundary of the considered domain. Nonethe-less, those equations still express a conservation law: any variation in the totalmomentum or energy can only come from fluxes through the boundary.

The rest of this section is dedicated to the derivation of the above-claimedconservation laws. To this aim we could use partial results from section 3, butit is more instructive to develop the argument in a self-contained way fromequations (3.50a). One should note that mass and momentum conservationdo not depend of the specific form of the MHD Ohm’s law, as they followdirectly from mass continuity and Euler’s equation, respectively. As for energyconservation, a more general argument, which is valid even for non-standardchoices of the Ohm’s law, is briefly reported in appendix B.

In the last part of the section, incompressible flows are briefly discussed.

Mass conservation. Mass conservation follows directly from the equation forthe mass density in conservation form by integrating over the whole spatialdomain Ω and using the Gauss theorem for the divergence term,

d

dt

∫Ω

ρdx = −∫

Ω

∇ · (ρu)dx = −∫∂Ω

ρ(n · u)dS = 0,

where the boundary conditions (4.1) have been taken into account.

Momentum conservation. In order to obtain (4.2b), we make use of the masscontinuity equation to re-write the momentum balance equation in conservativeform, namely,

∂t(ρu) +∇ · (ρuu+ pI) = − 14πB × (∇×B) + ρg.

The J×B-term on the right-hand side can be dealt with by means of the vectoridentity

B ×∇×B = ∇B ·B −B · ∇B= ∇

(12B

2)−∇ · (BB),

where ∇ ·B = 0 has been accounted for. Therefore,

∂t(ρu) +∇ ·[ρuu− BB

4π +(p+ B2

8π

)I]

= ρg,

which expresses the momentum balance in MHD. One can notice that grav-ity breaks momentum conservation. Under the assumption that g = 0, theintegration over the whole domain Ω gives,

d

dt

∫Ω

ρudx = −∫∂Ω

n ·[ρuu− BB

4π

]dS −

∫∂Ω

(p+

B2

8π

)ndS.

The result follows from boundary conditions (4.1).

110

Energy conservation. First we multiply the mass continuity equations by 12u

2,dot-multiply by u the momentum balance equation with g = −∇Φg, and sumthe two resulting equations thus obtaining

12u

2∂tρ+ ρu · ∂tu+ 12u

2u · ∇ρ+1

2ρu2∇ · u+ ρu · ∇u · u

= −u · ∇p+ u · J ×B/c− ρu · ∇Φg.

The first two terms give the time derivative of the kinetic energy density. To-gether with the vector identity

∇ ·(

12ρu

2u)

= 12ρu

2∇ · u+ ρu · ∇u · u+ 12u

2u · ∇ρ,

that gives

∂t(

12ρu

2)

+∇ ·(

12ρu

2u)

= −u · ∇p+ u · J ×B/c− ρu · ∇Φg.

As for the internal energy, the equation for the pressure in (3.50a) divided byγ − 1 amounts to

∂t(

pγ−1

)+∇ ·

(p

γ−1u)

+ p∇ · u = ηJ2.

Upon using again the mass continuity equation, one gets

∂t(ρΦg) = Φg∂tρ = −Φgu · ∇ρ− ρΦg∇ · u.

The sum of the foregoing three equations at last yields

∂t(

12ρu

2 + pγ−1 + ρΦg

)+∇ ·

[(12ρu

2 + pγ−1 + ρΦg

)u+ pu

]= ηJ2 + u · J ×B/c. (4.3)

When we dot-multiply the induction equation by B we have

∂t(12B

2)−B · ∇ × (u×B) = −cB · ∇ × (ηJ).

This can be expressed in the form of a transport equation for the magneticenergy density in virtue of the vector identity

v1 · ∇ × v2 = v2 · ∇ × v1 +∇ · (v2 × v1),

which is valid for any two vector fields v1, v2. Particularly, one has

B · ∇ × (u×B) = (u×B) · (∇×B) +∇ ·[(u×B)×B

],

B · ∇ × (ηJ) = ηJ × (∇×B) +∇ · (ηJ ×B),

and thus,

∂t(B2

8π

)+ c

4π∇ ·[(ηJ − u×B

c

)×B

]= −ηJ2 + J · u×B/c, (4.4)

where we have divided by 4π and used J = c4π∇×B. The sum of equations (4.3)

and (4.4) together with the identity J · (u × B) = u · (B × J) = −u · (J × B)yields

∂tw +∇ ·(wu+ pu+ P

)= 0, (4.5)

111

where

P =c

4π

(ηJ − u×B

c

)×B =

c

4πE ×B,

is the Poynting flux with the electric field being expressed from the MHD Ohm’slaw, cf. equations (2.15) and (3.42).

Equation (4.5) is the differential form of the energy conservation law inMHD. Upon integrating equation (4.5) on the whole domain Ω and making useof the Gauss theorem, one gets

d

dt

∫Ω

wdx = −∫∂Ω

(w + p)(u · n)dσ −∫∂Ω

P · ndS.

Explicitly the Poynting vector on the boundary is proportional to

E ×B = ηJ ×B −B × (B × u)/c

= ηJ ×B − (B · u)B +B2u.

One can see that all fluxes through the boundary vanish if the solution satisfiesthe boundary conditions (4.1) except the term ηJ×B ·n, and this proves energyconservation. This result is true independently on the value of the resistivityη: the internal energy produced by the resistive term (right-hand side of equa-tion (4.3)) is taken at the expenses of the magnetic energy (right-hand side ofequation (4.4).

Incompressible flows. As discussed in section 1.7, the case of incompressibleflows deserves some special consideration, as the standard thermodynamics lawsof compressible gasses do not apply. For the case of the resistive incompressibleMHD equations (3.50d) mass is constant by construction and

Proposition 4.2. Incompressible resistive MHD equations (3.50d) equippedwith boundary conditions (4.1) imply the following conservation laws.

• Momentum conservation: if g = 0, then

d

dt

∫Ω

ρudx = −∫∂Ω

(p+

B2

8π

)ndS − µ

∫∂Ω

n · ∇udS. (4.6a)

• Energy conservation: if g = −∇Φg,

d

dt

∫Ω

wdx = −∫

Ω

κη∣∣∇×B∣∣2dx− µ∫

Ω

∣∣∇u|2dx−∫∂Ω

κηJ ×Bc· ndS (4.6b)

where the energy density for incompressible MHD flows is

w =1

2ρu2 +

|B|2

8π.

with ρ > 0 the constant mass density.

112

One should notice that energy is dissipated by viscosity, resistivity as wellas by boundary terms. Physically the dissipated energy is transferred to theinternal energy which not accounted for in the system (3.50d). In additionviscosity produces a boundary term in the balance of total momentum.

The momentum balance equation (4.6a) follows as in the compressible caseon noting that

ρu · ∇u = ∇ · (ρuu),

and upon integrating the viscosity terms one is left with

ρν

∫Ω

∆udx = µ

∫∂Ω

n · ∇udS.

Energy balance equation (4.6b) also follows as in the compressible case. Forinstance one can observe that

ρu · ∇u · u = ∇ ·(1

2ρu2u

),

and

ρ∇P = ∇(p+|B|2

8π

).

In the incompressible case, however, the difference that the resistive term is nolonger canceled by the Ohmic heating term and we have the contribution ofviscosity which reads∫

Ω

ρu · (ν∆u)dx = µ∑i,j

∫Ω

ui∂2ui∂x2

j

dx

= µ∑i,j

∫Ω

[ ∂

∂xj

(ui∂ui∂xj

)− µ

( ∂ui∂xj

)2]dx,= −µ

∫Ω

|∇u|2dx,

where boundary conditions (4.1) have been accounted for.

4.2 Global conservation laws in ideal MHD. The conservation lawsderived in section 4.1 hold for the given boundary conditions (4.1) independentlyof the resistivity. Hence, the mass and momentum conservation laws (4.2a)and (4.2b) hold unchanged in the ideal case, while the boundary term in theenergy balance (4.2c) is identically zero when η = 0. In addition, there are twomore invariants due to the very special form of the ideal induction equation,namely, the magnetic helicity and the cross-helicity, defined by

Hm =

∫Ω

A ·Bdx, Hc =

∫Ω

u ·Bdx, (4.7)

where A is the vector potential associated to the magnetic field B, i.e., ∇×A =B, cf. appendix C. One might notice that with boundary conditions (4.1) themagnetic helicity is a gauge-independent quantity, that is a gauge transforma-tion A′ = A+∇ϕ changes the magnetic helicity according to

H ′m = Hm +

∫Ω

∇ϕ ·Bdx = Hm +

∫∂Ω

ϕB · ndS,

and the boundary integral vanishes if either ϕ|∂Ω = 0 or B · n = 0.In summary, we can state

113

Proposition 4.3. Ideal MHD equations (3.50b) with boundary conditions (4.1)imply the following conservation laws.

• Mass conservation:d

dt

∫Ω

ρdx = 0. (4.8a)

• Momentum conservation: if g = 0, then

d

dt

∫Ω

ρudx = −∫∂Ω

(p+

B2

8π

)ndS. (4.8b)

• Energy conservation: if g = −∇Φg, then

d

dt

∫Ω

wdx = 0, (4.8c)

where the energy density w is the same as in proposition 4.1.

• Magnetic helicity:d

dt

∫Ω

A ·Bdx = 0. (4.8d)

• Cross-helicity: if g = −∇Φg, then

d

dt

∫Ω

u ·Bdx = 0. (4.8e)

Mass momentum and energy conservation have already been proven in sec-tion 4.1. It remains to show the conservation of magnetic and cross-helicity.

Conservation law for magnetic helicity. Magnetic helicity involves the mag-netic field only and the corresponding vector potential and the conservation ofmagnetic helicity is a direct consequence of the induction equation

∂tB −∇× (u×B) = 0.

The equation for the vector potential A is derived from the induction equa-tion in appendix C and for η = 0 it reads

∂tA− u×∇×A = ∇χ,

where χ is an arbitrary scalar field which accounts for the gauge freedom.Then,

d

dt

∫Ω

A ·Bdx =

∫Ω

(A · ∂tB +B · ∂tA

)dx

=

∫Ω

(A · ∇ × (u×B) +B · (u×∇×A) +B · ∇χ

)dx

=

∫Ω

((∇×A) · (u×B)−∇ · (A× (u×B))

+B · (u×∇×A) +B · ∇χ)dx

= −∫

Ω

∇ · (A× (u×B) +Bχ)dx

= −∫∂Ω

(A× (u×B) +Bχ) · ndS,

114

where we have used ∇× A = B, B · (u×B) = 0, and in the last equation, theGauss theorem. The conservation of magnetic helicity follows from the identityA× (u×B) = (A ·B)u− (A · u)B and boundary conditions (4.1).

Conservation law for the cross-helicity. The calculation of the time derivativeof the cross-helicity requires the momentum balance equation which we write inthe form

∂tu− u×∇× u = −ρ−1∇p−∇(u2/2 + Φg) + J ×B/(ρc),

in view of the identity u · ∇u = ∇(u2/2) − u × ∇ × u. As discussed afterequation (3.50b), p = Cργ so that the plasma is isentropic

ρ−1∇p = Cγργ−2∇ρ = ∇h(ρ),

where the enthalpy function h is given by [11],

h(ρ) =

∫ ρ

Cγrγ−2dr =Cγ

γ − 1ργ−1.

The equation for the velocity field takes the form

∂tu− u×∇× u = −∇χ+ J ×B/(ρc),

where we set χ = h(ρ) + u2/2 + Φg for sake of brevity. Then we compute

d

dt

∫Ω

u ·Bdx =

∫Ω

(B · ∂tu+ u · ∂tB

)dx

=

∫Ω

(B · (u×∇× u)−B · ∇χ+ u · ∇ × (u×B)

)dx.

As before, the integral of B · χ = ∇ · (Bχ) amounts to zero in view of theboundary conditions (4.1). As for the last term

u · ∇ × (u×B) = (∇× u) · (u×B)−∇ ·(u× (u×B)

),

hence,d

dt

∫Ω

u ·Bdx = −∫∂Ω

(u× (u×B)

)· ndS,

and the boundary integral vanishes in view of the identity u × (u × B) = (u ·B)u−u2B and boundary conditions (4.1). This proves the conservation law forthe cross-helicity.

Ideal incompressible flows Equations (3.50e) follow from (3.50d) when κη = 0and ν = 0. Therefore energy and momentum conservation laws for ideal MHDflows are special cases of proposition 4.2.

Magnetic and cross-helicity conservation are obtained as in the compressiblecase with the only difference that ρ is constant and the equation of state p = Cργ

does not hold for incompressible flows. Instead we have

ρ−1∇p = ∇P,

where P is defined after equation (3.50d).The physical consequences of the two additional invariants in ideal MHD

will be outlined below, cf. section 4.5, after discussing the most celebratedproperties of ideal MHD, namely the “freezing” of magnetic field lines in theplasma flow and the conservation of the magnetic flux.

115

4.3 Frozen-in law. The conservation laws analyzed so far can be consid-ered global, since they deal with integral quantities over the whole domain ofinterest. In the ideal regime, however, MHD has additional conservation lawsthat constrain the dynamics locally as well. The most celebrated one is probablythe frozen-in law which concerns magnetic field lines.

Let us recall that, at each fixed time t, the magnetic field line passing througha point y ∈ Ω is the curve determined by

dx(σ)

dσ= B

(t, x(σ)

),

x(0) = y,(t fixed), (4.9)

where σ is a parameter. There is a similarity between equation (4.9) defin-ing field lines and equation (1.5) defining the flow of the vector field, but oneimportant difference is that here time is fixed. Another difference is that the “ve-locity” at which the field line is traced has no physical meaning. More preciselywe have the freedom to re-parametrize the curve, i.e., re-scale the parameter σ.If f(t, x) > 0 is a smooth strictly positive function, we can introduce the newparameter

λ(σ) =

∫ σ

0

f(t, x(σ′)

)dσ′,

so that dλ/dσ = f(t, x(σ)

)and we can re-write the equation for the field line

in the formdx(λ)

dλ=dσ

dλ

dx(σ)

dσ=B(t, x(λ)

)f(t, x(λ)

) ,which is the same as re-scaling the vector field by 1/f(t, x). With some abuseof notation we have denoted x(λ) = x

(σ(λ)

)for simplicity.

As an example, let us consider the arc-length

s(σ) =

∫ σ

0

∣∣∣dx(σ′)

dσ

∣∣∣dσ′,= ∫ σ

0

∣∣B(t, x(σ′))∣∣dσ′,

in a region where |B(t, x| > 0. This corresponds to f(t, x) = |B(t, x)| for whichwe have

dx(s)

ds= b(t, x(s)

),

where b(t, x) = B(t, x)/|B(t, x)| is the unit vector in the direction of the mag-netic field, and the parameter s has the meaning of length of the curve. Ingeneral, the parameter σ does not even have the dimensions of a length, but itis just a label of the points on the line.

The configuration of a set of magnetic field lines with carefully chosen initialpoints is often sufficient to give a good representation of the magnetic field atany given time t, cf. figure 4.1.

The frozen-in law constrains the way magnetic field lines evolve: we saythat they are frozen into the plasma flow since they move with the plasma fluidvelocity u(t, x).

Precisely, let us consider a generic time-dependent vector field w(t, x) and agiven velocity field u(t, x) with flow Ft.

116

Figure 4.1: A single magnetic field line of the reconstructed equilibrium mag-netic field in ASDEX upgrade (shot #27764). The field line alone providesalready an idea of the topology of the field near the magnetic axis (red circle).The color code represents the strength of the magnetic field, from high fieldinside the torus (light blue) to low field outside the torus (dark blue).

Definition 4.1. We say that the vector field w ∈ C1(I × Ω) is frozen in theflow Ft : Ω→ Ω if, for any field line x0(σ) of w at the time t = 0, the curve

xt(σ) = Ft(x0(σ)

),

is a field line of w at time t, for all t ≥ 0.

The condition that xt is a field line of w at time t is written as

dxt(σ)

dσ= f

(t, xt(σ)

)w(t, xt(σ)

), (4.10)

allowing for a possible re-scaling factor f = f(t, x) > 0.Let us remark that the curve xt = xt(σ) is the evolution of x0 = x0(σ) along

the Lagrangian trajectories of u and in general that does not coincide with anyfield line of w at time t. The frozen-in law poses a strong constraint on theevolution of the magnetic field in ideal MHD.

Proposition 4.4. For a solution of ideal MHD equations (3.50b) such thatρ, u,B ∈ C1, ρ > 0, and with flow F (t, x) = Ft(x) in C2, the magnetic fieldB is frozen into the plasma flow. In addition, for incompressible flows thecondition ρ > 0 can be dropped.

The remaining part of this section is dedicated to the proof of this claim.Definition 4.1 is not easy to check directly. We shall first derive a sufficient

condition which can be expressed in terms of a partial differential equation forthe field w and then check that, in ideal MHD, the magnetic field (properlyre-scaled) satisfies such condition.

Lemma 4.5. Let u,w ∈ C1(I × Ω) be two vector fields and let Ft : Ω → Ω bethe flow of u with the hypothesis that F (t, x) = Ft(x) is C2 on I ×Ω. Then thefollowing statements are equivalent:

(i) For any field line x0 = x0(σ) of w at t = 0,

dFt(x0(σ)

)dσ

= w(t, Ft(x0(σ))

). (4.11a)

117

(ii) For all x ∈ Ω,

w(t, Ft(x)

)= w(0, x) · ∇Ft(x). (4.11b)

(iii) The field w satisfies the partial differential equation

∂tw + u · ∇w − w · ∇u = 0. (4.11c)

Let us remark that lemma 4.5 establishes stronger conditions then that re-quired in definition 4.1, as it does not allow for a re-scaling factor, cf. equa-tion (4.10) as compared to (4.11a).

Proof. 1. (i) ⇒ (ii). If (i) is true, then for any field line x0 of w at time t = 0we have

w(t, Ft(x0(σ))

)=dFt(x0(σ)

)dσ

=dx0(σ)

dσ· ∇Ft

(xo(σ)

),

where we have used the chain rule for differentiation in the second identity.Since x0 is a field line of w(0, ·), we obtain

w(t, Ft(x0(σ)) = w

(0, x0(σ)

)· ∇Ft

(xo(σ)

),

identically in σ and for all field lines x0. Evaluating at σ = 0, we obtainequation (4.11b) with x = x0(0).

2. (ii) ⇒ (iii). If (ii) holds, we can differentiate equation (4.11b) in time.The derivative of the left-hand side reads

∂tw(t, Ft(x)

)+dFt(x)

dt· ∇w

(t, Ft(x)

)= ∂tw

(t, Ft(x)

)+ u(t, Ft(x)

)· ∇w

(t, Ft(x)

),

where equation (1.5) has been accounted for. As for the time derivative of theright-hand side, we use the hypothesis that F (t, x) is in C2(I×Ω), which allowsus to exchange the gradient and the time derivative, with the result that

w(0, x) · ∇(dFt(x)

dt

)= w(0, x) · ∇

[u(t, Ft(x)

)]= w(0, x) · ∇Ft(x) · ∇u

(t, Ft(x)

)= w

(t, Ft(x)

)· ∇u

(t, Ft(x)

),

and we have used again (ii) in the last equality. At last, we have

∂tw(t, Ft(x)

)+ u(t, Ft(x)

)· ∇w

(t, Ft(x)

)= w

(t, Ft(x)

)· ∇u

(t, Ft(x)

),

which evaluated at x = F−1t (y) gives equation (4.11c) at the arbitrary point y.

3. (iii) ⇒ (i). Let us consider the quantity

ϕσ(t) = dFt(x0(σ)

)/dσ − w

(t, Ft(x0(σ))

),

defined for a given field line x0 of the field w at t = 0. We assume that (iii) istrue and want to prove (i) which is equivalent to ϕσ = 0 identically for all σ

118

and all field lines x0. At t = 0 we have ϕσ(0) = 0 for all σ, since x0 is a fieldline of w(0, ·) by hypothesis. We compute

dϕσ(t)

dt=

d

dt

[dFt(x0(σ))

dσ

]− ∂tw

(t, Ft(x0(σ))

)− u(t, Ft(x0(σ))

)· ∇w

(t, Ft(x0(σ))

).

In the first term on the right-hand side we can exchange the order of the deriva-tives as, by hypothesis, the flow is C2 in both time and space, while x0 ∈ C2

for a C1-vector field w. Hence,

d

dt

[dFt(x0(σ))

dσ

]=

d

dσ

[dFt(x0(σ))

dt

]=

d

dσ

[u(t, Ft(x0(σ))

)]=dFt(x0(σ)

)dσ

· ∇u(t, Ft(x0(σ))

).

By using (iii) we have

dϕσ(t)

dt= ϕσ(t) · ∇u

(t, Ft(x0(σ))

),

which is a linear ordinary differential equation depending on the parameter σ.Since ϕσ(0) = 0, the unique solution is ϕσ(t) = 0 for all σ and all initial fieldlines x0.

We can now go ahead and check whether the differential equation (4.11c)holds true for the magnetic field in ideal MHD.

Proof of proposition 4.4. Let us first consider the case of incompressible idealMHD, cf. system (3.50e). The induction equation for the magnetic field B takesthe form

∂tB + u · ∇B −B · ∇u = 0,

which is just condition (4.11c) with w = B. We can conclude directly that inideal incompressible MHD the magnetic field is frozen in the plasma flow andthe field lines evolve with the flow according to equation (4.10).

The general case of compressible ideal MHD, cf. system (3.50c) is a bit lessobvious and we need to use the freedom of re-scaling the magnetic field. In thecompressible case, ∇ · u 6= 0 and the induction equation for the magnetic fieldreads

∂tB + u · ∇B −B · ∇u+ (∇ · u)B = 0,

which is not in the form (4.11c). However, from the continuity equation for themass density under the (physically reasonable) assumption ρ > 0, one has

∇ · u = −1

ρ[∂t + u · ∇]ρ,

and thus

∂tB + u · ∇B −B · ∇u+ (∇ · u)B

= ∂tB + u · ∇B −B · ∇u− B

ρ

[∂t + u · ∇

]ρ = 0

119

St

Ct = ∂St

Figure 4.2: Sketch of magnetic field lines intersecting a compact surface Stbounded by a simple curve Ct.

and dividing by ρ, we obtain

∂t(B/ρ) + u · ∇(B/ρ)− (B/ρ) · ∇u = 0,

which is in the form of condition (4.11c) with w = B/ρ. We conclude that, inthe compressible case, the magnetic field re-scaled by the mass density B/ρ isfrozen in the plasma flow and magnetic field lines evolve with the plasma flowaccording to equation (4.10) with f = 1/ρ.

The key point of this remarkable result is the frozen-in condition, either inthe integral form (4.11b) or in the differential form (4.11c). This has a deepgeometrical meaning which can be fully appreciated in the framework of moderntensor calculus [32]. In appendix D, a attempt is made to explain such importantideas in a the simpler and more accessible language of standard vector calculus.

4.4 Flux conservation. Let us consider a time-dependent compact surfaceSt ⊂ Ω which moves with the plasma flow, that is, St = Ft(S0) where S0 isthe configuration of the surface at time t = 0 and Ft is the flow of the plasmafluid. In virtue of the frozen-in law (section 4.3) the magnetic field lines thatintersect St are the same at all time as both surface and field lines are frozenin the plasma. It is therefore natural to ask if the magnetic flux through St isconstant and we shall obtain that under suitable hypotheses, this is the case.This situation is schematically represented in figure 4.2.

First let us recall briefly a few basic concepts about surfaces and curves. Acurve γ given parametrically by X : [a, b] 3 σ 7→ X(σ) ∈ Rd is of class Ck ifthe parametrization X is of class Ck on the interval [a, b] ⊂ R. We say that thecurve is regular if X ∈ C1([a, b]) and X ′(σ) 6= 0, while the curve is called simpleif X is injective. Geometrically, the latter condition means that the curve doesnot cross itself. A closed curve is by definition a curve such that X(a) = X(b),i.e., it closes on itself. One should notice that the curve σ 7→ (σ, 0, 0) ∈ R3 forσ ∈ [0, 1] is not closed in this sense, but it is a closed as a set in R3: The two

120

S

C = ∂S

n

t

Figure 4.3: Positive orientation of the boundary of an oriented surface.

concepts should not be confused. A simple closed curve (or Jordan curve) is aclosed curve with X injective in the open interval (a, b).

In a similar manner, surfaces without boundary are locally the image of amap X : U 3 y 7→ X(y) ∈ Rd where U is an open set of R2 with coordinatesy = (y1, y2). Surfaces with boundary are locally modeled on open sets in R2

+ =(y1, y2) ∈ R2 | y2 ≥ 0, that is U ⊂ R2

+ can include a boundary. A surface isCk-regular if the parametrization X is of class Ck. If X ∈ C1 and d = 3, theunit normal is defined by the local parametrization,

n ∝ ∂X(y)

∂y1× ∂X(y)

∂y2,

and we always take n normalized, i.e., |n| = 1. We say that a surface is ori-entable if we can choose a normal vector globally and smoothly on the surface.The Mobius strip and the Klein bottle are examples of non-orientable surfaces.A surface is connected if it is path-wise connected as a set, i.e., any two points ofthe surface can be joined by a continuous path on the surface. We are interestedin compact surfaces that is, surfaces that are compact as topological spaces. Asa special case, closed surfaces are compact surfaces without a boundary. Theclassification theorem for compact surfaces restrict significantly the possibilities.Closed orientable surfaces are topologically equivalent to one of the followingmodel surfaces: (1) a sphere, (2) a torus, or (3) a connected sum or tori. (Pre-cisely, two surfaces are “topologically equivalent” if they are mapped into eachother by a continuous map with continuous inverse, i.e., a homeomorphism.) Inparticular, closed orientable surfaces enclose a bounded region of space.

We also need to recall Stokes theorem and related definitions. Let S be anycompact connected orientable surface with unit normal n and boundary C = ∂S,which is a simple closed regular curve with tangent t. We always choose thepositive orientation of the boundary C with respect to the orientation of S. Here“positive orientation” means that the unit tangent vector of the curve C = ∂Sis oriented counter-clockwise with respect to the unit normal n of the surface S,cf. figure 4.3.

Under those conditions, Stokes theorem for a generic C1-vector-field w statesthat the flux of the curl of w equals its circulation (line integral) on the boundary,

121

namely, ∫S

(∇× w) · ndS =

∫∂S

w · tds. (4.12)

From Stokes theorem (4.12) it follows that the circulation of a gradient is zero,∫∂S

∇f · tds =

∫S

(∇×∇f) · ndS = 0, (4.13)

in view of the identity ∇×∇f = 0.If σ 7→ X(σ) is a given parametrization of the curve C, the unit tangent

amounts to

t =

∣∣∣∣dX(σ)

dσ

∣∣∣∣−1dX(σ)

dσ,

while the arc-length is given by, cf. section 4.3,

ds =

∣∣∣∣dX(σ)

dσ

∣∣∣∣dσ,hence the oriented line element takes the form

tds =dX(σ)

dσdσ. (4.14)

We can now state the flux-conservation theorem of ideal MHD.At the time t = 0, we consider a compact connected orientable surface S0

possibly with boundary. In case S0 has a non-empty boundary, we assume fur-ther that the boundary C0 = ∂S0 is a regular closed curve with a parametriza-tion

[0, 1] 3 σ 7→ X0(σ) ∈ Ω, (4.15)

such that X0 ∈ C2([0, 1]).At a generic point t in time, the surface is St = Ft(S0), where Ft : Ω→ Ω is

the flow of the velocity field u. As for the frozen-in law discussed in section 4.3,we shall consider the case of sufficiently regular flows, namely, the map F (t, x) =Ft(x) is assumed to be in C2(I ×Ω). Then St = Ft(S0) is a compact connectedorientable surface of class C2. Particularly, since Ft is continuous and invertiblewith continuous inverse (homeomorphism), we have Ct = ∂St = Ft(C0), i.e.,points on the boundary stay on the boundary at all time, and Ct is a regularcurve of class C2.

Proposition 4.6. Let St be a compact connected orientable surface constructedas described above with the flow Ft such that F (t, x) = Ft(x) is in C2(I × Ω).Then,

d

dt

∫St

B · ndS = 0, (4.16)

for any solution B = ∇×A, A ∈ C2, of the ideal MHD induction equation

∂tB −∇× (u×B) = 0, ∇ ·B = 0,

where u is the velocity field associated to the flow Ft.

122

Proposition 4.6 does not refer to a full solution of ideal MHD equations, butconsider only the induction equation with a velocity field u. It applies to regularsolutions of ideal MHD equations as particular cases.

When ∂St is non-empty, Stokes theorem allows us to rewrite the magneticflux through St in terms of the circulation of the vector potential A, namely,∫

St

B · ndS =

∫St

(∇×A) · ndS =

∫Ct

A · tds, (4.17)

and we check that the time derivative of the circulation of A on Ct is zero.

Proof of proposition 4.6. If St is closed, i.e., without boundary, the flux is iden-tically zero. This follows from the fact that closed orientable surfaces enclose afinite volume Wt. Since B is divergence-free,

0 =

∫Wt

∇ ·Bdx =

∫St

B · ndS,

by Gauss theorem, with normal n oriented in the outward direction. Since thisholds at any time t, the statement is trivially true.

For the case of a surface with non-empty boundary, the composition of Ftwith the parametrization (4.15) of the initial boundary C0 gives a parametriza-tion of the boundary Ct, namely,

[0, 1] 3 σ 7→ Xt(σ) = Ft(X0

(σ))∈ Ω,

and, in virtue of the assumptions on Ft and X0, we have that (t, σ) 7→ Xt(σ)belongs to C2. The circulation of the vector potential A on the curve Ct reads∫

Ct

A · tds =

∫ 1

0

A(t,Xt(σ)

)· ∂Xt(σ)

∂σdσ,

where equation (4.14) has been accounted for. The integral on the right-handside is absolutely convergent and the integrand is differentiable in time. Com-puting the derivative, we have

d

dt

∫Ct

A · tds =

∫ 1

0

[∂tA

(t,Xt(σ)

)+dXt(σ)

dt· ∇A

(t,Xt(σ)

)]· ∂Xt(σ)

∂σdσ

+

∫ 1

0

A(t,Xt(σ)

)· ddt

[dXt(σ)

dσ

]ds.

In the last term we can exchange the order of derivatives since Xt is the com-position of two C2 maps and thus it is C2 in the combined variables (t, σ).Since,

d

dt

[dXt(σ)

dσ

]=

d

dσ

[dXt(σ)

dt

]=

d

dσ

[u(t,Xt(σ)

)]=dXt(σ)

dσ· ∇u

(t,Xt(σ)

),

we obtain

d

dt

∫Ct

A · tds =

∫ 1

0

[∂tA

(t,Xt(σ)

)+ u(t,Xt(σ)

)· ∇A

(t,Xt(σ)

)+∇u

(t,Xt(σ)

)·A(t,Xt(σ)

)]· dXt(σ)

dσds.

123

If B is a solution of MHD induction equation with zero resistivity (ideal MHD)the magnetic vector potential solves, cf. appendix C,

∂tA− u× (∇×A) = ∇χ,

where the scalar function χ accounts for the gauge freedom. Equivalently wecan write

∂tA+ u · ∇A = ∇χ+∇A · u,

so that

d

dt

∫Ct

A · tds =

∫ 1

0

[∇(χ(t,Xt) +A(t,Xt) · u(t,Xt)

)]· dXt

dσds

=

∫Ct

∇(χ+A · u) · tds = 0,

since the circulation of a gradient is zero, cf. equation (4.13). At last one candeduce the conservation of the magnetic flux from Stokes theorem (4.17).

A few consequences of flux-conservation are outlined in section 4.5.

4.5 Topology of the magnetic field. In this section we review a few keyconcepts that are used to characterize the magnetic field topology in plasmaphysics. The definition of such topological objects relies on the flux-conservationresult obtained in section 4.4 for ideal MHD.

Flux surfaces. A surface spanned by magnetic field lines in plasma physics isreferred to as a flux surface. Let us formulate a precise definition.

Definition 4.2. Given a magnetic field B = B(t, x) at least continuous, anorientable surface S with unit normal n is a flux surface at time t if B(t, x)·n = 0for all points x ∈ S.

In ideal MHD, flux conservation, proposition 4.6, guarantees that flux sur-faces evolve into flux surfaces. Precisely,

Proposition 4.7. If the magnetic field B and the velocity field u satisfy thehypotheses of proposition 4.6 and S0 is a smooth flux surface at time t = 0, thenSt = Ft(S0) is a flux surface at all time.

Proof. We have to show that B · n = 0 on St = Ft(S0), given the fact that S0

is a flux surface at time t = 0. For every area patch At ⊂ St such that At is acompact surface with boundary Ct satisfying the hypotheses of proposition 4.6we have ∫

At

B(t, x) · ndS =

∫A0

B(0, x) · ndS = 0,

since A0 = F−1t (At) is a patch on a flux surface. If there is a point x0 on St

where B(t, x0)·n(t, x0) 6= 0 then by continuity we can find a patch At sufficientlysmall that the flux is non-zero contradicting flux conservation. It follows thatB · n = 0 on St at all time.

124

Flux surfaces are particularly important in describing static MHD equilibria,i.e., time-independent solutions of ideal MHD with u = 0. In this limit, MHDequations (3.50c) reduce to a set of conditions, namely,

∇p = J ×B/c, ∇×B =4π

cJ, ∇ ·B = 0.

The first condition expresses the exact balance of the forces on the right-handside of the momentum equation in ideal MHD, cf. system (3.50b). This systemof equations is not closed even when boundary conditions are added: there aremany possible MHD equilibria compatible with physically reasonable boundaryconditions.

Let us consider the case of one such equilibrium for which at the point x0

in the given domain Ω the pressure attains a regular value p0 = p(x0) in thesense on the implicit function theorem, i.e., ∇p(x0) 6= 0. Then locally near x0

the set p−1(p0) is a regular surface Σ with normal n ∝ ∇p. Such surfaces arecalled pressure surfaces.

We have that for MHD equilibria pressure surfaces are flux surfaces. Fromthe force balance we have in fact that

B · n ∝ B · ∇p = B · (J ×B)/c = 0.

Geometrically pressure surfaces contains magnetic field lines. Indeed we alsohave the same for the current density,

J · n ∝ J · ∇p = J · (J ×B)/c = 0.

Physically in static MHD equilibria, the Lorentz force due to current and mag-netic field tangent to pressure surfaces balances the pressure gradient of theplasma.

In fusion plasma physics the knowledge of equilibrium flux surfaces is criticaland define much of the geometry of the fusion experiment. The shape of suchsurfaces is constrained by two basic results of topology.

Theorem 4.8. Let B, j, and p in C1(Ω) satisfy the force balance, and let p0 ∈ Rbe a regular value of p such that Σ = p−1(p0) is a closed connected orientablesurface of constant pressure. If B 6= 0 then Σ is a flux surface homeomorphicto a connected sum of tori of genus g ≥ 1.

This result is often reported informally in basic plasma physics text books[39, Chapter 3]. The proof relies on the classification theorem recalled at thebeginning of section 4.4 and on the so called “hairy ball theorem”.

Theorem 4.9 (Hairy ball theorem [32].). The sphere Sk has a continuous fieldof non-vanishing tangent vectors if and only if the dimension k is odd.

Proof of theorem 4.8. We have already shown that Σ is a flux surface whichcontains both magnetic field lines and current lines. If it is closed and orientable,by the classification theorem in must be homeomorphic to either a sphere S2, atorus, or a connect sum of tori. We known however that B is nowhere vanishingand tangent to the surface hence, by the hairy ball theorem we can rule out thepossibility of a sphere.

125

Figure 4.4: Reconstructed magnetic equilibrium for a plasma in the ASDEXupgrade tokamak (shot number 27764). Toroidal flux surfaces are cut by a planein order to show interior nested surfaces. A magnetic field line is also shown,cf. figure 4.1, and one can observe that it stays tangent to a magnetic surface.

An example of realistic flux surface (from the ASDEX upgrade tokamak) isshown in figure 4.4.

Existence of flux surfaces in an axially symmetric equilibrium such as thatof a tokamak is guaranteed and they form a family of nested tori, cf. figure 4.4.In a full three-dimensional configuration, on the other hand, existence of fluxsurfaces is in general not possible: three dimensional perturbations resonatewith special surfaces and destroy them [63]. For this reason, the computationof three-dimensional MHD equilibria is still a challenging problem [64].

Flux tubes. A flux tube is a connected region of space W ⊂ Ω defined by thecongruence of a bundle of magnetic field lines and thus bounded by a flux surfaceas shown in figure 4.5.

For definiteness we can choose a central field line which represent the overalltrajectory of the flux tube in space.

To any flux tube we can associate a constant which is defined as the magneticflux through any cross-section of the tube. This is possible in view of thefollowing result.

Proposition 4.10. Under the conditions of proposition 4.6, let T be a fluxtube, and S a cross-section of T oriented along the direction of the field lines.Then the magnetic flux

κ =

∫S

B · ndS. (4.18)

is independent of the choice of S and constant in time.

Proof. A cross-section of the tube is a compact connected orientable surface andthus the flux through it is constant in time in virtue of proposition 4.6. The factthat the flux is independent on the chosen cross-section follows from ∇ ·B = 0.

126

S1

S2

S3

W

B

Figure 4.5: Sketch of a flux tube.

Consider two non-intersecting surfaces S1 and S2 delimiting a finite volume Wof the flux tube as in figure 4.5. At any given time,

0 =

∫W

∇ ·Bdx =

∫S1

B · nW dS +

∫S2

B · nW dS +

∫S3

B · nW dS,

where S3 is part of the outer boundary of the flux tube and by definition is aflux surface, hence the last integral vanishes. Here nW is the outer normal ofW . In view of the chosen orientation, the normal on S1 is n1 = −nW while thenormal on S2 is n2 = nW and thus,

−∫S1

B · n1dS +

∫S2

B · n2dS = 0,

that is, S1 and S2 intersect the same flux. In the pathological case in whichS1 and S2 intersect each other, let us choose a third section S that does notintersect either of them. We can use the same argument to show that both S1

and S2 intersect the same flux as S.

Magnetic helicity and topology of flux tubes. Magnetic helicity is related to thetopology of flux tubes in a given configuration. This was first understood byMoffatt [5] and Arnold [65]. We report here the qualitative (discrete) argu-ment proposed by Moffatt and just state the precise mathematical equivalentby Arnold.

Let us consider a set of regular simple closed magnetic-field lines Ci, i =1, 2, . . ., in the domain Ω of interest at a given time t fixed. Let us also considerthe extremely idealized situation in which the magnetic field is non-zero onlyin narrow flux tubes around the lines Ci and such flux tubes do not intersecteach other. Figure 4.6 gives a schematic representation of this idealized fieldconfiguration.

To each flux tube we can associate a constant κi which is the flux througha cross section. In virtue of proposition 4.10, κi is well-defined (i.e., does notdepend on the choice of the cross section) and constant in time.

In this configuration we can compute the flux Φi of the magnetic fieldthrough any surface Si bounded by Ci = ∂Si and positively oriented withrespect to the boundary.

127

C1

C2

C3

C4

Figure 4.6: Linked flux tubes projected on a plane. Each oriented simple curveCj should be thought of as a narrow flux tube carrying a constant flux κj . Theintersection of a the curve with a surface Si with boundary Ci = ∂Si contributesto the circulation of the vector potential A along Ci.

Since the magnetic field vanishes everywhere away from the considered fieldlines, the contribution of Cj to the flux Φi is +κj if the curve Cj intersects Siin the direction of the normal, −κj if it intersects Si in the opposite directionand 0 if it does not intersect. Hence, we have

Φi =

∫Si

B · ndS =∑j

αijκj ,

where the constants αij depend only on the topology of the field lines. Since theflow Ft of the plasma motion is continuous and the magnetic field is frozen-indue to proposition 4.4, the coefficients αij are independent of time. Essentiallyαij counts the number of links between Ci and Cj and it is therefore referredto as linking number. Figure 4.7 gives a few examples of linking numbers.

We now go ahead with a formal calculation of the magnetic helicity,

Hm(t) =

∫Ω

A(t, x) ·B(t, x)dx =∑i

∫Ωi

A(t, x) ·B(t, x)dx,

where Ωi is the volume of the narrow flux tube around the curve Ci. Within eachflux tube we switch to adapted coordinates (s, y) where s is a coordinate thatreduces to the arc-length on Ci and y = (y1, y2) are two coordinates spanningthe cross-section normal to B in the flux tube. The volume element in Ωi istherefore dx = dsdS(y) where dS(y) is the area element on the cross-sectionspanned by y. In addition in Ωi we have

B = |B|n, |B| = B · n,

128

α12 = α21 = −1

C1 C2 C1

C2

C1

C2

C1

C2

α12 = α21 = 0 α12 = α21 = +1

α12 = α21 = +2

C2

C1

α12 = α21 = −2

Figure 4.7: A few examples of linked curves following Moffatt’s paper [5].

where n is the normal to the surface spanned by y. Then we have,

Hm(t) =∑i

∫Ωi

A(t, x) ·B(t, x)dx

=∑i

∫Ci

(∫A ·BdS(y)

)ds

≈∑i

∫Ci

A|Ci · t(∫|B|dS(y)

)ds

=∑i

∫Ci

A|Ci · t(∫

B · ndS(y))ds

=∑i

κiΦi,

where we have converted the circulation of A into the flux Φi across a surfacebounded by Ci. The approximations used in the second step are: (1) n ≈ twhere t is the unit tangent on Ci and (2) A ≈ A|Ci both due the the narrowcross-section of the flux tube.

In the limit of zero-width flux tubes we obtain the formal (but elegant) result

Hm(t) =∑i,j

αijκiκj . (4.19)

The magnetic helicity is a quadratic form of the flux strengths κi of each fluxtube with coefficients αij depending only on the topology of the field lines.

129

C C1

C2

α12 = +1

Figure 4.8: Unknotting the trefoil knot [5]. The curve C is equivalent to thetrefoil knot and can be unknotted by adding two properly oriented segments(dashed lines). The result is two linked unknotted curves, C1 and C2, withlinking number α = 1.

We can slightly generalize the result by allowing the curves Ci to be knotted.It is possible to decompose a knotted curve into several links by adding twosegments carrying opposite fluxes, thereby not changing the total value of themagnetic helicity. A knotted field line (trefoil knot) is shown in figure 4.8 to-gether with its equivalent decomposition in two linked curves following Moffatt’spaper [5]. In this case, equation (4.19) holds with the constants αij accountingfor the combination of the number of links and knots in the system.

Equation (4.19) shows at once that the conservation of Hm(t) is a directconsequence of the conservation of the topology of the magnetic field lines, i.e.,the constants αij , as well as of the flux through each flux tube. We also seethat the magnetic helicity measures how linked are the magnetic field lines,since Hm = 0 if flux tubes are linked with each other (αij = 0). A variation inHm, due to non-ideal effects such as a finite resistivity or finite electron inertia,signals a change in the topology of the field lines.

In 1973 Arnold proposed a theorem which generalizes equation (4.19) to thecase of a realistic field [65]. Such generalization reads [66],

Hm(t) =

∫Ω

∫Ω

λ(x1, x2)dx1dx2, (4.20)

where λ ∈ L1(Ω× Ω) is the asymptotic linking number of two field lines issuedfrom the points x1 and x2, respectively. This is operatively defined by thefollowing procedure: given two point x1 and x2, one computes the two fieldlines issued from such points following them for σ ∈ [0, L1] and σ ∈ [0, L2],respectively. In general, the two field lines will not close on themselves, butone can close them by adding a straight line connecting the end-points to thecorresponding starting points. Such segments will not intersect each other apartfor a set of measure zero of points x1 and x2; then one has two closed simplecurves and one can compute their linking number which will depend on L1, L2;we divide the linking number by L1 · L2 and let L1, L2 → +∞; the limit is theasymptotic linking number λ(x1, x2). The rigorous proof of equation (4.20) hasbeen completed by Volgel [67] later in 2003.

130

Topological bounds on magnetic energy. Conservation of magnetic helicity im-plies a lower bound on the magnetic energy. This observation has importantconsequences for the dynamics of plasmas as observed by Woltjer [4] and laterby Taylor [6] in his approach to the MHD equilibrium problem which is nowcalled Taylor relaxation. An overview of the role of magnetic helicity in plasmaconfinement is given by Yoshida [68].

Let us consider a bounded simply connected domain Ω with smooth bound-ary ∂Ω and a magnetic field configuration B ∈ L2(Ω) at a certain time, satisfying∇ · B = 0 in Ω and B · n = 0 on ∂Ω. The corresponding potential A such that∇ × A = B with the gauge fixed by the condition ∇ · A = 0 is given by thefollowing result.

Theorem 4.11 (Theorem 3.6 in Girault and Raviart [69]). Let Ω ⊂ R3 bea bounded simply connected domain with smooth connected boundary ∂Ω andv ∈ L2(Ω) satisfying

∇ · v = 0 in Ω and v · n = 0 weakly on ∂Ω.

Then there is one potential w ∈ H1(Ω) such that v = ∇ × w with tangentialboundary condition w×n = 0 on ∂Ω. The potential is characterized as a uniquesolution of

−∆w = ∇× v in H−1(Ω),

∇ · w = 0 in Ω,

w × n = 0 on ∂Ω.

We shall also need the following Poincare-type inequality.

Theorem 4.12 (Lemma 3.4 in Girault and Raviart [69]). With Ω as in theo-rem 4.11, there is a constant CP such that

‖w‖L2(Ω) ≤ CP‖∇ × w‖L2(Ω) + ‖∇ · w‖L2(Ω)

,

for all w : Ω → R3 such that w ∈ L2(Ω) with weak curl and divergence ∇ ×w,∇ · w ∈ L2(Ω) and satisfying the boundary conditions w × n = 0 on ∂Ω.

As a direct consequence of theorem 4.11 and 4.12 one has the claimed lowerbound on the energy.

Theorem 4.13 (Arnold [65]). With Ω as in theorem 4.11, let B ∈ L2(Ω)be a given magnetic field and A ∈ H1(Ω) the corresponding vector potentialconstructed as in theorem 4.11. Then

|Hm| ≤ CP ‖B‖2L2(Ω),

where Hm = (A,B)L2(Ω) is the magnetic helicity.

Proof. Schwartz inequality gives∣∣Hm

∣∣ =∣∣(A,B)L2(Ω)

∣∣ ≤ ‖A‖L2(Ω) · ‖B‖L2(Ω),

and A ∈ H1(Ω) obtained from theorem 4.11 satisfies in particular ∇ · A = 0 inΩ and A× n = 0 on ∂Ω. Hence theorem 4.12 implies

‖A‖L2(Ω) ≤ CP ‖∇ ×A‖L2(Ω),

131

and thus‖A‖L2(Ω) ≤ CP ‖B‖L2(Ω).

When this is inserted in the estimate for the helicity, we have∣∣Hm

∣∣ ≤ ‖A‖L2(Ω) · ‖B‖L2(Ω) ≤ CP ‖B‖2L2(Ω),

which is the claimed estimate.

Conservation of magnetic helicity implies that the magnetic energy, whichis proportional to ‖B‖2L2(Ω) cannot decrease arbitrarily. Since magnetic helicityis essentially related to the topological properties of the magnetic field, we havethat the topology imposes a constraint on the relaxation of the magnetic field:not all magnetic energy can be transferred to kinetic or internal energy of theplasma (total energy is conserved). It was proposed by Woltjer and Taylor thateven in presence of non-ideal effects (viscosity and resistivity) magnetic helicitydecays slowly on the time scale of interest and thus is approximately preservedand poses a strong constraint on the dissipation and relaxation to equilibrium.

4.6 Analogy with the vorticity of isentropic flows. Many of the resultsstated for the magnetic field where first discovered for isentropic flows. There isin fact a strong formal analogy between the equations of ideal MHD and thosefor isentropic flows.

If one recalls the results of section 1.8, the momentum balance equation andthe vorticity equation for an inviscid isentropic fluid without external forces takethe form

∂tρ+∇ · (ρu) = 0,

∂tu− u× (∇× u) = ∇χ(ρ, u),

∂tω −∇× (u× ω) = 0,

(4.21)

with ω = ∇× u, and we recognize the same structure as in ideal MHD where uplays the role of the the magnetic vector potential A, and the vorticity ω thatof the magnetic field. The only difference is the in the case of equations (4.21)the velocity field is both the potential for ω and the advecting field, but thisdifference is inessential here.

We can therefore conclude at once the following results for the vorticity.First we have an exact invariant the helicity of the flow which is the analogousof the magnetic helicity in MHD.

Proposition 4.14. For solutions of (4.21) the helicity is conserved,

d

dt

∫Ω

u · ωdx = 0. (4.22)

The proof follows in the same way as for magnetic helicity (proposition 4.3).In this case it is possible to draw a physical interpretation of this quantity.The vorticity ω describes the rotation of a fluid element around a referenceLagrangian trajectory, cf. section 1.3, and the quantity u · ω is positive if therotation is “right-handed” with respect to u, i.e., anti-clockwise as in figure 4.3.For particle with a spin this is called helicity and the name has been adoptedin fluid mechanics [5].

We also have an equivalent of the frozen-in law which one can prove exactlyin the same way.

132

Proposition 4.15. For a solution of equations (4.21) such that ρ, u, ω ∈ C1,ρ > 0 and the flow F (t, x) = Ft(x) is in C2, the magnetic field B is frozen intothe plasma flow. In addition, for incompressible flows the condition ρ > 0 canbe dropped.

Analogously we have flux conservation. For the case of isentropic flows,however, this result is usually formulated as a conservation of the circulation ofthe velocity and is referred to as Kelvin’s theorem.

Proposition 4.16 (Kelvin’s circulation theorem). Let ω = ∇×u and u ∈ C2 besolution of equations (4.21) and let St be a compact connected orientable surfacewith boundary evolving with the flow Ft of the velocity field u. If F (t, x) = Ft(x)is in C2(I × Ω), then

d

dt

∫∂St

u · tds = 0. (4.23)

The surface satisfying the condition ω · n = 0 are referred to as vortex sheets[11] and they are preserved by the dynamics in the same way as flux surfaces arepreserved in MHD (the precise statement can be deduced from proposition 4.7).

A tube of vorticity field lines is referred to as a vortex tube. The analo-gous of proposition 4.10 is the Helmholtz theorem which is formulated in termscirculations rather then fluxes.

Proposition 4.17 (Helmholtz theorem). Let u, ω and F as in proposition 4.16and let T be a vortex tube. Then the circulation

κ =

∫C

u · tds. (4.24)

of the velocity on a simple closed curve, which bounds a cross-section of the tube,is independent of the chosen cross-section and constant in time.

The constant κ can be associate to the vortex tube in the same way themagnetic flux can be associated to a flux tube.

The same consideration on linked and knotted flux tubes of section 4.5 applyhere to vortex tubes. Particularly, the conservation of fluid helicity implies alower bound for the L2-norm of the vorticity.

133

5 Basic processes in magnetohydrodynamics

The solution of MHD equations (except for very few simple cases) necessarilyrequires computer codes even for very idealized problems. A good understand-ing of qualitative features of the solutions is, however, an essential prerequisiteto the development and application of such MHD codes as well as for the physicsinterpretation of the results. In this section, some basic physics processes oc-curring in magnetohydrodynamics are illustrated in simple situations.

5.1 Linear MHD waves. Given a uniform plasma (where “uniform” meansconstant in time and homogeneous in space), we shall study the evolution of asmall disturbance in terms of normal modes (plane waves). This allows us toidentify the natural wave modes of the system and their propagation speedswhich often set the characteristic time scales for physical processes and haveto be accounted for in the design of either stable explicit numerical schemes orpreconditioners for implicit schemes. The method is relatively simple and canbe carried out for most plasma physics models. The technique employed hereis at least as important as the results themselves.

Let us consider the full resistive MHD system (3.50a). We are interested insolutions describing a uniform plasma without flow, i.e., with constant densityρ0, no fluid velocity u0 = 0, constant pressure p0, and constant magnetic fieldB0. One can check that this is a solution only in absence of gravity, whichtypically produces stratification in a fluid, hence we assume g = 0 in this section.

We study the evolution of a small amplitude perturbation

ρ = ρ0 + ρ1,

u = u1, (u0 = 0),

p = p0 + p1,

B0 = B0 +B1.

(5.1)

where “small amplitude” means that, on substituting the perturbation (5.1)into MHD equation (3.50a), nonlinear terms are neglected. This procedure isknown as linearization. For the specific case under consideration, the linearizedsystem reads

∂tρ1 + ρ0∇ · u1 = 0,

ρ0∂tu1 = −∇p1 +1

4π(B0 · ∇B1 −∇B1 ·B0),

∂tp1 + γp0∇ · u1 = 0,

∂tB1 −B0 · ∇u1 + (∇ · u1)B0 = κη∆B1

(5.2)

where we have assumed constant resistivity and all nonlinear terms have beendropped. We see in particular that the Ohmic heating term at the right-handside of the pressure equation is zero, since J0 = c

4π∇ × B0 = 0 for a uniformequilibrium. Therefore we expect that, in general, the linearized system doesnot conserve energy: The fixed background provides an inexhaustible energysink/source.

135

We look for solutions of the linearized system in the form of plane waves,ρ1

u1

p1

B1

=

ρup

B

e−izt+ik·x, (5.3)

where z = ω + iν ∈ C is a complex frequency and k ∈ Rd the wave vector. Letus note that this is not a Fourier transform, but rather a particular family ofsmooth functions depending parametrically on (z, k). The substitution of theplane wave (5.3) into the linearized system (5.2) gives

−izρ+ iρ0k · u = 0,

−izρ0u = −ikp+i

4π(B0 · kB − kB0 · B),

−izp+ iγp0k · u = 0,

−izB − iB0 · ku+ iB0(k · u) = −κηk2B,

(5.4)

which is a linear algebraic system for the complex amplitudes ρ, u, p, and B.It is worth noting that for Euler’s equations (1.41), the linearization procedurewould give the system (5.4) with B0 = 0 and B = 0. Let us examine this casefirst.

The linearized momentum balance equation without the magnetic field andmultiplied by iz reads

z2ρ0u = ik(−izp) = iz(−iγp0k · u),

which can be rewritten in the form[z2I − (γp0/ρ0)kk

]· u = 0, (5.5)

where I is the identity tensor. The quick way to solve this equation is a scalarmultiplication by k, which gives

(z2 − (γp0/ρ0)k2)(k · u) = 0, k = k/|k|,

on one hand, and by k×, which gives

z2k × u = 0,

on the other hand. The latter implies that for z 6= 0, u is parallel to k, and theformer reduces to the algebraic condition

z2 = (γp0/ρ0)k2,

which has two real valued solutions, z = ω with

ω = ±cS |k|, cS =√γp0/ρ0. (5.6)

This describes a wave which moves in the direction of k with constant speed cSand is characterized by a velocity perturbation in the same direction k of thepropagation, which means that the wave is purely compressional and therefore it

136

is accompanied by density and pressure oscillations. This is commonly referredto as the sound wave, and its propagation speed cS is the sound speed in the fluid.The fact that cS is constant means, in particular, that the propagation speeddoes not depend of the direction of k, i.e., the wave is isotropic. Equation (5.6)specifies the frequency of the wave, given the wave vector and it is referred toas the dispersion relation.

On going back to (5.4), we note that the equation for ρ and p can be explicitlysolved in terms of the velocity perturbation u; the induction equation requiressome more care because of the resistivity term,

−izB =iz

z + iκηk2

[B0 · ku−B0k · u

],

for ω = Re(z) 6= 0 so that the denominator does not vanish for real k. Theequation for u multiplied by z amounts to

−iz2ρ0u = k(−izp)− 1

4π

(B0 · k(−izB)− kB0 · (−izB)

),

= −iγp0k(k · u)− 1

4π

iz

z + iκηk2

[B0 · k

[B0 · ku−B0k · u

]− kB0 ·

[B0 · ku−B0k · u

]]= −iγp0k(k · u)

− |B0|2

4π

iz

z + iκηk2

[k2‖u− k‖(bk + kb) · u+ k(k · u)

],

where b = B0/|B0| is the unit vector along the direction of the backgroundmagnetic field B0. Upon dividing by −iρ0 we can recognized the sound speedin the first term on the right-hand side, cf. equation (5.6), and the Alfven speed

cA =

√|B0|24πρ0

, (5.7)

cf. section 3.5. Therefore, we have[z2I − c2skk − c2Aζη

(k2‖I − k‖(bk + kb) + kk

)]· u = 0, (5.8)

where ζη = z/(z + iκηk2) accounts for the effect of resistivity; for ideal MHD

we have ηη = 1. Equation (5.8) is the MHD analogous of equation (5.5) and infact it reduces to (5.5) when cA = 0. The presence of the magnetic field inducesa much richer wave motion.

Equation (5.8) has the form

D(z, k)u = 0, (5.9)

where the complex-matrix-valued function D(z, k) is referred to as the dispersiontensor. A non-trivial solution u 6= 0 of (5.9) exists if and only if the matrix Dis not invertible, that is,

detD(z, k) = 0, (5.10)

which should be solved for the complex frequency z, given a real wave vectork [70]. To every solution z(k) ∈ C it corresponds a plane wave solution of the

137

x1

x2

x3

k

k⊥

k‖

ϑ

B0

Figure 5.1: Reference frame used to study plasma waves [71]. The magneticfield direction defines the direction of the x3 axis, while the x1 and x2 axes arerotated so that k2 = 0. This, in particular, yields the matrix form (5.12) of thedispersion tensor in equation (5.8).

linearized MHD system (5.2) oscillating at the frequency ω(k) = Re z(k) andwith amplitude depending on time exponentially ∼ eν(k)t where ν(k) = Im z(k),cf. equation (5.3). The wave can be stable, i.e., uniformly bounded for t ≥ 0, ifν(k) ≤ 0, or unstable, if ν(k) > 0. In the latter case, the wave amplitude growsexponentially in time, up to the point where the linearization (i.e., neglectingnonlinear terms in the perturbation) is no longer valid and a fully nonlineartreatment has to be considered.

For the specific case of equation (5.8) the problem can be simplified by a suit-able choice of the coordinate system. Without loss of generality we can choosethe coordinate system so that the third axis is directed along the backgroundmagnetic field, i.e., b = e3 = (0, 0, 1), and so that the wave vector k belongs to

the plane spanned by e1, e3, i.e., k = (k⊥, 0, k‖) = k⊥e1 + k‖e3, where k‖ = b · kand k2

⊥ = k2 − k2‖ are the components of k parallel and perpendicular to the

background magnetic field. The angle ϑ between the direction of the wave vectork and the magnetic field is the propagation angle,

k‖ = |k| cosϑ, k⊥ = |k| sinϑ. (5.11)

This coordinate system is referred to as the Stix frame in the theory of plasmawaves [71] and it is sketched in figure 5.1.

In the Stix frame, the dyadic products in (5.8) amount to the matrices

kk =

k2⊥ 0 k‖k⊥0 0 0

k⊥k‖ 0 k2‖

,

and

bk =

0 0 k⊥0 0 00 0 k‖

, kb =

0 0 00 0 0k⊥ 0 k‖

.

138

Correspondingly, the dispersion tensor in (5.8) for linear MHD waves reads

D(z, k) =

z2 − c2sk2⊥ − c2Aζηk2 0 −c2sk‖k⊥0 z2 − c2Aζηk2

‖ 0

−c2sk‖k⊥ 0 z2 − c2sk2‖

. (5.12)

Let us first study the ideal case, i.e., κη = 0 and ζη = 1. By inspection ofthe matrix (5.12) with ζη = 1, one can notice the solution

z2 − c2Ak2‖ = 0, u1 = u3 = 0, u2 6= 0,

which means that there exists a normal mode with real-valued frequency

ω2 = c2Ak2‖, (5.13a)

and polarized so that

k · u = 0, B0 · u = 0. (5.13b)

This normal mode is referred to as Alfven wave or shear Alfven wave and it isthe most characteristic wave mode in MHD [1]. The fact that the frequencyof shear Alfven waves is real means that the wave is stable and undamped forideal MHD on a uniform plasma background. The first polarization conditionin (5.13b) is equivalent to incompressibility ∇ · u1 = 0 and implies that densityand pressure perturbations are zero, cf. equation (5.4). As for the magneticperturbation, from the induction equation in (5.4) we have

B = ∓(|B0|/cA

)u, (5.13c)

that is, the magnetic field perturbation B is either anti-parallel or parallel to thevelocity perturbation u depending on the root of the dispersion relation (5.13a),namely, anti-parallel for ω = +cAk‖ or parallel for ω = −cAk‖. The divergence-

free condition k·B = 0 is automatically satisfied, while the second of polarizationconditions (5.13b) implies B0 · B = 0, which means that the perturbation isorthogonal to the background magnetic field.

Figure 5.2 shows a simple example of Alfven wave propagating parallel to thebackground magnetic field. One can see that, the perturbation makes the fieldlines oscillate like strings without compressing the plasma. The fluid flow bendsthe magnetic field lines, thus creating an electric current J ∝ ∇× B, which inpresence of a magnetic field reacts back on the fluid by the field-line bendingforce, cf. section 3.5, opposing the bending of field lines and thus generatingthe wave oscillation.

At last let us note that the shear Alfven wave is the only wave mode in theincompressible limit. This can be checked by imposing the incompressibilitycondition k ·u = 0 in equation (5.4) and noticing that in this case the momentumbalance equation implies the condition B0 · u = 0, which in turn gives B0 · B = 0from the induction equation.

On going back to the full problem, we can proceed systematically and applythe general condition (5.10) to the dispersion tensor (5.12) again in the idealcase, i.e., with ζη = 1. That gives(

z2 − c2Ak2‖)[

(z2 − c2Sk2⊥ − c2Ak2)(z2 − c2Sk2

‖)− c4Sk

2‖k

2⊥]

= 0. (5.14)

139

Figure 5.2: Velocity and magnetic field for the simplest case of an Alfven wavein a uniform plasma, shown at a given point in time. The background magneticfield is B0 = |B0|e3 and the wave vector is k = |k|e3, i.e., the propagation isparallel to the background magnetic field. The amplitude of the perturbationsis |B1|/|B0| = |u1|/cA = 0.05. Parallel black curves correspond to the field linesof the background magnetic field, while the thick blue lines are those of the totalfield, including the perturbation. The little arrows represent the velocity field(in red) and the magnetic field perturbation (in light blue).

The Alfven mode appears as a factor, leading to the shear Alfven wave addressedabove. The factor in square brackets amounts to a quadratic equation for thesquare of the complex frequency z2, namely,

z4 − (c2S + c2A)k2z2 + c2Sc2Ak

2‖k

2 = 0,

which is readily solved by the real valued frequencies

ω2 =1

2

[(c2S + c2A)±

√(c2S + c2A)2 − 4c2Sc

2A cos2 ϑ

]k2. (5.15)

Those two wave modes are referred to as magnetosonic waves as they com-bine the sound speed with the Alfven speed. It is customary to characterizemagnetosonic waves in terms of their phase velocity vph defined in general by

vph =ω(k)

|k|. (5.16)

For instance, the phase velocity of the shear Alfven wave is

vAph

cA= cosϑ, (5.17)

and it is natural to normalize the phase velocity to the Alfven speed. For thecase of the two magnetosonic waves, one has

v±ph

cA=

1

2

[(2 + γβ)±

[(2 + γβ)2 − 8γβ cos2 ϑ

] 12

] 12

, (5.18)

140

Figure 5.3: Polar plot of phase velocities of ideal MHD waves normalized to cAaccording to equations (5.17) and (5.18), as functions of the propagation angleϑ, cf. equation (5.11) for the case γβ = 1.

where we have used the identity

c2Sc2A

=1

2γβ,

relating the ratio of sound and Alfven speeds to the plasma β defined in sec-tion 3.5. One can check that v+

ph ≥ v−ph with equality for ϑ = 0, i.e., for parallel

propagation. Therefore, the wave corresponding to v+ph is referred to as the fast

magnetosonic wave, while the other one is the slow magnetosonic wave.Figure 5.3 shows a polar plot of phase velocities of ideal MHD wave modes as

function of the propagation angle: The curves are defined as vph(ϑ)(cosϑ, sinϑ)for ϑ ∈ [0, 2π), i.e., a straight line drawn from the center of the plot in thedirection ϑ intersects one of the phase-velocity curves exactly at the distancevph(ϑ); when no intersection occurs, the wave mode is not propagating in thatdirection. From such a representation it is possible to appreciate the highlyanisotropic behavior of the slow and the shear Alfven waves (in this plot anisotropic wave would be represented by an exact circle of radius given by itsphase velocity). The most isotropic mode is the fast wave which is just thesound wave modified by the presence of the magnetic field.

We conclude this section by examining the effect of a small but finite re-sistivity on ideal MHD waves. Equation (5.10) with dispersion tensor given inequation (5.12), gives

z2 − c2Aζηk2‖ = 0, (5.19)

for the shear Alfven wave and

z4 − (c2S + c2Aζη)k2z2 + c2Sc2Aζηk

2‖k

2 = 0 (5.20)

for the two magnetosonic waves, where ζη depends, in particular, on the complexfrequency z with the result that the solution of the dispersion equations is lesssimple than in the ideal case.

141

The case of the Alfven wave can be dealt with dividing the correspondingcomplex dispersion relation by ζη with the result that

(z + iκηk2)z − c2Ak2

‖ = 0,

and this is a quadratic equation for the complex frequency z ∈ C. The solutionsare

z = ±√c2Ak

2‖ −

14κ

2ηk

4 − i2κηk

2. (5.21)

For κη = 0 we recover the frequency of the ideal Alfven wave, while for a finiteresistivity κη one can notice two effects. On one hand, the frequency has a realpart that depends non-linearly on |k| (wave dispersion). On the other hand, ithas acquired an imaginary part which means that the wave can either decrease(damping) or increase (instability) exponentially in time depending on whetherIm(z) < 0 or Im(z) > 0, respectively. By inspection of the solutions, one noticesthat the largest imaginary part is obtained when k‖ = 0, which gives

Im(z) = (±1− 1)1

2κηk

2 ≤ 0, (k‖ = 0)

hence, resistivity introduces damping of the Alfven wave. In the limit of smallresistivity (and assuming cos θ 6= 0), one has

z ≈ ±cA|k‖| ∓κ2η|k|3

8cA cos2 θ− i

2κηk

2, (for κ2ηk

4/(4c2Ak2‖) 1), (5.22)

which shows that wave dispersion is O(κ2η) while resistive damping is O(κη).

As for magnetosonic waves, in presence of resistivity the dispersion rela-tion (5.20) amounts to an algebraic equation of forth degree: Again dividing byζη we have

z4 + iκηk2z3 − (c2S + c2A)k2z2 − iκηk2c2Sk

2z + c2Sc2Ak

2‖k

2 = 0.

However, if one is interested in the effects of a small resistivity on the ideal mag-netosonic waves, the exact solution of this equation is not needed. It is possibleto obtain a result analogous to equation (5.22) by a perturbation argument. Forsimplicity, let us denote by Q(z, k) = Q0(z, k) + iκηk

2Q1(z, k) the polynomialabove, where Q0(z, k) = 0 is the dispersion relation of the ideal magnetosonicmode and Q1(z, k) accounts for the effects of resistivity. We look for deforma-tions of an ideal root z = ω(k), with ω(k) given in (5.15), for small κη. With thisaim let us introduce the new complex variable w defined by z = ω(k) + iκηk

2wso that, by Taylor formula,

0 = Q0

(ω(k) + iκηk

2w, k)

+ iκηk2Q1

(ω(k) + iκηk

2w, k)

= iκηk2[Q′0(ω(k), k

)w +Q1

(ω(k), k

)]+ · · · ,

where Q′0 denotes the complex derivative of the polynomial Q0 with respectto z and the dots stand for terms of higher order in the resistivity; the idealdispersion relation Q0

(ω(k), k

)= 0 has been accounted for. We can now solve

for w, obtaining the lowest order correction

w = −Q1

(ω(k), k

)Q′0(ω(k), k

) = − ω(k)2 − c2Sk2

4ω(k)2 − 2(c2S + c2A)k2,

142

and correspondingly the first-order correction to the frequency of resistive mag-netosonic waves amounts to

z ≈ ω(k)− i

2κηk

2 ω(k)2 − c2Sk2

2ω(k)2 − (c2S + c2A)k2.

The first order correction could describe either a damping or an instabilitydepending on its sign; physically we expect damping as resistivity dissipatesthe currents associated to the wave into heat. Upon accounting for the idealdispersion relation, one has

ω(k)2 − c2Sk2 =1

2(c2A − c2S)k2 ± 1

2

√∆k2

where ∆ = (c2S + c2A)2 − 4c2Sc2A cos2 θ, and

ω(k)2 − c2Sk2

2ω(k)2 − (c2S + c2A)k2=

1

2

[1± c2A − c2S√

∆

],

and thus

z ≈ ω(k)− i

4κηk

2[1± c2A − c2S√

∆

]. (5.23)

On noting that

√∆ =

√(c2S + c2A)2 − 4c2Sc

2A cos2 θ

=√c4S + c4A + 2c2Sc

2A(1− 2 cos2 θ) ≥

√c4S + c4A − 2c2Sc

2A =

∣∣c2A − c2S∣∣,we can conclude that ∣∣∣c2A − c2s√

∆

∣∣∣ ≤ |c2A − c2S ||c2A − c2S |= 1,

which implies that the first-order effect of resistivity on magnetosonic wave isdamping as expected.

5.2 Nonlinear shear Alfven waves. The linear MHD waves discussed insection 5.1 are approximations of a solution of the equations of magnetohydro-dynamics valid when the amplitude of the wave is sufficiently small to justifythe linearization. However, a remarkable property of the shear Alfven wave isthat it actually corresponds to an exact solution of the fully nonlinear MHDequations, even with arbitrarily large amplitude. This is a direct consequenceof the specific polarization of the Alfven wave for which the magnetic field iseither parallel or anti-parallel to the velocity field, cf. equation (5.13b).

The Alfven wave corresponds to an incompressible disturbance of an idealplasma (η = 0), therefore let us consider the incompressible ideal MHD equa-tions (3.50e). We consider incompressible perturbations of arbitrary amplitudeof a uniform plasma with constant mass density ρ0 and magnetic field B0 in asteady state, i.e., u0 = 0. In view of incompressibility of the perturbation, thedensity is constant and remains equal to ρ0. Upon denoting by u the pertur-bation in the velocity field and by B the perturbation in the magnetic field, sothat the total magnetic field is B0 +B, equations (3.50e) amount to∂tu+ u · ∇u− 1

4πρ0(B0 +B) · ∇B = −∇P,

∂tB + u · ∇B − (B0 +B) · ∇u = 0,

143

where we have accounted for the fact that B0 is a constant background field. Itis convenient to introduce new variables

z± = u± B√4πρ0

, (5.24)

which are referred to as Elsasser variables. For incompressible perturbations,∇ · u = 0, and thus ∇ · z± = 0. In terms of Elsasser variables, incompressibleideal MHD equations with a uniform guide field B0 read

∂tz± ∓ vA · ∇z± + z∓ · ∇z± = −∇P, (5.25)

where vA = B0/√

4πρ is the vectorial Alfven velocity (|vA| = cA where cA isthe Alfven speed defined in section 5.1). It is worth noting that in the abovesystem the pressure is determined implicitly by the constraints ∇ · z± = 0; itis sufficient to have a single scalar function P to enforce both constraints sincethe divergence of the equation for both z+ and z− yields the same equation forthe pressure, namely, ∆P = ∇z+ : ∇z−.

Equations (5.25) are fully non-linear and describe the exact dynamics of aperturbation (u,B) of the considered uniform plasma. One can note howeverthat the only nonlinear terms describe advection of one of the Elsasser variableby the other one. It follows that if z− = 0, then the equation for z+ is linearand the constraint ∇ · z+ = 0 is automatically satisfied so that P is constant.Viceversa if z+ = 0, then the equation for z− is linear and again ∇ · z− = 0with P = constant. Summarizing, we have found two classes of exact solutions,namely,

z− = 0,

∂tz+ − vA · ∇z+ = 0,

∇ · z+ = 0,

(5.26a)

and z+ = 0,

∂tz− + vA · ∇z− = 0,

∇ · z− = 0.

(5.26b)

In both systems (5.26) the divergence-free constraint is identically satisfied forall time t ≥ t0 if it is at the initial time t0. According to this systems, an initialdisturbance is transported rigidly in a direction parallel to the background fieldB0, either backward for the case of (5.26a), regressive wave, or forward for thecase of (5.26b), progressive wave.

Fourier transform shows that each harmonic composing the solution of sys-tem (5.26a) has the same dispersion and the same polarization as a regressiveAlfven wave, cf. equations (5.13a) and (5.13c), and analogously for the pro-gressive wave. Therefore, a solution of either one of systems (5.26) represents awave packet of shear Alfven waves.

Large-amplitude Alfven waves are frequently encountered in space plasmas.Such waves, in view of their large amplitude, can trigger nonlinear processes suchas parametric decays which have been used to explain various energy transfermechanisms in the solar corona and solar wind [72].

144

5.3 Magnetic field diffusion. Let us consider the regime of low magneticReynolds numbers, Rm 1 with a constant plasma resistivity. This leads topure magnetic field diffusion, which is a highly idealized situation hardly to befound in nature; nonetheless, this study allows us to understand how a finiteplasma resistivity affects the magnetic field. For Rm 1, the hyperbolic termsin the induction equation (3.49) can be neglected and the latter decouples fromthe rest of the MHD system; with constant resistivity, we have

∂tB(t, x) = κη∆B(t, x), t > 0, x ∈ Rd. (5.27)

Here, ∆ is the Laplace operator in Cartesian coordinates and d = 3, but thefollowing results are valid in a generic number d of dimensions. We set up aninitial value problem with the initial condition

B(0, x) = B0(x), ∇ ·B0(x) = 0,

given at time t = 0.Equation (5.27) is known as the heat equation as it is the same equation

describing the diffusion of temperature in a thermally conducting body. This isalso one of the the simplest examples of linear constant-coefficient partial differ-ential equations, and the prototype of parabolic partial differential equations.

Let us start by noting two important properties of this equation, that alonegive a good intuition on the behavior of the solutions. The first property is thatthe average field is conserved. If B(t, x) is a regular solution of (5.27) such thatfor every t, B(t, x) is integrable in space, we can define the average field

B(t) =

∫RdB(t, x)dx,

and, since κη∆B = ∇ · (κη∇B), Gauss theorem yields

d

dtB(t) = lim

r→+∞

∫|x|≤r

∇ ·(κη∇B(t, x)

)dx = lim

r→+∞

∫|x|=r

κηn · ∇Bdσ = 0,

since∇B restricted to the sphere |x| = r approaches zero as r → +∞. Thereforethe average field B(t) is a constant of motion

B(t) = constant. (5.28)

The second property is the dissipation of the L2-norm. If B(t, x) is a classicalsolution with a finite L2-norm,

‖B(t)‖2L2(Rd) =

∫Rd|B(t, x)|2dx,

then, Gauss theorem again gives

d

dt‖B(t)‖2L2(Rd) = 2

∫RdB(t, x) · ∂tB(t, x)dx = 2κη

∫RdB(t, x) ·∆B(t, x)dx

= κη

∫Rd∇ ·[∇B(t, x)2

]dx− 2κη

d∑i,j=1

∫Rd|∇Bi(t, x)|2dx

= −2κη

d∑i=1

∫Rd|∇Bi(t, x)|2dx ≤ 0

145

where the integral of the divergence vanishes as before. It follows that

‖B(t)‖2L2(Rd) ≤ ‖B(0)‖2L2(Rd). (5.29)

Those two properties alone allow us to have a fairly accurate qualitative pictureof the dynamics: If the L2-norm of initial magnetic field B0 must be dissi-pated while conserving the average field, the maximum value of the field shoulddecrease and the solution should spread over a larger volume.

In addition, the dissipation of the L2-norm implies that B = 0 is the uniqueregular squared-integrable solution of the heat equation corresponding to thezero initial condition, B0 = 0. In fact, if ‖B0‖L2(Rd) = 0, inequality (5.29)implies ‖B(t)‖L2(Rd) = 0 for all t ≥ 0 and thus B = 0. For linear equationsthis is equivalent to uniqueness of the solution in the class of squared-integrablefunctions. Indeed if B1(t, x) and B2(t, x) are two regular squared-integrablesolutions corresponding to the same initial condition B0, by linearity of theequation, their difference B1−B2 is again a solution corresponding to the initialcondition B1(0)−B2(0) = B0 −B0 = 0, thus, B1(t)−B2(t) = 0, which meansthat B1 = B2.

We can construct solutions of the heat equation by means of a partial Fouriertransform in space. This technique is important in itself as it can be appliedto any linear constant-coefficient partial differential equation describing a timeevolution on the full space Rd.

Let us look for solutions that can be represented as

B(t, x) =1

(2π)d

∫Rdeik·xB(t, k)dk,

with Fourier amplitude B(t, k). This is possible only if the initial condition B0

admits a similar representation with Fourier amplitude B0(k). We take initialconditions in the Schwartz space S (Rd) of rapidly decreasing functions, which,in particular, is invariant under Fourier transform. Then,

B0(k) =

∫Rde−ik·xB0(x)dx,

belongs to S (Rd). After Fourier transform equation (5.27) becomes

∂tB(t, k) + κηk2B(t, k) = 0.

If the Fourier amplitude B(t, k) satisfies the ordinary differential equation

d

dtB(t, k) = −κηk2B(t, k), B(0, k) = B0(k), (5.30)

where the Fourier dual variable k is treated as a parameter, then B(t, x) is asolution of the heat equation (5.27) with initial condition B0(x). The obtainedordinary differential equation has the unique solution

B(t, k) = e−κηtk2

B0(k),

and we have

B(t, x) =1

(2π)d

∫Rdeik·x−κηtk

2

B0(k)dk

=1

(2π)d

∫Rd

∫Rdeik·(x−x

′)−κηtk2B0(x′)dx′dk.

146

For t > 0, the integral is absolutely convergent and we can exchange the inte-gration order. With this aim we compute the Gaussian integral

K(t, x) =1

(2π)d

∫Rdeik·x−κηtk

2

dk

=e−x

2/(4κηt)

(4πκηt)d/21

πd/2

∫Rde−(ξ−ix/

√4κηt)

2

dξ, (t > 0)

and the remaining integration gives a factor πd/2. Then,

B(t, x) = eκηt∆B0(x)

=

∫RdK(t, x− x′)B0(x′)dx′, K(t, x) =

e−x2/(4κηt)

(4πκηt)d/2.

(5.31)

The operator eκηt∆ : S (Rd)→ S (Rd) is the heat flow operator and its kernelK(t, x) is referred to as the heat kernel. As t → 0+, the kernel approaches theDirac’s δ-function K(t, x − x′) → δ(x − x′) in the space S ′(Rd) of tempereddistributions, and thus the heat flow approaches the identity operator, eκηt∆ →I, thus recovering the initial condition.

Since the heat kernel acts on the initial condition as a convolution with arapidly decreasing function (for t > 0), we can extend this solution to initialconditions in the space of tempered distributions S ′(Rd), i.e., to a rather largeclass of (generalized) functions. For instance, it is sufficient that B0 is boundedby a polynomial in x for the integral in (5.31) to be absolutely convergent forall t > 0 and x ∈ Rd; the result solves the heat equation (5.27) for t > 0 andtends to the initial condition for t→ 0+.

We note that B(t, x) given by (5.31) is smooth in x for every t > 0 (strictinequality!) even when B0 is not, as a consequence of the regularity of the heatkernel: The heat flow is the prototype of regularizing (or smoothing) operator.

The solution obtained in terms of the heat flow operator satisfies the con-straint ∇ ·B = 0 if the initial condition B0 does,

∇ ·B(t, x) =

d∑i=1

∫Rd

∂K

∂xi(t, x− x′)B0,i(x

′)dx′

= −d∑i=1

∫Rd

∂K

∂x′i(t, x− x′)B0,i(x

′)dx′

= −d∑i=1

∫Rd

∂

∂x′i

[K(t, x− x′)B0,i(x

′)]dx′

+

∫RdK(t, x− x′)∇ ·B0(x′)dx′ = 0,

where the first term vanishes because it is a divergence and the second becauseof the initial condition satisfies the constraint ∇ ·B0 = 0. Since the heat kernelis normalized, i.e., ∫

RdK(t, x)dx = 1,

one can check that (5.28) is automatically satisfied too. The reader is encouragedto check the property (5.29) directly from (5.31).

147

We can now consider two specific solutions in order to illustrate the diffu-sion of magnetic field. First, we consider a regular initial condition, namely aGaussian tube of magnetic field lines. The initial condition is

B0(x) = Bce3e−(x2

1+x22)/(2a2), a > 0,

where e3 is the unit vector in the direction of x3 and Bc is a constant magneticfield amplitude (which is added for dimensional reasons). The solution amountsto the Gaussian integral

B(t, x) =Bce3

4πκηt

∫R2

e− (y−y′)2

4κηt− (y′)2

2a2 dy,

where the integration in x′3 have been performed and y = (x1, x2). It remainsto compute a two-dimensional Gaussian integral; we note that

(y − y′)2

4κηt+

(y′)2

2a2=

y2

2a2 + 4κηt+( 1

4κηt+

1

2a2

)(y′ − η(y)

)2=

y2

2a2 + 4κηt+ z2,

where η(y) = y/(1 + 2κηt/a2), and by the change of variable y′ 7→ z one gets

B(t, x) = Bce32a2

2a2 + 4κηte− x21+x22

2a2+4κηt . (5.32)

Therefore, the magnetic field profile preserves its Gaussian shape, but the widthincreases by a factor

√1 + 2κηt/a2 ∼ √κηt as the solution spreads over the

whole space; correspondingly the amplitude decreases ∼ (2κηt/a2)−1, so that

the average field is constant. From the form of the heat kernel, we could haveexpected a scaling of the amplitude of the form ∼ (κηt)

−d/2 where d is thedimension of the space; here, the physical dimension is d = 3, but the solutionis essentially two-dimensional, hence we get the scaling with d = 2.

As mentioned above we can consider much more general initial conditions.One particularly interesting case is given by the initial condition

B0(x) = Bce3

[H(x1)−H(−x1)

]= Bce3

− 1, x1 < 0,

0, x1 = 0,

+ 1, x1 > 0,

which is a one-dimensional configuration with a jump at the plane x1 = 0, withthe Heaviside step function being H(z) = 0 for z < 0, H(0) = 0, and H(z) = 1for z > 0. The corresponding current density is given by

J0(x) =c

4π∇×B0(x) =

cBc4π∇[H(x1)−H(−x1)

]× e3

= −2cBc4π

H ′(x1)e1 × e3 =cBc2π

δ(x1)e1 × e3

= −cBc2π

δ(x1)e2,

where δ(x1) is the Dirac distribution in x1 and we have used the identity H ′ = δ.This represents a singular current density directed along the direction orthogonal

148

to both the magnetic field and its gradient and localized exactly at the jumppoint. Such a field/current configuration is referred to as current sheet or currentlayer. Formation and persistence (stability) of current layers is particularlydiscussed in various aspects of MHD applications such as magnetic reconnectionand equilibria.

In the low magnetic Reynolds number regime, the current sheet diffuses.The solution can be computed analytically. Since the field is essentially onedimensional, the integration over x2 and x3 in (5.31) can be carried out first,with the result that

B(t, x) = Bce3

∫R

e−(x1−x′1)2/(4κηt)

(4πκηt)1/2

[H(x′1)−H(−x′1)

]dx′1

=Bce3

(4πκηt)1/2

[ ∫ +∞

0

−∫ 0

−∞

]e−(x1−x′1)2/(4κηt)dx′1.

The remaining Gaussian integrals can be expressed in terms of the error functionwhich is defined by

erf(z) =2√π

∫ z

0

e−u2

du, z ∈ R.

In fact, the change of variable x′1 7→ u = (x1 − x′1)/√

4κηt yields

B(t, x) =Bce3√π

[ ∫ x1/√

4κηt

−∞−∫ +∞

x1/√

4κηt

]e−u

2

du

=Bce3

(√π

[ ∫ 0

−∞+

∫ x1/√

4κηt

0

−∫ ∞

0

+

∫ x1/√

4κηt

0

]e−u

2

du

= Bce32√π

∫ x1/√

4κηt

0

e−u2

du,

from which one recognizes the error function of argument x1/√

4κηt, and

B(t, x) = Bce3 erf(x1/√

4κηt). (5.33)

The error function is essentially a smoothed jump at x1 = 0, and√

4κηt is thespatial scale of the field variation near the jump: Again we find that such ascale increases in time like

√4κηt.

Both solutions (5.32) and (5.33) are essentially one-dimensional diffusionprocesses as the magnetic field has only one non-vanishing component and thatdepends on one variable. Specifically, in the case of solution (5.32) one has diffu-sion in the radial direction (due to symmetry), while in the case of solution (5.33)one has diffusion in the x1 coordinate. Figure 5.4 shows the time-evolution of B3

for the two considered examples. Diffusion yields a broadening of the Gaussianprofile in the case of (5.32), and a smoothing of the current sheet of the caseof (5.33). Figure 5.5 gives a three-dimensional representation of the magneticfield according to solutions (5.32) and (5.33). For the case of a Gaussian fluxtube, the field has the same direction everywhere and just spreads over a largerarea, thus reducing its intensity. In the case of the current sheet, the magneticfield has different directions depending on the value of x1; diffusion tends to

149

Figure 5.4: Normalized magnetic field component B3/Bc at different time asa function of x1 at x2 = x3 = 0, for the cases of solutions (5.32), left-hand-sidepanel, and (5.33), right-hand-side panel. The spatial coordinate x1 as well as alllengths including

√κηt are normalized. We choose the normalization length so

that a = 1 in the case of solution (5.32), while the current sheet solution (5.33)depends only the dimensionless quantity x1/

√4κηt so that the result is scale

invariant.

average out the field in space so that in regions where both opposite directionsof the magnetic field are present, the convolution with the heat kernel averagesto a zero (or small) value of the field with the results that the current layerbroadens as time goes by.

A plasma with a magnetic Reynolds number so low is however not reallycommon; in fact resistivity is usually small as it scales with temperature like1/τe ∼ T−3/2, cf. equations (3.39) and (3.41). Nonetheless, magnetic fielddiffusion plays a role near current sheets even if resistivity is small becausethere the magnetic field can develop gradients so large that locally the magneticReynolds number is reduced and magnetic field diffusion quantitatively andqualitatively modifies the ideal dynamics.

5.4 Magnetic reconnection: basic ideas and examples. Flux tubeswith magnetic diffusion. Reconnection briefly. Numerical examples.

150

Figure 5.5: Magnetic field diffusion in a “Gaussian flux tube” (left-hand-sidepanels) and in a current sheet (right-hand-side panels). The uppermost panelsshow the respective initial condition and time is increasing from top to bottom.The same cases as in figure 5.4 are considered, with the same normalizations.The magnetic field is represented by arrows of different color and length, whilethe contour plot at the bottom of each panel shows the profile of B3/Bc in the(x1, x2)-plane. The color scale is constant in time, while the size of the arrowsis normalized to the maximum norm of B at the given time.

151

6 Variational formulation

Variational formulations allow us to understand solutions of a given system ofequations as (local) extrema of a certain functional of the unknowns. This hasboth theoretical and practical advantages, but it is not always easy to find avariational formulation for a given model.

Variational principles stand at the very foundation of classical mechanicsand in general they are regarded as “more fundamental” than model equationsthemselves [73, 74]. From a mathematical point of view a problem that admits avariational formulation can be studied in the framework of calculus of variations[75, 76]. More recently, variational formulations have attracted the interest ofthe scientific-computing community because of the rapidly developing work onvariational integrators. Those are numerical schemes obtained by discretizationof the variational principle underlying the model [77]. Such methods have beendeveloped for both ordinary differential equations [78, and references therein]and partial differential equations [79, and references therein]. For the specificcase of incompressible ideal MHD, variational integrator based upon the Euler-Poincare variational principle have been developed by Gawlik et al. [22], whilea formal Lagrangian approach has been proposed by Kraus [80, 24, 25].

However the first and most direct variational form for ideal MHD (bothcompressible and incompressible) was obtained by Newcomb in 1962 [81]. Adiscretization of Newcomb variational principle has been attempted recently byYou et al. [23].

Newcomb variational formulation is essentially a generalization of Arnold’sresult on ideal incompressible flows [82]. Arnold discovered that (at least for-mally) ideal incompressible flows, i.e., solutions of Euler’s equations (1.55), canbe realized as geodesics in a suitable space of transformations. Geodesics arecurves of minimum length and therefore Arnold’s formulation is a variationalformulation, the functional being the length of a curve. A simple and clearderivation can also be found in the third volume of Taylor’s treatise on partialdifferential equations together with a generalization to compressible fluids [83,p.448 and p.532].

The aim of this section is to introduce basic ideas on variational principles ingeneral and the Newcomb’s variational formulation for ideal MHD in a modernperspective. We shall follow Arnold’s ideas in the form reported by Taylor,viewing the plasma motion as a curve in a suitable space, but for MHD this curveis not a geodesic as the functional in the variational formulation is complicatedby the presence of passively advected quantities, cf. appendix D.

6.1 Basic elements of calculus of variations. In order to keep the pre-sentation reasonably self-contained let us start reviewing a few definitions con-cerning functions on Banach spaces. We refer to the book by Hunter andNachtergaele [75] for a concise introduction. For Banach spaces, one can re-fer to the comprehensive monograph by Fabian et al. [84]. Definitions andresults given in this section are rigorous, but will be used in a formal way onlyin the applications to MHD. If one has already some familiarity with functionalderivatives and calculus of variations, this section can be skipped.

If X is a linear space (over the field R of real numbers for definiteness), afunction ‖ · ‖ : X → R≥0 is a norm if it satisfies three conditions:

153

N1. ‖x‖ ≥ 0 and ‖x‖ = 0 only if x = 0.

N2. ‖λx‖ = |λ| · ‖x‖ for all x ∈ X and λ ∈ R.

N3. ‖x1 + x2‖ ≤ ‖x1‖+ ‖x2‖ for all x1, x2 ∈ X.

A space X equipped with a norm is called a normed space.A norm defines a metric so that we can say when a sequence xn | n ∈ N

converges in X, namely when

∀ε > 0, ∃N > 0 | k > N =⇒ ‖xk − x‖X < ε. (6.1)

A convergent sequence automatically satisfies the Cauchy condition

∀ε > 0, ∃N > 0 | m,n > N =⇒ ‖xm − xn‖X < ε. (6.2)

This follows from the triangular inequality (property N3.) since

‖xm − xn‖X ≤ ‖xm − x‖X + ‖xn − x‖X ,

and we can choose an N so big that ‖xk−x‖X < ε/2 for all k ≥ N . The conversehowever is not true, i.e., not every Cauchy sequence in a generic normed spaceX has a limit in X.

A normed space in which all Cauchy sequences have a limit is referred to asa complete normed space or Banach space.

Banach spaces constitute the standard setting for the generalization of cal-culus to functions defined over spaces of functions. Specifically we can definedifferentiation and the purpose of this section is to review three different con-cepts of derivative of a function on a Banach space.

We shall consider functions f : X → Y between two Banach spaces X, Yover the real numbers with norms ‖ · ‖X and ‖ · ‖Y , respectively.

If X is a space of functions (for instance L2(Rd)), then f : X → Y issometimes referred to as a functional. However, in this note we reserve the term“functional” to real-valued functions f : X → R only, that is, the target spaceis R equipped with the absolute-value norm.

We can extend the usual definition of continuity to a function f : X → Y .Specifically f is continuous in a point x0 ∈ X when

∀ε > 0, ∃δ > 0 | ‖x− x0‖X < δ =⇒ ‖f(x)− f(x0)‖Y < ε. (6.3)

If a function f is continuous in all points of an open non-empty set U ⊆ X wewrite f ∈ C(U, Y ). This is equivalent to the statement that the inverse imageof an open set is open.

Linear functions A : D(A) ⊆ X → Y , defined on a linear subspace D(A)called the domain of A, play a special role and are referred to as operators;linearity means that

A(c1x1 + c2x2) = c1Ax1 + c2Ax2,

for all x1, x2 ∈ D(A) and c1, c2 ∈ R, where Ax ∈ Y (without brackets) denotesthe action of A on x ∈ D(A) ⊆ X. If the closure D(A) = clD(A) in X of thedomain D(A) amounts to the whole space X, we say that the domain D(A) isdense and the operator A is densely defined.

154

We say that a linear operator A is bounded if there exists a constant C > 0such that

‖Ax‖y ≤ C‖x‖X , for all x ∈ D(A). (6.4)

For linear operators continuity and boundedness are equivalent.

Theorem 6.1. An operator A : D(A) ⊆ X → Y is continuous if and only if itis bounded.

Proof. A linear operator is continuous on the whole linear space D(A) if andonly if it is continuous in x = 0. In fact, if A is continuous in zero, we have thatfor every ε > 0 we can choose a δ > 0 such that

‖Ax‖Y ≤ ε,

for all x ∈ D(A) such that ‖x‖X < δ. Hence, for all x, x0 ∈ D(A) we have

‖Ax−Ax0‖Y = ‖A(x− x0)‖Y < ε,

for all x = x − x0 such that ‖x − x0‖X < δ. From (6.4) it follows that bound-edness implies continuity in zero. The converse is also true as we can writethe continuity condition in zero with ε = 1/2 fixed, namely, we find δ > 0such that ‖Ax‖Y < 1/2 when ‖x‖X < δ. For a generic x ∈ D(A), x 6= 0, letx = (δ/2)x/‖x‖X so that ‖x‖X = δ/2 < δ and thus

‖Ax‖Y =2‖x‖Xδ‖Ax‖Y <

1

δ‖x‖X .

which implies (6.4) with constant C = 1/δ.

A bounded operator A : D(A) ⊆ X → Y can be extended by continuityto the closure D(A) of its domain, and therefore a densely defined boundedoperator can be extended by continuity to the whole space X. We denote byB(X,Y ) the linear space of bounded operators defined on the whole space X.One can check that B(X,Y ) is a linear space and we can define the norm

‖A‖B(X,Y ) = supx∈X | ‖x‖X=1

‖Ax‖Y , (6.5)

that is, the norm of A is the “best constant” in the bound (6.4). The threedefining properties of a norm are satisfied: Positivity (N1.) and homogeneity(N2.) are inherited from ‖ · ‖Y ; for the triangular inequality (N3.) one also usesthe properties of sup. Hence B(X,Y ) is a normed space.

Theorem 6.2. If X is a normed space and Y a Banach space, then B(X,Y )equipped with the operator norm (6.5) is a Banach space.

Proof. Let An be a Cauchy sequence in B(X,Y ) with respect to the operatornorm (6.5). Then for every ε > 0 there is an integer N > 0 such that for everym,n > N we have

‖(Am −An)x‖Y ≤ ‖Am −An‖B(X,Y ) < ε, (6.6)

for any x ∈ X of unit norm, ‖x‖X = 1, that is uniformly on the unit spherein X. By linearity, ‖Amx− Anx‖Y ≤ ‖x‖X · ‖Am − An‖B(X,Y ), and thus Anx

155

is a Cauchy sequence in Y . Since Y is complete, there is a pointwise limity = limn→+∞Anx for every point x ∈ X. We define Ax = limnAnx. Then Ais a linear map from X → Y since An(c1x1 + c2x2) = c1Anx1 + c2Anx2 and thelimits of Anxk exist separately for k = 1, 2. On the unit sphere, convergence isuniform as we can let m → +∞ in (6.6) and obtain ‖Anx − Ax‖Y < ε for alln > N and all x ∈ X with ‖x‖X = 1. Hence, A is bounded since for every point0 6= x ∈ X, x = x/‖x‖X , and n > N we have

‖Ax‖Y = ‖x‖X · ‖Ax‖Y ≤ ‖x‖X ·(‖Anx−Ax‖Y + ‖An‖B(X,Y )

)≤(‖An‖B(X,Y ) + ε

)‖x‖X .

Therefore A ∈ B(X,Y ) and ‖An − A‖B(X,Y ) → 0 since Anx → Ax uniformlyon the unit sphere.

The space X ′ = B(X,R) of bounded linear functionals over X is referredto as the topological dual of X. Incidentally we note that the algebraic dual X∗

is the space of all linear functions from X → R and X∗ ⊇ X ′.

Frechet derivative. For a generic function f ∈ C1(Rd,RN ) the derivative at apoint x ∈ Rd is a matrix Df(x) with N rows and d columns such that

f(x+ h)− f(x)−Df(x)h = o(|h|),

uniformly for |h| → 0. (Notation: Df(x)h = h·∇f(x)). Then the derivative canbe viewed as a linear map Df(x) : Rd → RN . This definition can be generalizedto functions f : U ⊆ X → Y defined on a non-empty subset U of a Banachspace X.

Definition 6.1 (Frechet derivative [75, 84]). A function f : U ⊆ X → Y isFrechet differentiable at x ∈ U ⊆ X if there exists a bounded linear operatorA ∈ B(X,Y ) such that

‖f(x+ h)− f(x)−Ah‖Y = o(‖h‖X),

uniformly for ‖h‖X → 0. Then A is called Frechet derivative of f at x anddenoted equivalently by Df(x) = f ′(x).

As usual a function which is differentiable at x is also continuous at x. If afunction f is differentiable in all points of an open non-empty set U ⊆ X andDf : U → B(X,Y ) is continuous we say that f ∈ C1(U, Y ). Iteratively one candefine spaces Ck(U, Y ).

For the specific case of functionals (i.e., Y = R), the Frechet derivativeDf(x) is an element of the topological dual B(X,R) = X ′ and if f is differen-tiable in an open set U , then

Df : U → X ′.

For functionals we can also use the special notation

Df(x)h = 〈Df(x), h〉, (6.7)

where the angle-brackets should not be confused with a scalar product; in fact,Df(x) and h in general do not even belong to the same space.

A property of practical importance is that the chain rule holds for Frechetderivatives as well. Precisely

156

Theorem 6.3. If f : X → Y is differentiable at x ∈ X and g : Y → Z isdifferentiable at y = f(x), then g f : X → Z is differentiable at x and

D(g f)(x) = Dg(y) Df(x), y = f(x). (6.8)

Proof. Theorem 13.8 in Hunter and Nachtergaele [75].

Since the Frechet derivative is the natural definition of derivative, in thefollowing, “differentiable” means “Frechet differentiable”.

Gateaux derivative. Given a function f : U ⊆ X → Y and a certain directionh ∈ X we can define the line x + th ∈ U through the point x with parametert ∈ R, and for |t| sufficiently small, we can compose it with f thus obtainingthe function t → f(x + th). The derivative of this Y -valued function at t = 0,if it exists, defines the directional derivative of f along h, namely,

δf(x, h) = limt→0

1

t

[f(x+ th)− f(x)

], (6.9)

for points x ∈ U and directions h ∈ X for which the limit exists. In generalthe directional derivative might be a nonlinear function of (x, h). The functionf : R2 → R defined by

f(x, y) =xy2

x2 + y4,

for (x, y) 6= 0 and f(0, 0) = 0 is a textbook example of function that has alldirectional derivatives in the origin (x, y) = (0, 0), but they are non-linear. Thefunction itself is not even continuous in (0, 0).

Definition 6.2 (Gateaux derivative [75]). If the directional derivative δf(x, h)exists in a given x ∈ U for all h ∈ X and there exists a linear operator A : X → Ysuch that δf(x, h) = Ah, then f is Gateaux-differentiable at x and A = DGf(x)is the Gateaux derivative at x.

The Gateaux derivative has the advantage of being easier to understand andcompute, but it is weaker than the Frechet derivative. In the definition of theGateaux derivative we only require that the directional derivative exists and itis linear for all directions. We do not say if the Gateaux derivative is boundedor not (although some authors require boundedness [84]). More importantly,the limit in (6.9) does not need to be uniform.

The chain rule, theorem 6.3, ensures that a Frechet-differentiable function isalso Gateaux-differentiable. In fact, the directional derivative is the derivativeof the composition of f with a linear function from R into X. The conversehowever is not true: Existence of the Gateaux derivative at a point is notsufficient for the existence of the Frechet derivative.

However, if f is Gateaux-differentiable at x ∈ U , DGf(x) ∈ B(X,Y ), andthe limit (6.9) is uniform in h, then f is Frechet-differentiable at x and Df =DGf . In fact uniform convergence means that

‖f(x+ th)− f(x)− tDGf(x)h‖Y ≤ r(t),

for all h ∈ X such that ‖h‖X ≤ 1, where r(t) = o(t) is independent on h. Butthen for every h ∈ X

‖f(x+ h)− f(x)−DGf(x)h‖Y = ‖f(x+ th)− f(x)− tDGf(x)h‖Y ≤ r(t),

157

where t = ‖h‖X and h = h/‖h‖X . From definition 6.1, we see that f is Frechetdifferentiable and DGf(x) = Df(x).

A less evident sufficient condition is established by the following

Theorem 6.4. If f is Gateaux-differentiable in an open neighborhood U of apoint x ∈ X and the Gateaux derivative DGf(x′) ∈ B(X,Y ) is continuous atx′ = x, then f is Frechet-differentiable at x and Df(x) = DGf(x).

Proof. Theorem 13.8 in Hunter and Nachtergaele [75].

Let us remark that both definitions of derivative rely on the linear structureof the spaces X and Y .

Pairings and functional derivatives. We follow Marsden et al. [32]. Let startwith the case of functionals f : X → R defined over a Hilbert space X withscalar product (·, ·)X : X ×X → R. We recall that a scalar product induces a

norm through ‖x‖2X = (x, x)1/2X , so that the Hilbert space X is also a Banach

space. If f is differentiable at x, then Df(x) ∈ X ′, i.e., Df(x) is a continuouslinear functional over the Hilbert space X. In virtue of the Riesz representationtheorem, there exists a unique element in X, denoted by δf(x)/δx, such that

〈Df(x), h〉 =(δf(x)

δx, h)X, (6.10)

for all h ∈ X. The element δf(x)/δx is referred to as functional derivative of fat x with respect to the standard product in X.

For a generic Banach space, we do not have a standard product. Even for aHilbert space the natural product might not be a good choice to represent thederivative (example given below).

Nonetheless, it is still convenient to find a representation of the derivativeof a functional on a generic Banach space X in terms of an element of anotherBanach space Y . With this aim we introduce

Definition 6.3 (Marsden et al. [32]). A pairing of two Banach spaces X,Y is abi-linear continuous map 〈·, ·〉X×Y : X×Y → R; the subscript X×Y is usuallyunderstood when it can be inferred from the arguments. Here continuity meansthat there exists a constant C > 0 for which∣∣〈x, y〉| ≤ C‖x‖X · ‖y‖Y .The pairing is (weakly) Y -non-degenerate if

〈x, y〉 = 0 for all x ∈ X =⇒ y = 0.

Conversely, the pairing is (weakly) X-non-degenerate if

〈x, y〉 = 0 for all y ∈ Y =⇒ x = 0.

A pairing that is both X- and Y -non-degenerate is called non-degenerate. Thespaces X,Y are in duality if there exists a non-degenerate pairing.

158

As an example, equation (6.7) defines a pairing between X ′ and X (and thisjustifies the use of the angle-bracket notation). Since the norm in X ′ is theoperator norm (6.5), we have∣∣〈u, h〉∣∣ ≤ ‖u‖X′ · ‖h‖X ,for all u ∈ X ′ and h ∈ X, which shows the continuity on X ′×X. One can checkthat this pairing is non-degenerate, and thus, X ′, X are in duality. In fact, if〈u, h〉 = 0 for all h ∈ X, then u is the operator identically equal to zero. On theother hand, if 〈u, h〉 = 0 for all u ∈ X ′ and h 6= 0, we can find ωh ∈ X ′ suchthat 〈ωh, h〉 = ‖h‖X , which gives the contradiction ‖h‖X = 0, hence h = 0. Theexistence of ωh for all h 6= 0 is a consequence of the Hahn-Banach theorem [85].

We shall use the pairing 〈·, ·〉X×Y as an alternative to the scalar product inorder to find a representative of the functional derivative in Y . This in generalis not possible for all x and thus the functional derivative might have a smallerdomain than the Frechet derivative.

First, we observe that pairing induces a map T : Y → X ′ given by

T (y)x = 〈x, y〉X×Y ,

and for a non-degenerate pairing, T is injective. More precisely, we have

Proposition 6.5. If the pairing is Y -non-degenerate, T : Y → X ′ is injective.

Proof. If for two points y1, y2 ∈ Y we have T (y1) = T (y2) then,

〈x, y1 − y2〉X×Y = 0,

for all x ∈ X. Since the pairing is Y -non-degenerate we must have y1 − y2 = 0,hence T is injective.

We can now give the general definition of functional derivative.

Definition 6.4. Let X,Y be two Banach spaces with a Y -non-degenerate pair-ing 〈·, ·〉 : X × Y → R. If f ∈ C(U,R) is a functional defined on an opennon-empty set U ⊆ X and differentiable at x ∈ U , then δf(x)/δx is the uniqueelement of Y , if it exists, such that

Df(x)h = 〈h, δf(x)

δx〉X×Y .

When it exists, δf(x)/δx is referred to as the functional derivative of f at x,relative to the pairing 〈·, ·〉X×Y .

Let us remark again that in general, existence of the functional derivativeis not implied by the existence of the Frechet derivative. It depends also onthe pairing as discussed in the example below. Nonetheless, when it exists it isunique since the map T is injective.

In practice, X will be a space of functions and (when not otherwise specified)functional derivatives are computed with respect to the L2-pairing.

159

Example. Let us briefly discuss a typical example based on the L2-space, whichis a particularly important space in physics applications since energy oftenamounts to an L2-norm.

With Ω being a bounded domain in Rd, L2(Ω) is a Hilbert space and thusa Banach space with the norm induced by the scalar product. Let us considerthe subspace X = H1

0 (Ω), which, we recall, is defined as the closure of the spaceC∞0 (Ω) of smooth compactly supported functions in Ω with the norm

‖u‖H1(Ω) =[‖u‖2L2(Ω) + ‖∇u‖2L2(Ω)

]1/2.

By definition H10 (Ω) is complete with respect to this norm and thus a Banach

space. Actually, on H10 (Ω) we can also define a scalar product by

(u, v)H1(Ω) = (u, v)L2(Ω) + (∇u,∇v)L2(Ω),

which makes H10 (Ω) a Hilbert space and ‖u‖H1(Ω) = (u, u)

1/2H1(Ω).

A frequently encountered example of functional on H10 (Ω) is f : H1

0 (Ω)→ R

defined by

f(u) =1

2

∫Ω

|∇u|2dx.

In view of the Poincare inequality this functional is coercive, that is, it controlsthe standard norm; specifically we have

‖u‖2H1 = ‖u‖2L2 + ‖∇u‖2L2 ≤ Cf(u),

for all u ∈ H10 (Ω) where we have used lemma 2.6.

In order to compute the Frechet derivative at u we pick a direction v ∈ H10 (Ω)

and t ∈ R and compute

f(u+ tv)− f(u) =

∫Ω

[t∇u · ∇v + (t2/2)|∇v|2

]dx,

and we see that1

t

[f(u+ tv)− f(u)

]→∫

Ω

∇u · ∇vdx,

which defines the directional derivative

δf(u, v) =

∫Ω

∇u · ∇vdx.

For every u, this is a linear operator v 7→ δf(u, v), defined for all v ∈ H10 (Ω).

Hence we conclude that f is Gateaux-differentiable at u, and this holds for everyu ∈ H1

0 (Ω).By the Cauchy-Schwartz inequality, the Gateaux derivative satisfies∣∣DGf(u)v

∣∣ ≤ ‖∇u‖L2(Ω) · ‖∇v‖L2(Ω) ≤ ‖∇u‖H1(Ω) · ‖∇v‖H1(Ω),

which shows that DGf(u) ∈ B(H10 (Ω),R) = H−1(Ω), where H−1(Ω) is by def-

inition the topological dual of H10 (Ω). A similar inequality shows the continuity

with respect to the point u. In summary, f is Gateaux-differentiable everywherein H1

0 (Ω) with continuous Gateaux derivative DGf : H10 (Ω)→ H−1(Ω). Hence,

160

in virtue of theorem 6.4, we can conclude that f is Frechet-differentiable andthe derivative is

Df(u)v = DGf(u)v =

∫Ω

∇u · ∇vdx.

We now look for a representation of this derivative in the space Y = L2(Ω),with respect to the pairing

〈u, ϕ〉 =

∫Ω

uϕdx,

with u ∈ H10 (Ω) and ϕ ∈ L2(Ω). The considered pairing is bi-linear and contin-

uous since ∣∣〈u, ϕ〉∣∣ ≤ ‖u‖L2(Ω)‖ϕ‖L2(Ω) ≤ ‖u‖H10 (Ω)‖ϕ‖L2(Ω),

for all u ∈ H10 (Ω) and ϕ ∈ L2(Ω). In addition, the pairing is L2-non-degenerate

since C∞0 (Ω) is dense in L2(Ω), so that we can find a sequence un ∈ C∞0 (Ω)which converges to ϕ ∈ L2(Ω) and∣∣(un, ϕ)L2(Ω) − ‖ϕ‖L2(Ω)

∣∣ =∣∣(un − ϕ,ϕ)L2(Ω)

∣∣ ≤ ‖un − ϕ‖L2(Ω)‖ϕ‖L2(Ω).

Since un ∈ H10 (Ω), if 〈u, ϕ〉 = 0 for all u ∈ H1

0 (Ω), then it must be ϕ = 0 ∈L2(Ω). (In short we have that ϕ is orthogonal to a dense subspace and thus mustvanish.) Conversely, the pairing is also H1

0 -non-degenerate, since, if 〈u, ϕ〉 = 0for all ϕ ∈ L2, we can choose in particular ϕ = u and obtain ‖u‖2L2 = 0 oru = 0. Hence, the pairing is non-degenerate.

The functional derivative of δf(u)/δu with respect to the considered L2

pairing is the unique element of L2(Ω) such that

Df(u)v =

∫Ω

∇u · ∇vdx =

∫Ω

δf(u)

δuvdx = 〈v, δf(u)

δu〉,

but that does not exists everywhere in H10 (Ω). If we consider points u ∈ H1

0 (Ω)that have a Laplacian in L2(Ω), we can integrate by parts and write

Df(u)v =

∫Ω

∇u · ∇vdx =

∫Ω

(−∆u)vdx.

Henceδf(u)

δu= −∆u,

for those point u ∈ H10 (Ω) such that ∆u ∈ L2(Ω), which gives the domain of

definition of the functional derivative with respect to the L2 pairing.

Banach manifolds. The definitions of derivative and functional derivative givenso far rely on the linear space structure of the considered Banach space. Inpractice however we can encounter constraints that define hyper-surfaces inthe Banach space. This leads to the theory of manifolds modeled on Banachspaces [32]. Here we attempt to provide a minimal simplified approach.

Let us start with Banach spaces X,Y, Z and a proper closed subset M ⊂ Xgiven in the form of a level set of a C1 function ψ : X → Z, namely,

M = x ∈ X | ψ(x) = 0 ∈ Z,

161

X

M = x ∈ X | ψ(x) = 0

m = γ(0)γ

γ′(0)

Figure 6.1: Sketch of the set M and a tangent vector as a surface in X. Thetangent plane TmM is the space of all tangent vectors γ′(0) at m = γ(0).

with Dψ(m) surjective at any m ∈ M . Here ψ represents a set of constraintsimposed on x.

The chain rule, theorem 6.3, implies that a curve γ = γ(t) ∈ M with γ ∈C1(I,X), defined on an interval I = (−ε,+ε) ⊂ R, satisfies

Dψ(γ(t)

)γ′(t) = 0,

since γ takes values on the zero-level set of ψ and γ′(t) is by definition a linearmap from R→ X and thus can be identified with an element of X. Therefore,the tangent space to M at the point m is identified with the null space of thelinear operator Dψ(m) ∈ B(X,Z), that is

TmM = x ∈ X | Dψ(m)x = 0 = kerDψ(m). (6.11)

Since Dψ(m) : X → Z is a bounded operator TmM = kerDψ(m) is a closedsubspace of X, then it is a Banach space [84]. According to this definition,TmM can be understood as the subspace of X spanned by the tangent vectorsat m of all the curves in M passing through the point m, cf. figure 6.1.

We can now introduce the directional derivative of a function f : M → Y .For any curve γ : I → M passing through m = γ(0) with tangent h = γ′(0) ∈TmM , the directional derivative of f at m in the direction h is defined by

δf(m,h) = limt→0

1

t

[f(γ(t)

)− f(m)

]=

d

dtf(γ(t)

)∣∣∣∣t=0

. (6.12)

In physics applications, δf(m,h) is often referred to as variation of f in thedirection h, and one usually writes δf implying the arguments.

If f : M → Y is defined as the restriction of a function F ∈ C1(X,Y ), i.e.,f = F |M , then the chain rule gives

δf(m,h) = DF (m)h, for all h ∈ TmM,

162

that is we can compute the full Frechet derivative and then restrict it to tangentspace in m ∈M .

6.2 Existence of a variational formulation. Let N : U ⊆ X → X ′ be aC1 map with domain U in a Banach space X, with values in the dual X ′. Weare interested in the solution of the problem:

find x ∈ U ⊆ X such that N(x) = 0. (6.13)

Physically the function (also referred to as nonlinear operator) N represents thesystem of equations of the considered physics model. As a simple example, letN : H1

0 (Ω)→ H−1(Ω) = X ′ be defined on u ∈ H10 (Ω) by

N(u) = −∆u− f,

where ∆ denotes the Laplace operator and f ∈ H−1(Ω). Then the equationN(u) = 0 corresponds to the Poisson equation discussed in section 2.3. Moreinterestingly N can represent the whole set of MHD equations (3.50a), but inthis case it is much harder to identify the spaces X and X ′.

We shall consider a special class of problems. Let us recall that for anyfunctional f ∈ C1(X,R), the Frechet derivative Df is a map from X → X ′

exactly as N .

Definition 6.5 (Potential fields). If there exists a functional F ∈ C2(U) suchthat

N(x) = DF (x), (6.14)

N is a potential field on X and F is a potential for N .

Solutions of equation (6.13) with a potential field N are related to the vari-ational principle, which asks for the solution of the variational problem:

find a local extremum x ∈ U ⊆ X of the functional F , (6.15)

where F is the potential of N .In fact, if x is a local extremum of a functional F ∈ C2(U), with DF = N ,

for h ∈ X and t ∈ R so small that x+ th ∈ U , the function

t 7→ F (x+ th)

has a local extremum at t = 0, and thus its derivative must vanish at t = 0,namely,

DF (x) = N(x) = 0,

which is equation (6.13). We see that for potential fields, equation (6.13) isa necessary condition for solutions of the variational problem (6.15). It is nota sufficient condition since F might, in principle, have saddle points as well.Therefore the original problem (6.13) with a potential field N and the cor-responding variational formulation (6.15) in general are not strictly equivalentunless the functional F satisfies additional conditions that exclude saddle points.We shall not need to discuss such conditions in details here.

If the non-linear operator N is not a potential field, then there can be noassociated variational principle (at least of the form considered here). Under-standing when a system of equations admits a variational principle is known asthe inverse problem of the calculus of variations [86], which we review briefly.

163

As the name potential field suggests, we look at N as a “vector field” onX and F , when it exists, is a potential for N . Indeed this generalizes the ideaof potentials for vector fields such as the electrostatic potential discussed insection 2.1. Even in the (relatively) simple case of a vector field over X =R3 = X ′, existence of a potential is not granted in general: A vector fieldv = v(x) ∈ R3 is a potential field, that is there exists a scalar f = f(x) suchthat v = ∇f , if and only if it is irrotational, namely,

∇× v(x) = 0. (6.16)

This condition is necessary because if v = ∇f then ∇× v = ∇×∇f = 0. Uponwriting condition (6.16) in components,

(∇× v)i =∑j,k

εijk∂vk∂xj

,

where εijk is the Levi-Civita symbol introduced in section 1.2, we see that con-dition (6.16) is equivalent to the symmetry of the matrix ∇v. This formulationof condition (6.16) can be generalized to the case of functionals.

Theorem 6.6 (Vainberg theorem [86]). A map N ∈ C1(X,X ′) over a Banachspace X is a potential field if and only if DN(x) : X → X ′ is symmetric, thatis,

〈DN(x)h1, h2〉 = 〈DN(x)h2, h1〉,

where 〈·, ·〉 is the standard duality pairing between X ′ and X.

As an example let us consider the map u 7→ N(u) = −∆u−f defined weaklyon u ∈ H1

0 (Ω) = X, that is,

〈−∆u− f, v〉 =

∫Ω

∇u · ∇vdx− 〈f, v〉.

The Frechet derivative is

〈DN(u)v1, v2〉 =

∫Ω

∇v1 · ∇v2dx = 〈DN(u)v2, v1〉.

We see that the hypotheses of Vainberg theorem 6.6 are satisfied hence N(u) =−∆u− f is a potential field. In fact, as shown in the example of section 6, theLaplace operator on H1

0 (Ω) is the derivative of the functional (1/2)‖∇u‖2L2(Ω)

and thus the Poisson equation for u ∈ H10 (Ω)

−∆u = f,

with f ∈ H−1(Ω) admits a variational formulation with potential

F (u) =1

2

∫Ω

|∇u|2dx− 〈f, u〉. (6.17)

If, on the other hand, we consider the advection operator N(u) = a(x) · ∇u(x)with coefficient a ∈ C1

b (Ω) (that is, a ∈ C1 and both a and ∇a are bounded)we find

〈DN(u)v1, v2〉 =

∫Ω

v2a · ∇v1dx = −∫

Ω

v1∇ · (av2)dx,

164

hence DN(u) in this case is not symmetric. Therefore the advection operatorcannot be a potential field on H1

0 (Ω). This is a remarkable result as it impliesthat systems involving an advection operator do not have a variational formand MHD involves mainly advection operators. Luckily Vainberg theorem isnot invariant under a change of coordinate. We shall see that it is possible torepresent the solution of MHD equation in such a way that it has a variationalformulation. In the next section we consider the case of Maxwell’s equations,for which the symmetry condition of Vainberg theorem is not true, and yet achange of variables allows us to find a variational formulation.

6.3 Variational principle for Maxwell’s equations. Let us considerMaxwell’s equations introduced in section 2.1. This case is interesting andphysically meaningful in itself, but it also serves as an example of practicalapplication of the rather abstract concepts we have introduced in the previoussection. For convenience we rewrite the full system in the form

1

c

∂E

∂t−∇×B = −4π

cJ, ∇ · E = 4πρc, (6.18a)

1

c

∂B

∂t+∇× E = 0, ∇ ·B = 0. (6.18b)

Even formally, the symmetry condition of Vainberg theorem 6.6 is not true, thusit is not possible to write a variational principle using E and B as primary fields.However we can represent the electromagnetic field in terms of potentials, cf.equations (2.8),

E = −∇φ− 1

c

∂A

∂t,

B = ∇×A,(6.19)

then the two homogeneous Maxwell’s equations (6.18b) are identically satisfied.

We consider the functional, the action of the electromagnetic field,

S(φ,A) =

∫I

∫Ω

[ 1

8π

(|E|2 − |B|2

)+ J ·A/c− ρcφ

]dxdt

=

∫I

∫Ω

[ 1

8π

∣∣∣∇φ+1

c

∂A

∂t

∣∣∣2 − 1

8π

∣∣∇×A∣∣2 − ρcφ+J ·Ac

]dxdt,

(6.20)

where I = [0, T ] and Ω ⊂ Rd is a bounded domain in dimension d = 3. Thefields φ and A are defined on the space-time domain ΩT = [0, T ] × Ω and weassume that the restriction of the fields to the boundary ∂ΩT is fixed, i.e., weconsider Dirichlet boundary conditions. This, in particular, implies that we fixthe fields at time t = 0 and t = T as well as their value at the spatial boundary[0, T ]× ∂Ω, and we look for local extrema of S on the Banach manifold definedby the boundary conditions. (We shall make no attempt to be precise on thespaces here.)

As discussed in section 6.5, we first compute the variation of the actionfunctional in general, and then restrict to variations that satisfy boundary con-ditions.

165

For λ ∈ R, we have

S(φ+ λφ1,A+ λA1)− S(φ,A)

= λ

∫ΩT

[ 1

4π

(∇φ+

1

c

∂A

∂t

)·(∇φ1 +

1

c

∂A1

∂t

)− 1

4π(∇×A) · (∇×A1) +

J ·A1

c− ρcφ1

]dtdx+O(λ2).

We see that the directional derivative of S is

δS((φ,A), (φ1, A1)

)=

∫ΩT

[ 1

4π

(∇φ+

1

c

∂A

∂t

)·(∇φ1 +

1

c

∂A1

∂t

)− 1

4π(∇×A) · (∇×A1) +

J ·A1

c− ρcφ1

]dtdx.

Then δS((φ,A), (φ1, A1)

)exists for every (φ1, A1), under minimal regularity

requirements on (φ,A). Formally, the Frechet derivative of S is written as

DS(ψ,A)(φ1, A1) = DφS(φ,A)φ1 +DAS(φ,A)A1,

with partial derivatives

DφS(φ,A)φ1 =1

4π

∫ΩT

[− E · ∇φ1 − 4πρcφ1

]dtdx,

DAS(φ,A)A1 =1

4π

∫ΩT

[− E · 1

c

∂A1

∂t−B · (∇×A1) +

4π

cJ ·A1

]dtdx,

where equations (6.19) have been accounted for. As necessary condition for(φ,A) to be a local extremum of S, both derivatives have to vanish for all(φ1, A1) that preserve the boundary conditions, that is,

DφS(φ,A)φ1 = 0, DAS(φ,A)A1 = 0,

for all (φ1, A1) vanishing on the boundary. We recognize the weak form of thetwo inhomogeneous Maxwell’s equations (6.18a), while the homogeneous equa-tions (6.18b) are satisfied automatically by using potentials, cf. equation (6.19).Hence we have found a variational formulation of Maxwell’s equations.

If E and B have sufficient derivatives in L2, we can integrate by parts andobtain equations (6.18a) in strong form. In fact we can compute the functionalderivatives of S with respect to the L2-pairing by

DφS(φ,A)φ1 =

∫ΩT

δS(φ,A)

δφφ1dtdx

=1

4π

∫ΩT

[∇ · E − 4πρc

]φ1dtdx = 0,

DAS(φ,A)A1 =

∫ΩT

δS(φ,A)

δA·A1dtdx

=1

4π

∫ΩT

[1

c

∂E

∂t−∇×B +

4π

cJ]·A1dtdx = 0,

166

for all (φ1, A1) that vanish on the boundary. In summary, weak solutions ofMaxwell’s equations (6.18) are necessary conditions for extrema of the actionfunctional defined in (6.20). In strong form we have

δS(φ,A)

δφ= ∇ · E − 4πρc = 0,

δS(φ,A)

δA=

1

c

∂E

∂t−∇×B +

4π

cJ = 0,

whenever E and B have sufficient derivatives in L2.

6.4 Motion of a changed particle in an electromagnetic field. Letus consider an electrically charged particle in a given electromagnetic field, asdiscussed in section 2.2. The equations of motion are, cf. equation (2.17),

mpd2x(t)

dt2= ep

[E(t, x(t)

)+

1

c

dx(t)

dt×B

(t, x(t)

)], (6.21)

where γ : t 7→ x(t) ∈ R3 is the trajectory of the particle in physical space and(E,B) is the imposed electromagnetic field (possibly changing with both timeand position). We represent E and B in terms of given potentials φ and Aaccording to equations (6.19).

We claim that the ordinary differential equation (6.21) admits a variationalformulation with functional

S(γ) =

∫ T

0

L(t, x(t), x′(t)

)dt (6.22)

defined over the space of trajectories γ : [0, T ] → R of class C2([0, T ]) withprescribed boundary values x(0) = x0 and x(T ) = xT . Here, the function

L(t, x, v) =1

2mpv

2 − epφ+epcA(t, x) · v, (6.23)

is referred to as the Lagrangian function of a charged particle and S is the actionfunctional.

As before, we first compute the derivative of S viewed as a function of curvesγ of class C2 and then restrict the result to variations that preserve the givenboundary conditions.

We evaluate the action S on the line γ + λγ1, where γ1 is the trajectoryt 7→ x1(t),

S(γ + λγ1)− S(γ) = λ

∫ T

0

[mp

dx

dt· dx1

dt− epx1 · ∇φ

+epcx1 · ∇A ·

dx

dt+epcA · dx1

dt

]dt+O(λ2),

where we have expanded the potentials by Taylor’s formula. Extrema of Ssatisfy the condition

DS(γ)γ1 =

∫ T

0

[mp

dx

dt· dx1

dt− epx1 · ∇φ

+epcx1 · ∇A ·

dx

dt+epcA · dx1

dt

]dt = 0.

167

Now we restrict to variations γ1 that preserve the boundary conditions, i.e.,x1(0) = x1(T ) = 0, and integrate by parts with the result that

DS(γ)γ1 =

∫ T

0

[−mp

d2x

dt2− ep∇φ

+epc∇A · dx

dt− ep

c

∂A

∂t− ep

c

dx

dt· ∇A

]· x1dt = 0,

for all γ1 with end-points x1(0) = x1(T ) = 0. Since the integrand is continuous,the only possible solutions are those curves γ for which the term in squarebrackets vanishes, and thus

mpd2x

dt2= ep

[−∇φ− 1

c

∂A

∂t

]+epc∇A · dx

dt− ep

c

dx

dt· ∇A. (6.24)

For every v ∈ R3 we have the vector-calculus identity

v × (∇×A) = −v · ∇A+∇A · v,

and accounting for equation (6.19), we recognize that equation (6.24) is equiv-alent to the equation of motion (6.21) as claimed.

6.5 First-order Lagrangian theories. Maxwell’s equations, section 6.3,and charged particle dynamics, section 6.4, are two examples of a large class ofvariational problems, namely, first-order Lagrangian theories [76, 74].

Maxwell’s equations constitute an example of Lagrangian field theory sinceat every point in time, the configuration of the system is specified by fields,which, in particular, means that the space of all possible configurations at agiven time is infinite-dimensional.

Particle dynamics, on the other hand, is an example of Lagrangian mechanicsfor which the space of possible configurations of the system at a given time isfinite dimensional.

However, one can find a definition of Lagrangian theory in which both finite-and infinite-dimensional systems can be treated in a unified way.

For n, d > 0 integers, we consider a domain Ω ⊆ Rd, a function L : Ω×Rn×Rn×d → R of class C2, called the Lagrangian density, and construct the actionfunctional

S(u) =

∫Ω

L(x, u(x), Du(x)

)dx, (6.25)

for u : Ω→ Rn.

Definition 6.6 (First-order Lagrangian theory). A first-order Lagrangian the-ory consists in the variational problem for a field u : Ω → Rn with functionalof the form (6.25). The necessary condition DS(u) = 0 for the local extrema ofthe action is referred to as Euler-Lagrange equations.

We can derive an explicit weak form of Euler-Lagrange equations, by com-puting the directional derivative along the line u+ λv, λ ∈ R, namely,

S(u+ λv) =

∫Ω

L(x, u+ λv,Du+ λDv)dx

= S(u) + λ

∫Ω

[∂yL(x, u,Du) · v + ∂ηL(x, u,Du) : Dv

]dx+O(λ2),

168

where (x, y, η) ∈ Ω×Rn×Rn×d denotes the point in the domain of L. Therefromwe deduce, formally at least, the derivative DS(u) and the weak Euler-Lagrangeequations,

DS(u)v =

∫ω

[∂yL(x, u,Du) · v + ∂ηL(x, u,Du) : Dv

]dx = 0, (6.26)

for all v.For sufficiently regular fields u, we can integrate by parts and obtain the

functional derivatives of S and the strong form of Euler-Lagrange equations.Imposing Dirichlet boundary conditions on u, we must have v|∂Ω = 0 and

δS(u)

δu= ∂yL(x, u,Du)− div

[∂ηL(x, u,Du)

]= 0.

If xi, ya, and ηai are the components of x, y, and η, respectively, (with indicesi = 1, . . . , d and a = 1, . . . , n), Euler-Lagrange equations (6.26) explicitly read

∂L∂ya

(x, u,Du)−∑i

∂

∂xi

[∂L∂ηai

(x, u,Du)

]= 0, (6.27)

which is a system of n partial differential equations for u = (ua). The xi-derivative has to be computed as a total derivative, namely,

∂L∂ya−∑i

∂2L∂xi∂ηai

−∑b,i

∂2L∂yb∂ηai

∂ub

∂xi−∑b,i,j

∂2L∂ηai ∂η

bj

∂2ub

∂xi∂xj= 0.

One should observe that, in general, this is a system of second-order partialdifferential equations for the field u except when ∂2

ηL = 0. When det ∂2ηL = 0,

we say that the Lagrangian is degenerate. A particularly degenerate case iswhen L does not depend on η, and Euler-Lagrange equations become a systemof nonlinear algebraic equations

∂yL(x, u) = 0, (with L independent of η).

One can also observe that Euler-Lagrange equations for first-order field theoriesare rather general and can comprise elliptic as well as hyperbolic problems.

We can however make a basic classification in geometric terms. Specificallywe find three main spacial cases:

1. The domain Ω can be a spatial domain only so that u is a time-independentfield. This is the case of the Poisson equation discussed in section 6.2.

2. The domain Ω can be of the form Ω = (0, T )×O so that u describes theevolution in time of a field defined on the spatial domain O. This is thecase of Maxwell’s equations discussed in section 6.3. In this formulationtime t is treated as any other spatial coordinate and thus this is referredto as co-variant approach, [87].

3. The domain Ω can be an open time interval Ω = (0, T ) and thus u is acurve describing the evolution of a finite-dimensional system [76]. This isthe case of particle mechanics discussed in section 6.4.

169

For the case of time-dependent fields (case 2.), along with the co-variantapproach one can consider a time-space splitting in which the field u = u(t, x),with now t ∈ (0, T ) and x ∈ O, is regarded as a function of time u(t) = u(t, ·) :O → Rn, taking values in a space of functions over the spatial domain O.The Lagrangian density L can be integrated in space in order to define theLagrangian functional

L(t, u, u′) =

∫OL(t, x, u,Du)dx, (6.28)

where u′(t) = du(t)/dt is the time-derivative of u(t) which equals ∂tu(t, ·) as afunction, and the Jacobian matrix is Du(t, ·) =

(∂tu(t, ·), ∂xu(t, ·)

). In terms of

the Lagrangian functional, the action takes the form

S(u) =

∫ T

0

L(t, u, u′)dt, (6.29)

which is formally identical to the action of a finite-dimensional system (case 3.).In the case of a finite-dimensional theory, the distinction between Lagrangiandensity and Lagrangian function is not necessary.

6.6 Jet bundles and Noether’s theorem. By inspection of (6.27), wecan immediately deduce that if the Lagrangian density L is independent of oneof the y-coordinates, e.g., ya for some fixed a, then Euler-Lagrange equationsimply the conservation law,∑

i

∂

∂xi

[∂L∂ηai

(x, u,Du)

]= 0, (with ∂yaL = 0). (6.30)

More precisely, if u is a solution of Euler-Lagrange equation, the vector fieldwith components given by

Ji(x) =∂L∂ηai

(x, u,Du), (6.31)

is divergence-free.In a space-plus-time splitting we replace x by (t, x), and equation (6.30)

takes the form of a conservation law

∂tJt +∇ · Jx = 0, (6.32)

where J = (Jt, Jx). Therefore the time-like component Jt of the vector J isconserved with flux given by the space-like components Jx.

Another example of conserved quantity is found when the Lagrangian densitydoes not explicitly depend on the variable xk for a given k, i.e., ∂xkL = 0. Inthat case we have, with δik being the Kronecker delta,∑

i

∂

∂xi

[Lδik −

∑a

∂L∂ηai

∂ua

∂xk

]=∑a

[∂L∂ya

∂ua

∂xk−∑i

∂

∂xi

[ ∂L∂ηai

]∂ua∂xk

],

and the right-hand side vanishes for solutions of the Euler-Lagrange equa-tions (6.27). Therefore from the assumption ∂xkL = 0, it follows that, for asolution of Euler-Lagrange equations, the vector field, given in components by

Ji(x) = L(x, u,Du)δik −∑a

∂L∂ηai

∂ua

∂xk(x, u,Du), (for k fixed), (6.33)

170

is divergence-free.In both the foregoing examples, conservation laws follow from a symmetry

of the Lagrangian density. In the first case, L is assumed to be invariant withrespect to translations of ya, that is one of the field components, while in thesecond case the invariance is with respect to translations of xk, that is one ofthe coordinate in the domain Ω.

The deep connection between symmetries and conservation laws was estab-lished by Noether in her seminal work [88]. Noether’s results is fully appreciatedonly if the underlying geometry is taken into account. The following is an at-tempt the give a concise introduction.

First-order Lagrangian theories as described in section 6.5 can be geometri-cally formulated on the Cartesian product

E = Ω×Rn.

In fact, a given function u : Ω→ Rn can be identified with the surface

graph(u) = (x, y) ∈ E | y = u(x) ⊂ E,

which is referred to as the graph of the function. We also observe that E canbe naturally equipped with the projection π : E → Ω given by (x, y) 7→ x. Themap π is onto and, upon identifying Ω with the “zero section” Ω × 0 ⊂ E,π satisfies π2 = π, i.e., it is a projector. The set E is such that the inverseimage π−1(x) of any point x ∈ Ω is a copy of a linear space, namely, Rn.This particular structure is called vector bundle, and the open domain Ω is thebase of the bundle, while Rn is the fiber. A vector bundle is a special caseof a more general structure, namely, a fiber bundle, the difference being thatfor fiber bundles the fibers do not need to be modeled on a linear space, butcan be generic manifolds [87]. The specific case of the Cartesian product Edefined here, is called trivial since there is one single coordinate map coveringthe whole base Ω (namely, the identity map). A fiber bundle is usually denotedby π : E → Ω, like the associated projection.

The graph of a function can be characterized intrinsically without makingreference to the fibers. This leads to the concept of section of a fiber bundle.

Definition 6.7. A section of a fiber bundle π : E → Ω is a map ϕ : Ω→ E suchthat π ϕ = Id, where Id is the identity map in Ω. For a vector bundle, sectionsform a linear space, since the fibers are linear spaces. The space of sections ofa vector bundle E is denoted ΓE.

According to this definition with E = Ω×Rn, a point x ∈ Ω is mapped intothe point (x′, y′) = ϕ(x) ∈ E, but we must have x′ = π ϕ(x) = x, so thatϕ(x) = (x, u(x)) for some function u : Ω → Rn. This shows that sections of Eare just functions over Ω with values in the fibers of E and that ΓE is linear.

In order to deal with first-order Lagrangian theories we also have to accountfor first-order derivatives into this geometric structure. With this aim we intro-duce an equivalence relation: We say that two differentiable sections ϕ1 and ϕ2

are equivalent at x ∈ Ω if they agree to first order at x ∈ Ω, that is, if theirTaylor polynomial is the same up to the first-order term. Hence, two equivalentsections at a point x pass through the same point in E, ϕ1(x) = ϕ2(x), and havethe same Jacobian matrix, Dϕ1(x) = Dϕ2(x). If we write ϕi(x) =

(x, ui(x)

)for

171

i = 1, 2, then the two sections agree to first order if and only if u1(x) = u2(x)and Du1(x) = Du2(x), since, in particular, Dϕi(x) =

(I,Dui(x)

)where I is

the d× d identity matrix. The equivalence classes of sections that agree to firstorder in a given point has a natural fiber-bundle structure over the same baseΩ as the considered bundle E. In fact, at every point x, an equivalence classis uniquely characterized by a point y = u1(x) = u2(x) in the fiber through x,together with the matrix η = Du1(x) = Du2(x) ∈ Rn×d.

Definition 6.8 (First jet-bundle). The set of equivalence classes of sections ofa bundle π : E → Ω that agree to first order is the first jet bundle of E, denotedby π1 : J1E → Ω.

For the simple case of a trivial vector bundle considered here, we have that,

J1E ∼= Ω×Rn ×Rn×d ∼= E ×Rn×d.

We see that indeed J1E is a vector bundle over Ω, as well as a vector bundleover E. This construction is summarized in the commutative diagram

E

π

J1Eoo

π1

xxΩ

Since J1E is a bundle over Ω we can actually iterate the construction and definehigher-order jet-bundles of E, denoted by JkE; this, however, will not be neededhere.

Given a differentiable section ϕ ∈ ΓE with ϕ(x) = (x, u(x)), we can buildthe function j1ϕ : Ω→ J1E defined by

j1ϕ : Ω 3 x 7→(x, u(x), Du(x)

)∈ J1E. (6.34)

We observe that π1 j1ϕ(x) = x, hence j1ϕ is a section of J1E and we writej1ϕ ∈ ΓJ1E. However, not all the element of ΓJ1E can be related to a sectionof E.

Definition 6.9 (First jet prolongation and holonomic sections). The sectionj1ϕ ∈ ΓJ1E is referred to as the first jet prolongation of the section ϕ ∈ ΓE. Ageneric section ψ ∈ ΓJ1E is called holonomic if and only if there exists ϕ ∈ ΓEdifferentiable and such that ψ = j1ϕ.

We are now ready to formulate a first-order Lagrangian theory in geometricterms. Let π : E → Ω be a vector bundle over Ω ⊆ Rd with finite-dimensionalfibers, and L : J1E → R a function of class C2. We construct the actionfunctional S : ΓE → R by

S(ϕ) =

∫Ω

L(j1ϕ(x)

)dx, (6.35)

and we seek sections ϕ ∈ ΓE that extremize the action functional S(ϕ).So far we have just introduced a geometric language which does not really

add any new information. The usefulness of the foregoing concepts becomesclearer when we consider the action of a group of transformations of the bundle

172

E. Particularly we want to write a condition on the Lagrangian density whichstates the invariance of S under the action of a group of transformations of thewhole bundle E.

We restrict our attention to vector-bundle diffeomorphisms f : E → Ethat cover a diffeomorphism of the base. Specifically, a diffeomorphism f :E → E is a vector-bundle diffeomorphism if it preserves the vector-bundlestructure, i.e., f

(π−1(x)

)is again a fiber of E. In addition, we say that f covers

a diffeomorphism g : Ω→ Ω of the base if it is of the form

f(x, y) =(g(x), h(x, y)

),

with h : E → Rn. This construction is represented by the commutative diagram

E

π

f// E

π

Ωg

// Ω

The action of the transformation f on a section ϕ ∈ ΓE, which we denote byfϕ, is defined by the commutative diagram

Ef

// E

Ω

ϕ

OO

g// Ω

fϕ

OO

and explicitly by

fϕ(x) = f ϕ(x′), where x = g(x′).

The jet prolongation of the transformed section fϕ is readily computed onnoting that

fϕ(x) =(x, h(x′, u(x′))

), where x = g(x′),

and thus by the chain rule,

D(fϕ)(x) =(I,Dx[h(x′, u(x′))]

)where in components we have(Dx[h(x′, u(x′))]

)ai

=∂

∂xi[ha(g−1(x), u(g−1(x)

)]=∑j

[∂ha(x′, u(x′))

∂xj+∑b

∂ha(x′, u(x′))

∂yb∂ub(x′)

∂xj

]∂(g−1)j

∂xi,

again with x = g(x′).

Definition 6.10 (First jet prolongation of a transformation). The prolongationj1f : J1E → J1E of the map f defined above is

j1f(x, y, η) =(g(x), h(x, y), [Dxh+ (Dyh)η]Dg−1

).

where the derivatives of h are to be evaluated at (x, y).

173

With this definition we have the relation

j1(fϕ) = j1f j1ϕ, (6.36)

that is, the prolongation of the transformed section is given by the applicationof the prolonged transformation to the prolonged section.

Having established the basic definitions, let us now consider a smooth one-parameter family of vector-bundle diffeomorphisms fε(x, y) =

(gε(x), hε(x, y)

)covering the diffeomorphisms gε of the base; here, ε is the parameter.

We also assume that fε is smooth in ε and reduces to the identity for ε = 0;thus it is sometimes called a differentiable near-identity transformation.

Near identity transformations are the flow of a vector field on E defined by,cf. equation (1.5),

Z(x, y) =d

dεfε(x, y)

∣∣∣ε=0

, (6.37)

and thus we haveZ(x, y) =

(X(x), Y (x, y)

),

where

X(x) =d

dεgε(x)

∣∣∣ε=0

, (6.38)

is the vector field associated to the base diffeomorphisms gε and

Y (x, y) =d

dεhε(x, y)

∣∣∣ε=0

, (6.39)

is a vector field on E.

Definition 6.11 (Infinitesimal generator). The vector field Z on E definedby (6.37) is referred to as the infinitesimal generator of the near-identity trans-formation fε. The map fε on the other hand, is the flow of Z, with the parameterε playing the role of time.

Since Z does not depend on the parameter ε, we can conclude from proposi-tion 1.3 that fε has the group property. As discussed in section 1.2, the generatorfield Z and its flow fε are equivalent: Given one we can compute the other.

Correspondingly we can consider the jet prolongation of the family of trans-formations which amounts to a near-identity group of transformation on J1Eand thus has a generator.

Definition 6.12 (Prolongation of a generator field). The infinitesimal generatorof j1fε, namely,

j1Z(x, y, η) =d

dεj1fε(x, y, η)

∣∣∣ε=0

,

is referred to as the jet prolongation of the field Z.

We need an explicit expression for the prolongation of a vector field. Fromdefinition 6.10 we have

j1Z(x, y, η) =d

dε

(gε(x), hε(x, y), [Dxhε + (Dyhε)η]Dg−1

ε

)∣∣∣ε=0

=(X(x), Y (x, y),

d

dε[Dxhε + (Dyhε)η]Dg−1

ε

∣∣∣ε=0

).

The last factor is computed in the following

174

Lemma 6.7. With fε = (gε, hε) defined above we have

d


ε

∣∣∣ε=0

= DxY +DyY η − ηDX,

where DY = (DxY,DyY ) and DX are the Jacobian matrices of the vector fieldsdefined in equations (6.38) and (6.39), respectively.

Explicitly in coordinates we write[DxY +DyY η − ηDX

]ai

=∂Y a

∂xi+∑b

∂Y a

∂ybηbi −

∑j

ηaj∂Xj

∂xi.

Then we have

j1Z(x, y, η) =(X(x), Y (x, y), DxY (x, y) +DyY (x, y)η − ηDX(x)

), (6.40)

If we think of the base of the bundle E as lying “horizontally” and of the fiber as“vertical lines” passing through the each point of the base (which is the standardrepresentation of a fiber bundle) then we say that a transformation fε is verticalif it acts on the fibers only leaving the base point unchanged, that is, gε = 0and X = 0. For vertical transformations we have

j1Z(x, y, η) =(0, Y (x, y), DxY (x, y) +DyY (x, y)η

). (6.41)

One should note that Y might still depend on x even for vertical transformations.On the other hand, a transformation is horizontal if it acts on the base only,that is, hε = 0 and Y = 0. In that case

j1Z(x, y, η) =(X(x), 0,−ηDX(x)

). (6.42)

Those two special cases play an important role as often we look for groups oftransformations that are either purely vertical of purely horizontal.

Proof of lemma 6.7. Since fε reduces to the identity map when ε = 0, we havegε(x)|ε=0 = x and hε(x, y)|ε=0 = y; hence,

Dg−1ε |ε=0 = Id, Dxhε|ε=0 = 0, and Dyhε|ε=0 = In,

where Id, In are the identity matrices in dimension d and n respectively. Then,

d


ε

∣∣∣ε=0

=d

dε

(Dxhε + (Dyhε)η

)∣∣∣ε=0

+ ηd

dεDg−1

ε

∣∣∣ε=0

= DxY +DyY η + ηd

dεDg−1

ε

∣∣∣ε=0

.

The last term can be computed from the identity DgεDg−1ε = Id. Upon differ-

entiating with respect to ε one gets( ddεDgε

)Dg−1

ε +Dgε

( ddεDg−1

ε

)= 0,

from whichd

dεDg−1

ε = −Dg−1ε

( ddεDgε

)Dg−1

ε ,

and evaluating at ε = 0, that reduces to

d

dεDg−1

ε

∣∣∣ε=0

= −DX.

Using this identity, the claim follows.

175

We are now ready to state the condition of invariance of the action func-tional (6.35) under the action of a group of transformations.

Theorem 6.8. Sufficient condition for the action (6.35) to be invariant underthe group of transformations generated by the vector field Z = (X,Y ) is that theLagrangian density L satisfies

j1Z(L) + LdivX = 0,

which is referred to as equivariance condition.

Remark 6.1. In the equivariance condition, the action of the vector field j1Z onthe scalar L is defined as an advection operator, namely,

j1Z(L) =

[∑i

Xi ∂

∂xi+∑a

Y a∂

∂ya

+∑a,i

(∂Y a∂xi

+∑b

∂Y a

∂ybηbi −

∑j

ηaj∂Xj

∂xi

) ∂

∂ηai

]L.

Proof. In applying the transformation fε to the functional S we observe that Ltransform like a scalar, while the volume element dx like a volume form, hence

Sε(ϕε) =

∫Ω

L(j1ϕε) detDgε(x)dx,

and we have j1ϕε = j1fε j1ϕ, in virtue of identity (6.36), while detDgε > 0since it reduces to one at ε = 0 and cannot cross zero. Then, upon differentiatingin ε under the integral sign,

d

dεSε(ϕε) =

∫Ω

d

dε

[L(j1ϕε) detDgε(x)

]dx

=

∫Ω

[ ddεL(j1ϕε) + L(j1ϕε) divX(gε(x))

]detDgε(x)dx,

where we have used lemma 1.7 in order to compute the derivative of the deter-minant. Since

L(j1ϕε) = L(j1fε j1ϕ) = (L j1fε)(j1ϕ),

and, by the chain rule,

d

dε

[L j1fε

]=[DL j1fε

] ddεj1fε = j1Z(L)(j1fε),

we can writed

dεL(j1ϕε) = j1Z(L)(j1ϕε),

and therefore

d

dεSε(ϕε) =

∫Ω

[j1Z(L)(j1ϕε) + L(j1ϕε) divX(x)

]detDgε(x)dx.

If the Lagrangian density is equivariant the action functional is invariant.

176

Let us consider an example of a vertical transformation, and specifically, thegroup of transformations generated by the vector field Z = (X,Y ) with

Xj = 0, Y b = (δba), (6.43)

for a fixed direction a fixed. The flow fε(x, y) =(gε(x), hε(x, y)

)of this vector

field is such that

gjε (x) = xj , hbε(x, y) = yb + εδba,

which corresponds to a translation in the direction of ya. The prolongation ofthe generator field is

j1Z(x, y, η) =(0, (δba), 0

)= Z,

so that the equivariance condition for the Lagrangian density amounts to

j1Z(L) + L divX = ∂yaL = 0,

which is the condition leading to the conservation law for the flux (6.31).As an example of horizontal transformations, for k fixed, let us consider the

vector field Z = (X,Y ) with

Xj = δjk, Y b = 0, (6.44)

which corresponds to the transformation fε(x, y) =(gε(x), hε(x, y)

)with

gjε (x) = xj + εδjk, hbε(x, y) = 0.

This is a translation in the horizontal direction of xi. The prolonged vector fieldin this case is

j1Z(x, y, η) =((δjk), 0, 0

)= Z,

and we also have divX = 0 so that the equivariance condition amounts to

j1Z(L) + L divX = ∂xkL = 0.

This is the condition underlying the conservation of the vector field (6.33).At last we give the definition of the vector field associated to a Lagrangian

density L and a generator Z. The definition we use here is not fully general, sincewe have taken advantage of the particularly simple geometric setting (trivialvector bundle). The vector field J will initially be defined on the whole first jetbundle j1E and eventually be reduced to a vector field over Ω by compositionwith the prolongation of a section.

Definition 6.13 (Noether current). Let Z be a vector field over a vector bundleE, and L ∈ C2(J1E). The vector field over J1E defined in components by

Ji(x, y, η) =∑a

∂L∂ηai

(x, y, η)[Y a(x, y)−

∑j

Xj(x)ηaj]

+L(x, y, η)Xi(x), (6.45)

and zero for the remaining components, is referred to as the Noether currentassociated to Z.

177

The vector fields (6.31) and (6.33) in the two examples considered at thebeginning of the section, are the Noether currents associated to a vertical trans-lation in the direction ya and an horizontal translation in the direction xi,respectively, evaluated on a section j1ϕ.

A direct computation of the divergence of the Noether current associated toa vector field Z evaluated at j1ϕ gives

div[J(j1ϕ)

]=∑i

∂

∂xi

[∑a

∂L∂ηai

(Y a(ϕ)−

∑j

Xj ∂ua

∂xj

)+ L(j1ϕ)Xi

]=∑a,i

∂

∂xi

[ ∂L∂ηai

](Y a(ϕ)−

∑j

Xj ∂ua

∂xj

)+∑a,i

∂L∂ηai

∂

∂xi

[Y a(ϕ)−

∑j

Xj ∂ua

∂xj

]+∑i

Xi ∂

∂xi[L(j1ϕ)

]+ L(j1ϕ) divX,

where, in computing the derivatives ∂xi [· · · ], we have to account for the totaldependence of the argument [· · · ] on x, namely, the explicit dependence as wellas the implicit dependence through u(x) and Du(x). First we compute

∂

∂xi

[Y a(ϕ)−

∑j

Xj ∂ua

∂xj

]=∂Y a

∂xi+∑b

∂ub

∂xi∂Y a

∂yb

−∑i,j

(∂Xj

∂xi∂ua

∂xj+Xj ∂2ua

∂xi∂xj

),

and∂

∂xi[L(j1ϕ)

]=∂L∂xi

+∑b

∂ub

∂xi∂L∂yb

+∑b,j

∂2ub

∂xi∂xj∂L∂ηbj

.

Upon substituting back into the expression of div J we find

div[J(j1ϕ)

]=∑a,i

∂

∂xi

[ ∂L∂ηai

](Y a(ϕ)−

∑j

Xj ∂ua

∂xj

)+∑a,i

∂L∂ηai

∂Y a

∂xi+∑b

∂ub

∂xi∂Y a

∂yb−∑j

(∂Xj

∂xi∂ua

∂xj+Xj ∂2ua

∂xi∂xj

)

+∑i

Xi

∂L∂xi

+∑b

∂ub

∂xi∂L∂yb

+∑b,j

∂2ub

∂xi∂xj∂L∂ηbj

[L(j1ϕ)

]+ L(j1ϕ) divX,

=∑a,i

∂

∂xi

[ ∂L∂ηai

](Y a(ϕ)−

∑j

Xj ∂ua

∂xj

)+∑i

Xi ∂L∂xi

+∑a,i

(∂Y a∂xi

+∑b

∂Y a

∂yb∂ub

∂xi−∑j

∂ua

∂xj∂Xj

∂xi

) ∂L∂ηai

+∑a,i

Xi ∂ua

∂xi∂L∂ya

+ L(j1ϕ) divX.

178

We can now sum and subtract from the last expression the term∑a

Y a∂L∂ya

,

and get

div[J(j1ϕ)

]= −

∑a

[∂L∂ya−∑i

∂

∂xi

[ L∂ηai

]](Y a(ϕ)−

∑j

Xj ∂ua

∂xj

)+ j1Z(L)(j1ϕ) + L(j1ϕ) divX. (6.46)

Equation (6.46) implies

Theorem 6.9 (Noether theorem [88]). Let π : E → Ω and Z be defined asabove. If the Lagrangian density is equivariant under the action of the group oftransformations generated by Z, then the associated Noether current satisfies

div[J(j1ϕ)

]= 0,

for every solution ϕ ∈ ΓE of Euler-Lagrange equations.

Proof. In virtue of theorem 6.8, the action is invariant if and only if the La-grangian density satisfies the equivariance condition. The right-hand side ofequation (6.46) vanishes identically if both the equivariance condition and Euler-Lagrange equations are satisfied.

A typical application of Noether’s theorem follows three steps: First weidentify the vector fields Z that satisfy the equivariance condition of theorem 6.8.For each such field, we construct the associated Noether current. At last we solveEuler-Lagrange equations and evaluate the Noether current on the obtainedsolution, which yields the conserved quantity.

6.7 Geodesics and Euler’s equations of fluid dynamics. Let M be asurface in Rn defined as the zero-level set of a smooth function ψ : Rn → R

satisfying the condition Dψ 6= 0 on M . By the implicit function theorem Mhas dimension d = n− 1, and the space of tangent vectors at a point m ∈M is,cf. equation (6.11),

TmM = v ∈ Rn | Dψ(m)v = 0 = kerDψ(m).

Since TmM ⊂ Rn we can use the standard norm on Rn in order to measure thelength of vectors in TmM . However, this is not the only possibility. A functionthat allows us to measure the length of tangent vectors is called a metric and itis precisely given by

Definition 6.14. Let M be a surface as above. A metric on M is a functiong : m 7→ gm where gm is a positive-definite bi-linear form on TmM . We shallalways assume that gm depends smoothly on m.

We can construct a basis for the tangent space TmM in terms of local co-ordinates. By using the implicit function theorem we can construct a map

179

q : U ⊆ Rd → V ⊂ Rn from a neighborhood U of 0 ∈ Rd to a neighborhood ofm ∈M such that

M ∩ V = z ∈ Rn | z = q(x), x ∈ U ⊆ Rd,

that is, q defines the surface near the point m. The vectors,

ei(x) =∂q(x)

∂xi∈ Rn,

are linearly independent, since the Jacobian of q must have maximum rank, andare tangent, i.e., ei(x) ∈ Tq(x)M , since, ψ

(q(x)

)= 0 and upon differentiating

Dψ(q(x)

)ei(x) = 0.

Hence ei(x) forms a basis for Tq(x)M induced by the coordinates. Given a metricg on M , the matrix-valued function

gij(x) = gq(x)

(ei(x), ej(x)

),

is the local-coordinate representation of the metric, that is, for any two tangentvector fields v, w defined on M ∩ V we have

v(x) =∑i

vi(x)ei(x), w(x) =∑i

wi(x)ei(x),

and

gq(v, w) =∑i,j

gijviwj .

The matrix gij(x) is the coordinate representation of the metric tensor. As anexample, if the metric g is given by the norm in the ambient space Rn, themetric tensor is

gij(x) = ei(x) · ej(x) =∑k

∂qk(x)

∂xi∂qk(x)

∂xj,

in agreement with the classical theory of surfaces.Let us now consider a curve γ : [t1, t2] ⊆ R → M in the surface (M, g) en-

dowed with a metric g. The situation is depicted in figure 6.1 with the differencethat here we consider finite-dimensional spaces.

The length of a curve is defined by the integral of the norm of its tangentvector, namely,

`(γ) =

∫ t2

t1

√gγ(t)

(γ(t), γ(t)

)dt, (6.47)

where γ(t) = dγ(t)/dt ∈ Tγ(t)M is the tangent vector of the curve in M .

Definition 6.15 (Geodesics). A geodesic γ : [t1, t2]→M is the curve which hasminimal length among all C2-regular curves joining two fixed points γ(t1) = m1

and γ(t2) = m2 in M .

Although this is not strictly needed in the following, we give, for sake ofcompleteness, the governing equations for geodesics on a surface.

180

Proposition 6.10. A curve γ ∈ C2([t1, t2],M) is a geodesic only if

d2xi(t)

dt2+∑j,k

Γijk(x(t)

)dxj(t)dt

dxk(t)

dt= 0,

where x(t) is the representation of γ in coordinates,

Γijk =1

2

∑l

gil(∂xkglj + ∂xjglk − ∂xlgkj

),

is the Christoffel symbol of second-kind, and (gij) is the inverse matrix of (gij).

Proof. We prove this claim only in the case curves belonging to the neighbor-hood M ∩ V where we can use the coordinate map z = q(x). We observe that,since the square root is a monotonic function, any C2-regular curve with fixedend-points that minimize the functional `(γ) defined in equation (6.47), mustequivalently minimize the functional

˜(γ) =∑i,j

∫ t2

t1

gi,j(x(t)

)dxidt

dxj

dtdt,

where x(t) is the coordinate representation of the curve γ, namely, q(x(t)) =γ(t). Extremizing ˜ defines a first-order Lagrangian theory on the space offunctions t 7→ x(t) ∈ U of class C2 with fixed end-points. The Lagrangian is

L(y, η) =∑i,j

gij(y)ηiηj ,

and the corresponding Euler-Lagrange equations amount to∑k,j

∂gkj(x(t)

)∂xi

dxk(t)

dt

dxj(t)

dt− d

dt

[∑j

2gij(x(t)

)dxj(t)dt

]= 0.

In addition,

d

dt

[∑j

gij(x(t)

)dxj(t)dt

]=∑j,k

∂gij(x(t)

)∂xk

dxj(t)

dt

dxk(t)

dt+∑j

gij(x(t)

)d2xj(t)

dt2

=∑j,k

1

2

(∂gij(x(t))

∂xk+∂gik

(x(t)

)∂xj

)dxj(t)dt

dxk(t)

dt+∑j

gij(x(t)

)d2xj(t)

dt2.

One can conclude by using the foregoing identity into Euler-Lagrange equationsand matrix-multiplying by −(1/2)gij .

Arnold’s formulation of incompressible fluid dynamics [82] gives an inter-pretation of Euler’s equations (1.55) as the equations for geodesic curves inan infinite-dimensional manifold. The corresponding variational formulation isbased on ideas that generalize to the cases of compressible fluids as well as MHD.

We consider a simply connected domain Ω ⊂ R3 and the group of specialdiffeomorphisms SDiff(Ω), that is, the group of all diffeomorphisms F : Ω→ Ωthat preserves the volume element as well as the orientation, namely,

dF (x) = |detDF (x)|dx = dx, and detDF (x) = 1.

181

Let us stress that from the fact that F preserves the volume element, it follows|detDF (x)| = 1 which in general implies two possibilities, detDF (x) = ±1,but we restrict to cases in which detDF = 1 so that the orientation is preservedand the identity belongs to SDiff(Ω).

A curve in SDiff(Ω) is a map γ : t 7→ Ft ∈ SDiff(Ω) and it defines a vectorfield over the domain Ω according to

u(t, Ft(x)

)=

d

dtFt(x).

From proposition 1.7 we have ∇·u(t, x) = 0 as a consequence of Ft being volumepreserving. Thus, the tangent space TFSDiff(Ω) at F ∈ SDiff(Ω) is the spaceof divergence-free vector fields.

On SDiff(Ω) we introduce the metric

g(u, v) =1

2

∫Ω

u(x) · v(x)dx = (u, v)L2(Ω), (6.48)

which is a bi-linear form of u, v ∈ TFSDiff(Ω).

We define the length of a curve t 7→ Ft in the same way as in the finite-dimensional case (6.47). It is actually more convenient to consider the integralof the square of the metric as in the proof of proposition 6.10, and thus considerthe functional

˜(γ) =1

2

∫ t2

t1

(u(t), u(t))L2(Ω)dt, (6.49)

where γ : [t1, t2] 3 t 7→ Ft ∈ SDiff(Ω) is a curve and u(t) ∈ TFtSDiff(Ω) is itstangent at Ft.

Theorem 6.11 (Arnold [82]). With Ω bounded and simply connected, the Euler-Lagrange equations for (6.49) amount to Euler’s equations (1.55) for incom-pressible flows with constant mass density ρ = 1 and no external forces, f = 0.

With ρ = 1, one should notice that the functional ˜ is the time integral ofthe kinetic energy of the fluid.

Proof. The group SDiff(Ω) does not form a linear space and thus the resultsof section 6.5 cannot be applied straightforwardly. We consider a certain curveγ : t 7→ Ft together with a family of curves γε : t 7→ Ft,ε such that Ft,ε|ε=0 = Ft.We can now define two vector fields over Ω, namely,

uε(t, x′) =

d

dtFt,ε(x), vε(t, x

′) =d

dεFt,ε(x),

where in both cases x′ = Ft,ε(x). Then uε(t) is the tangent to the curve γε whilevε(t) is the variation of the curve along ε.

The functional ˜(γ) has a critical point at γ only if

d

dε˜(γε)

∣∣∣ε

= 0,

for all family of curves γε such that γε|ε=0 = γ.

182

We compute the derivative of the functional:

d

dε˜(γε) =

1

2

d

dε

∫ t2

t1

∫Ω

dFt,ε(x)

dt· dFt,ε(x)

dtdxdt

=

∫ t2

t1

∫Ω

dFt,ε(x)

dt· ddε

dFt,ε(x)

dtdxdt

=

∫ t2

t1

∫Ω

dFt,ε(x)

dt· ddt

[vε(t, Ft,ε(x)

)]dxdt

= −∫ t2

t1

∫Ω

d

dt

[uε(t, Ft,ε(x)

)]· vε(t, Ft,ε(x)

)dxdt

= −∫ t2

t1

∫Ω

vε(t, Ft,ε(x)

)·[∂tuε

(t, Ft,ε(x)

)+ uε

(t, Ft,ε(x)

)· ∇uε

(t, Ft,ε(x)

)]dxdt.

In the fourth identity we have integrated by part in time and used the fact thatthe end-points of γ are fixed. Evaluating now at ε = 0,

d

dε˜(γε)

∣∣∣ε=0

= −∫ t2

t1

∫Ω

v(t, x′) ·[∂tu(t, x′) + u(t, x′) · ∇u(t, x′)

]dx′dt,

where we have changed variable according to x′ = Ft(x) and the transforma-tion has unit Jacobian determinant. The right-and side has to vanish for alldivergence-free vector field v.

Since Ft has to preserve the boundary ∂Ω, we must have v · n = 0 on ∂Ω. Indimension d = 3 we can write a divergence-free vector as v = ∇×w where w isa generic vector with n × w = 0 on ∂Ω, cf. theorem 4.11. Integrating by partsthe curl operator,

d

dε˜(γε)

∣∣∣ε=0

= −∫ t2

t1

∫Ω

w(t, x′) · ∇ ×[∂tu(t, x′) + u(t, x′) · ∇u(t, x′)

]dx′dt,

where the boundary term in the integration by parts vanishes since∫∂Ω

[n× w

]·(∂tu+ u · ∇u

)dS = 0.

Now w is an arbitrary vector field, so that we can deduce

∇×[∂tu+ u · ∇u

]= 0,

and thus there exists a scalar field p such that

∂tu+ u · ∇u = −∇p,

which is Euler’s equation for incompressible flows.

It is remarkable that in this formulation one finds the pressure p as a directconsequence of the incompressibility. Furthermore we find that the flow Ftcorresponding to the fluid velocity u is interpreted as a curve in the groupSDiff(Ω). Since the action functional is defined as a function of the flow, Arnold

183

variational principle is said to be a Lagrangian variational formulation, withreference to Lagrangian trajectories introduced in section 1.2.

Arnold’s argument can be generalized to compressible flows by accountingfor internal energy in the functional [83].

For compressible fluids, the volume element is not preserved by the flow,therefore we need to consider the full group of diffeomorphisms Diff(Ω), butwe still restrict to the component of Diff(Ω) which is connected to the identity,i.e., detDF = 1 and orientation is preserved. The tangent space at a pointF ∈ Diff(Ω) is determined in the usual way: We take a curve γ : t 7→ Ft andcompute

dFt(x)

dt= u

(t, Ft(x)

),

hence TFDiff(Ω) is identified with the space of generic vector fields on Ω. Dif-ferently from the incompressible case, now we have ∇ · u 6= 0, in general.

We also need to account for a mass density which in general is transported bythe flow according to the mass continuity equation. We say that the mass densityis passively advected by the flow, cf. appendix D. In a Lagrangian variationalprinciple passively advected quantities are explicitly written in terms of the flow.For the mass density in particular we have

Proposition 6.12. Given a smooth mass density ρ0 : Ω→ R≥0, and a smoothflow Ft : Ω→ Ω, we define

ρ(t, x) = ρ0(y)[

detDFt(y)]−1

, x = Ft(y). (6.50)

Then, ρ is the unique smooth solution of the initial-value problem

∂tρ+∇ · (ρu) = 0, ρ(0) = ρ0.

One should notice that equation (6.50) corresponds to

ρ(t, x)dx = ρ0(y)dy,

which is the transformation rule for densities, cf. appendix D.

Proof. If ρ is defined by (6.50), then∫Ft(W )

ρ(t, x)dx =

∫W

ρ0(y)dy,

for every open set W in Ω, and, upon deriving with respect to time, Reynoldstransport theorem yields the mass continuity equation. The initial conditionρ(0) = ρ0 follows from F0 = Id. Vice versa, if ρ is a solution of the initial-value problem, Reynolds transport theorem implies ρ(t, x)dx = ρ0(y)dy whichis equivalent to (6.50).

Analogously the pressure is defined in terms of the flow according to

p(t, x) = p0(y)/[

detDFt(y)]γ, (6.51)

which follows form the equation of state p = Cργ together with equation (6.50)for the density.

184

For compressible flows we extend (6.49) to the functional

S(γ) =

∫ t2

t1

∫Ω

[1

2ρ(t, x)

∣∣u(t, x)|2 − p(t, x)

γ − 1

]dxdt, (6.52)

where γ : t 7→ Ft is a curve in Diff(Ω) and u(t) is its tangent vector, while ρ andp are given by equations (6.50) and (6.51), respectively. One should notice thatinternal energy enters in the functional with a minus sign, i.e., internal energyacts as an effective potential in the same way as the electrostatic potential φacts in the variational principle for charged-particle motion, cf. section 6.4.

Theorem 6.13. Euler-Lagrange equations for (6.52) amount to Euler’s equa-tions (1.41) with no external force f = 0.

Proof. As a first step we re-write the action functional in the form

S(γ) =

∫ t2

t1

∫Ω

[1

2ρ0(y)

∣∣∣dFt(y)

dt

∣∣∣2 − p0(y)

γ − 1

[detDFt(y)

]−γ+1]dydt,

where y = F−1t (x) is the initial point of the Lagrangian trajectory passing

through x at time t and dy = dx/detDFt(y).

We now consider a family of trajectories γε : t 7→ Ft,ε ∈ Diff(Ω) such thatγε = γ for ε = 0, and with fixed end-points at t1 and t2. The curve γ is anextremum of the functional S only if

d

dεS(γε)

∣∣∣ε=0

= 0,

for any such family γε of trajectories. We compute the derivative

d

dεS(γε) =

∫ t2

t1

∫Ω

[ρ0(y)uε

(t, Ft,ε(y)

)· ddtvε(t, Ft,ε(y)

)+ p0(y)

[detDFt,ε(y)

]−γ ddε

detDFt,ε(y)]dydt,

where the vector fields uε and vε are defined as in the proof of theorem 6.11 forthe incompressible case.

By lemma 1.7, we have

d

dεS(γε) =

∫ t2

t1

∫Ω

[ρ0(y)uε

(t, Ft,ε(y)


)+ p0(y)

[detDFt,ε(y)

]−γ+1∇ · vε(t, Ft,ε(y)

)]dydt,

=

∫ t2

t1

∫Ω

[ρ0(y)uε

(t, Ft,ε(y)


)+ p(t, x) detDFt,ε(y)∇ · vε

(t, Ft,ε(y)

)]dydt,

185

In the first term we integrate by parts in time with the result that

d

dεS(γε) =

∫ t2

t1

∫Ω

[− ρ0(y)

[∂tuε

(t, Ft,ε(y)

)+ uε

(t, Ft,ε(y)

)· ∇uε

(t, Ft,ε(y)

)]· vε(t, Ft,ε(y)

)+ p(t, x) detDFt,ε(y)∇ · vε

(t, Ft,ε(y)

)]dydt,

= −∫ t2

t1

∫Ω

[ρ(t, x)

(∂tuε(t, x) + uε(t, x) · ∇uε(t, x)

)+∇p(t, x)

]· vε(t, x)dxdt,

and we have changed back to x-coordinates in the last identity and integrated byparts the divergence in the second term. The boundary term in the integrationby part vanishes since vε has to be tangent to the boundary.

At last we have obtained

d

dεS(γε)

∣∣∣ε=0

= −∫ t2

t1

∫Ω

[ρ(∂tu+ u · ∇u

)+∇p

]· vdxdt = 0,

for every vector field v generating the family γε of trajectories. Therefore wehave Euler’s equation

ρ(∂tu+ u · ∇u

)+∇p = 0,

while the continuity equation for ρ and the transport equation for p are identi-cally satisfied by (6.50) and (6.51), respectively.

In the foregoing variational formulations of the equations of fluid dynamics, itis crucial to obtain an explicit expression for the passively advected quantitiessuch as density and pressure. Appendix D provides a brief overview of thegeometric meaning of such expressions, cf. proposition D.3 in particular.

6.8 Lagrangian formulation of ideal MHD. We are now ready to statethe variational formulation for ideal compressible MHD which is due to New-comb [81]. Here, we follow Arnold’s approach to fluid dynamics of section 6.7.Therefore we shall obtain again a variational formulation in Lagrangian coordi-nates. We treat the mass density and the pressure as in section 6.7, preciselyequations (6.50) and (6.51), but we have to deal with the magnetic field whichis also a passively advected quantity.

Proposition 6.14. Given a smooth magnetic field B0 : Ω→ R3 with ∇·B0 = 0,and a smooth flow Ft : Ω→ Ω, we define

B(t, x) = B0(y) · ∇Ft(y)[

detDFt(y)]−1

, x = Ft(y). (6.53)

Then, B is smooth, satisfies ∇ · B(t, x) = 0, and is the unique smooth solutionof the initial-value problem

∂tB −∇ · (u×B) = 0, B(0) = B0,

which is the induction equation of ideal MHD (3.50b).

186

This results should be compared to lemma 4.5 and proposition D.3. Withrespect to the transport of a vector field in proposition D.3, the presence of thefactor 1/ detDFt(y) accounts for flux conservation in a compressible flow.

Proof. If B is defined by (6.53), then

B(t, Ft(y)

)= B0(y) · ∇Ft(y)/ detDFt(y),

and differentiating with respect to t yields

∂tB(t, Ft(y)

)+ u(t, Ft(y)

)· ∇B

(t, Ft(y)

)= B0(y) · ∇y

[u(t, Ft(y)

)]/ detDFt(y)−B

(t, Ft(y)

)∇ · u

(t, Ft(x)

),

where we have used lemma 1.7 for the derivative of the determinant. The firstterm on the right-hand side can be written as

B0(y) · ∇Ft(y) · ∇u(t, Ft(y)

)/detDFt(y) = B

(t, Ft(y)

)· ∇u

(t, Ft(y)

),

and thus

∂tB(t, Ft(y)

)+ u(t, Ft(y)

)· ∇B

(t, Ft(y)

)= B

(t, Ft(y)

)· ∇u

(t, Ft(y)

)−B

(t, Ft(y)

)∇ · u

(t, Ft(x)

).

Upon evaluating at y = F−1t (x) one has

∂tB + u · ∇B −B · ∇u+B∇ · u = 0, (6.54)

which on the other hand implies

∂t∇ ·B +∇ ·(u · ∇B

)−∇ ·

(B · ∇u

)+ (∇ ·B)(∇ · u) +B · ∇(∇ · u) = 0,

or equivalently,D

Dt∇ ·B + (∇ · u)(∇ ·B) = 0,

where we have used the advective derivative (1.19). Evaluating the latter equa-tion on a Lagrangian trajectory, one finds

d

dt∇ ·B

(t, Ft(y)

)+∇ · u

(t, Ft(y)

)∇ ·B

(t, Ft(y)

)= 0,

which has the unique solution

∇ ·B(t, Ft(y)

)= ∇ ·B(0, y)/ detDFt(y).

This can be verified by direct substitution and using lemma 1.7. Since byhypothesis ∇ · B(0, y) = ∇ · B0(y) = 0, it follows ∇ · B = 0 on all Lagrangiantrajectories and thus on all the domain Ω. Equation (6.54) together with ∇·B =0 is equivalent to

∂tB −∇×(u×B

)= 0,

which is the ideal MHD induction equation. Vice versa, let B be a smoothsolution of the ideal MHD induction equation with a divergence-free initial

187

condition; then B is a divergence free field and satisfies equation (6.54), cf.section 3.5. We define, for each Lagrangian trajectory x = Ft(y),

w(t) = B(t, Ft(y)

)− B0(y) · ∇Ft(y)

detDFt(y),

and we compute

dw(t)

dt= ∂tB

(t, Ft(y)

)+ u(t, Ft(y)

)· ∇B

(t, Ft(y)

)− B0(y) · ∇Ft(y)

detDFt(y)· ∇u

(t, Ft(y)

)+B0(y) · ∇Ft(y)

detDFt(y)∇ · u

(t, Ft(y)

)= B

(t, Ft(y)

)· ∇u

(t, Ft(y)

)−B

(t, Ft(y)

)∇ · u

(t, Ft(y)

)− B0(y) · ∇Ft(y)

detDFt(y)· ∇u

(t, Ft(y)

)+B0(y) · ∇Ft(y)

detDFt(y)∇ · u

(t, Ft(y)

),

= w(t) · ∇u(t, Ft(y)

)− w(t)∇ · u

(t, Ft(y)

),

where we have used equation (6.54) in the second identity. Since in additionw(0) = 0, on all Lagrangian trajectories we have that w(t) satisfies a linearordinary differential equation with zero initial condition, hence,

w(t) = B(t, Ft(y)

)− B0(y) · ∇Ft(y)

detDFt(y)= 0,

which is (6.53).

We are now in the position of formulating Newcomb variational principleof ideal MHD. The action functional is obtained by addition of the magneticenergy as a potential energy term in the functional (6.52) for compressible fluids.As a result one gets

S(γ) =

∫ t2

t1

∫Ω

[1

2ρ(t, x)

∣∣u(t, x)|2 − p(t, x)

γ − 1− |B(t, x)|2

8π

]dxdt, (6.55)

where γ : [t1, t2] 3 t 7→ Ft ∈ Diff(Ω) is a curve in the component of Diff(Ω)connected to the identity, mass and pressure p is defined by equations (6.50)and (6.51) as for compressible flows, while the magnetic field B is defined byequation (6.53).

Theorem 6.15. Euler-Lagrange equations for (6.55) amount to ideal MHDequations (3.50b) without gravitational acceleration g = 0.

This is just the statement in the framework proposed by Arnold of the resultobtained by Newcomb [81].

Proof. Let γε be a one-parameter family of trajectories in Diff(Ω) with γε|ε=0 =γ as in the proof of theorem 6.13. Then γ is an extremum of the functional (6.55)only if

d

dεS(γε)

∣∣∣ε=0

=d

dεS1(γε)

∣∣∣ε=0

+d

dεS2(γε)

∣∣∣ε=0

= 0,

188

where

S1(γ) =

∫ t2

t1

∫Ω

[1

2ρ(t, x)

∣∣u(t, x)|2 − p(t, x)

γ − 1

]dxdt,

S2(γ) = −∫ t2

t1

∫Ω

B(t, x)

8πdxdt.

Particularly, S1 is the same as (6.52), hence

d

dεS1(γε)

∣∣∣ε=0

= −∫ t2

t1

∫Ω

[ρ(∂tu+ u · ∇u

)+∇p

]· vdxdt = 0.

For the magnetic field energy, we write

S2(γε) = − 1

8π

∫ t2

t1

∫Ω

∣∣B0(y) · ∇Ft,ε(y)∣∣2

detDFt,ε(y)dydt,

where we have changed variable to y = F−1t,ε (y) with dy = dx/detDFt,ε(y). The

derivative with respect to the parameter reads

d

dεS2(γε) = −

∫ t2

t1

∫Ω

[B(t, Ft,ε(y))

4π·B0(y) · ∇y

[vε(t, Ft,ε(y)

)]−∣∣B(t, Ft,ε(y)

)∣∣28π

detDFt,ε(y)∇ · vε(t, Ft,ε(y)

)]dydt

= −∫ t2

t1

∫Ω

B(t, x)⊗B(t, x)

4π· ∇vε(t, x)− |B(t, x)|2

8π∇ · vε(t, x)

]dxdt,

where we changed back to x-coordinates. At last we evaluate at ε = 0 andintegrate by parts in space, with the result that

d

dεS2(γε)

∣∣∣ε=0

=

∫ t2

t1

∫Ω

[∇ ·(B ⊗B

4π

)−∇

( |B|28π

)]· vdxdt = 0,

and we recognize the magnetic stress tensor. The combination of the fluid plusmagnetic part of the functional yields

d

dεS(γε)

∣∣∣ε=0

= −∫ t2

t1

∫Ω

[ρ(∂tu+ u · ∇u

)+∇

(p+|B|2

8π

)−∇ ·

(B ⊗B4π

)]· vdxdt = 0,

which is equivalent to the momentum equation in (3.50b). In view of propo-sitions 6.12 and 6.14, mass continuity, pressure, and induction equations areautomatically satisfied by the passively advected quantities.

For both cases of compressible fluids and MHD, we can think of the flow Ft asthe motion of a point in Diff(Ω), with kinetic energy given by the kinetic energyof the fluid and subject to potential energies given by the internal energy onlyfor the case of ordinary fluids and by the sum of internal and magnetic energyfor MHD flows.

189

7 Hamiltonian formulation

7.1 Introduction to Hamiltonian systems.

7.2 Hamiltonian structure of ideal MHD.

7.3 Metriplectic systems and dissipation.

191

A Proofs of the results on kinetic theory and closure

This appendix completes section 1.6 with technical results and gives the proofsof propositions 1.13-1.15 concerning the solution for the first-order corrector inthe Hilbert series and the corresponding contributions to the viscosity tensorand the heat-flux vector.

Since the lowest-order distribution function f0 is a Maxwellian, cf. equa-tion (1.38), we start with the computation of some Gaussian integrals in dimen-sion d = 3. The generalization to arbitrary dimensions is straightforward. Letus recall the definition of the Gaussian measure in three dimensions,

dµM (ξ) = π−3/2e−ξ2

dξ,

which has been introduced in the proof of proposition 1.12.

Lemma A.1. Calculation of Gaussian integrals:

1. Basic Gaussian integral: ∫R

e−x2

dx = π1/2.

2. Odd moments: for any integer n ≥ 0,∫R

x2n+1e−x2

dx = 0.

3. Even moments: for any integer n ≥ 1,∫R

x2ne−x2

dx =π1/2

2n(2n− 1)!!.

4. Normalization: ∫R3

dµM (ξ) = 1.

5. Moments of ξ2: for any integer n ≥ 1,∫R3

|ξ|2ndµM (ξ) =(2n+ 1)!!

2n.

6. Integral related to viscosity: with I the identity tensor,∫R3

ξ ⊗ ξdµM (ξ) = (1/2)I.

7. Integral related to a forth-order moment:∫R3

(ξ2 − 3/2

)2dµM (ξ) = 3/2.

8. Integral related to a tensor forth-order moment:∫R3

ξ ⊗ ξ(ξ2 − 3/2)dµM (ξ) = (1/2)I.

193

Proof. 1. The square of the basic Gaussian integral is(∫R

e−x2

dx)2

=

∫R2

e−x2−y2dxdy = 2π

∫ +∞

0

e−r2

rdr,

where we have used cylindrical coordinates so that r2 = x2 +y2. The remainingintegral is ∫ +∞

0

2re−r2

dr = −∫ +∞

0

d

dre−r

2

dr = 1,

and one obtains the claimed identity by taking the square root.2. Odd moments vanish because of the symmetry under inversion x 7→ −x

of the Gauss function.3. Even moments can be computed from the identity

x2ne−λx2

= (−1)ndn

dλne−λx

2

,

for λ > λ0 with some strictly positive lower bound λ0 ∈ (0, 1). Since we have

e−λx2 ≤ e−λ0x

2

uniformly for λ ≥ λ0, we can integrate under the derivative thuscomputing∫

R

x2ne−λx2

dx = (−1)ndn

dλn

(∫R

e−λx2

dx)

= (−1)ndn

dλn

(λ−1/2

∫R

e−y2

dy),

with new variables y = λ1/2x. The remaining integral is π1/2 and we have toevaluate the derivatives of λ−1/2 for which we claim

dn

dλn

(λ−1/2

)= (−1)n

1

2

3

2· · · 2n− 1

2λ−

2n+12 = (−1/2)n(2n− 1)!!λ−

2n+12 .

We can check this directly for n = 1 and prove the general case by induction.The inductive step is

dn+1

dλn+1

(λ−1/2

)=

d

dλ

dn

dλn

(λ−1/2

)= (−1/2)n(2n− 1)!!

d

dλ

(λ−

2n+12

)= (−1/2)n(2n− 1)!!

(− 2n+ 1

2

)λ−

2n+32 ,

which is the claimed identity with n replaced by n+ 1. Therefore we have∫R

x2ne−λx2

dx =π1/2

2n(2n− 1)!!λ−

2n+12 ,

which gives the thesis for λ = 1.4. By Fubini’s theorem, the integral splits into the product of three integrals

of the form 1. each giving a factor π1/2.5. A short way to prove this identity in three dimensions makes use of

spherical coordinates for ξ ∈ R3. If r = |ξ|,∫R3

|ξ|2ndµM (ξ) =4π

π3/2

∫ +∞

0

r2ne−r2

r2dr =2

π1/2

∫ +∞

−∞r2(n+1)e−r

2

dr,

and identity 3. concludes the proof.

194

6. We observe that the off-diagonal components ξiξj for i 6= j give zero sincethe integral splits into the product of two factors of the form 2. with n = 0. Asfor the diagonal terms,

δijπ−3/2

∫R3

ξ2i e−ξ2dξ = δij

(π−

12

∫R

x2e−x2

dx)(π−

12

∫R

e−x2

dx)2

=1

2δij ,

where we have used 1. and 3. with n = 1.7. It is convenient to use the identity (ξ2− 3

2 )2 = (ξ2− 32 )ξ2− 3

2 (ξ2− 32 ) and

then use 5.; we notice that the integral of the second term is zero.8. In view of the symmetry of the Gaussian measure under reflection ξ 7→ −ξ,

the only non-zero components of the tensor are the diagonal entries∫R3

ξ2i (ξ2 − 3

2)dµM (ξ) =

∫R3

ξ2i ξ

2dµM (ξ)− 3

2

∫R3

ξ2i dµM (ξ)

=

∫R3

ξ2i ξ

2dµM (ξ)− 3

2π−1/2

∫R

ξ2i e−ξ2i dξ

=

∫R3

ξ2i ξ

2dµM (ξ)− 3

4,

where we have used the result of item 3. with n = 1. The remaining integralamounts to

1

3

∫R3

|ξ|4dµM (ξ) =5

4,

where we have used the result of item 5. of lemma A.1 with n = 2. Hence, eachdiagonal entry of the tensor is equal to 1/2 as claimed.

A.1 Proof of proposition 1.13 We start considering a distribution func-tion f given as

f = f0 + εf1,

and compute the corresponding moments n, u, and T as defined by nnu

nu2 + 3nkBT/m

=

∫R3

1v|v|2

f0(t, x, v)dv + ε

∫R3

1v|v|2

f1(t, x, v)dv.

We look for solutions in the form of a polynomial in ε and we need the first twoterms only. For the number density this gives

n = n0 + εn1, nj =

∫R3

fjdv, (A.1)

and this is an exact identity. The average velocity u is determined by

n0u0 + ε(n0u1 + n1u0) +O(ε2) =

∫R3

vf0dv + ε

∫R3

vf1dv,

so that u0 is the average velocity corresponding to the distribution f0, and fromthe O(ε)-term we have

u1 =1

n0

∫R3

(v − u0)f1dv, (A.2)

195

where we have used the expression for n1 in terms of f1. Analogously the totalenergy density amounts to

nu2 + 3nkBT/m = n0u20 + 3n0kBT0

+ ε[n1u

20 + 2n0u0 · u1 + 3n0kBT1/m+ 3n1kBT0/m

]+O(ε2),

which should be equal to∫R3

v2f0dv + ε

∫R3

(v − u0 + u0)2f1dv =

∫R3

v2f0dv

+ ε[n1u

20 + 2n0u0 · u1 +

∫R3

(v − u0)2f1dv].

We deduce

T1 =m

3n0kB

∫R3

(v − u0)2f1dv − T0n1

n0=T0

n0

∫R3

[m(v − u0)2

3kBT0− 1]f1dv. (A.3)

We can now compute the first-order term in the expansion of M(f), namely,

M(f0 + εf1) = n( m

2πkBT

) 32

e−m(v−u)2

2kBT

= (n0 + εn1)( m

2πkBT0

) 32(

1− ε3

2

T1

T0+O(ε2)

)×(

1 + ε[ m

kBT0(v − u0) · u1 +

m(v − u0)2

2kBT0

T1

T0

]+O(ε2)

)e−m(v−u)2

2kBT0

= M(f0) + εM(f0)[n1

n0+m(v − u0)

kBT0· u1

+(m(v − u0)2

2kBT0− 3

2

)T1

T0

]+O(ε2),

and by definition M(f0) is the Maxwell’s distributions with moments given byn0, u0, and T0. By hypothesis f0 is a Maxwellian hence M(f0) = f0. Further-more, the relation

f1 = f0g1

defines an element g1 of the space V , defined in section 1.6 for any integrabledistribution f1. The first-order corrections n1, u1, and T1 to the moments havebeen obtained above in terms of f1 so that

M(f0 + εf1) = M(f0) + εM(f0)[〈g1〉+

m(v − u0)

kBT0· 〈(v − u0)g1〉

+(m(v − u0)2

2kBT0− 3

2

)⟨(m(v − u0)2

3kBT0− 1)g1

⟩]+O(ε2),

where we have introduced the average operator [43]

〈g〉 =1

n0

∫R3

gf0dv,

for all g ∈ V . We deduce the first-order term in the Hilbert expansion of M(f),that is,

M1(f0, f1) = M(f0)Π(g1) = f0Π(g1),

196

with Π : V → V0 given in proposition 1.13. At last we need to check the identityΠ2 = Π. For every g ∈ V we have Π(g) ∈ V0 by definition of Π, and thus thefollowing lemma completes the proof of the proposition.

Lemma A.2. For every g ∈ V0, Π(g) = g.

Proof. A generic element g ∈ V0 is a linear combination of the form

g = a+m(v − u0)

kBT0· b+

(m(v − u0)2

2kBT0− 3

2

)c,

where a = a(t, x) ∈ R, b = b(t, x) ∈ R3 and c = c(t, x) ∈ R. The Gaussianintegrals in lemma A.1 give

〈g〉 = a,

〈(v − u0)g〉 =m

kBT0b · 〈(v − u0)⊗ (v − u0)〉

= 2b ·∫R3

ξ ⊗ ξdµM (ξ) = b,

while ⟨(m(v − u0)2

3kBT0− 1)g⟩

=2c

3

⟨(m(v − u0)2

2kBT0− 3

2

)2⟩=

2c

3

∫R3

(ξ2 − 3/2

)2dµM (ξ) = c,

from which we have Π(g) = g.

A.2 Proof of proposition 1.14 First we need to check when the right-hand side of equation (1.45) satisfies the solvability condition (1.46). With f0

given by the Maxwellian distribution function (1.38) and with an accelerationfield of the form (1.29), we compute

f0h1 = ∂tf0 + v · ∇xf0 + a · ∇vf0

= f0

[(∂tn0 + v · ∇n0

n0

)+

m

kBT0(v − u0) ·

(∂tu0 + v · ∇u0 − a

)+(∂tT0 + v · ∇T0

T0

)(m(v − u0)2

2kBT0− 3

2

)]= f0

[(∂tn0 + u0 · ∇n0

n0

)+

m

n0kBT0(v − u0) ·

(n0(∂tu0 + u0 · ∇u0 − f0) +

kBT0

m∇n0

)+(∂tT0 + u0 · ∇T0

T0

)(m(v − u0)2

2kBT0− 3

2

)+

m

kBT0∇u0 : (v − u0)⊗ (v − u0)

+∇T0

T0· (v − u0)

(m(v − u0)2

2kBT0− 3

2

)],

197

so that h1 amounts to a polynomial in (v − u0) of third degree. In the lastequality we have accounted for the identity, cf. section 1.6,

a = f0 + (v − u0)× b0,

where f0 = a0 +u0× b0 is computed from equation (1.39c) with distribution f0.

Lemma A.3. Let us assume that the acceleration field a = a(t, x, v) is of theform (1.29). Then, the function h1 defined above satisfies the condition π(h1) =0 if and only if n0, u0, and T0 satisfy the incompressible Euler’s equations (1.41).

Proof. Since the functions spanning the subspace V0 are linearly independent,the solvability condition π(h1) = 0 splits into three independent conditions,

〈h1〉 = 0,

〈(v − u0)h1〉 = 0,⟨(m(v − u0)2

3kBT0− 1)h1

⟩= 0.

In computing the first average we notice that the only non-zero contributionscome from the coefficient of 1 and (v − u0)⊗ (v − u0) so that

〈h1〉 =∂tn0 + u0 · ∇n0

n0+

m

kBT0∇u0 :

⟨(v − u0)⊗ (v − u0)

⟩,

andm

kBT0

⟨(v − u0)⊗ (v − u0)

⟩= 2

∫R3

ξ ⊗ ξdµM (ξ) = I,

in virtue of item 6. in lemma A.1. Then,

〈h1〉 =∂tn0 + u0 · ∇n0

n0+∇ · u0 = 0,

which is equivalent to the mass continuity equation in (1.40). We continue withthe computation of the next average that is

〈(v − u0)h1〉 =m

n0kBT0〈(v − u0)⊗ (v − u0)〉 ·

(n0(∂tu0 + u0 · ∇u0 − f0)

+kBT0

m∇n0

)+∇T0

T0·⟨

(v − u0)⊗ (v − u0)(m(v − u0)2

2kBT0− 3

2

)⟩=

1

n0

(n0(∂tu0 + u0 · ∇u0 − f0) +

kBT0

m∇n0

)+

2kB∇T0

m·(∫

R3

ξ ⊗ ξ(ξ2 − 3/2)dµM (ξ)),

=1

n0

(n0(∂tu0 + u0 · ∇u0 − f0) +

kBT0

m∇n0 +

n0kBm∇T0

),

where we have used the integrals computed in items 6. and 8. of lemma A.1.We recognize that 〈(v−u0)h1〉 = 0 is the momentum balance equation in (1.41).

198

The last condition reads⟨(m(v − u0)2

3kBT0−1)h1

⟩=

2

3

(∂tT0 + u0 · ∇T0

T0

)⟨(m(v − u0)2

2kBT0− 3

2

)2⟩+

m

kBT0∇u0 :

⟨(v − u0)⊗ (v − u0)

(m(v − u0)2

3kBT0− 1)⟩

=2

3

(∂tT0 + u0 · ∇T0

T0

)∫R3

(ξ2 − 32 )2dµM (ξ)

+4

3∇u0 :

∫R3

ξ ⊗ ξ(ξ2 − 32 )dµM (ξ),

=(∂tT0 + u0 · ∇T0

T0

)+

2

3∇ · u0 = 0,

where we have used item 7. and 8. of lemma A.1. Together with the continuityequation for n0, this is equivalent to the pressure equation in system (1.41).

With the solvability condition satisfied, the solution of equation (1.45) canbe checked as discussed in section 1.6 before the statement of the proposition.

A.3 Proof of proposition 1.15 With the zeroth-order distribution f0 de-fined in equation (1.38) where n0, u0, and T0 are taken to be solutions tothe compressible Euler’s equations, the first-order correction given in proposi-tion 1.14 is

g1 = g1,0 − ν−10 h1,

where g1,0 is an arbitrary element of the subspace V0. In view of the fact thatn0, u0, and T0 solve the compressible Euler’s equations, we can re-write h1 as

h1 = −∇ · u0 −∇T0

T0· (v − u0)− 2

3∇ · u0

(m(v − u0)2

2kBT0− 3

2

)+

m

kBT0∇u0 : (v − u0)⊗ (v − u0) +

∇T0

T0· (v − u0)

(m(v − u0)2

2kBT0− 3

2

)=

m

kBT0

[(v − u0)⊗ (v − u0)− 1

3(v − u0)2I] : ∇u0

+∇T0

T0· (v − u0)

(m(v − u0)2

2kBT0− 5

2

).

The first-order corrections to density, velocity, and temperature are then com-puted from equations (A.1), (A.2), and (A.3), that can be written as

n1 = n0〈g1,0〉 − (n0/ν0)〈h1〉,u1 = 〈(v − u0)g1,0〉 − (1/ν0)〈(v − u0)h1〉,

T1 = T0

⟨(m(v − u0)2

3kBT0− 1)g1,0

⟩− T0

ν0

⟨(m(v − u0)2

3kBT0− 1)h1

⟩.

Since, h0 satisfies the solvability condition π(h0) = 0 proven in lemma A.3, allthe averages involving h1 in the above relations are zero, thus proving the firstpart of the proposition.

199

Explicitly, g1,0 is the projection of g1 into V0, and thus there are functionsa = a(t, x) ∈ R, b = b(t, x) ∈ R3, and c = c(t, x) ∈ R such that

g1,0 = a+m(v − u0)

kBT0· b+

(m(v − u0)2

2kBT0− 3

2

)c.

The coefficients a, b, and c can be written in terms of n1, u1 and T1, by com-puting the averages as in the proof of lemma A.2. The result is

n1 = n0a, u1 = b, T1 = cT0.

We turn now to the viscosity tensor and heat flux. With the distribution func-tion f = f ε = f0

(1+εg1+O(ε2)

), and u = uε = u0+εu1+O(ε2), equation (1.30e)

gives

π(t, x) = m

∫R3

[(v − uε)⊗ (v − uε)− 1

3 (v − uε)2I]f0

(1 + εg1 +O(ε2)

)dv

= π0 + εmn0

⟨[(v − u0)⊗ (v − u0)− 1

3(v − u0)2I

]g1

⟩+O(ε2),

where we have used the fact that 〈v− u0〉 = 0. Analogously, the heat flux fromequation (1.30f) amounts to

q(t, x) =1

2m

∫R3

(v − uε)2(v − uε)f0

(1 + εg1 +O(ε2)

)dv

= q0 + ε[1

2mn0

⟨(v − u0)2(v − u0)g1

⟩−mn0u1 ·

⟨(v − u0)⊗ (v − u0)

⟩− 1

2mn0u1

⟨(v − u0)2

⟩]+O(ε2)

= q0 + ε[1

2mn0

⟨(v − u0)2(v − u0)g1

⟩− 2n0kBT0u1 ·

∫R3

ξ ⊗ ξdµM (ξ)− n0kBT0u1

∫R3

ξ2dµM (ξ)]

+O(ε2)

= q0 + ε[1

2mn0

⟨(v − u0)2(v − u0)g1

⟩− 5

2n0kBT0u1

]+O(ε2)

where we have used items 5. and 6. of lemma A.1.

The remaining averages splits into the sum of the contribution of g1,0 andof −ν−1

0 h1. We start from the former and compute

⟨[(v − u0)⊗ (v − u0)− 1

3(v − u0)2I

]g1,0

⟩=⟨m(v − u0)2

2kBT0

[(v − u0)⊗ (v − u0)− 1

3(v − u0)2I

]⟩c,

as the contributions of the other terms vanish either due to symmetry con-siderations in the integrals or because they multiply the average of the tensor(v − u0)⊗ (v − u0)− 1

3 (v − u0)2. The remaining term is

c2kBT0

m

∫R3

ξ2(ξ ⊗ ξ − 1

3ξ2)dµM (ξ),

200

which can be written in terms of integrals computed in lemma A.1; in fact,

ξ2(ξ ⊗ ξ − 13ξ

2) = ξ ⊗ ξ(ξ2 − 32 )− 1

3(ξ2 − 3

2 )2 +3

2(ξ ⊗ ξ − 1

3ξ2)− 1

2ξ2 +

3

4,

which integrated with the measure dµM (ξ) gives zero. We have obtained thatg1,0 does not contribute to the first order viscosity tensor. For the heat flux onthe other hand, we find a non-trivial contribution,⟨

(v − u0)2(v − u0)g1,0

⟩=

m

kBT0b ·⟨(v − u0)⊗ (v − u0)(v − u0)2

⟩=

4kBT0

mb ·∫R3

(ξ ⊗ ξ)ξ2dµM (ξ)

=4kBT0

mb ·[ ∫

R3

(ξ ⊗ ξ)(ξ2 − 32 )dµM (ξ) +

3

2

∫R3

ξ ⊗ ξdµM (ξ)]

=4kBT0

mb ·[1

2I +

3

4I]

=5kBT0

mb =

5kBT0

mu1,

where again we have used the integrals from lemma A.1; the terms involvingthe coefficients a and c vanish due to symmetry of the Maxwellian distribution.

Upon accounting for the foregoing results we can write

π1 = −mn0

ν0

⟨[(v − u0)⊗ (v − u0)− 1

3(v − u0)2I

]h1

⟩,

q1 = −mn0

2ν0

⟨(v − u0)2(v − u0)h1

⟩,

as the contribution from g1,0 to the heat flux exactly cancels the term involvingu1. The last step is the computation of the two remaining averages.

Beginning with the viscosity tensor, we find

π1 = −mn0

ν0

m

kBT0

⟨[(v − u0)⊗ (v − u0)− 1

3(v − u0)2I

]⊗[(v − u0)⊗ (v − u0)− 1

3(v − u0)2I

]⟩: ∇u0

= −4n0kBT0

ν0

[ ∫R3

[ξ ⊗ ξ − 1

3ξ3I]⊗[ξ ⊗ ξ − 1

3ξ3I]dµM (ξ)

]: ∇u0,

where the contraction with ∇u0 is on the last two indices of the rank-four tensordefined by the integral. The term proportional to ∇T0 in the expression forh1 does not contribute as the integrand is anti-symmetric under the reflectionξ → −ξ. For the calculation of remaining integral it is convenient to notice that

[ξ⊗ ξ− 1

3ξ2I]⊗[ξ⊗ ξ− 1

3ξ2I]

= ξ⊗4 +1

9ξ4I⊗ I− 1

2

[I⊗ (ξ⊗ ξ) + (ξ⊗ ξ)⊗ I

]− 1

3

[I ⊗

((ξ2 − 3

2 )(ξ ⊗ ξ))

+((ξ2 − 3

2 )(ξ ⊗ ξ))⊗ I],

where ξ⊗4 = ξ ⊗ ξ ⊗ ξ ⊗ ξ and all integrals except the first term have beencomputed in lemma A.1. We find

π1 = −4n0kBT0

ν0

[τ : ∇u0 −

5

12(∇ · u0)I

],

201

and we have to compute the rank-four tensor

τ =

∫R3

ξ⊗4dµM (ξ).

In components, we start with τiikl 6= 0 if and only if k = l, but we have to treatthe case i = k separately,

τiikl =

∫R3

ξ2i ξkξldµM (ξ) = δkl

[c22 + (c4 − c22)δik

],

where

c2 = π−1/2

∫R

x2e−x2

dx, c4 = π−1/2

∫R

x4e−x2

dx.

If on the other hand, i 6= j the only non-zero entries of the tensor are obtainedfor either i = k and j = l or i = l and j = k; in both cases the tensor amountsto c22. Therefore,

(τ : ∇u0)ij =∑k,l

τijkl∂u0,k

∂xl=

c22∇ · u0 + (c4 − c22)

∂u0,i

∂xi, for i = j,

c22

(∂u0,i

∂xj+

∂u0,j

∂xi

), for i 6= j.

From lemma A.1 we have c2 = 1/2 and c4 = 3/4, so that

(τ : ∇u0)ij =1

4

(∂u0,i

∂xj+∂u0,j

∂xi

)+

1

4(∇ · u0)δij ,

with the result that

π1,ij = −n0kBT0

ν0

[∂u0,i

∂xj+∂u0,j

∂xi− 2

3(∇ · u0)δij

],

which is the claimed identity.We now turn our attention to the heat flux, for which we obtain

q1 = − mn0

2ν0T0

⟨(m(v − u0)2

2kBT0− 5

2

)(v − u0)2(v − u0)⊗ (v − u0)

⟩∇T0

= −2n0kBT0

ν0

(kBm

)[ ∫R3

ξ2(ξ2 − 5

2

)ξ ⊗ ξdµM (ξ)

]∇T0.

The symmetries of the Gaussian function imply that the off-diagonal entries ofthe tensor in square brackets are zero and the entries on the diagonal are allequal hence ∫

R3

ξ2(ξ2 − 5

2

)ξ2i dµM (ξ) = I

1

3

∫R3

ξ4(ξ2 − 5

2

)dµM (ξ).

From item 5. of lemma A.1 we obtain∫R3

ξ4(ξ2 − 5

2

)dµM (ξ) =

105

8− 5

2

15

4=

15

4,

and thus the claimed expression.

202

B Energy conservation in extended MHD models

In section 4.1, the conservation of mass, momentum, and energy has been provenfor the case of resistive MHD equations (3.50a), making explicit use of thespecific Ohm’s law (3.42) of standard MHD.

Such conservation laws however are valid even for more general forms of theOhm’s law. In section 4.1 we have already noted that mass and momentumconservation are direct consequences of the mass continuity equation and theEuler’s equation, respectively, and therefore holds independently of the chosenform of the Ohm’s law. Energy conservation on the other hand requires somemore comments.

Let us assume that we want to consider a generalized form of the Ohm’s lawwhich we write as

E +u×Bc− ηJ = F (ρ, u, p,B), (B.1)

where F represents a possibly nonlinear operator acting on MHD state vari-ables (ρ, u, p,B). For instance F can represent all the remaining terms inequation (3.40). Alternatively only the most important terms can be retainedas appropriate for the considered physics problem, thus leading to a family ofmodels, usually referred to as extended MHD [59, 60]. Such models are usefulwhen condition (3.38) breaks down, e.g., due to the build-up of strong currentlayers. By retaining the effects of a large current density and electron pressuregradient, extended MHD models stand between standard MHD and a completetwo-fluid description.

However, when such large currents are present, assumption (3.36) becomesquestionable stresses and heat flux due to differences in the ion and electronfluid velocity must be taken into account. Stresses are particularly importantas the single-fluid momentum balance equation (3.31) and the generalized Ohm’slaw (3.40) must be derived consistently (a fact that is sometimes overlooked inthe literature where only the Ohm’s law is extended without consistently extend-ing the momentum equation). The importance of stresses due to the relativemotion of the ion and electron fluids becomes apparent if one recalls that, equa-tions (3.31) and (3.40) are equivalent to the system of the momentum balancelaws for the inividual ion and electron fluids, the relation being established bymeans of (3.37). On the other hand, the single-fluid heat flux (3.32) enters theequation for the pressure (3.28) only, which, in addition to the sum of partialpressures of the plasma species, accounts for their kinetic energy relative to thecenter-of-mass fluid.

For a plasma with one ion species and electrons, equations (3.37) allow usto write the differences ui − u and ue − u in terms of J/(ene); in addition boththe electron ne and ion ni densities are related to the mass density ρ by thequasi-neutrality condition ne = Zini. It follows that we can obtain a closure forsingle-fluid equations of section 3.3 in the form

π = π(ρ, u, J), q = q(ρ, u, J). (B.2)

We have the choice of computing the functions F , π, and q exactly from the two-fluid model by means of (3.37) or to introduce various levels of approximations,each leading to a different flavour of MHD. Some popular extended models arebreifly recalled at the end of this section.

203

Upon accounting for both the closure relations (B.2) and the extended Ohm’slaw (B.1) in their general form, the single-fluid system (3.35) must be replacedby the full system of equations (3.29), (3.31), and (3.34), coupled to the quasi-neutral limit of Maxwell’s equations. We obtain an extended version of MHDequations (3.45) which reads

Dρ

Dt+ ρ∇ · u = 0,

ρDu

Dt= −∇p−∇ · π +

J ×Bc

+ ρg,

D

Dt

p

γ − 1+

γ

γ − 1p∇ · u =

[J · (ηJ + F )−∇ · q − π : ∇u

],

E +u×Bc

= ηJ + F ,

∇×B =4π

cJ,

∂tB + c∇× E = 0,

∇ ·B = 0.

(B.3)

In spite of the additional terms, namely, the functions F , π, and q, this sys-tem preserves much of the structure of the standard MHD equations (3.45).Particularly, the system is formulated in terms of single fluid quantities, and itinterpolate from standard MHD, which corresponds to F = 0, π = 0, q = 0, tothe full two-fluid model, which corresponds to the case of the full generalizedOhm’s law (3.40) together with the exact closures for π and q.

Independently of the form of F , π, and q, one has mass, momentum, andenergy conservation exactly. Therefore, all the physics models obtained by thevarious approximations of F , π, and q, enjoy those three basic conservationlaws. Specifically, we have:

• Mass conservation,d

dt

∫Ω

ρdx = 0. (B.4a)

• Momentum conservation,

d

dt

∫Ω

ρudx = −∫∂Ω

[π · n +

(p+

B2

8π

)n]dS, (g = 0). (B.4b)

• Energy conservation,

d

dt

∫Ω

wdx = − c

4π

∫∂Ω

B · n× (ηJ + F )dS, (g = −∇Φg), (B.4c)

where the energy density is

w =1

2ρu2 +

p

γ − 1+ ρΦg +

|B|2

8π

=1

2

(miniu

2i +meneu

2e

)+pi + pe

γ − 1+ ρΦg +

|B|2

8π.

204

In the second expression of the energy w, we have explictly accounted for defi-nition (3.28), thus separation the sum of partial pressures pi + pe according toDalton’s law, from the kinetic energy of electrons and ions relative to u.

Mass and momentum conservation laws, equations (B.4a) and (B.4b), re-spectively, follow directly from the mass continuity and momentum balanceequations. Particularly, the stress tensor π enters the momentum equation as adivergence and thus only contributes to the boundary term.

Energy conservation on the other hand needs to be checked carefully. Ifgravitational forces are potential, i.e., g = −∇Φg, then the continuity equation,the momentum balance equation, and the pressure equation imply, cf. equa-tion (3.33),

∂t(

12ρu

2 + pγ−1 + ρΦg

)+∇ ·

[( 1

2ρu2 + p

γ−1 + ρΦg)u+ π · u+ up+ q]

= J · E.

With the Ohm’s law (B.1), the right-hand side is

J · E = J · (ηJ + F )− J · u×Bc

,

while the induction equation for the magnetic field reads

∂tB −∇× (u×B) + c∇× (ηJ + F ) = 0,

which gives

∂t( |B|2

8π

)− 1

4πB · ∇ × (u×B) + c4πB · ∇ × (ηJ + F ) = 0.

The combination of the foregoing balance laws gives

∂t(

12ρu

2 + pγ−1 + ρΦg + |B|2

8π

)+∇ ·

[( 1

2ρu2 + p

γ−1 + ρΦg)u+ pu]

− 14πB · ∇ × (u×B) + c

4πB · ∇ × (ηJ + F )

= J · (ηJ + F ) + u · (J ×B)/c.

By means of the same vector calculus identities used in section 4.1 we obtain

d

dt

∫Ω

wdx =c

4π

∫∂Ω

(B × (ηJ + F )

)· ndS.

This expresses the conservation of the total energy apart from the boundaryterms, which is usually zero in practice in view of the chioce of the domain Ω.Hence the conservation laws of section 4.1 are so fundamental that apply to afairly general family of extended MHD models.

Let us conclude this section with a bird-eye view of two popular extendedMHD models, both discussed in details by Kimura and Morrison [59, and ref-erences therein].

Hall MHD. As in standard electrodynamics, the (classical) Hall effect consistsin an electric field established when a current density J is subject to an externalmagnetic field B. Specifically, the Lorentz force acting on the current (carriedby electron) amounts to

−eneE +J ×Bc

,

205

and this must vanish at the equilibrium. Hence the electric field balancing themagnetic force is,

EH =J ×Benec

,

and this is referred to as the Hall field (usually expressed in terms of a potentialfor ordinary cunductors). We can reconize the term EH in equation (3.40). HallMHD is obtained by retaining the Hall term in the Ohm’s law (some authorsinclude the electron pressure gradient as well, [59]) and setting stressed to zero.This corresponds to the choice,

π = 0, F = EH =J ×Benec

.

The latter in particular, gives the Ohm law, cf. equation (B.1),

E +ue ×Bc

= ηJ,

where the flow of the electron fluid is defined by ue = u−J/(ene). We see that,when η = 0, the magnetic field is frozen into the electron fluid, cf. section 4.3.

Inertial MHD. The electron inertia effects are described by the first term onthe left-hand side of the generalized Ohm’s law (3.40). Retaining that termcorresponds to the choice

F =me

e2ne

[∂tJ +∇ ·

(uJ + Ju− JJ

ene

)].

Consistently one should retain the effect of a finite current J while neglectingterms of O(me/mi). Then, equatuons (3.37) amount to ui = u, and ue =u− J/(ene. Therefore, the pressure (3.28) and the viscosity tensor (3.30) withπi = πe = 0 become

p = pi + pe +1

3mene

( Jene

)2,

π = − me

e2neJJ +

1

3mene

( Jene

)2.

The foregoing closure relations yield an extended MHD model known as inertialMHD [59, 60, 61].

It is worth noting that the pressure and viscosity forces in the momentumequation amount to

∇ ·[π + pI

]= ∇(pe + pi) +

me

eJ · ∇

( Jene

),

and it includes an “extra-force” (the last term on the right-hand side) as dis-cussed by Kimura and Morrison [59].

206

C Magnetic vector potential in MHD

The magnetic vector potential is a vector field A such that [50]

B = ∇×A.

This is not uniquely determined by the field B, as the gauge transformationA 7→ A′ = A+∇ϕ leaves the magnetic field B invariant.

In terms of the magnetic vector potential A and including resistivity, theMHD induction equation, cf. system (3.50a), takes the form

∇×[∂tA− u×∇×A+ κη∇×∇×A

]= 0.

This implies that, at least locally, there exists a scalar function χ such that

∂tA− u×∇×A+ κη∇×∇×A = ∇χ, (C.1)

and this the relevant equation for the magnetic vector potential. The scalarfunction χ accounts for the gauge transformation in the sense that a gauge-transformed field A′ = A +∇ϕ satisfies the same equation with χ replaced byχ′ = χ+ ∂tϕ.

207

D Lie derivatives and passively advected quantities

In this appendix we shall consider three families of physically relevant quantitiesin continuum dynamics, namely, scalars, vector fields, and densities, and studytheir evolution under the assumption that they are passively advected by a flow.In all three cases, the time evolution of the field is related to its Lie derivative[32] along the velocity field of the flow.

Pull-back, push-forward, and Lie derivatives. Let us consider a Cr+1

diffeomorphism ϕ : U → V between two open neighborhoods U, V ⊂ Ω. Thisdiffeomorphism induces a transformation acting on quantities defined over Uand mapping them into quantities defined over V and vice-versa. We shallconsider only the three classes of quantities in which we are interested, i.e.,scalar functions, vector fields and densities. For a more general presentationwhich includes differential forms the reader is referred to the book by Marsden,Ratiu and Abraham [32].

Let us begin with scalars. Given a scalar function f ∈ Cr(Ω,R), with valuesin R, we can always construct over the neighborhood U a second function inCr(U,R) by composition with ϕ. This new function f ϕ : U → R is differentfrom the restriction of f to U and it is referred to as the pull-back of f with ϕ.Standard notation for the pull-back is

ϕ∗f(x) = f ϕ(x) = f(ϕ(x)

). (D.1)

This construction is represented in the following diagram,

R R

U

f

OO

ϕ∗f

99

ϕ// V

f

OO

Since ϕ is invertible we can also compose f with ϕ−1 thus obtaining a newfunction on V , namely,

ϕ∗f(y) = f ϕ−1(y) = f(ϕ−1(y)

), (D.2)

where y ∈ V ⊂ Ω. This is referred to as push-forward of f with ϕ since f ,viewed as a function on U is pushed to V by ϕ. The diagram is

R R

U

f

OO

V

f

OO

ϕ∗f

ff

ϕ−1

oo

Both ϕ∗f and ϕ∗f are functions of class Cr over U and V , respectively. We alsohave ϕ∗ϕ∗f = f in U and ϕ∗ϕ

∗f = f in V , as a consequence of the definition.In addition if ψ : V → W is another diffeomorphism mapping V into anotheropen subset W ⊆ Ω, we have (ψ ϕ)∗f = ϕ∗ψ∗f and (ψ ϕ)∗f = ψ∗ϕ∗f .

For scalar functions, pull-back and push-forward maps coincide with compo-sition of functions. For more complicated objects, however, the natural pull-back

209

and push-forward operations are less trivial and this justify the introduction ofsuch abstract notions.

For a Cr vector field v : Ω → Rd, the definition of the pull-back and push-forward has to account for the multi-component nature of the field. One mightthink of pulling back or pushing forward each component of the vector field, aseach component is indeed a scalar function. This however is not satisfactorybecause the result would depend on the reference frame. E.g., a vector fieldpulled back to U from V component-wise would not be invariant under a changeof coordinates in V . A way to construct the push-forward map for vector fieldsis to consider the directional derivative of an arbitrary smooth function f ∈C∞(Ω,R), namely,

v(f) = v · ∇f,which is independent of coordinates. By the chain rule we have

v(ϕ∗f) = v(x) · ∇ϕ(x) · ∇f(ϕ(x)

).

By inspection of the right-hand side we see that a natural definition of thepush-forward of v onto V is

ϕ∗v = (v · ∇ϕ) ϕ−1, (D.3)

which defines a vector field over V in Cr(V,Rd), since ϕ ∈ Cr+1(U, V ). Withthis definition we have

v(ϕ∗f) = ϕ∗(ϕ∗v(f)

).

By applying the same definition with the inverse map ϕ−1 we can define thepull-back of the vector field

ϕ∗v =(v · ∇ϕ−1) ϕ, (D.4)

which defines a vector field over U in Cr(U,Rd). Again one can check from thedefinition that ϕ∗ϕ∗v = v over U and ϕ∗ϕ

∗v = v over V . (In order to see thisone can use the identities ϕ−1

(ϕ(x)

)= x and ϕ

(ϕ−1(y)

)= y together with the

chain rule to obtain ∇ϕ(x) · ∇ϕ−1(ϕ(x)

)= I and ∇ϕ−1(y) · ∇ϕ

(ϕ−1(y)

)= I,

respectively, with I being the identity matrix.) In addition, given a seconddiffeomorphism ψ : V →W one has

(ψ ϕ)∗v =(v · ∇(ψ ϕ)−1

) ψ ϕ

=(v · ∇(ϕ−1 ψ−1)

) ψ ϕ

=(v · ∇ψ−1 · (∇ϕ−1 ψ−1)

) ψ ϕ

=[(

(v · ∇ψ−1) ψ)· ∇ϕ−1

] ϕ

= ϕ∗ψ∗v,

while the push-forward satisfies (ψ ϕ)∗v = ψ∗ϕ∗v.At last we consider densities. Although the geometric concept of density (or

more generally α-density where α ∈ R) is somewhat different [32], we restrictour attention to the more familiar idea of measure absolutely continuous withrespect to the Lebesgue measure in Ω. This is a map µρ which assigns a realnumber to any open subset A ⊆ Ω by integration, namely,

µρ(A) =

∫A

ρ(x)dx,

210

where ρ ∈ L1(Ω,R) is referred to as the density of the measure µρ.On the open neighborhood U , we can define the pull-back measure of µρ,

ϕ∗µρ(A) =

∫ϕ(A)

ρ(y)dy,

that is, we associate to an open set A ⊆ U the measure of its image ϕ(A) ⊂ Vunder the action of ϕ. By change of variable we have∫

ϕ(A)

ρ(y)dy =

∫A

ρ(ϕ(x)

)∣∣detDϕ(x)∣∣dx,

from which we can read the density of the pull-back measure, which is takenas the definition of pull-back of the density (with some abuse of notation andlanguage, since this is rather the density of the pulled-back measure),

ϕ∗ρ =∣∣detDϕ

∣∣(ρ ϕ). (D.5)

One should notice that this is just the pull-back of ρ as a scalar function, cor-rected with the Jacobian determinant of the map ϕ which accounts for thetransformation of the volume element. Replacing ϕ with its inverse ϕ−1 weobtain the push-forward of the density, which is the density of the push-forwardmeasure on V ,

ϕ∗µρ(A) =

∫ϕ−1(A)

ρ(x)dx,

for all open subset A ⊂ V , that is

ϕ∗ρ =(∣∣ detDϕ

∣∣−1ρ) ϕ−1. (D.6)

Again we have ϕ∗ϕ∗ρ = ρ on U and ϕ∗ϕ∗ρ = ρ on V . In virtue of the properties

of the determinant and the chain rule of differentiation, if ψ : V →W is a seconddiffeomorphism, we have

(ψ ϕ)∗ρ = (ρ ψ ϕ)∣∣ det

(∇(ψ ϕ)

)∣∣= (ρ ψ ϕ)

∣∣ det∇ψ(ϕ)∣∣ · ∣∣ det∇ϕ

∣∣= ϕ∗ψ∗ρ,

and analogously, (ψ ϕ)∗ρ = ψ∗ϕ∗ρ.For all the cases examined above, i.e., scalar, vector and density fields, the

pull-back and the push-forward are inverse to each other and behave naturallyunder composition. It is also worth noticing that the pull-back of scalars andmeasures does not actually require the invertibility of the map ϕ.

We are now in the position to address a generalization of directional deriva-tives of a certain quantity q along the integral curves of an autonomous (i.e.,time-independent) C1 vector field u = u(x) with (locally defined) flow Ft : U →Ω. We consider the case of flows that are in C2 as functions of the combinedvariable (t, x) so that we can exchange time- and space-derivatives.

For every t in the time interval I and position x ∈ U , we compute thevariation of the value of q along the Lagrangian trajectory issued from x withrespect to the value q(x) to t = 0. For a scalar function q = f the value alongthe Lagrangian trajectory is just f

(Ft(x)

), but in general we want to take into

211

account the transformation properties of the quantity q under the action of themap Ft, hence we pull back q to the point x. This leads to the definition

Lu q = limt→0

F ∗t q − qt

=d

dtF ∗t q

∣∣∣∣t=0

, (D.7)

which is referred to as the Lie derivative of q along u. From this definition onereadily obtains the so called Lie derivative formula [32].

Lemma D.1 (Marsden et al. [32]). Let q be a field over Ω for which we havedefined a pull-back map. Then,

d

dtF ∗t q = F ∗t Lu q, (D.8)

where the Lie derivative Lu is defined in (D.7).

Proof. We can write the derivative at t in the form

d

dtF ∗t q =

d

dsF ∗t+sq

∣∣∣∣s=0

.

Because of the semi-group property of the flow of an autonomous field, cf propo-sition 1.3, the pull-back satisfies F ∗t+sq = (Fs Ft)∗q = F ∗t (F ∗s q) as we have al-ready proven for the considered quantities. We also observe from the definitionthat

d

dsF ∗t F

∗s q = F ∗t

d

dsF ∗s q,

henced

dtF ∗t q =

d

dsF ∗t+sq

∣∣∣∣s=0

= F ∗t Lu q,

as claimed.

Lemma D.1 holds for all fields from which one can define a pull-back withthe flow Ft, even though here we considered only the three instances we need,namely, scalars, vector fields, and densities. We can now compute explicitly theLie derivative of such quantities.

For a scalar field (q = f), the Lie derivative amounts to

Lu f = u · ∇f = v(f), (D.9)

as follows by direct computation,

d

dtF ∗t f(x) =

d

dtf(Ft(x)

)= u

(Ft(x)

)· ∇f

(Ft(x)

)= F ∗t (u · ∇f).

Evaluating at t = 0 yields equation (D.9) and we also have a direct proof ofthe Lie-derivative formula of lemma D.1 for scalars. This calculation also showsthat indeed Lu generalizes the directional derivative along the integral lines ofthe field u.

A vector field however has a less trivial Lie derivative, since we have, cf.equation (D.4),

F ∗t v(x) = v(Ft(x)

)· ∇F−t

(Ft(x)

),

212

where we have used the fact that F−1t = F−t, cf. section 1.2. In order to compute

the Lie derivative of a vector field, it is convenient to write the time-derivativeof F ∗t v in the form

d

dtF ∗t v(x) =

dFt(x)

dt· ∇v

(Ft(x)

)· ∇F−t

(Ft(x)

)+ v(Ft(x)

)· ddt

[∇F−t

(Ft(x)

)]= u

(Ft(x)

)· ∇v

(Ft(x)

)· ∇F−t

(Ft(x)

)+ v(Ft(x)

)· ddt

[∇F−t

(Ft(x)

)].

The last term can be conveniently computed from the identity F−t(Ft(x)

)=

x which implies ∇Ft(x) · ∇F−t(Ft(x)

)= I, where I is the identity matrix.

Differentiating in time yields

∇Ft(x) ·[∇u(Ft(x)

)· ∇F−t

(Ft(x)

)+d

dt

[∇F−t

(Ft(x)

)]= 0,

and since ∇Ft(x) is the Jacobian matrix of an invertible transformation, it isnon singular and therefore

d

dt

[∇F−t

(Ft(x)

)]= −∇u

(Ft(x)

)· ∇F−t

(Ft(x)

).

On making use of this result we obtain

d

dtF ∗t v(x) = u

(Ft(x)

)· ∇v

(Ft(x)

)· ∇F−t

(Ft(x)

)− v(Ft(x)

)· ∇u

(Ft(x)

)· ∇F−t

(Ft(x)

)= F ∗t

(u · ∇v − v · ∇u

)(x).

and evaluating at t = 0 we obtain

Lu v = u · ∇v − v · ∇u = [u, v], (D.10)

where the commutator of directional derivatives defines the Lie brackets

[u, v] = u · ∇v − v · ∇u. (D.11)

One should notice that this is the same structure encountered in proposition 4.5.At last for densities, let us recall that the flow Ft is continuous and connect

continuously to the identity map for t→ 0. Hence the Jacobian determinant ispositive and we have

F ∗t ρ(x) =[

detDFt]ρ(Ft(x)

),

where we have dropped the absolute value on the determinant. For a C1 densitywe can compute the derivative directly,

d

dtF ∗t ρ(x) = ρ

(Ft(x)

) ddt

[detDFt

]+[

detDFt]dFt(x)

dt· ∇ρ

(Ft(x)

)=[

detDFt]ρ(Ft(x)

)∇ · u

(Ft(x)

)+[

detDFt]u(Ft(x)

)· ∇ρ

(Ft(x)

),

213

and we have used proposition 1.7 in the last identity. Upon evaluating at t = 0we have the Lie derivative of a C1 density in the form

Lu ρ = u · ∇ρ+ ρ∇ · u = ∇ · [ρu]. (D.12)

In the remaining part of this appendix, we relate these geometrical ideas to thephysical concept of passively advected quantities.

Passively advected quantities. We shall now consider passively advectedscalar, vector, and density fields by a time-dependent velocity field u ∈ C1(I ×Ω,Rd) on a time interval I = [−T,+T ] and over the spatial domain Ω. The flowof the velocity field u is a map Ft,s : U → Ω defined locally in a neighborhood Uof any point x0 ∈ Ω for a (possibly small) time interval t ∈ (s−ε, s+ε) ⊂ I andas introduced in section 1.2. The neighborhood U and the constant ε depend,in general, on the point (s, x0) ∈ I×Ω. We shall not need globally defined flowsin this section, but we do assume that the field u is sufficiently regular for theflow viewed as a map (t, x)→ Ft,s(x) to be in C2 so that we can exchange theorder of derivatives.

First we lift the velocity field u(t, x) to a vector field over the whole space-time domain I × Ω. Specifically we define

u(t, x) =(1, u(t, x)

).

The Lagrangian trajectories of u parametrized by τ are given by

dt(τ)

dτ= 1,

dx(τ)

dτ= u(t, x), t(0) = t, x(0) = x.

The solution of this Cauchy problem is t(τ) = t+τ for time and x(τ) = Ft+τ,t(x)for position. This in fact satisfies the initial condition since Ft,t(x) = x and wehave, in particular,

d

dτFt+τ,t(x) =

d

dsFs,t(x)

∣∣∣∣s=t+τ

= u(s, Fs,t(x)

)∣∣s=t+τ

= u(t+ τ, Ft+τ,t(x)

).

We conclude that the flow of the lifted field is

Fτ (t, x) =(t+ τ, Ft+τ,t(x)

),

and this forms a semi-group since u is autonomous even when u is not.We can now say when a quantity q is passively transported, that is when its

variations under the action of the lifted flow Fτ is zero.

Definition D.1 (Passively advected quantities). We say that a field q = q(t, x)defined on I × Ω is passively advected by the velocity field u = u(t, x) if

F ∗τ q = q, or equivalently Lu q = 0, (D.13)

where u is the velocity field u lifted to I × Ω and Fτ is the lifted flow.

The equivalence of the two definition follows from the definition of Lie deriva-tive: A quantity is invariant under pull-back if and only if its Lie derivative iszero (the “if” part follows from lemma D.1).

214

Since u is autonomous, the Lie derivative Lu is defined according to (D.7).It is more convenient to give a characterization of condition (D.13) in termsof the Lie derivative with the time-dependent vector field u which we need todefine. For time dependent field we extend definition (D.7) using the flow Fs,tevolving the Lagrangian trajectories from t to s. Specifically, we define

Lu q =d

dsF ∗t+s,tqt

∣∣∣∣s=0

, (D.14)

where qt = q(t, ·). One can directly verify that expressions (D.9), (D.10),and (D.12) holds for non-autonomous fields as well.

We have that passively advected quantities satisfy a very specific partial dif-ferential equation, independently on whether they are scalar, vector, or densityfields.

Proposition D.2. A passively advected quantity q satisfies

∂tq + Lu q = 0, (D.15)

where Lu is defined in equation (D.14)

Proof. Let us introduce the maps

ϕτ : (t, x) 7→ (t+ τ, x),

ψτ : (t, x) 7→(t, Ft+τ,t(x)

),

and notice that

Fτ (t, x) =(t+ τ, Ft+τ,t(x)

)= ϕτ ψτ (t, x) = ψτ ϕτ (t, x).

Specifically ϕτ is a time-translation at x fixed, while ψτ is related to the Liederivative along u,

Lu q =d

dsF ∗t+s,tqt

∣∣∣∣s=0

=d

dτψτq

∣∣∣∣τ=0

.

Then we have

Lu q = limτ→0

F ∗τ q − qτ

= limτ→0

ϕ∗τψ∗τq − qτ

,

andϕ∗τψ

∗τq − qτ

= ϕ∗τψ∗τq − q

τ+ϕ∗τq − q

τ.

Since ϕτ is just a time translation, its Jacobian matrix is the identity and thepull-back ϕ∗τq reduces to q(t + τ, x) so that the second term tends to ∂tq. Asfor the first term,

ϕ∗τψ∗τq(t, x)− q(t, x)

τ= Lu q(t+ τ, x) + h(τ, t+ τ, x),

where h(τ, t, x) = o(τ), thus its limit is Lu q(t, x).

Let us now specialize proposition D.2 to the case of scalar, vector, and densityfields. This is the main result of this appendix.

215

Proposition D.3. Let u ∈ C1(I ×Ω) be a time-dependent velocity field with a(locally defined) flow Ft of class C2 as a function of (t, x).

(i) A scalar field f ∈ C1(I×Ω) is passively advected by the flow of u if either

f(t, Ft(x)

)= f(0, x), or equivalently ∂tf + u · ∇f = 0.

(ii) A vector field w ∈ C1(I×Ω) is passively advected by the flow of u if either

w(t, Ft(x)

)= w(0, x)·∇Ft(x), or equivalently ∂tw+u·∇w−w·∇u = 0.

(iii) A density field ρ ∈ C1(I×Ω) is passively advected by the flow of u if either

ρ(t, Ft(x)

)= ρ(0, x)

[detDFt(x)

]−1, or equivalently ∂tρ+∇·(uρ) = 0.

Statement (i) means that a passively advected scalar f is constant alongLagrangian trajectories and satisfies the advection equation as we have alreadyobserved in section 1.4. The analogous result for vector fields corresponds tostatements (ii) and (iii) in lemma 4.5, i.e., conditions (4.11) state the fact thatthe vector field is passively advected. As discussed in section 4.3, the magneticfield B in the case of incompressible ideal MHD is passively advected by theflow. For compressible ideal MHD the passively advected field is B/ρ. At last fordensities we see that the continuity equation of fluid dynamics, obtained fromthe Reynolds transport theorem in section 1.5, is equivalent to the fact that ρ ispassively advected by u. The identities in proposition D.3 play a central role inthe construction of variational formulations for MHD as discussed in section 6.

Proof. If we specialize the condition F ∗τ q = q of definition D.1 for τ = t andevaluate at (0, x) we have, for a scalar f ,

f(t, Ft,0(x)

)= f(0, x),

and Ft,0(x) = Ft(x). Analogously for a density ρ we have

ρ(t, Ft,0(x)

)[detDFt,0(x)

]= ρ(0, x),

from which one has the claimed relation in (iii). For a vector field we considerthe equivalent condition in terms of the push-forward (Fτ )∗q = q and evaluatethis at

(t, Ft(x)

)for τ = t. The push-forward evaluated at a generic point (t, x)

reads

(Fτ )∗w(t, x) = w(t− τ, F−1

t,t−τ (x))· ∇Ft,t−τ

(F−1t,t−τ (x)

)= w(t, x),

and evaluating at(t, Ft(x)

)with τ = t we have

w(0, F−1

t,0 (Ft(x)))· ∇Ft,0

(F−1t,0 (Ft(x))

)= w

(t, Ft(x)

),

which gives the claimed relation. The equivalent differential equation followsfrom proposition D.2 together with the explicit expression for the Lie derivativeof scalar, vector, and density fields.

216

References

[1] H. Alfven. Existence of electromagnetic-hydrodynamic waves. Nature,150:405–406, 1942.

[2] H. Alfven. Cosmical Electrodynamics: Fundamental Principles. The Inter-national series of monographs on physics. Oxford University Press, 1953.

[3] L. Woltjer. On hydromagnetic equilibrium. Proceedings of the NationalAcademy of Sciences, 44(9):833–841, 1958.

[4] L. Woltjer. A theorem on force-free magnetic fields. Proceedings of theNational Academy of Sciences, 44(6):489–491, 1958.

[5] H. K. Moffatt. The degree of knottedness of tangled vortex lines. Journalof Fluid Mechanics, null:117–129, 1 1969.

[6] J. B. Taylor. Relaxation of toroidal plasma and generation of reverse mag-netic fields. Phys. Rev. Lett., 33:1139–1141, Nov 1974.

[7] R. L. Ricca. An Introduction to the Geometry and Topology of Fluid Flows.Nato Science Series II:. Springer Netherlands, 2001.

[8] V. I. Arnold and B. A. Khesin. Topological Methods in Hydrodynamics.Springer-Verlag, 1998.

[9] P. J. Morrison and J. M. Greene. Noncanonical Hamiltonian density for-mulation of hydrodynamics and ideal magnetohydrodynamics. Phys. Rev.Lett., 45:790–794, Sep 1980.

[10] P. J. Morrison. Hamiltonian description of the ideal fluid. Rev. Mod. Phys.,70:467–521, Apr 1998.

[11] A. J. Chorin and J. E. Marsden. A Mathematical Introduction to FluidMechanics. Springer New York, 1993.

[12] D. Biskamp. Nonlinear Magnetohydrodynamics. Cambridge UniversityPress, 1993.

[13] D. D. Schnack. Lectures on Magnetohydrodynamics. Springer, 2009.

[14] E. Priest. Magnetohydrodynamics of the Sun. Cambridge University Press,2014.

[15] M. J. Aschwanden. Physics of the Solar Corona: An Introduction withProblems and Solutions. Springer-Praxis, 2005.

[16] J. P. Freidberg. Ideal magnetohydrodynamics. Modern Perspectives inEnergy Series. Plenum Publishing Company Limited, 1987.

[17] H. Zohm. Magnetohydrodynamic Stability of Tokamaks. Wiley, 2014.

[18] J. P. Goedbloed and S. Poedts. Principles of Magnetohydrodynamics: WithApplications to Laboratory and Astrophysical Plasmas. Cambridge Univer-sity Press, 2004.

217

[19] J. P. Goedbloed, R. Keppens, and S. Poedts. Advanced Magnetohydrody-namics: With Applications to Laboratory and Astrophysical Plasmas. Cam-bridge University Press, 2010.

[20] P. A. Davidson. An Introduction to Magnetohydrodynamics. CambridgeTexts in Applied Mathematics. Cambridge University Press, 2001.

[21] F. Gay-Balmaz and D. D. Holm. A geometric theory of selective decaywith applications in mhd. Nonlinearity, 27(8):1747, 2014.

[22] E. S. Gawlik, P. Mullen, D. Pavlov, J.E. Marsden, and M. Desbrun. Geo-metric, variational discretization of continuum theories. Physica D: Non-linear Phenomena, 240(21):1724 – 1760, 2011.

[23] Y. Zhou, H. Qin, J. W. Burby, and A. Bhattacharjee. Variational inte-gration for ideal magnetohydrodynamics with built-in advection equations.Physics of Plasmas, 21(10), 2014.

[24] M. Kraus, E. Tassi, and D. Grasso. Variational integrators for reducedmagnetohydrodynamics. Journal of Computational Physics, 321:435 – 458,2016.

[25] M. Kraus and O. Maj. Variational integrators for ideal magnetohydrody-namics. in preparation, 2017.

[26] G. Duvaut and J. L. Lions. Inequations en thermoelasticite et magneto-hydrodynamique. Archive for Rational Mechanics and Analysis, 46(4):241–279, 1972.

[27] M. Sermange and R. Temam. Some mathematical questions related tothe mhd equations. Communications on Pure and Applied Mathematics,36(5):635–664, 1983.

[28] P. Secchi. On the equations of ideal incompressible magnetohydrodynamics.Rendiconti del Seminario Matematico della Universita di Padova, 90:103–119, 1993.

[29] Q. Chen, C. Miao, and Z. Zhang. On the well-posedness of the ideal MHDequations in the Triebel–Lizorkin spaces. Archive for Rational Mechanicsand Analysis, 195(2):561, 2009.

[30] C. L. Fefferman, D. S. McCormick, J. C. Robinson, and J. L. Rodrigo.Higher order commutator estimates and local existence for the non-resistiveMHD equations and related models. Journal of Functional Analysis,267(4):1035 – 1056, 2014.

[31] J. E. Marsden and T. J. R. Hughes. Mathematical Foundations of Elasticity.Dover Civil and Mechanical Engineering Series. Dover, 1994.

[32] J. E. Marsden, T. Ratiu, and R. Abraham. Manifolds, Tensor Analysis,and Applications. Springer, third edition edition, 2001.

[33] L. Hormander. Lectures on Nonlinear Hyperbolic Differential Equations.Mathematiques et Applications. Springer Berlin Heidelberg, 1997.

218

[34] T. Tao. Nonlinear Dispersive Equations: Local and Global Analysis. Num-ber no. 106 in Conference Board of the Mathematical Sciences. Regionalconference series in mathematics. American Mathematical Soc., 2006.

[35] D. D. Holm, T. Schmah, C. Stoica, and D. C. P. Ellis. Geometric mechanicsand symmetry: from finite to infinite dimensions. Oxford texts in appliedand engineering mathematics. Oxford University Press, 2013.

[36] F. John. Partial Differential Equations. Springer Berlin Heidelberg, 1980.

[37] C. Cercignani. The Boltzmann Equation and Its Applications. Springer,1975.

[38] S. I. Braginskii. Transport processes in a plasma. In M. A. Leontovich,editor, Reviews of Plasma Physics, volume 1. Consultant Bureau, NewYork, 1965.

[39] R. D. Hazeltine and J. D. Meiss. Plasma Confinement. Frontiers in physics.Addison-Wesley, Advanced Book Program, 1992.

[40] P. Helander and D. J. Sigmar. Collisional Transport in Magnetized Plasmas.Cambridge Monographs on Plasma Physics. Cambridge University Press,2005.

[41] P. L. Bhatnagar, E. P. Gross, and M. Krook. A model for collision pro-cesses in gases. I. Small amplitude processes in charged and neutral one-component systems. Phys. Rev., 94:511–525, May 1954.

[42] B. Perthame. Global existence to the BGK model of Boltzmann equation.Journal of Differential Equations, 82(1):191 – 205, 1989.

[43] L. Saint-Raymond. From the BGK model to the Navier-Stokes equations.Annales Scientifiques de l’Ecole Normale Superieure, 36, 03 2003.

[44] J. D. Huba and NRL. Plasma formulary. Naval Research Lab., 2013.

[45] L. Saint-Raymond. A mathematical PDE perspective on the Chapman-Enskog expansion. Bull. Amer. Math. Soc., 51:247–275, 2014.

[46] S. Chandrasekhar. Hydrodynamic and Hydromagnetic Stability. Dover,1981.

[47] P. Wesseling. Principles of Computational Fluid Dynamics. Springer-Verlag, Berlin, 2001.

[48] P. Andries, P. Le Tallec, J. Perlat, and B. Perthame. The Gaussian-BGKmodel of Boltzmann equation with small Prandtl number. European Jour-nal of Mechanics - B/Fluids, 19(6):813 – 830, 2000.

[49] G. Bertin. Dynamics of Galaxies. Cambridge University Press, 2000.

[50] J. D. Jackson. Classical Electrodynamics. Wiley, 2007.

[51] J. D. Jackson and L. B. Okun. Historical roots of gauge invariance. Reviewsof Modern Physics, 73(3):663–680, 2001.

219

[52] J. K. Hunter. Lecture notes on PDEs, 2016. Available at https://www.

math.ucdavis.edu/~hunter/pdes/pdes.html.

[53] L. C. Evans. Partial differential equations. Graduate studies in mathemat-ics. American Mathematical Society, Providence (R.I.), 1998.

[54] L. Tartar. An Introduction to Sobolev Spaces and Interpolation Spaces.Springer, 2007.

[55] J. Rauch. Hyperbolic Partial Differential Equations and Geometric Optics.Graduate studies in mathematics. American Mathematical Society, 2012.

[56] L. Hormander. The Analysis of Linear Partial Differential Operators I:Distribution Theory and Fourier Analysis. Springer Berlin Heidelberg, sec-ond edition edition, 1990.

[57] P. Degond, F. Deluzet, and D. Savelief. Numerical approximation of theeuler-maxwell model in the quasineutral limit. Journal of ComputationalPhysics, 231(4):1917–1946, 2012.

[58] J. P. Boyd. The devil’s invention: Asymptotic, superasymptotic and hy-perasymptotic series. Acta Applicandae Mathematica, 56(1):1–98, 1999.

[59] K. Kimura and P. J. Morrison. On energy conservation in extended mag-netohydrodynamics. Physics of Plasmas, 21(8):082101, 2014.

[60] M. Lingam, P. J. Morrison, and G. Miloshevich. Remarkable connectionsbetween extended magnetohydrodynamics models. Physics of Plasmas,22(7):072111, 2015.

[61] M. Lingam, P. J. Morrison, and E. Tassi. Inertial magnetohydrodynamics.Physics Letters A, 379(6):570 – 576, 2015.

[62] S. R. Cranmer. Coronal holes. Living Rev. Solar Phys., 6:3–65, 2009.

[63] R.B. White. The Theory of Toroidally Confined Plasmas. World ScientificPublishing Company, 2013.

[64] P. R. Garabedian. Magnetohydrodynamic stability of fusion plasmas. Com-munications on Pure and Applied Mathematics, 51(9-10):1019–1033, 1998.

[65] V. I. Arnold. Vladimir I. Arnold - Collected Works, chapter The asymptoticHopf invariant and its applications, pages 357–375. Springer, Berlin, 2014.

[66] B. A. Khesin. Topological fluid dynamics. Notices of the American Math-ematical Society, 52(1):9–19, 2005.

[67] T. Vogel. On the asymptotic linking number. Proc. of the American Math-ematical Society, 131(7):2289–2297, 2003.

[68] Z. Yoshida. Roles of magnetic helicity in plasma confinement. Journal ofNuclear Science and Technology, 27(3):193–204, 1990.

[69] V. Girault and P.-A. Raviart. Finite Element Methods for Navier-StokesEquations. Springer-Verlag, Berlin, 1986.

220

https://www.math.ucdavis.edu/~hunter/pdes/pdes.html

https://www.math.ucdavis.edu/~hunter/pdes/pdes.html

[70] D. B. Melrose. Instabilities in Space and Laboratory Plasmas. CambridgeUniversity Press, 1986.

[71] T. H. Stix. Waves in Plasmas. American Institute of Physics, New York,1992.

[72] E. Marsch and D. Verscharen. On nonlinear alfvn-cyclotron waves in multi-species plasma. Journal of Plasma Physics, 77(3):385–403, 2011.

[73] R. Abraham and J. E. Marsden. Foundations of Mechanics. AmericanMathematical Soc., 1978.

[74] F. Scheck. Mechanics: From Newton’s Laws to Deterministic Chaos.Springer, 2004.

[75] J. K. Hunter and B. Nachtergaele. Applied Analysis. World ScientificPublishing Company, 2001.

[76] I. M. Gel’fand and S. V. Fomin. Calculus of Variations. Dover, 1963.

[77] E. Hairer, Lubich C., and Wanner G. Geometric Numerical Integra-tion: Structure-Preserving AAlgorithm for Ordinary Differential Equations.Springer, 2013.

[78] J. E. Marsden and M. West. Discrete mechanics and variational integrators.Acta Numerica, 10:357–514, 2001.

[79] J. E. Marsden, G. W. Patrick, and S. Shkoller. Multisymplectic geome-try, variational integrators, and nonlinear pdes. Commun. Math. Phys.,199:351–395, 1998.

[80] M. Kraus. Variational Integrators in Plasma Physics. PhD thesis, ZentrumMathematik, TU Munchen, 2013.

[81] W. A. Newcomb. Lagrangian and hamiltonian methods in magnetohydro-dynamics. Nuclear Fusion, Supplement, part 2:451, 1962.

[82] V. I. Arnold. Sur la geometrie differentielle des groupes de lie de dimensioninfinie et ses applications a l’hydrodynamique des fluides parfaits. Ann.Inst. Fourier, 16:316–361, 1966.

[83] M. E. Taylor. Partial Differential Equations III: Nonlinear Equations, vol-ume 117 of Applied Mathematical Sciences. Springer New York, 2011.

[84] M. Fabian, P. Habala, P. Hajek, V. Montesinos, and V. Zizler. BanachSpace Theory. Springer, 2011.

[85] A. N. Kolmogorov and S. V. Fomin. Element of the Theory of Functionsand Functional Analysis, volume 1 and 2. Dover, 1999.

[86] F. Bampi and A. Morro. The inverse problem of the calculus of vari-ations applied to continuum physics. Journal of Mathematical Physics,23(12):2312–2321, 1982.

221

[87] M. J. Gotay, J. Isenberg, J. E. Marsden, and R. Montgomery. Momen-tum maps and classical fields. Part I: Covariant field theory. Available onarXiv:physics/9801019, 2004.

[88] E. Noether. Invariante variationsroblem. Nachrichten von der Gesellschaftder Wissenschaften zu Gttingen, Mathematisch-Physikalische Klasse, pages235–257, 1918. English translation available on arXiv:physics/0503066.

222

TUM · Contents Preamble 5 1 Basic elements of uid dynamics7 1.1 Kinematics of uids.. . . . . . . ....

Documents

Transcript of TUM · Contents Preamble 5 1 Basic elements of uid dynamics7 1.1 Kinematics of uids.. . . . . . . ....