Download - AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

Transcript
Page 1: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

AN INTRODUCTION TO MULTISCALEMETHODS

G.A. PavliotisDepartment of Mathematics

Imperial College Londonand

A.M. StuartMathematics InstituteUniversity of Warwick

June 25, 2006

Page 2: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

2

Page 3: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

i

To my parents Aργυρη and Σoυλτανα and to my brother Γιωργo.Carry Home. Γρηγoρης .

For my children Natalie, Sebastian and Isobel.AMS.

Page 4: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

ii

Page 5: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

Contents

Preface xi

1 Introduction 11.1 Motivating Examples . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1.1 Example I: Steady Heat Conduction in a Composite Material 11.1.2 Example II: Homogenization for Advection–Diffusion Equa-

tions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.1.3 Example III: Averaging, Homogenization and Dynamics . 41.1.4 Example IV: Dimension Reduction in Dynamical Systems 5

1.2 Averaging Versus Homogenization . . . . . . . . . . . . . . . . . 61.3 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . 6

I Background 9

2 Analysis 112.1 Set–Up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.2 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.3 Banach and Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . 13

2.3.1 Banach Spaces . . . . . . . . . . . . . . . . . . . . . . . 142.3.2 Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . . 15

2.4 Function Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.4.1 Lp Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 172.4.2 Sobolev Spaces . . . . . . . . . . . . . . . . . . . . . . . 182.4.3 Banach Space–Valued Spaces . . . . . . . . . . . . . . . 202.4.4 Sobolev spaces of Periodic Functions . . . . . . . . . . . 21

2.5 Two–Scale Convergence . . . . . . . . . . . . . . . . . . . . . . 23

iii

Page 6: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

iv CONTENTS

2.5.1 Two–Scale convergence for steady problems . . . . . . . 232.5.2 Two–Scale convergence for time dependent problems . . . 26

2.6 Equations in Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . 272.6.1 Lax-Milgram Theory . . . . . . . . . . . . . . . . . . . . 282.6.2 Fredholm Alternative . . . . . . . . . . . . . . . . . . . . 29

2.7 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . 302.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3 Probability Theory and Stochastic Processes 353.1 Set–Up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353.2 Probability and Expectation . . . . . . . . . . . . . . . . . . . . . 353.3 Stochastic Processes . . . . . . . . . . . . . . . . . . . . . . . . 383.4 Martingales and Stochastic Integrals . . . . . . . . . . . . . . . . 42

3.4.1 Martingales . . . . . . . . . . . . . . . . . . . . . . . . . 423.4.2 The Ito Stochastic Integral . . . . . . . . . . . . . . . . . 443.4.3 The Stratonovich Stochastic Integral . . . . . . . . . . . . 45

3.5 Weak Convergence of Probability Measures . . . . . . . . . . . . 463.6 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . 533.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

4 Ordinary Differential Equations 554.1 Set-Up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554.2 Existence and Uniqueness . . . . . . . . . . . . . . . . . . . . . 554.3 The Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574.4 Ergodicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 604.5 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . 614.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

5 Markov Chains 635.1 Set-Up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 635.2 The Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 655.3 Existence and Uniqueness . . . . . . . . . . . . . . . . . . . . . 685.4 Ergodicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 705.5 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . 715.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

Page 7: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

CONTENTS v

6 Stochastic Differential Equations 736.1 Set-Up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 736.2 Existence and Uniqueness . . . . . . . . . . . . . . . . . . . . . 746.3 The Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 756.4 Ergodicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 786.5 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . 826.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

7 Partial Differential Equations 857.1 Set-Up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 857.2 Elliptic PDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 857.3 The Dirichlet Problem for Elliptic PDE . . . . . . . . . . . . . . 867.4 The Periodic Problem for Elliptic PDE . . . . . . . . . . . . . . . 887.5 Fredholm alternative . . . . . . . . . . . . . . . . . . . . . . . . 897.6 Parabolic PDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . 957.7 Hyperbolic PDE . . . . . . . . . . . . . . . . . . . . . . . . . . . 967.8 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . 977.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

II Perturbation Expansions 99

8 Invariant Manifolds for ODE 1018.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1018.2 Full Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1018.3 Simplified Equations . . . . . . . . . . . . . . . . . . . . . . . . 1028.4 Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1038.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

8.5.1 Linear Fast Dynamics . . . . . . . . . . . . . . . . . . . 1048.5.2 Large Time Dynamics . . . . . . . . . . . . . . . . . . . 1048.5.3 Centre Manifold . . . . . . . . . . . . . . . . . . . . . . 105

8.6 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . 1068.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

9 Averaging for Markov Chains 1099.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1099.2 Full Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

Page 8: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

vi CONTENTS

9.3 Simplified Equations . . . . . . . . . . . . . . . . . . . . . . . . 1119.4 Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1129.5 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1139.6 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . 114

10 Averaging for ODE and SDE 11510.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11510.2 Full Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11510.3 Simplified Equations . . . . . . . . . . . . . . . . . . . . . . . . 11710.4 Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11810.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

10.5.1 A Skew-Product SDE . . . . . . . . . . . . . . . . . . . . 11910.5.2 Hamiltonian Mechanics . . . . . . . . . . . . . . . . . . 120

10.6 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . 12310.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

11 Homogenization for ODE and SDE 12711.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12711.2 Full Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12711.3 Simplified equations . . . . . . . . . . . . . . . . . . . . . . . . 12911.4 Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13211.5 Properties of the Effective Equation . . . . . . . . . . . . . . . . 13411.6 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

11.6.1 Fast Ornstein-Uhlenbeck Noise . . . . . . . . . . . . . . 13511.6.2 Fast Chaotic Noise . . . . . . . . . . . . . . . . . . . . . 13711.6.3 Stratonovich Corrections . . . . . . . . . . . . . . . . . . 13811.6.4 Stokes’ Law . . . . . . . . . . . . . . . . . . . . . . . . 14011.6.5 Kubo Formula . . . . . . . . . . . . . . . . . . . . . . . 14111.6.6 Neither Ito nor Stratonovich . . . . . . . . . . . . . . . . 14311.6.7 The Levy Area Correction . . . . . . . . . . . . . . . . . 145

11.7 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . 14611.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

12 Homogenization for Elliptic Equations 14912.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14912.2 Full Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14912.3 Simplified Equations . . . . . . . . . . . . . . . . . . . . . . . . 150

Page 9: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

CONTENTS vii

12.4 Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

12.5 Properties of the Homogenized Coefficients . . . . . . . . . . . . 154

12.6 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

12.6.1 The One–Dimensional Case . . . . . . . . . . . . . . . . 158

12.6.2 Layered Materials . . . . . . . . . . . . . . . . . . . . . 159

12.7 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . 162

12.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

13 Homogenization for Parabolic Equations 17113.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

13.2 Full Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

13.3 Simplified Equations . . . . . . . . . . . . . . . . . . . . . . . . 173

13.4 Derivation of the Homogenized equation . . . . . . . . . . . . . . 173

13.5 Effective Diffusivity . . . . . . . . . . . . . . . . . . . . . . . . . 175

13.6 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

13.6.1 Gradient Vector Fields . . . . . . . . . . . . . . . . . . . 176

13.6.2 The One Dimensional Case . . . . . . . . . . . . . . . . 179

13.6.3 Divergence–Free Fields . . . . . . . . . . . . . . . . . . 181

13.6.4 Shear Flow in 2D . . . . . . . . . . . . . . . . . . . . . . 184

13.7 The Connection to SDE . . . . . . . . . . . . . . . . . . . . . . . 188

13.8 Discussion and Bibliograhpy . . . . . . . . . . . . . . . . . . . . 189

13.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

14 Homogenization for Linear Transport PDE 19514.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

14.2 Full Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

14.3 Simplified Equations . . . . . . . . . . . . . . . . . . . . . . . . 196

14.4 Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

14.5 The Connection to ODE . . . . . . . . . . . . . . . . . . . . . . 198

14.6 The One–Dimensional Case . . . . . . . . . . . . . . . . . . . . 199

14.7 The Two–Dimensional Case. Kolmogorov’s Theorem . . . . . . . 201

14.8 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . 202

14.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

Page 10: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

viii CONTENTS

III Theory 205

15 Invariant Manifolds for ODE: The Convergence Theorem 20715.1 The Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20715.2 The Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20915.3 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . 21015.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211

16 Averaging for Markov Chains: The Convergence Theorem 21316.1 The Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21316.2 The Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21416.3 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . 21516.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216

17 Averaging for SDE: The Convergence Theorem 21717.1 The Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21717.2 The Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21817.3 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . 22017.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221

18 Homogenization for SDE: The Convergence Theorem 22318.1 The Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22318.2 The Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22518.3 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . 22918.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229

19 Homogenization for Elliptic PDE: The Convergence Theorem 23119.1 The Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23119.2 Proof of the Homogenization Theorem . . . . . . . . . . . . . . . 23219.3 Proof of the Corrector Result . . . . . . . . . . . . . . . . . . . . 23619.4 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . 23719.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238

20 Homogenization for Parabolic PDE: The Convergence Theorem 23920.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23920.2 The Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23920.3 The Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240

Page 11: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

CONTENTS ix

20.4 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . 24320.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244

21 Homogenization for Transport PDE: The Convergence Theorem 24721.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24721.2 The Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24721.3 The Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24821.4 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . 25021.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250

Page 12: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

x CONTENTS

Page 13: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

Preface

The aim of these notes is to describe, in a unified fashion, a set of methods for thesimplification of a wide variety of problems which all share the common feature ofpossessing multiple scales. The mathematical methods which we study are oftenreferred to as the methods of averaging and of homogenization. The methods ap-ply to partial differential equations (PDE), stochastic differential equations (SDE),ordinary differential equations (ODE) and Markov chains. The unifying principleunderlying the collection of techniques described here is the approximation of sin-gularly perturbed linear equations. The unity of the subject is most clearly visiblein the application of perturbation expansions to the approximation of these singu-lar perturbation problems. A significant portion of the notes is devoted to suchperturbation expansions. In this context we use the term Result to describe theconclusions of a formal perturbation argument. This enables us to derive importantapproximation results without the burden of rigorous proof which can sometimesobfuscate the main ideas. However, we will also study a variety of tools fromanalysis and probability, used to place the approximations derived on a rigorousfooting. The resulting theorems are proved using a range of methods, tailored todifferent settings. There is less unity to this part of the subject. As a consequenceconsiderable background is required to absorb the entire theoretical side of the sub-ject, and we devote a significant fraction of the book to this background material.

The first section of the notes is devoted to the Background, the second to thePerturbation Expansions which provide the unity of the subject matter, and thethird to the Theory justifying these perturbative techniques. We do not necessarilyrecommend that the reader cover this material in order. A natural way to get anoverview of the subject is to read through the section on Perturbation Expansions,referring back to the Background material as needed. 1 The Theory can then be

1This material is necessarily terse and the Discussion and Bibliography section at the end of eachchapter provide pointers to more detailed literature.

xi

Page 14: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

xii PREFACE

studied, after the form of the approximations is understood, on a case by case basis.The subject matter in these set of notes has, for the most part, been known for

several decades. However, the particular presentation of the material here is, webelieve, particularly suited to the pedagogical goal of communicating the subjectarea to the wide range of mathematicians, scientists and engineers who are cur-rently engaged in the use of these tools to tackle the enormous range of applicationsthat require them. In particular we have chosen a setting which demonstrates quiteclearly the wide applicability of the techniques to PDE, SDE, ODE and Markovchains, as well as highlighting the unity of the approach. Such a wide–rangingsetting is not undertaken, we believe, in existing books, or is done so less explic-itly than in this text. We have chosen to use the phrasing Multiscale Methods inthe title of the book because the material presented here forms the backbone of asignificant portion of the amorphous field which now goes by that name. Howeverwe do recognize that there are vast parts of the field which we do not cover in thisbook.

These notes are meant to be an introduction, aimed primarily towards graduatestudents in mathematics, applied mathematics and physics. This is particularlytrue for the third part of the book, where we analyze rigorously only simplifiedversions of the models that were studied in the second part of the book. Extensionsand generalizations of the results presented these notes as well as references tothe literature are presented in the Discussion and Bibliography section, at the endof each chapter. All chapters are supplemented with exercises. We hope that theexercises will make these notes more appropriate for use as a textbook or for selfstudy.

WarningThis is a draft set of notes. It has not been checked and refined to the stan-

dard required of a published book. The authors would welcome comments, re-marks and suggestions by all readers, from typos through to major errors andstructural deficiencies. All comments can be sent to the authors via e-mail ([email protected] and [email protected]). Updated versionsof the notes will be available from the authors’ web pages: http://www.ma.ic.ac.uk/˜pavl and http://www.maths.warwick.ac.uk/˜stuart.

AcknowledgementsWe are grateful to Dror Givon, Martin Hairer, Raz Kupferman, David White

and Konstantinos Zygalakis for many helpful comments which substantially im-

Page 15: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

xiii

proved these lecture notes. The book grew out of courses on multiscale methodsand homogenization that the were taught by the authors at Warwick University andImperial College London. We thank the students who attended these courses.

Page 16: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

xiv PREFACE

Page 17: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

Chapter 1

Introduction

1.1 Motivating Examples

In this section we describe four examples which illustrate the range of problemsthat we will study in the notes. The first illustrates homogenization in the contextof linear elliptic PDE. The second illustrates related ideas in the context of time-dependent PDE of advection-diffusion type. Through the connection between hy-perbolic (transport) PDE and ODE via the method of characteristics, and parabolicPDE and SDE via Ito’s formula, we show that the methods of homogenizationdeveloped for the study of linear PDE can also be applied to study dimension re-duction in ODE and SDE. We then finish with an example concerning variableelimination in dynamical systems.

1.1.1 Example I: Steady Heat Conduction in a Composite Material

To introduce ideas related to homogenization we consider the problem of steadyheat conduction in a composite material whose material properties vary rapidlyon the macroscopic scale. For simplicity we assume that the heterogenieties areperiodic in space. If Ω ∈ Rd denotes the domain occupied by the material, then thesize of the domain defines the macroscopic length scale L. On the other hand, theperiod of the heterogeneities defines the microscopic length scale ε of the problem.Assuming that L = O(1) and that the size of heterogeneities is small ε ¿ 1,then the phenomenon of steady heat conduction can be described by the following

1

Page 18: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

2 CHAPTER 1. INTRODUCTION

elliptic boundary value problem1:

−∇ ·(A

(x

ε

)∇uε

)= f, for x ∈ Ω, (1.1.1a)

uε(x) = 0, for x ∈ ∂Ω. (1.1.1b)

The matrix A(y) is the thermal conductivity tensor and uε(x) denotes the tem-perature field. The purpose of homogenization theory is to study the limit of thisequation as ε → 0. From a physical point of view, this limit corresponds to the casewhere the heterogeneities become vanishingly small. In other words, we aim to re-place the original, highly heterogeneous material characterized by the coefficientsA

(xε

)with an effective, homogeneous material which is characterized by constant

coefficients A. Hence the name homogenization.In later chapters we will show that, under appropriate assumptions on the ma-

trix A(y), the function f(x) and the domain Ω the homogenized equation is

−A : ∇∇u = f, for x ∈ Ω, (1.1.2a)

u(x) = 0, for x ∈ ∂Ω. (1.1.2b)

The constant homogenized coefficients A are given by the formula:

A =∫

YA(y)

(I +∇χT (y)

)dy. (1.1.3)

The (first order) corrector χ(y) solves the cell problem

−∇y ·(A(y)∇yχ

T (y))

= ∇AT (y), χ(y) is 1–periodic. (1.1.4)

Thus, the calculation of the effective coefficients involves the solution of a partialdifferential equation posed on the unit cell (i.e. with periodic boundary condi-tions), together with the computation of the integrals in (1.1.3). Notice that findingthe homogenized solution u requires the solution of two elliptic PDE: the peri-odic cell problem (1.1.4), which allows construction of a given by (1.1.3), and theDirichlet problem (1.1.2). The important point is that these elliptic equations donot depend upon the small scale ε and hence their numerical solution is far lesstime-consuming than is direct numerical solution of (1.1.1).

As well as deriving the homogenized equations by use of perturbation expan-sions, we will also prove that the solution uε(x) of (1.1.1) converges to the so-lution u(x) of the homogenized equation (1.1.2) as ε → 0, in some appropriatesense. The homogenized equation will be derived using asymptotic expansions inChapter 12. The rigorous homogenization theorem will be proved in Chapter 19.

1The notation that we will be using in these notes is explained in Section 2.2

Page 19: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

1.1. MOTIVATING EXAMPLES 3

1.1.2 Example II: Homogenization for Advection–Diffusion Equations

We now show how the ideas of homogenization can be used to study time-dependentparabolic PDE. Consider some physical quantity which is immersed in a fluid,for example a chemical pollutant in the atmosphere. Under the assumption thatthis quantity does not affect the fluid velocity field v(x, t), its concentration fieldT (x, t) satisfies the advection-diffusion equation

∂T

∂t+ v(x, t) · ∇T = D∆T, (1.1.5a)

T (x, t = 0) = T0(x). (1.1.5b)

Let us assume that the fluid velocity is smooth, steady, v(x, t) = −b(x), incom-pressible ∇ · b(x) = 0, and periodic with period 1 in all directions. Imagine thatT0(x) = f(εx) so that initially the concentration is slowly varying in space. Itis then reasonable to expect that the concentration will only vary significantly onlarge length and time scales. Homogenization techniques enable us to show that there–scaled concentration field T (x/ε, t/ε2) – scaled so as to bring out large lengthand timescales – converges, as ε tends to 0, to the solution of the heat equation

∂T

∂t= K : ∇∇T .

Here K denotes effective diffusion tensor

K = DI +∫

Td

b(y)⊗ χ(y) dy.

The corrector field χ(y) solves the cell problem

−D∆yχ(y)− b(y) · ∇yχ(y) = b(y),

with periodic boundary conditions.The comments made in Example I apply equally well here: finding the homog-

enized field requires solution of two PDE (one periodic elliptic, the other parabolic)which are independent of ε, and hence amenable to numerical solution. Further-more these ideas can be made rigorous and error estimates found. The homoge-nized equation will be derived in Chapter 13. The rigorous homogenization theo-rem will be proved in Chapter 20.

Page 20: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

4 CHAPTER 1. INTRODUCTION

1.1.3 Example III: Averaging, Homogenization and Dynamics

Under the rescaling described above, namely x → x/ε and t → t/ε2, the equation(1.1.5) becomes, for steady velocity fields,

∂T

∂t− 1

εb(x

ε

)· ∇T = D∆T, (1.1.6a)

T (x, t = 0) = f(x). (1.1.6b)

In the case D = 0 this equation can be solved by the method of characteristics; thecharacteristics are obtained by solving the ODE

dxε

dt= b

(xε

ε

).

Since b(x) is periodic, and x/ε varies rapidly on the scale of the period, it is naturalto try and average the equation to eliminate these rapid oscillations. Thus we seethat eliminating fast scales in a time-dependent transport PDE is intimately relatedto averaging in ODE. See Chapters 14, 21.

In the case where D 6= 0 the equation (1.1.6) is the backward Kolmogorovequation for x solving the SDE

dxε

dt= v

(xε

ε

)+√

2DdW

dt, (1.1.7)

where W is Brownian motion. This means that

T (x(0), t) = E(f(x(t))|x(0)),

whereE denotes averaging with respect to the measure on Brownian motion (Wienermeasure). Again we can try and eliminate the rapidly varying quantity x/ε. If v isperiodic, divergence free and mean zero then the result described in the previousexample shows that, xε(t), the solution of (1.1.7), converges, in the limit as ε → 0,to X(t) where X(t) is a Brownian motion with diffusion coefficient

√2K :

dX

dt=√

2KdW

dt.

The connection between homogenization in parabolic PDE and in SDE is discussedin Chapter 11. Rigorous homogenization theorems for SDE are proved 18.

Page 21: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

1.1. MOTIVATING EXAMPLES 5

1.1.4 Example IV: Dimension Reduction in Dynamical Systems

The methods applied to derive homogenized elliptic and parabolic PDE can alsobe used to average out, or homogenize, the fast scales in systems of ODE and SDE.Doing so leads to effective equations which do not contain the small parameter ε

and are hence more amenable to numerical solution. The prototypical example is adynamical systems of the form

dx

dt= f(x, y),

dy

dt=

1εg(x, y).

In situations of this type, where there is a scale separation, it is often the case that y

can be eliminated and an approximate equation for the evolution of x can be found.We write the approximate equation in the form

dX

dt= F (X).

In the simplest situation y is eliminated through an invariant manifold; in morecomplex systems it is eliminated through averaging. Such results may be viewedas functional versions of the law of large numbers.

In some situations F ≡ 0 and it is then necessary to scale the equations to alonger time s = t/ε to see interesting effects. The starting point is then

dx

ds=

1εf(x, y),

dy

dt=

1ε2

g(x, y).

In this situation f essentially averages to zero when y is eliminated and the effectiveequation sees the fluctuations in f ; the approximate equation takes the form

dX

dt= F (X) + A(X)

dW

dt

where W is a standard unit Brownian motion; this may be viewed as a central limittheorem. Derivation of this equation we refer to as homogenization.

As for linear PDE, techniques from analysis enable us to estimate errors be-tween the original and simplified equations. The formal asymptotics underlyingthese ideas are fleshed out in Chapters 8, 10 and 11; related rigorous results aregiven in Chapters 15, 17 and 18.

Page 22: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

6 CHAPTER 1. INTRODUCTION

1.2 Averaging Versus Homogenization

The unifying principle underlying the derivation of most of the effective equationsin these notes concerns formal perturbation expansions for linear operator equa-tions of the form

λ∂u

∂t= Lεu.

In particular we will be interested in cases where the operator Lε has the form

Lε =1εL0 + L1, (1.2.1)

orLε =

1ε2L0 +

1εL1 + L2. (1.2.2)

In both cases L0 will have a nontrivial null space and interest focusses on captur-ing the dynamics within this subspace. The first case (1.2.1) we will refer to asaveraging, or first order perturbation theory. It can be thought of as a form (orconsequence) of the law of large numbers. The second, (1.2.2), will be referred toas homogenization or second order perturbation theory. It can be thought of asa form (or consequence) of the central limit theorem; indeed (1.2.2) enables us tostudy fluctuations around the average.

Our main interest will be in the case λ 6= 0. This will apply to the study ofhomogenization for parabolic PDE and transport PDE, and also to averaging andhomogenization for ODE, Markov chains and SDE, via the Kolmogorov equationsand variants (the forward equation, the method of characteristics). We will alsostudy the case λ = 0 which arises when studying homogenization for elliptic PDE.The case where where a second order time-derivative appears (wave equations)will not be covered explicitly in the notes, but the techniques developed do apply;references to the literature will be given.

1.3 Discussion and Bibliography

The use of formal multiscale expansions in the study of homogenization was de-veloped systematically and applied to many different problems in [17]. See also[116, 115, 120]. Mode reduction for either stochastic or deterministic systems hasbecome a very important area in applied mathematics. Applications include atmo-sphere/ocean science [96], molecular dynamics, [134], mathematical finance [48].

Page 23: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

1.3. DISCUSSION AND BIBLIOGRAPHY 7

Various aspects of stochastic and deterministic mode reduction, including numeri-cal algorithms for these problems, are discussed in the review articles [58, 71, 72].Homogenization for advection–diffusion equations is a part of the theory of turbu-lent diffusion. See [94] for an excellent review of the theory of turbulent diffusion.

One of the main technical tools that will be used in these notes is the connectionbetween parabolic and elliptic PDE and SDE through the Feynmann–Kac formula,[111]. The relation between homogenization and averaging for PDE and limit the-orems for (Markov) stochastic processes has been recognized, and studied exten-sively, since the early 1960s [49]. A rather general theory of homogenization andaveraging for stochastic processes was developed in the 1970s [85, 115, 116, 120].The mathematical theory is presented in the monograph [41].

There are many important topics related to homogenization and averaging forODE, SDE and PDE that will not be discussed in these notes. Let us comment onsome of these topics are provide some references to the literature. Non–periodic,deterministic, homogenization will not be covered here. The interested reader mayconsult [30, Ch. 13], [34] and the references therein. Bounds on the homogenizedcoefficients and their dependence on the non–dimensional parameters of the prob-lem will be discussed very briefly. Information about these topics can be found in[106]. We will not study homogenization problems for non–linear PDE; see [42]and the references therein. Variational methods–which apply to elliptic PDE–andthe theory of Γ–convergence will not be discussed in these notes; see [102, 33, 23].Related concepts such as H and G convergence will not be developed in thesenotes. See, e.g. [138, 139, 144]. We will not discuss about homogenization forrandom PDE (i.e. PDE whose coefficients are random fields). See [114]. More de-tailed pointers to the literature will be provided in the Discussion and Bibliographysection which concludes each chapter.

Page 24: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

8 CHAPTER 1. INTRODUCTION

Page 25: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

Part I

Background

9

Page 26: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …
Page 27: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

Chapter 2

Analysis

2.1 Set–Up

In this chapter we collect a variety of definitions and theorems (mainly withoutproofs) from analysis. The topics are selected because they will be needed in thesequel. Proofs and additional material can be found in the books cited in Sec-tion 2.7. The presentation is necessarily terse, and it is not recommended that thischapter is read in its entirety on a first read through the book; rather that it be re-ferred to as needed when reading later chapters. The rigorous setting developedin this chapter is required primarily in Chapters 15–21 where we prove a numberof results concerning dimension reduction for dynamical problems (averaging andhomogenization) and homogenization for PDE. The results themselves are derivedin Chapters 8–14 by means of perturbation expansions and in that context the rig-orous setting is not required.

When proving rigorous results about our reduced systems, the natural settingto answer these questions in the context of PDE is the theory of weak convergencein Hilbert and Banach spaces; in this context we will also develop an appropriatekind of weak convergence, that of two–scale convergence, which is very usefulfor problems related to periodic homogenization. A complementary chapter onprobability, Chapter 3, provides the appropriate convergence tools for the study ofSDE and Markov chains; in particular we develop background material on weakconvergence of probability measures.

11

Page 28: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

12 CHAPTER 2. ANALYSIS

2.2 Notation

In this book we will encounter scalar, vector and matrix fields – all examples oftensor fields. We will adopt an index free notation throughout the book and thepurpose of this section is to explain this notation, using indices to do so. We adoptthe Einstein summation convention, whereby repeated indexes imply a summation,single indices range freely over 1, 2, . . . , d, d being the dimension, and no indexmay appear more than twice in any product term. We start with tensor algebraand then move to tensor calculus. We use Rd to denote d−dimensional Euclideanspace, and Td the d−dimensional torus Rd/Z.

We will denote by eidi=1 the standard basis in Rd. Thus arbitrary ξ ∈ Rd can

be written as ξ = ξiei, where ξi = 〈ξ, ei〉. We use · to denote the inner-productbetween two vectors, so that

a · b = aibi.

We will also use 〈·, ·〉 to denote this standard inner-product on Rd. The norminduced by this inner-product is the Euclidean norm

|a| = √a · a

and it follows that

|a|2 =d∑

j=1

a2i , ξ ∈ Rd.

We will also use the usual Lp norms on Rd, 1 6 p 6 ∞ and denote them by | · |p.Note that | · | = | · |2. In addition we use | · |p to denote the associated operator normdefined by

|A|p = supx 6=0

|Ax|p|x|p .

In particular we use the notation |A|=|A|2. for operator norms.The inner-product between matrices is denoted by

A : B = tr(AT B) = aijbij .

The norm induced by this inner product is the Frobenius norm

|A|F =√

tr(AT A).

The outer-product between two vectors a and b is the matrix a⊗ b defined by(a⊗ b

)c = (b · c)a ∀c ∈ Rd.

Page 29: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

2.3. BANACH AND HILBERT SPACES 13

Let ∇ and ∇· denote gradient and divergence in Rd. The gradient lifts a scalar(resp. vector) to a vector (resp. matrix) whilst the divergence contracts a vector(resp. matrix) to a scalar (resp. vector). The gradient acts on scalar valued func-tions φ, or vector valued functions v, via

(∇φ)i =∂φ

∂zi, (∇v)ij =

∂vi

∂zj.

The divergence acts on vector valued functions v, or matrix valued functions A via

∇ · v =∂vi

∂zi, (∇ ·A)i =

∂Aij

∂zj.

Given vector fields a, v we use the notation

a · ∇v := (∇v)a.

Since the gradient is defined for scalars and vectors we readily make sense of theexpression

∇∇φ

for any scalar φ; it is the Hessian matrix ∂2φ∂xi∂xj

. Similarly, we can also make senseof the expression

∇∇v

by applying ∇∇ to each scalar component of the vector v.

In many instances in what follows we will take the gradient or divergence withrespect to a variety of different independent variables. We use ∇x to denote gra-dient or divergence with respect to x coordinates alone, and similarly for otherindependent variables.

2.3 Banach and Hilbert Spaces

We assume that the reader is already familiar with the definitions of a norm, aninner product and of vector, metric, normed and inner product spaces. In the sequelX will denote a normed space and ‖·‖will denote its norm. We say that a sequencexj∞j=1 ⊂ X converges strongly to x ∈ X , written as

xj → x,

Page 30: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

14 CHAPTER 2. ANALYSIS

provided thatlim

j→∞‖xj − x‖ = 0.

Furthermore, we say that xj∞j=1 ⊂ X is a Cauchy sequence provided that foreach ε > 0 there exists an N ∈ N such that

‖xj − xk‖ < ε ∀j, k > N.

Every convergent sequence in a normed space X is a Cauchy sequence. The con-verse in not always true. If, however, every Cauchy sequence in X is convergent,then the space X is called complete.

2.3.1 Banach Spaces

Definition 2.1. A Banach space X is a complete normed vector space.

Definition 2.2. Let X be a Banach space. We say that a map ` : X → R is abounded linear functional on X provided that

i) `(αx + βy) = α`(x) + β`(y) ∀x, y ∈ X, α, β ∈ R.

ii) ∃C > 0 : |`(x)| 6 C‖x‖ ∀x ∈ X .

Definition 2.3. The collection of all bounded linear functionals on X is called thedual space and denoted by X∗.

Theorem 2.4. X∗ equipped with the norm

‖`‖ = supx 6=0

|`(x)|‖x‖ ,

is a Banach space.

Definition 2.5. A Banach space X is called reflexive if the dual of its dual is X:

(X∗)∗ = X.

The concept of the dual space enables us to introduce another topology on X ,the so called weak topology.

Definition 2.6. A sequence xn∞n=1 is said to converge weakly to x ∈ X , written

xn x,

if`(xn) → `(x) ∀` ∈ X∗.

Page 31: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

2.3. BANACH AND HILBERT SPACES 15

Every strongly convergent sequence is also weakly convergent. However, theconverse is not true. The importance of weak convergence stems from the follow-ing theorem, in particular part (ii).

Theorem 2.7. Let X be a Banach space.

(i) Every weakly convergent sequence in X is bounded.

(ii) (Eberlein–Smuljan) Assume that X is reflexive. Then from every boundedsequence in X we can extract a weakly convergent subsequence.

In addition to the weak topology on X , the duality between X and X∗ enablesus to define a topology on X∗.

Definition 2.8. Let X be a Banach space. A sequence `n∞n=1 ∈ X∗ is said toconverge weakly–∗ to ` ∈ X∗ written

`n∗ `,

iflim

n→∞ `n(x) = `(x) ∀x ∈ X.

We remark that if X is reflexive then weak–∗ convergence coincides with weakconvergence on X .

A compactness result similar to Theorem 2.7 holds for bounded sequences inX∗, provided that X is separable. We remind the reader that a subset X0 of X

is called dense if for every x ∈ X there exists a sequence xj∞j=1 ⊂ X0 whichconverges to x. In other words, X0 is dense in X if its closure is X: X0 = X .

Definition 2.9. A Banach space X is called separable if it contains a countabledense subset.

The compactness theorem for sequences in X∗ can be stated as follows.

Theorem 2.10. Let X be a separable Banach space. Then from any boundedsequence in X∗ we can extract a weakly convergent subsequence.

2.3.2 Hilbert Spaces

Definition 2.11. A Hilbert space is a complete inner product space.

Page 32: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

16 CHAPTER 2. ANALYSIS

We denote the inner-product by (·, ·). Clearly, every Hilbert space is a Banachspace. Naturally, the inner product defines a norm on H:

‖x‖ = (x, x)12 .

Furthermore, all elements of H satisfy the Cauchy–Schwarz inequality:

|(u, v)| 6 ‖u‖‖v‖.

A very important property of a Hilbert space H is that we can identify the dual ofH with itself through the Riesz representation theorem.

Theorem 2.12. (Riesz Representation.) For every ` ∈ H∗ there exists a uniquey ∈ H such that

`(x) = (x, y) ∀x ∈ H.

We will usually denote the action of ` on x ∈ H by 1 〈·, ·〉:

〈`, x〉 := `(x).

The Riesz representation theorem implies that every Hilbert space is reflexive andconsequently Theorem 2.7 applies. Furthermore, the definition of weak conver-gence simplifies to

xn x ⇔ (xn, y) → 0 ∀ y ∈ H.

2.4 Function Spaces

Let Ω be an open subset of Rd. We will denote by C(Ω) the space of continuousfunctions f : Ω → R. This space, when equipped with the supremum norm

‖f‖C(Ω) = supx∈Ω

|f(x)|,

is a Banach space. Similarly, we can define the space Cp(Ω) of p–times contin-uously differentiable functions. We will denote by C∞(Ω) the space of smoothfunctions. The notation C∞

0 (Ω) will be used to denote the space of smooth func-tions over Ω with compact support. We will use the notation Ck

b (Rd) to denote

1In Rd we use 〈·, ·〉 to denote the inner-product and it is useful to retain this convention; in theinfinite dimensional setting we will always use (·, ·) for the inner-product and 〈·, ·〉 for the dualpairing between H and H∗

Page 33: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

2.4. FUNCTION SPACES 17

the set of k–times continuously differentiable functions from Rd to R which arebounded, together with all of their derivatives. C∞

b (Rd) will denote the space ofbounded continuous functions having bounded continuous derivatives of all orders.More generally, let (E, ρ) be a metric space. Then Cb(E) will denote the space ofbounded continuous functions on (E, ρ).

2.4.1 Lp Spaces

Let 1 6 p 6 ∞ and let f : Ω → R be a measurable function. Define

‖f‖Lp(Ω) :=

(∫Ω |f |p dx

) 1p , for 1 6 p < ∞

ess supΩ |f | , for p = ∞.

In the above definition we have used the notation

ess supΩ

= inf C, |f | 6 C a.e. on Ω .

Definition 2.13. Lp(Ω) is the vector space of all measurable functions f : Ω → Rfor which ‖f‖Lp(Ω) < ∞.

Theorem 2.14. (Basic properties of Lp spaces.)

i) The vector space Lp(Ω), equipped with the Lp–norm defined above, is aBanach space for every p ∈ [1,+∞].

ii) L2(Ω) is a Hilbert space equipped with the inner product

(u, v)L2(Ω) =∫

Ωu(x)v(x) dx ∀u, v ∈ L2(Ω).

iii) Lp(Ω) is separable for p ∈ [1, +∞) and reflexive for p ∈ (1, +∞). In par-ticular, L1(Ω) is not reflexive and L∞(Ω) is neither separable nor reflexive.

Let p ∈ [1,+∞] and define q ∈ [1,+∞] through

1p

+1q

= 1. (2.4.1)

Then the Holder inequality states that∣∣∣∣∫

Ωu(x)v(x) dx

∣∣∣∣ 6 ‖u‖Lp(Ω)‖v‖Lq(Ω) ∀u ∈ Lp(Ω), v ∈ Lq(Ω).

Page 34: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

18 CHAPTER 2. ANALYSIS

Let p ∈ [1, +∞) and let q be defined through (2.4.1). Then

(Lp(Ω))∗ = Lq(Ω).

The last part of the above theorem, together with the fact that Lp(Ω) is a Banachspace and Definition 2.6, implies the following definition of weak convergence inLp(Ω), p ∈ [1, +∞).

Definition 2.15. A sequence un(x)∞n=1 ∈ Lp(Ω), p ∈ [1,+∞) is said to con-verge weakly to u(x) ∈ Lp(Ω), written

un(x) u(x) w–Lp(Ω),

provided that∫

Ωun(x)v(x) dx →

Ωu(x)v(x) dx ∀v ∈ Lq(Ω),

where q is defined in (2.4.1).

Whereas weak convergence in L∞ is very rarely used, the notion of weak–*convergence in that space is very useful.

Definition 2.16. A sequence un∞n=1 ∈ L∞(Ω) converges weak–* in L∞, written

un(x) u(x) w*–L∞(Ω),

provided that∫

Ωun(x)φ(x) dx →

Ωu(x)φ(x) dx, ∀φ ∈ L1(Ω).

2.4.2 Sobolev Spaces

We start with the definition of the weak derivative.

Definition 2.17. Let u, v ∈ L2(Ω). We say that v is the first weak derivative of u

with respect to xi if∫

Ωu

∂φ

∂xidx = −

Ωvφ dx ∀φ ∈ C∞

0 (Ω).

In the contxt of a function u ∈ L2(Ω) We will use the notation ∂u∂xi

to denotethe weak derivative with respect to xi. We will also use the notation ∇u = ∂u

∂xiei.

Page 35: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

2.4. FUNCTION SPACES 19

Definition 2.18. The Sobolev space H1(Ω) consists of all square integrable func-tions from Ω to R whose first order weak derivatives exist and are square inte-grable:

H1(Ω) =

u∣∣∣u ∈ L2(Ω), ∇u ∈ L2((Ω))d

.

The space H1(Ω) is a separable Hilbert space with norm

‖u‖H1(Ω) =(‖u‖2

L2(Ω) + ‖∇u‖2L2(Ω)

) 12

and inner product

(u, v)H1(Ω) = (u, v)L2(Ω) + (∇u,∇v)L2(Ω).

A very useful property of H1(Ω) is that its embedding into L2(Ω) is compact. Wewill state this result in the following way.

Theorem 2.19. (Rellich Compactness Theorem). From every bounded sequencein H1(Ω) we can extract a subsequence which is strongly convergent in L2(Ω).

In many applications one is interested elements of H1(Ω) which vanish on theboundary ∂Ω of the domain. Functions with this property belong to the followingsubset of H1

0 (Ω).

Definition 2.20. The Sobolev space H10 (Ω) is defined as the completion of C∞

0 (Ω)with respect to the H1–norm.

A very important property of H10 (Ω) is the fact that we can control the L2 norm

of its elements in terms of the L2–norm of their gradient. Below we present thisresult, together with its analogue for elements of H1(Ω).

Theorem 2.21. (Poincare Inequality) Let Ω be a bounded open set in Rd. Thenthere is a constant CΩ, which depends only on Ω, such that

‖u‖L2(Ω) 6 CΩ‖∇u‖L2(Ω) (2.4.2)

for every u ∈ H10 (Ω).

An immediate corollary of the first part of the above theorem is that ‖∇·‖L2(Ω)

can be used as the norm in H10 (Ω):

‖u‖H10 (Ω) = ‖∇u‖L2(Ω). (2.4.3)

Page 36: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

20 CHAPTER 2. ANALYSIS

We will denote by H−1(Ω) the dual space of H10 (Ω). Further, we will denote by

〈·, ·〉H10 ,H−1 the pairing between H−1(Ω) and H1

0 (Ω). In other words, the action off ∈ H−1(Ω) on v ∈ H1

0 (Ω) will be denoted by 〈f, v〉H10 ,H−1 . Then H−1(Ω) is a

Banach space equipped with the norm

‖f‖H−1(Ω) = supv∈H1

0 (Ω), ‖v‖H1

0(Ω)61

|〈f, v〉H10 ,H1−1|.

The following version of the Cauchy–Schwarz inequality holds

|〈f, v〉H10 ,H−1 | 6 ‖f‖H−1‖v‖H1

0∀ f ∈ H−1(Ω), ∀ v ∈ H1

0 (Ω). (2.4.4)

2.4.3 Banach Space–Valued Spaces

It is possible to define Lp spaces over sets and spaces more general than subsetsof Rd; in particular, one can define spaces Lp(Ω) when Ω is a Banach space itself.The spaces defined below appear quite often in applications. They are particularlyrelevant for the rigorous analysis of periodic homogenization since we have to dealwith functions of two arguments (x and y, say.)

Definition 2.22. Let X be a Banach space with norm ‖ · ‖X and let Ω denotea subset of Rd, not necessarily bounded. The space Y := Lp(Ω; X) with p ∈[1, +∞] consists of all measurable functions u : x ∈ Ω → u(x) ∈ X such that‖u(x)‖X ∈ Lp(Ω).

The space Y defined above has various nice properties which we list below.

Theorem 2.23. Let Y = Lp(Ω;X) with X and Ω as in Definition 2.22. Then

(i) Y equipped with the norm

‖u‖Y =(∫

Ω‖u(x)‖p

X dx

) 1p

,

is a Banach space.

(ii) If X is reflexive and p ∈ (1, +∞) then Y is also reflexive.

(iii) If X is separable and p ∈ [1, +∞) then Y is also separable.

Page 37: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

2.4. FUNCTION SPACES 21

One can define spaces of the form H1(Ω;X), where X is a Banach space, in asimilar fashion.

We will use Ck,p([0, T ]×Rd) to denote the space of functions which are k andp times continuously differentiable in t ∈ [0, T ] and x ∈ Ω, respectively.

Example 2.24. Let T > 0; then function spaces of the form Lp((0, T );X) appearin the analysis of evolution PDE of parabolic and hyperbolic type – see Chapter 7.

Example 2.25. If (Ω,F ,P) is a probability space then Lp(Ω;X) will denote aBanach space found by defining the Lp integration with respect to the measure Pon Ω, rather than with respect to Lebesgue measure as above– see Chapter 3.

2.4.4 Sobolev spaces of Periodic Functions

We will use the notation Y = [0, 1]d. We will refer to Y as the unit cell. In thissection we will study some properties of periodic functions f : Y → R. We thusidentify Y with the torus Td. Functions which are periodic with period 1 will becalled 1–periodic functions.

We will denote by C∞per(Y) the space of smooth functions C∞(Rd) which are

1–periodic. Lpper(Y) is defined to be the completion of C∞

per(Y) with respect to theLp–norm. A similar definition holds for H1

per(Y).The Poincare inequality does not hold in the space H1

per. It does hold, however,if we remove the constants from this space. We define the space

H =

u ∈ H1per(Y)

∣∣∣∫

Yu dy = 0

. (2.4.5)

There exists a constant CΩ > 0, depending only on Ω, such that

‖u‖L2(Y) 6 CΩ‖∇u‖L2(Y) ∀u ∈ H.

Hence, we can use‖u‖H = ‖∇u‖L2(Y), ∀u ∈ H, (2.4.6)

as the norm in H . The dual of H may be shown to comprise all elements of(H1

per(Y))∗ which are orthogonal to constants:

H∗ =

u ∈ (H1per(Y))∗; 〈f, 1〉 = 0

,

where 〈·, ·〉 denotes the pairing between H1per(Y) and its dual.

Page 38: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

22 CHAPTER 2. ANALYSIS

The spaceL2(Ω; L2(Y)) := L2(Ω× Y)

will be used in our study of periodic homogenization. This is a Hilbert space withinner-product

(u, v)L2(Ω×Y) =∫

Ω

Yu(x, y)v(x, y) dydx,

together with the corresponding norm. We will on occasion use the space L2(Ω; Cper(Y)),which is the set of all measurable functions u : x ∈ Ω → u(x) ∈ Cper(Y) suchthat ‖u(x)‖ ∈ L2(Ω). By Theorem 2.23 the norm of this space is

‖u‖2L2(Ω;Cper(Y )) =

Ω

∣∣∣∣∣supy∈Y

u(x, y)

∣∣∣∣∣2

dy.

This is a separable Banach space which is dense in L2(Ω × Y). It enjoys variousproperties which we will need.

Theorem 2.26. Let u ∈ L2(Ω; Cper(Y )) and ε > 0. Then

(i) u(x, x

ε

) ∈ L2(Ω) and∥∥∥u

(x,

x

ε

)∥∥∥L2(Ω)

6 ‖u(x, y)‖L2(Ω;Cper(Y )) .

(ii) u(x, x

ε

)converges to

∫Y u(x, y) dy, weakly in L2(Ω) as ε → 0.

(iii) We have ∥∥∥u(x,

x

ε

)∥∥∥L2(Ω)

→ ‖u(x, y)‖L2(Ω×Y)

as ε → 0.

The following property of periodic functions will be used in the sequel.

Theorem 2.27. Let p ∈ [1,+∞] and f be a Y–periodic function in Lp(Y). Set

fε(x) = f(x

ε

)a.e. on Rd.

Then, if p < +∞, as ε → 0

Yf(y) dy weakly in Lp(Ω),

for any bounded open subset Ω of Rd.We also have

Yf(y) dy weakly–* in L∞(Rd).

Page 39: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

2.5. TWO–SCALE CONVERGENCE 23

2.5 Two–Scale Convergence

A form of weak convergence that is particularly well suited for problems in peri-odic homogenization is that of two–scale convergence. In this section we definethis concept and we study some of its basic properties. Once again we use Y todenote the unit cell and, since we consider periodic functions on Y , identify it withthe torus Td. As before Ω denotes a subset of Rd. We start by discussing two-scaleconvergence for steady (time-independent) problems. We then discuss related is-sues for time-dependent problems.

2.5.1 Two–Scale convergence for steady problems

In order to define the concept of two–scale convergence we first need to considerappropriate test functions, the admissible test functions.

Definition 2.28. A function φ(x, y) ∈ L2(Ω × Y) is called an admissible testfunction if it satisfies

limε→0

Ω

∣∣∣φ(x,

x

ε

)∣∣∣2

dx =∫

Ω

Y|φ (x, y)|2 dydx. (2.5.1)

Note that an admissible test function is one for which, asymptotically, the de-pendence on x and x/ε decouples, in the calculation of L2−norms. By Theorem2.26 any φ ∈ L2(Ω; Cper(Y)) is an admissible test function. Now we are ready togive the definition of two–scale convergence.

Definition 2.29. Let uε be a sequence in L2(Ω). We will say that uε two–scaleconverges to u0(x, y) ∈ L2(Ω×Y) and write uε 2

u0 if for every admissible testfunction φ we have

limε→0

Ωuε(x)φ

(x,

x

ε

)dx =

Ω

Yu0(x, y)φ (x, y) dydx. (2.5.2)

Two–scale convergence implies weak convergence in L2(Ω). In particular, wehave the following lemma.

Lemma 2.30. Let uε be a sequence in L2(Ω) which two–scale converges to u0(x, y) ∈L2(Ω× Y). Then

uε u0(x) :=∫

Yu0(x, y) dy, weakly in L2(Ω).

Page 40: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

24 CHAPTER 2. ANALYSIS

Proof. Choose a test function φ(x) ∈ L2(Ω), independent of ε. This is an admis-sible test function since

∫Y dy = 1 and we can use it in (2.5.2) to deduce:

limε→0

Ωuε(x)φ (x) dx =

Ω

Yu0(x, y)φ (x) dydx

=∫

Ω

(∫

Yu0(x, y) dy

)φ (x) dx

= (u0, φ)L2(Ω) .

The above holds for every φ(x) ∈ L2(Ω) and, hence, uε converges to u0 weaklyin L2(Ω).

An immediate consequence of the above lemma is the following.

Corollary 2.31. Let uε be a sequence in L2(Ω) which two–scale converges tou0(x) ∈ L2(Ω), i.e. the two–scale limit is independent of y. Then the weak L2–limit and the two–scale limit coincide.

Two–scale convergence is a useful tool for studying multiscale expansions,with a periodic dependence on the ”fast variable”, as the next result indicates.

Lemma 2.32. Consider a function uε(x) ∈ L2(Ω) of the form

uε(x) = u0

(x,

x

ε

)+ εu1

(x,

x

ε

),

where uj(x, y) ∈ L2(Ω;Cper(Y)), j = 0, 1, Ω being a bounded domain in Rd.

Then uε 2 u0.

Proof. Let φ(x, y) ∈ L2(Ω; Cper(Y)) and define fj(x, y) = uj(x, y)φ(x, y), j =0, 1. We will use the notation fε(x) = f

(x, x

ε

). We clearly have

Ωuε(x)φ

(x,

x

ε

)dx =

Ωfε0 (x) dx + ε

Ωfε1 (x) dx. (2.5.3)

Now, fεj (x) ∈ L2(Ω; Cper(Y)) for j = 0, 1. This implies, by Theorem 2.26, that

fε0 converges to its average over Y ∫

Y f0(x, y) dy, weakly in L2(Ω). Now chooseψ = 1 (notice that 1 ∈ L2(Ω), since Ω is a bounded subset of Rd) to obtain

Ωfε0 (x) dx →

Ω

Yf0(x, y) dydx

=∫

Ω

Yu0(x, y)φ(x, y) dydx.

Page 41: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

2.5. TWO–SCALE CONVERGENCE 25

Now consider the second integral ont the right hand side of (2.5.3). Since thesequence fε

1 is weakly convergent in L2(Ω), it is bounded by Theorem 2.7. Thus,using again the boundedness of Ω, together with Cauchy–Schwartz inequality, weobtain:

ε

Ωfε1 (x) dx 6 εC‖fε

1‖L2(Ω)

6 εC → 0.

We use the above two calculations in (2.5.3) to obtain:∫

Ωuε(x)φ

(x,

x

ε

)dx →

Ω

Yu0(x, y)φ(x, y) dydx.

Hence, uε two–scale converges to u0.Now, we would like to find criteria which enable to conclude that a given se-

quence in L2(Ω) is two–scale convergent. The following compactness result pro-vides us with such a criterion.

Theorem 2.33. Let uε be a bounded sequence in L2(Ω). Then there exists a sub-sequence, still denoted by uε, and function u0(x, y) ∈ L2(Ω × Y) such that uε

two–scale converges to u0(x, y).

The two–scale convergence defined in Definition 2.29 is still a weak type ofconvergence, since it is defined in terms of the product of a sequence uε with anappropriate test function. We can also define a notion of strong two–scale conver-gence.

Definition 2.34. Let uε be a sequence in L2(Ω). We will say that uε two–scaleconverges strongly to u0(x, y) ∈ L2(Ω× Y) and write uε 2→ u0 if

limε→0

Ω|uε(x)|2 dx =

Ω

Y|u0(x, y)|2 dydx. (2.5.4)

Although every strongly two–scale convergent sequence is also two–scale con-vergent, the converse is not true. Notice that, in view of the above definition, wecan define the class of admissible test functions to be the subset of elements φ(x, y)of L2(Ω×Y) which are periodic in y and for which φε(x) := φ(x, x/ε) is stronglytwo–scale convergent.

As is always the case with weak convergence, the limit of the product of twotwo–scale convergent sequences is not in general the product of the limits. How-ever, we can pass to the limit when we one of the two sequences is strongly two–scale convergent.

Page 42: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

26 CHAPTER 2. ANALYSIS

Theorem 2.35. (i) Let uε, vε be sequences in L2(Ω) such that uε 2→ u0 andvε 2

v0. Then uεvε 2 u0v0.

(ii) Assume further that u0(x, y) ∈ L2(Ω; Cper(Y )). Then∥∥∥uε(x)− u0

(x,

x

ε

)∥∥∥L2(Ω)

→ 0.

So far we have only considered bounded sequences in L2(Ω) whose two–scalelimit is an element of L2(Ω × Y) and depends explicitly on y. It is now naturalto ask whether more information on the two–scale limit can be obtained when oursequence is bounded in a stronger norm.

Theorem 2.36. (i) Let uε be a bounded sequence in H1(Ω). Then uε two-scale converges to its weak–H1 limit u(x). Further, there exists a functionu1(x, y) ∈ L2(Ω; H1

per(Y)/R) such that, up to a subsequence, ∇uε two–scale converges to ∇xu(x) +∇yu1(x, y).

(ii) Let uε and ε∇uε be uniformly bounded sequences in L2(Ω). Then thereexists a function u0(x, y) ∈ L2(Ω; H1

per(Y)) such that, up to a subsequence,uε and ε∇uε two–scale converge to u0(x, y) and to∇yu0(x, y), respectively.

(iii) Let vε be a divergence–free field which is bounded in [L2(Ω)]d. Then thetwo–scale limit satisfies ∇y · v0(x, y) = 0 and

∫Y ∇x · v0(x, y) dy = 0

Since we only know that v0(x, y) ∈ [L2(Ω × Y)]d, we have to interpret thedivergence of v0(x, y) with respect to x and y in the sense of weak derivatives.

2.5.2 Two–Scale convergence for time dependent problems

When studying homogenization problems for evolution PDE it is necessary to mod-ify the concept of two–scale convergence to take into account the fact the time de-pendence of the sequences of functions we consider. Below we present the relevantdefinitions and theorems. As in the previous subsection, we let Ω be a subset ofRd, not necessarily bounded and Y = [0, 1]d. We use (x, y, t) to denote a point inY × Ω× [0, T ].

Definition 2.37. A function φ(x, y, t) ∈ L2((0, T ) × Ω;L2per(Y)) is called an

admissible test function if it satisfies

limε→0

∫ T

0

Ω

∣∣∣φ(x,

x

ε, t

)∣∣∣2

dxdt =∫ T

0

Ω

Y|φ (x, y, t)|2 dydxdt. (2.5.5)

Page 43: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

2.6. EQUATIONS IN HILBERT SPACES 27

Definition 2.38. A sequence uε ∈ L2((0, T )×Ω) two–scale converges to u0(x, y, t) ∈L2((0, T ) × Ω × Y) and write uε 2

u0 if for every admissible test functionφ(x, y, t) we have

limε→0

∫ T

0

Ωuε(x, t)φ

(x,

x

ε, t

)dxdt =

∫ T

0

Ω

Yu0(x, y, t)φ (x, y, t) dydxdt.

(2.5.6)

The basic compactness theorem of two–scale convergence, Theorem 2.33, isstill valid:

Theorem 2.39. Let uε be a bounded sequence in L2((0, T )×Ω). Then there existsa subsequence, still denoted by uε, and function u0(x, y, t) ∈ L2((0, T ) × Ω ×Y) such that uε(x, t) two–scale converges to u0(x, y, t). Moreover, uε convergesweakly in L2((0, T )× Ω) to the average of the two–scale limit over the unit cell:

Yu0(x, y, t), weakly in L2((0, T )× Ω).

We already know that bounds on better spaces give us more information onthe two-scale limit. The theorem below is the analogue of the first two parts ofTheorem 2.36.

Theorem 2.40. (i) Let uε be a bounded sequence in L2((0, T );H1(Ω)). Thenuε two-scale converges to its weak–L2((0, T );H1(Ω)) limit u(x, t). Fur-thermore, there exists a function u1(x, y, t) ∈ L2((0, T ) × Ω;H1

per(Y)/R)such that, up to a subsequence, ∇uε two–scale converges to ∇xu(x, t) +∇yu1(x, y, t).

(ii) Let uε and ε∇uε be uniformly bounded sequences in L2((0, T ) × Ω) and(L2((0, T ) × Ω))d, respectively. Then there exists a function u0(x, y, t) ∈L2((0, T ) × Ω; H1

per(Y)/R) such that, up to a subsequence, uε and ε∇uε

two–scale converge to u0(x, y, t) and to ∇yu0(x, y, t), respectively.

The proofs of the these two theorems are almost identical to the proofs of thecorresponding results from Section 2.5.

2.6 Equations in Hilbert Spaces

Throughout this book we will frequently encounter linear PDE in a Hilbert spacesetting. It is consequently useful to develop an abstract formulation for such prob-

Page 44: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

28 CHAPTER 2. ANALYSIS

lems. We summarize this theory here. There are two main components: the Lax-Milgram existence theory and the Fredholm alternative.

2.6.1 Lax-Milgram Theory

Let H be a Hilbert space with inner product (·, ·) and let A : H → H∗ be a linearoperator. Let f ∈ H∗ and let 〈·, ·〉 denote the pairing between H and H∗. We areinterested in studying the equation

Au = f. (2.6.1)

An equivalent formulation of this equation is

(Au, v) = 〈f, v〉 ∀ v ∈ H. (2.6.2)

The linearity of A implies that the left hand side of the above equation defines abilinear form a : H ×H → R given by

a[u, v] = (Au, v).

Existence and uniqueness of solutions for equations of the form (2.6.2) can beproved by means of the following theorem.

Theorem 2.41. (Lax–Milgram). Let H be a Hilbert space with norm ‖ · ‖ andinner product (·, ·). Let 〈·, ·〉 denote the pairing between H∗ and H . Let a :H ×H → R be a bilinear mapping which satisfies the following properties:

(i) (Coercivity) There exists a constant α > 0 such that

a[u, u] > α‖u‖2 ∀u ∈ H.

(ii) (Continuity) There exists a constant β > 0 such that

a[u, v] 6 β‖u‖‖v‖ ∀u, v ∈ H.

Let now f : H → R be a bounded linear functional on H . Then there exists aunique element u ∈ H such that

a[u, v] = 〈f, v〉

for all v ∈ H .

Page 45: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

2.6. EQUATIONS IN HILBERT SPACES 29

2.6.2 Fredholm Alternative

Assume now that A : H → H is a linear operator and consider equation (2.6.1)with f ∈ H . Let A∗ : H → H denote the adjoint of A which is defined through

(Au, v) = (u,A∗v) ∀u, v ∈ H.

Let now v ∈ H belong to the null space N (A∗) of A∗ where

N (A∗) = v ∈ H : A∗v = 0 .

Equation (2.6.2) implies that

(f, v) = 0 ∀ v ∈ N (A∗).

Consequently, a necessary condition for the existence of a solution for (2.6.1) isthat the right hand side of this equation is orthogonal to the null space of the adjointoperator of A.

The above formal argument can be made rigorous in the case where A is acompact perturbation of the identity: A = I − K, with K compact. A boundedoperator K : H → H is compact if it maps bounded sets into precompact ones.Equivalently, K is compact if and only if for every bounded sequence un∞n=1 ∈H , the sequence Kun∞n=1 has a strongly convergent subsequence in H . We nowstudy (2.6.1) in the case where A = I − K. The following result will be usedrepeatedly in these notes.

Theorem 2.42. ( Fredholm Alternative). Let H be a Hilbert space and let K :H → H be a compact operator. Then the following alternative holds.

(i) Either the equations(I −K)u = f (2.6.3a)

(I −K∗)U = F (2.6.3b)

have unique solutions for every f, F ∈ H or

(ii) the homogeneous equations

(I −K)u0 = 0, (I −K∗)U0 = 0

have the same finite number of non–trivial solutions:

dim (N (I −K)) = dim (N (I −K∗)) < ∞.

Page 46: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

30 CHAPTER 2. ANALYSIS

In this case equations (2.6.3a) and (2.6.3b) have a solution if and only of

(f, U0) = 0 ∀U0 ∈ N (I −K∗)

and(F, u0) = 0 ∀u0 ∈ N (I −K),

respectively.

2.7 Discussion and Bibliography

Most of the material presented in this chapter is very standard and can be foundin many books on functional analysis and PDE. For more information on Banach,Hilbert and Lp spaces see e.g. [25, 84, 89, 129, 157].

Since L1(Ω) is not reflexive, a bounded sequence in L1(Ω) does not neces-sarily have any weakly convergent subsequences. A natural question that arisesis under what conditions can we extract a weakly convergent sequence from a se-quence in L1(Ω). The answer to this question is given through the Dunford–Pettistheorem: apart from boundendness we also need equi–integrability. We refer toe.g. [40] for details. The non-reflexivity of L1(Ω) implies that this space cannotbe characterized as the dual of a Banach space, and hence weak-* convergence isnot a useful concept for this space. Weak-* convergence becomes, however, anextremely important concept for (Cb(Ω))∗ =: M(Ω), the space of Radon mea-sures on Ω2. In fact, a bounded sequence in L1(Ω) is weakly-* compact in M(Ω):we can extract a weakly convergent subsequence that converges to an element ofM(Ω). Probabilists prefer to call weak-* convergence in Cb(Ω) weak conver-gence of probability measures. This the most useful (and natural) concept forlimit theorems in probability theory, and will be the topic of Section 3.5. Theinterested reader may also consult [75, 19].

Sobolev spaces of periodic functions can be defined using Fourier series. Forexample, the space H1

per(Y ) can be defined as

H1per(Y ) =

u : u =

k∈Zd

uke2πik·x, uk = u−k,

k∈Zd

|k|2|uk|2 < ∞ .

Similarly we haveH =

u ∈ H1

per(Y ) : uk = 0

.

2This space is (much) larger than L1(Ω), which is a proper subset of M(Ω)

Page 47: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

2.7. DISCUSSION AND BIBLIOGRAPHY 31

Sobolev spaces of periodic functions are discussed in [129, 147]. An exhaustivetreatment of Sobolev spaces is can be found in [2]. For many applications to thetheory of PDEs, the material presented in [45, Ch. 5] is sufficient.

The concept of two–scale convergence was introduced by Nguetseng [108,109] and later popularized and developed further by Allaire [3, 4]. Most of theresults presented in Section 2.5 are take from [4]. We remark that the set of admis-sible test functions in Definition 2.28 is a proper subset of L2(Ω × Y); there areelements of L2(Ω×Y) which do not satisfy (2.5.1); see [4] for an example. More-over, a strongly two–scale convergent sequence converges to its two–scale limitstrongly in L2(Ω), provided that the limit is regular enough. Notice however thatthe two–scale limit will not in general possess any further regularity. In fact, everyfunction u0(x, y) in L2(Ω× Y ) is attained as a two–scale limit of some sequencein L2(Ω) [4, Lem. 1.13].

In Section 2.5.2 we considered sequences of functions which do not oscillatein time. The concept of two–scale convergence has been extended to cover the caseof sequences of the form

uε = u

(x,

x

ε, t,

t

εp

),

where p > 0 and u(x, y, t, τ) is periodic in both y and τ . See [67]. The concept oftwo–scale convergence has also been extended to cover the case of functions whichdepend on more than two scales, i.e.

uε = u(x,

x

ε,

x

ε2, . . .

).

See [5]. A concept similar to that of two–scale convergence has also been devel-oped for non–periodic oscillations. See [101].

The use of appropriately chosen test functions in order to study asymptoticproblems for PDE is a standard technique. See e.g. [42]. A very similar techniqueto that of two–scale convergence was introduced many years ago by Kurtz [85].The approach is taken further in the perturbed test function method of Evans [44,43].

The Fredholm alternative holds in normed spaces. That is, neither complete-ness, nor the inner product structure are necessary, see for instance [84, sec. 8.7].

Page 48: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

32 CHAPTER 2. ANALYSIS

2.8 Exercises

1. Let X, ‖ · ‖ be a Banach space. Show that every strongly convergent se-quence is weakly convergent.

2. Let X, ‖ · ‖ be a finite dimensional Banach space. Show that every weaklyconvergent sequence is strongly convergent.

3. Let Ω = (0, 1) ∈ R. Define

u(x) =

x : for 0 6 x 6 1

2 ,

1− x : for 12 6 x 6 1.

Show that the weak derivative of u(x) is

du

dx=

1 : for 0 6 x 6 1

2 ,

−1 : for 12 6 x 6 1.

Is this function differentiable in the classical sense?

4. Consider the function u(x) from the previous exercise. Show that u(x) ∈H1

0 (Ω).

5. Recall H defined in (2.4.5). Use Fourier series to prove that the Poincare in-equality holds in H :

‖u‖L2 6 12π‖∇u‖L2 ∀u ∈ H.

6. Let Ω = (0, 1) ∈ R1 and define uα(x) = |x|α. For what values of α doesuα(x) ∈ H1(Ω)?

7. Prove the Poincare inequality for a function f ∈ C∞(0, L), L > 0. Estimatethe optimal value of the Poincare constant CL. Show that

limL→∞

CL = ∞.

Interpret this result.

8. Consider a function uε(x) ∈ L2(Ω) which admits the following two–scaleexpansion

uε(x) =N∑

j=0

uj

(x,

x

ε

),

Page 49: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

2.8. EXERCISES 33

where uj(x, y) ∈ L2(Ω;Cper(Y )), j = 0, 1, . . . , N , and Ω is bounded domain

in Rd. Show that uε 2 u0 (this is a generalization of Lemma 2.32).

Page 50: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

34 CHAPTER 2. ANALYSIS

Page 51: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

Chapter 3

Probability Theory andStochastic Processes

3.1 Set–Up

In this chapter we present some basic definitions and results from probability the-ory and from the theory of stochastic processes. We define the Wiener process(Brownian motion) and develop the Ito theory of stochastic integration. We sum-marize the basic properties of Martingales, and apply these to Ito integrals. Whenstudying dimension reduction for Markovian problems we will mainly work in thecontext of weak convergence of probability measures. In particular, when appliedon path-space, this will require us to study tightness (relative compactness) of prob-ability measures. Hence in this chapter we will also summarize some basic limittheorems for stochastic processes, in particular weak convergence results in pathspace. We will see in later chapters that averaging and homogenization for SDEare essentially limit theorems for stochastic processes, and that they are intimatelyrelated to the theory of weak convergence of probability measures in metric spaces.

3.2 Probability and Expectation

A measurable space is a pair (Ω,F) where Ω is a set and F is a σ–algebra ofsubsets of Ω1. Let (Ω,F) and (E,G) be two measurable spaces. A function X :

1A collection of subsets of a set Ω is called a σ–algebra if it contains Ω and is closed under theoperations of taking complements and countable unions of its elements.

35

Page 52: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

36 CHAPTER 3. PROBABILITY THEORY AND STOCHASTIC PROCESSES

Ω 7→ E such that the event

ω ∈ Ω : X(ω) ∈ A =: X ∈ Abelongs to F for arbitrary A ∈ G is called a measurable function or random vari-able.

Let (Ω,F) be a measurable space. A function µ : F 7→ [0, 1] is called aprobability measure if µ(∅) = 1, µ(Ω) = 1 and µ(∪∞k=1Ak) =

∑∞k=1 Ak for

all sequences of pairwise disjoint sets Ak∞k=1 ∈ F . The triplet (Ω,F , µ) iscalled a probability space. Let X be a measurable function (random variable) from(Ω,F , µ) to (E,G). The expectation of X is defined as

E[X] =∫

ΩX(ω) dµ(ω).

More generally, let f : E 7→ R be G–measurable. Then,

E[f(X)] =∫

Ωf(X(ω)) dµ(ω),

provided that the integral exists. Every random variable from a probability space(Ω,F , µ) to a measurable space (S,B(S))2 induces a probability measure on S:

µX = PX−1(B) = µ(ω ∈ Ω;X(ω) ∈ B), B ∈ B(S).

The measure µX is called the distribution of X . We can use the distribution of arandom variable to compute expectations and probabilities:

E[f(X)] =∫

Sf(x) dµX(x)

andP[X ∈ G] =

GdµX(x), G ∈ B(S).

When S = Rd and we can write

dµX(x) = f(x)dx,

then we refer to f(x) as the probability density function (pdf) for X .When E = Rd then by Lp(Ω;Rd), or simply Lp(µ), we mean the Banach

space of functions on Ω with norm

‖X‖ =(E|X|p

)1/p.

2Let U be a topological space. We will use the notation B(U) to denote the Borel σ–algebra ofU , i.e. the smallest σ–algebra containing all open sets of U .

Page 53: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

3.2. PROBABILITY AND EXPECTATION 37

Example 3.1. i) Consider the random variable X : Ω 7→ R with pdf

γσ,m(x) := (2πσ)−12 e−

(x−m)2

2σ dx.

Such a random variable is a Gaussian or normal random variable. The meanis

E(X) =∫

Rxγσ,m(x) dx = m

and the variance is

E[(X −m)2] =∫

R(x−m)2γσ,m(x) dx = σ.

Since the mean and variance completely specify a Gaussian random variableon R, the Gaussian is commonly denoted by N (m,σ).

ii) Let m ∈ Rd and Σ ∈ Rd×dsymmetric and positive definite. The randomvariable X : Ω 7→ Rd with pdf

γΣ,m(x) = ((2π)n|Σ|)− 12 e−

12〈Σ−1(x−m),(x−m)〉,

with |Σ| = det|Σ|, is called the multivariate normal (Gaussian) distribution.The mean is

E(X) = m (3.2.1)

and the covariance matrix is

E((X −m)⊗ (X −m)) = Σ. (3.2.2)

Since the mean and covariance matrix completely specify a Gaussian ran-dom variable on Rd, the Gaussian is commonly denoted by N (m, Σ).

Example 3.2. An exponential random variable T : Ω → R+ with rate λ > 0satisfies

P(T > t) = e−λt.

We write T ∼ exp(λ). The related pdf is

fT (t) = λe−λt, t > 0,0, t < 0.

. (3.2.3)

Page 54: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

38 CHAPTER 3. PROBABILITY THEORY AND STOCHASTIC PROCESSES

Notice that

E(T ) =∫ ∞

−∞tfT (t)dt =

∫ ∞

0(λt)e−λtd(λt) =

.

If the times τn = tn+1 − tn are i.i.d random variables with τ0 ∼ exp(λ) then, ift0 = 0,

tn =n−1∑

k=0

τk

and it is straightforward to show that

P(0 6 tk 6 t < tk+1) =e−λt(λt)k

k!. (3.2.4)

Assume that E[|X|] < ∞ and let G be a sub–σ–algebra of F . The conditionalexpectation of X with respect to G is defined to be the function E[X|G] : Ω 7→ E

which is G–measurable and satisfies∫

GE[X|G] dµ =

GX dµ ∀G ∈ G.

We can define the conditional probability P[x ∈ F |G] in a similar manner.

3.3 Stochastic Processes

Let T be an ordered set. A stochastic process is a collection of random variablesX = Xt; t ∈ T from (Ω,F) to (E,G). The measurable space Ω,F is calledthe sample space. The space (E,G) is called the state space. In this book we willtake the set T to be either [0, +∞) or N. Thus we will assume that all elementsof T are nonnegative. The state space E will usually be Rd equipped with the σ–algebra of Borel sets or, on some occasions, Td. When the ordered set T is clearfrom the context we will sometimes write Xt rather than Xt; t ∈ T. Noticethat X may be viewed as a function of both t ∈ T and ω ∈ Ω. It is sometimesconvenient to write X(t), X(t, ω) or Xt(ω) instead of Xt.

The finite dimensional-distributions of a stochastic process are the Rk valuedrandom variables (X(t1), X(t2), . . . , X(tk)) for arbitrary positive integer k andarbitrary times ti ∈ T. A process is called stationary if all such collections ofrandom variables are equal in distribution when translated in time: the distributionof (X(t1), X(t2), . . . , X(tk)) is equal to that of (X(s+ t1), X(s+ t2), . . . , X(s+tk)) for any s such that s + ti ∈ T.

Page 55: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

3.3. STOCHASTIC PROCESSES 39

The most important continuous–time stochastic process is the Wiener process– Brownian motion.

Definition 3.3. i) A one dimensional Brownian motion W (t) : R+ → R is areal valued stochastic process with the following properties:

(a) W (0) = 0.

(b) W (t) is continuous.

(c) W (t) has independent increments. Furthermore, for every t > s > 0W (t) −W (s) has a Gaussian distribution with mean 0 and variancet− s. That is, the pdf of the random variable W (t)−W (s) is

g(t, s;x) =(2π(t− s)

)− 12 exp

(− x2

2(t− s)

). (3.3.1)

ii) A d–dimensional Brownian motion W (t) : R+ → Rd is a collection of d

independent one dimensional Brownian motions:

W (t) = (W1(t), . . . , Wd(t)),

where Wi(t), i = 1, . . . , d are independent 1–dimensional Brownian mo-tions. The density of the Gaussian random vector W (t)−W (s) is

g(t, s;x) =(2π(t− s)

)−d/2exp

(− 〈x, x〉

2(t− s)

).

Notice that, for the d–dimensional Brownian motion, and for I the d × d di-mensional identity, we have (see (3.2.1) and (3.2.2))

E(W (t)) = 0

andE [(W (t)−W (s))⊗ (W (t)−W (s))] = (t− s)I.

Moreover,E [W (t)W (s)] = min(t, s)I.

Another important continuous time process is the Poisson process:

Definition 3.4. The Poisson process with intensity λ, denoted by N(t), is a continuous–time stochastic process with independent increments satisfying

P[(N(t)−N(s)) = k] =e−λ(t−s)u

(λ(t− s)

)k

k!, t > s > 0, k ∈ N.

Page 56: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

40 CHAPTER 3. PROBABILITY THEORY AND STOCHASTIC PROCESSES

Notice the connection to exponential random variables via (3.2.4).Both Brownian motion and the Poisson process are homogeneous (or time-

homogeneous): the increments between successive times s and t depend only ont− s. In these notes we will consider only homogeneous stochastic processes.

Let (Ω,F , µ) be a probability space, (E, ρ) a metric space and let T = [0,∞).Let Xt be a stochastic process from (Ω,F , µ) to (E, ρ) with continuous sam-ple paths. In other words, for all ω ∈ Ω we have that Xt ∈ C(T, E). In thiscase Xt can be thought of as a random variable on (Ω,F , µ) with state space(C(T,E),B(C(T, E))). The probability measure PX−1

t on (C(T,E),B(C(T, E)))is called the law of Xt.

Although Brownian motion is continuous, many stochastic processes do nothave continuous sample paths. We denote by D(T, E) the space of right con-tinuous processes Xt : T 7→ E with left limits3 (cadlag processes). The spaceD(T, E) will play an important role in the sequel.

Let (Ω,F) be a measurable space and T an ordered set. Let X = Xt(ω) bea stochastic process from the sample space (Ω,F) to the state space (E,G). It isa function of two variables, t ∈ T and ω ∈ Ω. For a fixed ω ∈ Ω the functionXt(ω), t ∈ T is the sample path of the process X associated with ω. Let K bea collection of subsets of Ω. The smallest σ–algebra on Ω which contains K isdenoted by σ(K) and is called the σ–algebra generated by K. Let Xt : Ω 7→E, t ∈ T . The smallest σ–algebra σ(Xt, t ∈ T ) such that the family of mappingsXt, t ∈ T is a stochastic process with sample space (Ω, σ(Xt, t ∈ T )) andstate space (E,G) (that is, Xt, t ∈ T is measurable), is called the σ–algebragenerated by Xt, t ∈ T.

A filtration on (Ω,F) is a nondecreasing family Ft, t ∈ T of sub–σ–algebrasof F : Fs ⊆ Ft ⊆ F for s 6 t. We set F∞ = σ(∪t∈TFt). The filtration generatedby a stochastic process Xt is

FXt := σ (Xs; s 6 t) .

We say that a stochastic process Xt; t ∈ T is adapted to the filtration Ft :=Ft, t ∈ T if for all t ∈ T , Xt is an Ft–measurable random variable.

Definition 3.5. Let Xt be a stochastic process defined on a probability space(Ω,F , µ) with values in E and let FX

t be the filtration generated by Xt. Then

3That is, lims→t+ Xs = Xt and lims→t− Xs := Xt− exists for all t ∈ [0,∞).

Page 57: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

3.3. STOCHASTIC PROCESSES 41

Xt is a Markov process if

P(Xt+s ∈ Γ|FXt ) = P(Xt+s ∈ Γ|Xt) (3.3.2)

for all s, t ∈ T and Γ ∈ B(E). If Ft is a filtration with FXt ⊂ Ft, t ∈ T ,

then Xt is a Markov process with respect to Ft if (3.3.2) holds with FXt

replaced by Ft.

Roughly speaking the statistics of Xt+s for s > 0 are completely determinedonce Xt is known; information about Xt−s for s > 0 is superfluous. With aMarkov process Xt we can associate a function P : T × T ×E ×B(E) → R+

defined through the relation

P[Xt+s ∈ Γ|FX

t

]= P (s, t, Xt, Γ),

for all t, s ∈ T, Γ ∈ B(E). The transition function P (s, t, x,Γ) satisfies theChapman–Kolmogorov equation

EP (s, t, x, dy)P (t, v, y,Γ) = P (s, v, x,Γ), (3.3.3)

for all x ∈ E, Γ ∈ B(E), s < t < v. A Markov process is homogeneous if

P (s, t, x,Γ) = P (0, t− s, x, Γ).

In this case we simplify the notation by setting P (0, t, ·, ·) = P (t, ·, ·) and theChapman–Kolmogorov equation becomes

EP (s, x, dy)P (t, y,Γ) = P (t + s, x,Γ).

Let (E, ρ) be a metric space and let Xt be an E-valued homogenous stochas-tic process. Then we can define a semigroup of operators through

T (t)f(x) =∫

f(y)P (t, x, dy)

for all f(x) ∈ Cb(E). The generator L of the semigroup T (t) is called the gener-ator of the Markov process Xt. Conversely, let T (t) be a contraction semigroupwith D(T (t)) ⊂ Cb(E), closed. Then, under mild technical hypotheses, there isan E–valued Markov process Xt associated with T (t) defined through

E[f(X(t + s)|FXt )] = T (t)f(X(t))

Page 58: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

42 CHAPTER 3. PROBABILITY THEORY AND STOCHASTIC PROCESSES

for all s, t > 0 and f ∈ D(T (t)). A Markov process is ergodic if the semigroupT (t) is such that T (t) − I has only constants in its null-space. Under some addi-tional compactness assumptions, the Markov process then has an invariant measureµ with the property that, if X0 is distributed according to µ, then so is Xt for allt > 0. The resulting stochastic process, with X0 distributed in this way, is station-ary.

Example 3.6. Brownian motion is a time-homogeneous Markov process. The tran-sition function is the Gaussian defined in Example 3.1:

P (t, x, dy) = γt,x(y)dy.

The semigroup associated to the standard Brownian motion is the heat semigroupT (t) = e

12∆t. The generator of this Markovian process is 1

2∆.

Example 3.7. The Poisson process is a homogeneous Markov process.

As mentioned above, it is sometimes useful to view a stochastic process Xt asa function of two variables t ∈ T and ω ∈ Ω : X(t, ω). In this context it is thenof interest to look at Banach space-valued spaces, as in the previous chapter (seeDefinition 2.22 and the discussion following it).

Example 3.8. Consider a stochastic process Xt taking values in the space of real-valued continuous functions C([0, T ],R). We define Lp(Ω, C([0, T ],R)) to be theBanach space equipped with norm ‖X‖ given by

‖X‖p = E(

supt∈[0,T ]

‖X(t)‖)p

.

Notice that this is equal to

‖X‖p = E(

supt∈[0,T ]

‖X(t)‖p)

and the norm is usually written this way.

3.4 Martingales and Stochastic Integrals

3.4.1 Martingales

Definition 3.9. Let Ft be a filtration of the probability space (Ω,F , µ) and letXt be a stochastic process adapted to Ft. We will say that Xt is an Ft–martingale if

E[Xt+s|Ft] = Xt, t, s ∈ T.

Page 59: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

3.4. MARTINGALES AND STOCHASTIC INTEGRALS 43

A martingale Mt is square–integrable if E(M2t ) < ∞ for all t ∈ T .

Example 3.10. By Definition 3.3 Brownian motion is a martingale with respect tothe filtration generated by itself.

Definition 3.11. Let Mt be a square integrable Ft−martingale with sample pathsin D([0,∞),Rd). Then there exists an increasing process with sample paths inD([0,∞),Rd), called the quadratic variation and denoted by 〈M, M〉t, such thatfor each t > 0 and each sequence of partitions u(n)

k of [0, t] with maxk(u(n)k+1 −

u(n)k ) → 0,

k

(M(u(n)

k+1)−M(u(n)k )⊗M(u(n)

k+1)−M(u(n)k )

)→ 〈M,M〉t, in L1(µ).

Example 3.12. i) The quadratic variation of the one dimensional standardBrownian motion is t. Let W (t) be the standard Brownian motion in Rd.Then

〈W,W 〉t = It.

ii) Let Xt be a Markov process with generator L and let f ∈ D(L), the domainof definition of L. Then

Mt = f(Xt)− f(X0)−∫ t

0(Lf)(Xs) ds (3.4.1)

is a martingale. Assume that f2 ∈ D(L). Then the quadratic variation ofMt is given by the formula

〈M, M〉t =∫ t

0

((Lf2)(Xs)− 2f(Xs)(Lf)(Xs)

)ds.

Every continuous martingale satisfies Doob’s inequality

P[

sup06t6T

|Mt| > λ

]6 1

λpE|MT |p, (3.4.2)

for all λ > 0, p > 1, T > 0. Furthermore

E sup06t6T

|Mt|p 6(

p− 1p

)p

E|MT |p, (3.4.3)

for all p > 1, T > 0.

Page 60: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

44 CHAPTER 3. PROBABILITY THEORY AND STOCHASTIC PROCESSES

3.4.2 The Ito Stochastic Integral

For the rigorous analysis of stochastic differential equations it is necessary to definestochastic integrals of the form

I(t) =∫ t

0f(s) dW (s), (3.4.4)

where W (t) is a standard one dimensional Brownian motion and f(t) is a randomprocess, adapted to the filtration Ft generated by the W (t), and such that

E(∫ T

0f(s)2 ds

)< ∞.

Throughout we use the Ito interpretation of the stochastic integral. The Ito stochas-tic integral is defined as the L2–limit of the Riemann sum approximation of (3.4.4)

I(t) := limN→∞

N−1∑

k=1

f(tk−1) (W (tk)−W (tk−1)) , (3.4.5)

where tn = n∆t and N∆t = t. Notice that the function f(t) is evaluated at theleft end of each interval [tn−1, tn] in (3.4.5). The multidimensional Ito integral isdefined in a similar way. The resulting Ito stochastic integral I(t) is a.s. continuousin t. These ideas are readily generalized to the case where W (s) is a standard d

dimensional Brownian motion and f(s) ∈ Rm×d for each s. The integral thensatisfies the Ito isometry

E|I(t)|2 =∫ t

0E|f(s)|2Fds.

Furthermore, it is a martingale:

EI(t) = 0

andE[I(t)|Fs] = I(s) ∀ t > s,

where Fs denotes the filtration generated by W (s).

Example 3.13. Consider the Ito stochastic integral

I(t) =∫ t

0f(s) dW (s).

Page 61: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

3.4. MARTINGALES AND STOCHASTIC INTEGRALS 45

This is a martingale with quadratic variation

〈I, I〉t =∫ t

0(f(s))2 ds,

where f,W are scalar-valued.

We now state a converse result: every square integrable martingale can be ex-pressed in terms of an Ito integral.

Theorem 3.14. Let (Ω,F , µ) be a probability space and let B(t) be a standard,d–dimensional Brownian motion on it. Let Ft be the filtration generated by theBrownian motion and let Mt be a square integrable Ft–martingale with respect toµ. Then there exists a B([0,∞)) × Ω–measurable function f ∈ L2([0,∞) × Ω)such that

Mt = E[M0] +∫ t

0f(s, ω) dBs.

One of the most useful features of martingales is that they satisfy various in-equalities. These inequalities are particularly important when proving limit theo-rems. Perhaps the most important inequality satisfied by martingales is the follow-ing.

Theorem 3.15. Burkholder-Davis-Gundy Inequality Consider the Ito stochas-tic integral (3.4.4) with quadratic variation 〈I, I〉t. For every p > 0 there areconstants C± such that

C−(E〈I, I〉t

)p/26 E

(sup

06s6t|I(s)|p

)6 C+

(E〈I, I〉t

)p/2.

The proof of this theorem is based on Ito’s formula and the Doob Martingaleinequality 3.4.3.

3.4.3 The Stratonovich Stochastic Integral

In addition to the Ito stochastic integral, the following Stratonovich integral is alsouseful. It is defined as the L2–limit of a different Riemann sum approximation of(3.4.4) namely

I(t) := limN→∞

N−1∑

k=1

12

(f(tk−1) + f(tk)

)(W (tk)−W (tk−1)) , (3.4.6)

Page 62: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

46 CHAPTER 3. PROBABILITY THEORY AND STOCHASTIC PROCESSES

where tn = n∆t and N∆t = t. Notice that the function f(t) is evaluated at bothendpoints of each interval [tn−1, tn] in (3.4.6). The multidimensional Stratonovichintegral is defined in a similar way. The resulting integral is written as

Istrat(t) =∫ t

0f(s) dW (s).

The resulting limit differs from the Ito integral. The situation is different from thatarising in the theory of Riemann integration for functions of bounded variation;in that case the points in [tk−1, tk] where the integrand is evaluated do not effectthe definition of the integral. In the case of integration against Brownian motion,which does not have bounded variation, the definitions differ. However, when f

and W are correlated through an SDE, then a formula exists to convert betweenthem – see Chapter 6.

3.5 Weak Convergence of Probability Measures

A type of convergence that is very often used in probability theory is that of weakconvergence of probability measures.

Definition 3.16. Let (S, ρ) be a metric space with Borel σ–fieldB(S). Let µn∞n=1

be a sequence of probability measures on (S,B(S)), and let µ be another measureon this space. We say that µn∞n=1 converges weakly to µ, and write µn ⇒ µ if

limn→∞

Sf(s) dµn(s) =

Sf(s) dµ(s),

for every f ∈ Cb(S).

Definition 3.17. Let (Ωn,Fn, µn)∞n=1 be a sequence of probability spaces and let(S, ρ) be a metric space. Let Xn : Ωn 7→ S, n = 1, . . . ,∞ be a sequence ofrandom variables. Assume that (Ω,F , µ) is another probability space and let X :Ω 7→ S, be another random variable. We will say that Xn∞n=1 converges toX in distribution and write Xn ⇒ X if the sequence of measures PX−1

n ∞n=1

converges weakly to the measure PX−1.

In other words, Xn ⇒ X if and only if

limn→∞Enf(Xn) = Ef(X)

for all f ∈ Cb(S).

Page 63: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

3.5. WEAK CONVERGENCE OF PROBABILITY MEASURES 47

Example 3.18. (The central limit theorem). Let ξn∞n=1 be a sequence of i.i.d.random variables with mean zero and variance 1, and define

Sn :=n∑

k=1

ξk. (3.5.1)

Then the sequence

Xn :=1√n

Sn

converges in distribution to a standard normal variable.

There are various types of convergence that are useful in the study of limittheorems for random variables. Let Xn∞n=1 be a sequence of random variables.We will say that the sequence converges in probability to a random variable X iffor every ε > 0

limn→∞P(|Xn −X| > ε) → 0.

We will say that Xn∞n=1 converges almost surely (or with probability 1) if

P(

limn→∞Xn = X

)= 1.

Finally, we say that Xn∞n=1 converges in p-th mean (or in Lp) provided that

limn→∞E|Xn −X|p = 0

Example 3.19. (The strong law of large numbers). Let ξn∞n=1 be a sequenceof i.i.d. random variables with mean 1. Define

Xn =1n

n∑

k=1

ξk.

Then the sequence Xn converges to 1 almost surely. Assume furthermore thatξn ∈ L2. Then Xn converges to 1 also in L2.

Remark 3.20. The are relations between the different notions of convergence forrandom variables: for example almost sure convergence implies weak conver-gence; convergence in probability implies weak convergence.

Weak convergence is preserved under continuous mappings.

Page 64: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

48 CHAPTER 3. PROBABILITY THEORY AND STOCHASTIC PROCESSES

Theorem 3.21. Let (Ωn,Fn, µn)∞n=1 be a sequence of probability spaces and let(Si, ρi) be metric spaces i = 1, 2. Let Xn : Ωn 7→ S, n = 1, . . . ,∞ be a sequenceof random variables. Assume that (Ω,F , µ) is another probability space and letX : Ω 7→ S, be another random variable. If f : S1 → S2 is continuous thenf(Xn)∞n=1 converges to f(X) in distribution if Xn∞n=1 converges to X indistribution.

Example 3.22. Let S1 = C[0, 1] and S2 = R. The function f : S1 → S2

f(x) := supt∈[0,1]

x(t)

continuous. Hence, theorem 3.21 applies and we have that Xn ⇒ X in S1 impliesthat supt∈[0,1] Xn(t) ⇒ supt∈[0,1] X(t) in S2.

Let (Ω,F , µ) be a probability space, T = [0,∞) and (E, ρ) a metric space. LetXn

t ∞n=1, by a family of stochastic processes, and Xt another stochastic process,all with sample paths in C(T, E). We will say that the sequence Xn

t ∞n=1 con-verges weakly to Xt and write Xn

t ⇒ Xt if the sequence of probability measuresP(Xn

t )−1 converges weakly to the probability measurePX−1t on (C(T, E),B(C(T, E))).

Sometimes we will also say that Xnt ∞n=1 converges weakly in C(T, E).

Example 3.23. Consider the situation of Example 3.18 and let Sn be given by(3.5.1). Let [t] denote the integer part of a real number t and define the continuous–time process

Yt = S[t] + (t− [t])ξ[t]+1. (3.5.2)

The process Yt has continuous paths. Define the sequence of stochastic processes

Xnt =

1√n

Ynt, t > 0.

Then Xnt ∞n=1 converges weakly in C([0,∞),R) to a standard one–dimensional

Brownian motion.

Let (Ω,F) be a measurable space. We will denote by M1(Ω,F) the space ofall probability measures on (Ω,F). Assume that Ω is a topological space and letB(Ω) denote the Borel σ–algebra. We will write M1(Ω) := M1(Ω,B(Ω)).

Definition 3.24. Let (E, ρ) be a metric space and let M ⊆ M1(E). We will saythat M is relatively compact if every sequence of elements of M contains a weakly

Page 65: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

3.5. WEAK CONVERGENCE OF PROBABILITY MEASURES 49

convergent subsequence. We will say that M is tight if for every ε > 0 there existsa compact subset K ⊆ E such that µ(K) > 1− ε ∀µ ∈ M .

Theorem 3.25. Let (E, ρ) be a complete, separable metric space. Then a setM ⊆ M1((E,B(E))) is relatively compact if and only if it is tight.

Theorem 3.26. Let Xnt be a sequence of continuous stochastic processes on

(Ω,F , µ) satisfying the following conditions:

i. ∃ν > 0 : sup16n6∞ E|Xn0 |ν < ∞,

ii. ∃α, β > 0 and C = C(T ) > 0:

sup16n6∞

E|Xnt −Xn

s |α 6 CT |t− s|1+β, ∀T > 0, t, s ∈ [0, T ]

,

Let µn denote the law of the process Xnt for n = 1, . . .∞. Then the sequence

µn∞n=1 (which forms a sequence of probability measures on (C[0,∞),B(C[0,∞)))is tight.

Tightness and convergence of finite dimensional distributions imply conver-gence in law.

Theorem 3.27. Let Xnt ∞n=1, Xt be continuous stochastic processes on a prob-

ability space (Ω,F , µ) with state space (E, ρ) and assume that the following con-ditions are satisfied.

i. Xnt ∞n=1 is tight.

ii. The finite dimensional distributions of Xnt ∞n=1 converge to those of Xt.

Then the sequence Xnt ∞n=1 converges weakly in C([0,∞), E) to Xt.

Let (E, ρ) be a metric space. In Section 3.3 we introduced the space D([0,∞), E),the space of right continuous processes Xt : [0,∞) 7→ E. The space of continu-ous functions C([0,∞), E) is a closed subset of D([0,∞), E). If E is separablethen so is D([0,∞), E). If (E, ρ) is complete then there exists a metric d onD([0,∞), E) such that (D([0,∞), E), d) is a complete metric space.

Page 66: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

50 CHAPTER 3. PROBABILITY THEORY AND STOCHASTIC PROCESSES

Example 3.28. Consider the situation of Examples 3.18 and 3.23 and let Sn begiven by (3.5.1). Define the continuous–time process

Zt = S[t].

Notice that Zt ∈ D([0,∞),R). Define the sequence of stochastic processes

Xnt =

1√n

Znt, t > 0.

Then Xnt ∞n=1 converges weakly in D([0,∞),R) to a standard one–dimensional

Brownian motion.

Remark 3.29. Let CE := C([0,∞), E) and DE := D([0,∞), E). Assume thatµn∞n=1, µ ∈ M1(DE), and assume that µn(CE) = µ(CE) = 1∀n = 1, . . . ,∞.Define Qn∞n=1, Q ∈ M1(CE) by Qn = µn|B(CE), n = 1, . . . ,∞ and Q =µ|B(CE). Then Qn converges to Q weakly in CE if and only if µn converges toµ weakly in DE . In particular, for processes with continuous sample paths weakconvergence in DE is a necessary and sufficient condition for weak convergence inCE .

In Examples 3.18 and 3.23 we studied the classical central limit theorem andinvariance principle (functional central limit theorem), respectively. These resultswere stated for sums of independent random variables. Similar results hold evenfor dependent random variables, provided that the dependence between the randomvariables is not too strong. In many instances the martingale property is sufficient.

Theorem 3.30. Let (Ω,F , µ) be a probability space and let Fj , j > 1 be afiltration. Let Zj , j > 1 be a sequence of stationary, ergodic random variablessuch that

E[Z1]2 = σ2

and

E[Zk+1|Fk] = 0. (3.5.3)

Define

Sn =n∑

k=1

Zk. (3.5.4)

Page 67: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

3.5. WEAK CONVERGENCE OF PROBABILITY MEASURES 51

ThenXn =

1√n

Sn

converges in distribution to a Gaussian variable with mean 0 and variance σ2.Furthermore, the process

Xnt =

1√n

S[nt] +1√n

(nt− [nt])Z[nt]+1

converges weakly in C([0,∞),R)) to a Brownian motion with variance σ2. Fi-nally, the process

Xnt =

1√n

S[nt]

converges weakly in D([0,∞),R)) to a Brownian motion with variance σ2.

Notice that the condition (3.5.3) implies that SN defined in (3.5.4) is an Fj–martingale.

The above theorem is also valid in arbitrary dimensions, that is, in the casewhere Zj , j > 1 is a vector–valued sequence of stationary, ergodic randomvariables. In this case the covariance matrix of the limiting Brownian motion is thecovariance matrix of Z1.

A result similar to that of Theorem 3.30 can be proved for continuous–timemartingales.

Theorem 3.31. (The Martingale Central Limit Theorem). Let M(t) : R+ 7→R be a right–continuous square integrable martingale on a probability space(Ω,F , µ) with respect to a given increasing filtration Ft : t > 0 and let 〈M〉tdenote its quadratic variation. Assume that

i) M(0) = 0;

ii) The increments of M(t) are stationary;

iii) The quadratic variation of M(t) converges in L1(µ) to some positive con-stant σ:

limt→∞E

[∣∣∣∣〈M〉t

t− σ

∣∣∣∣]

= 0. (3.5.5)

Then the process 1√tMt converges in distribution to a N (0, σ) random variable.

Furthermore, for every fixed t ∈ [0, T ], T < ∞ the rescaled martingale

M ε(t) :=1εM

(t

ε2

)

Page 68: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

52 CHAPTER 3. PROBABILITY THEORY AND STOCHASTIC PROCESSES

converges weakly in D([0, T ),R) to a Brownian motion with variance σ.

A similar result is valid in arbitrary dimensions. Notice that, on account ofRemark 3.29, if the martingale has continuous sample paths then the above theoremprovides us with weak convergence in C([0, T ),R). The assumptions of the abovetheorem can be weakened considerably as we now show.

Theorem 3.32. Let (Ω,F , µ) be a probability space and let Fnt , n > 1 be a fil-

tration. Let Mn be an Fnt –local martingale with sample paths in C([0,∞),Rd)

and Mn(0) = 0. Let An := (Aijn ) be the quadratic variation of the martingale.

An(t) = 〈Mn,Mn〉t.Assume that there exists a nonnegative definite symmetric matrix C = (Cij) suchthat An(t) converges to Ct in probability. Then Mn converges weakly in C([0,∞),Rd)to a d–dimensional Brownian motion with covariance matrix C.

Upon combining the fact that stochastic integrals are martingales, together withan ergodicity assumption, we obtain the following corollary.

Corollary 3.33. Let x(t) : R+ 7→ Z be a stationary, ergodic Markov process withinvariant measure µ(dx), and let f(·) : Rd 7→ Rm be a smooth bounded function.Define

I(t) =∫ t

0f(x(s)) dW (s)

Then, for every finite T > 0, the rescaled stochastic integral

Iε(t) := εI

(t

ε2

)

converges in distribution to an m dimensional Brownian motion with covariancematrix

Σ =∫

Zf(x)⊗ f(x)µ(dx).

Proof. We have to check that the martingale I(t) satisfies the assumptions of The-orem 3.31. We clearly have that I(0) = 0. Furthermore, the stationarity of theprocess x(t) implies that I(t) has independent increments. By ergodicity

limt→∞

1t〈I〉t = lim

t→∞1t

∫ t

0f(x(s))f(x(s))T ds

=∫

Zf(x)f(x)T µ(dx) in L1(µ).

Hence, Theorem 3.31 applies.

Page 69: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

3.6. DISCUSSION AND BIBLIOGRAPHY 53

3.6 Discussion and Bibliography

An excellent reference on convergence of probability measures is [19]. Variousconvergence results for sequences of Markov processes can be found in [41]. Thetext [55] has good background on stochastic processes and weak convergence forprobability measures on path space. For discussion about relationships betweendifferent modes of convergence see [60].

Continuous local martingales can be expressed as time changed Brownian mo-tions. This is the Dambis–Dubins–Schwarz theorem [75, Thm 3.4.6]. M =Mt,Ft; 0 6 t < ∞ be a Martingale satisfying limt→∞〈Mt,Mt〉t = ∞, Pa.s. Define, for each 0 6 s < ∞, the stopping time

T (s) = inft > 0; 〈M, M〉t > s.

Then the time–changed process

Bs = MT (s), Gs = FT (s), 0 6 s < ∞,

is a standard one–dimensional Brownian motion. In particular P a.s:

Mt = B〈M,M〉t .

Notice that the quantities Bt and Mt are in general highly correlated and that theyare not adapted to the same filtration (because of the time change).

Weak convergence results of the form of Theorem 3.32 are valid even when thelimiting process is not a Brownian motion but a more general diffusion process. See[41, Sec.7.4]. The proof of the martingale central limit theorem can be found in[41, ch. 7]. See also [87, 86].

The martingale central limit theorem leads to a general central limit theorem foradditive functionals of Markov Processes: let y(t) be an ergodic Markov processwith generator L and invariant measure µ(dy). Consider the integral (i.e. additive)functional of y(t)

x(t) = x0 +∫ t

0f(y(s)) ds.

Then, the re–scaled process

xε(t) = εx(t/ε2)

Page 70: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

54 CHAPTER 3. PROBABILITY THEORY AND STOCHASTIC PROCESSES

converges weakly to a Brownian motion with variance Σ, provided that the Poissonequation

−Lφ = f (3.6.1)

is well posed, in some appropriate (weak) sense. The variance of the limitingBrownian motion is given by the Dirichlet form of the generator L, evaluated atthe solution of (3.6.1):

Σ =∫

(−Lφ)⊗ φµ(dy).

To see this, notice that (3.4.1) implies that the rescaled process xε(t) satisfies

xε(t) = ε(x0 − φ(X(t/ε2)) + φ(X(0)))

)+ εM(t/ε2),

where M(t) is a martingale. The first term on the right hand side of the aboveequation tends to 0, provided that φ satisfies appropriate estimates. The secondterm converges to a Brownian motion with variance Σ, by the martingale centrallimit theorem. This theorem was proved for reversible Markov processes in [82].See also [118]. Various extensions of this result have been proved. See [87] andthe references therein.

3.7 Exercises

1. Let W (t) : R+ → R be a standard Brownian motion. Calculate all momentsof W (t)−W (s), t > s > 0.

2. Let W (t) be a standard Brownian motion in one dimension. Show that

E(∣∣∣∣

∆W (t)∆t

∣∣∣∣)

=√

1√∆t

.

Deduce that the Brownian motion is nowhere differentiable a.s.

3. Let Wt be a standard Brownian motion in Rd and let Ft denote the filtrationgenerated by Wt. Prove that Wt is an Ft–martingale. Prove that Wt − t is alsoan Ft–martingale.

4. State the analogue of Theorem 3.30 in arbitrary dimensions.

Page 71: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

Chapter 4

Ordinary Differential Equations

4.1 Set-Up

We study the ordinary differential equations

dz

dt= h(z), z(0) = z0 (4.1.1)

where h : Z → Rd. Typically Z = Td, Rd or Rl ⊕ Td−l. In later chapters we willoften consider z = (xT , yT )T , with x ∈ X , y ∈ Y . If Z = Td (resp. Rd) thenX = Tl (resp. Rl) and Y = Td−l (resp. Rd−l). If Z = Rl ⊕ Td−l then X = Rl

and Y = Td−l.

4.2 Existence and Uniqueness

When there is a unique solution to the initial value problem (4.1.1) for all initialdata in Z and t ∈ R we write

z(t) = ϕt(z0).

Thus ϕt : Z → Z is the solution operator and forms a one parameter group, that is

ϕt+s = φt φs ∀t, s ∈ R and ϕ0 = I,

where I : Z → Z denotes the identity operator.In practice existence and uniqueness of solutions of (4.1.1) can be verified in

a wide range of different scenarios. For ease of exposition we work within tothe simplest scenario, namely when h is Lipschitz on Z . This condition can beweakened when a priori bounds on the solution prevent blow-up.

55

Page 72: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

56 CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS

Definition 4.1. A function f : Z → Rd is Lipschitz on Z with Lipschitz constantL if

|f(x)− f(y)| 6 L|x− y|, ∀x, y ∈ Z.

We observe that a Lipschitz function is also linearly bounded:

|f(x)| 6 |f(0)|+ L|x|.Theorem 4.2. If h is Lipschitz on Z then the ODE (4.1.1) has a unique solutionz(t) ∈ C1(R,Z) for any x0 ∈ Z . Moreover

|x(t)− y(t)| 6 eLt|x0 − y0|. (4.2.1)

Proof. The existence and uniqueness follows from a standard fixed point argument(Picard iteration). The bound (4.2.1) follows from

12

d

dt|x− y|2 = 〈h(x)− h(y), x− y〉,

and〈h(x)− h(y), x− y〉 6 L|x− y|2,

using the Gronwall Lemma 4.3 below.

A useful tool for studying evolution equations is Gronwall’s lemma; it is alsorequired to prove the preceding theorem.

Lemma 4.3. 1. (Differential form). Let η(t) ∈ C1([0, T ];R+) satisfy the differen-tial inequality

dη(t)dt

6 φ(t)η(t) + ψ(t), η(0) = η, (4.2.2)

where φ(t), ψ(t) ∈ L1([0, T ];R+). Then

η(t) 6 exp(∫ t

0φ(s) ds

)[η +

∫ t

0ψ(s) ds

](4.2.3)

for all t ∈ [0, T ].2. (Integral form). Assume that ξ(t) ∈ L1([0, T ];R+) satisfies the integral

inequality

ξ(t) 6 φ

∫ t

0ξ(s) ds + ψ,

for some positive constants φ, ψ. Then

ξ(t) 6 φ(1 + ψteφt

).

Page 73: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

4.3. THE GENERATOR 57

Proof. 1. We multiply equation (4.2.2) by exp(− ∫ t

0 φ(s) ds)

to obtain

d

dtη(t) ˙exp

(−

∫ t

0φ(s) ds

)6

(ψ(t) + φ(t)η(t)

)exp

(−

∫ t

0φ(s) ds

).

Consequently

d

dt

(η(t) exp

(−

∫ t

0φ(s) ds

))6 ψ(t) exp

(−

∫ t

0φ(s) ds

)

6 ψ(t),

since φ(t) is nonnegative. Integrating this inequality from 0 to t and multiplyingthrough by exp

(∫ t0 φ(s) ds

)gives (4.2.3).

2. Define η(t) =∫ t0 ξ(s) ds. Then η(t) satisfies the inequality

dt6 φη + ψ.

We apply the first part of the lemma with η(0) = 0 to deduce that

η(t) 6 ψteφt.

Consequently

ξ(t) 6 φη(t) + ψ

6 ψ(1 + φteφt

).

4.3 The Generator

It is often of importance to understand how functions of z(t) change with time. Wemay achieve this by using the generator L:

(LV )(z) = 〈h(z),∇V (z)〉. (4.3.1)

If z(t) solves (4.1.1) and V ∈ C1(Z,R) then

d

dt

V (z(t))

= 〈∇V (z(t)),

dz

dt〉

= 〈∇V (z(t)), h(z(t))〉= LV (z(t)). (4.3.2)

Page 74: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

58 CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS

If LV can be bounded above in terms of V then it is possible to use differentialinequalities, such as the Gronwall Lemma 4.3, to obtain bounds on V (z(t)), andhence on z(t). Then V (z(t)) is a Lyapunov function.

Also important is the formal L2–adjoint operator L∗, given by

L∗V = −∇ · (hV). (4.3.3)

We now show the crucial role played by L and L∗ in understanding how fami-lies of solutions of (4.1.1), parameterized by the initial data, and possibly carryinga probability measure, behave.

Let v satisfy the linear PDE

∂v

∂t= Lv, (z, t) ∈ Z × (0,∞),

v(z, 0) = φ(z), z ∈ Z. (4.3.4)

We will denote the solution of (4.3.4) by v(z, t) = (eLtφ)(z). This is often referedto as the semigroup notation for the solution of a time-dependent linear operatorequation.

Theorem 4.4. Assume that the solution of (4.1.1) generates a one-parameter groupon Z so that ϕ−t(Z) = Z for all t ∈ R. Assume also that φ is sufficiently smoothso that (4.3.4) has a classical solution. Then the classical solution satisfies

v(z, t) = φ(ϕt(z)), ∀t ∈ R+, z ∈ Z. (4.3.5)

Proof. Note that (4.3.5) satisfies the initial condition v(z, 0) = φ(ϕ0(z)) = φ(z).Using the group property of ϕt we deduce that v(z, t) given by (4.3.5) satisfiesv(ϕ−t(z), t) = φ(z), ∀ t ∈ R+, z ∈ Z . By differentiating with respect to t weobtain from (4.3.5) that

d

dtv(ϕ−t(z), t) = 0

and so∂v(ϕ−t(z), t))

∂t+ 〈∇v(ϕ−t(z), t),

d

dtϕ−t(z)〉 = 0.

But ϕ−t(z) is the backwards time solution for (4.1.1) and hence satisfies

d

dt

(ϕ−t(z)

)= −h(ϕ−t(z)).

Page 75: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

4.3. THE GENERATOR 59

Thus

∂v(ϕ−t(z), t))∂t

+ 〈∇v(ϕ−t(z), t),−h(ϕ−t(z))〉 = 0, ∀ t ∈ R+, z ∈ Z.

This is equivalent to

∂v(z, t)∂t

+ 〈∇v(z, t),−h(z)〉 = 0, ∀ t ∈ R+, z ∈ Z

showing that (4.3.5) solves the linear PDE (4.3.4).

Formula (4.3.5) represents solution of the PDE (4.3.4) by the method of char-acteristics. Conversely it shows that the family of all solutions of (4.1.1) forz0 ∈ Z can be found by solving a linear PDE. We now study what happens whenwe place a probability measure on z0, so that z(t) solving (4.1.1) is a random vari-able.

To this end consider the adjoint of (4.3.4), namely the Liouville equation

∂ρ

∂t= L∗ρ, (z, t) ∈ Z × (0,∞), (4.3.6)

ρ(z, 0) = ρ0(z), z ∈ Z. (4.3.7)

Using the semigroup notation the solution can be denoted by ρ(z, t) = (eL∗tρ0)(z).Now let E denote expectation with respect to initial data distributed according to arandom variable on Z with density ρ0(z), i.e.

Ef :=∫

Zf(z)ρ0(z) dz.

Theorem 4.5. Assume that the solution of (4.1.1) generates a one-parameter groupon Z so that ϕ−t(Z) = Z . Assume also that φ is sufficiently smooth so that (4.3.4)has a classical solution. Finally, assume that the initial data for (4.1.1) is dis-tributed according to a random variable on Z with density ρ0(z), smooth enoughso that (4.3.7) has a classical solution. Then, z(t) is a random variable on Z withdensity ρ(z, t) satisfying (4.3.7).

Page 76: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

60 CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS

Proof. Note that, by the previous result,

E(φ(z(t))) =∫

Zφ(ϕt(z))ρ0(z)dz

=∫

Zv(z, t)ρ0(z)dz

=∫

Z(eLtφ)(z)ρ0(z)dz

=∫

Z(eL

∗tρ0)(z)φ(z)dz.

Also, if ρ(z, t) is the density of z(t), then

E(φ(z(t))) =∫

Zρ(z, t)φ(z)dz.

Equating these two expressions for the expectation at time t and using the arbitrari-ness of φ, together with an approximation argument to extend the equality to all φ

in L2, shows thatρ(z, t) = (eL

∗tρ0)(z)

in L2. Hence, by the assumed smoothness, the density ρ(z, t) satisfies the adjointequation (4.3.7).

4.4 Ergodicity

In this section we will assume for simplicity that the phase space Z is compactwith no boundary. The natural example is Z = Td, in which case L, and its formaladjoint, are equipped with periodic boundary conditions. We will consider themeasure space (Z,A, µ), where A denotes a σ–algebra of measurable subsets ofZ and µ denotes a measure on Z . Let ϕt denote the solution operator for (4.1.1).We will say that a set A ∈ A is invariant under ϕt provided that for all t ∈ R

ϕt(A) = A.

Definition 4.6. The ODE (4.1.1) is called ergodic if every invariant set A of ϕt issuch that either µ(A) = 0 or µ(A) = 1.

Note that the definition of ergodicity is relative to the measure space in ques-tion.

Page 77: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

4.5. DISCUSSION AND BIBLIOGRAPHY 61

Theorem 4.7. The following properties are equivalent.

(i) The ODE (4.1.1) is ergodic.

(ii) 1 is a simple eigenvalue of the operator eLt.

(iii) The only solutions ofLf = 0

are constants.

(iv) The stationary Liouville equation

L∗ρ = 0

has a unique solution ρ∞ satisfying∫Z ρ∞(z)dz = 1.

(v) For every bounded measurable φ : Z 7→ R we have

limT→∞

1T

∫ T

0φ(z(t)) dt =

Zφ(z)ρ∞(z) dz, µa.s..

By choosing φ to be IA, the indicator function of Borel set A ⊆ Z , we deducefrom the last result that the measure µ defined by

µ(A) = limT→∞

1T

∫ T

0IA(z(t)) dt, (4.4.1)

has density ρ∞. Thus µ(dz) = ρ∞(z)dz.

Notice that an important aspect of ergodicity, encapsulated in the fifth item ofthe theorem, is that the system forgets its initial condition. It is this property thatwill be crucial in much of our exploitation of ergodicity.

4.5 Discussion and Bibliography

The complete proof of Theorem 4.2 can be found in [8, Sec. 31] or [31]. Thegenerator of a system of ODEs and its properties are discussed in [88, Sec. 7.6]; inparticular the results of the ergodicity Theorem 4.7 can be found in this book. Theuse of the generator to study Lyapunov functions, and obtain a priori estimates onsolutions of ODEs, may be found in [141]. Sometimes the generator L which wasdefined in equation (4.3.1) and its adjoint L∗ are called the generators of the Koop-man and the Frobenious–Perron operators, respectively. The Liouville equationis the fundamental equation of non–equilibrium statistical mechanics; see e.g. [13].The method of characteristics is discussed in numerous PDE books including [45].

Page 78: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

62 CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS

4.6 Exercises

1. Let Z = Td. Show that for all f ∈ C1(Z,R) the formal L2–adjoint of Ldefined in eqn. (4.3.1) is L∗ defined in eqn. (4.3.3).

2. Let H(p, q) : R2d → R be a smooth function and consider the (Hamiltonian)ODE

q = ∇pH(p, q), p = −∇qH(p, q) (4.6.1)

a. Write down the generators of the Koopman and Frobenious–Perron operatorsfor (4.6.1).

b. Write down the Liouville equation (4.6.1)

c. Show that every smooth function of the Hamiltonian H(p(t), q(t)) solvesthe Liouville equation.

d. Is the Hamiltonian system (4.6.1) ergodic?

3. Carry out the same program as in the previous exercise for the (gradient) ODE

q = −∇qV (q), (4.6.2)

where V (q) : Rd → R is a smooth function.

Page 79: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

Chapter 5

Markov Chains

5.1 Set-Up

In this section we set-up Markov chains on a countable state space which, withoutloss of generality, we take as I, a subset of the positive integers. 1 We start withdiscrete time Markov chains and then construct continuous time Markov chainsfrom them.

A matrix P with entries pij is a stochastic matrix if∑

j

pij = 1 ∀i ∈ I

and pij ∈ [0,∞) for all (i, j) ∈ I × I.

Definition 5.1. The random sequence znn>0 is a homogeneous Markov chainwith initial distribution ρ0 and transition matrix P if

• z0 has distribution ρ0;

• for n > 0 P(zn+1 = j|zn = i) = pij , independently of z0, . . . , zn−1.

Note that (P k)ij gives the probability that zk = j given z0 = i.

Notice that a stochastic matrix satisfies

P1 = 1, (5.1.1)1Unless stated otherwise, all sums are over I in this chapter. In later chapters we will sometimes

find it convenient to work with a doubly indexed state space found as the product of two subsets ofthe integers.

63

Page 80: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

64 CHAPTER 5. MARKOV CHAINS

where 1 is the vectors of ones. Combining this with the fact that P has positiveentries implies the following fundamental estimate:

‖P‖∞ = 1. (5.1.2)

Definition 5.2. A continuous time Markov chain is a Markov stochastic processwith state space E = I.

We will only consider time homogeneous Markov chains in what follows andhence refrain from explicitly stating this fact in the remainder of these notes.

A discrete-time Markov chain can be used to construct a continuous time Markovchain as follows. Let the i.i.d sequence τnn>0 be distributed as exp(λ) and de-fine tnn>0 by tn+1 = tn + τn. Let znn>0 be a discrete time Markov chain onI, independent of the τnn>0, and set

z(t) = zn, t ∈ [tn, tn+1). (5.1.3)

We call this a jump chain. Notice that z(t) takes values in I and is a cadlag process.The fact that z(t) is Markov follows from the Markovian structure of znn>0,together with the properties of the exponential random variable.

Informally we may write z(t) as the solution of the differential equation

dz

dt= δ(t− tj)

(k(z(t−))− z(t−)

)(5.1.4)

where k(z) is distributed as p(z, ·) and the k(z(t−j ))j>0 are drawn independentlyof one another, and independently of the τj. This representation follows because,integrating over the jump times tj yields

z(t+j )− z(t−j ) = limε→0

∫ tj+ε

tj−εδ(t− tj)

(k(z(t−))− z(t−)

)dt

= k(z(t−j ))− z(t−j )

and so z(t+j ) = k(z(t−j )) as desired. Making sense of this random differentialequation, and in particular showing that it has a solution for all time t > 0, is inti-mately related to the question of showing that the jump times tj do not accumulteat a finite time. In the next section we assume that this is indeed the case. In thesection following it, we return to the discussion of existence of solutions to theMarkov chain.

Page 81: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

5.2. THE GENERATOR 65

5.2 The Generator

We start by finding a representation of the matrix P (t) with entries

pij(t) = P(z(t) = j|z(0) = i). (5.2.1)

We will express P (t) in terms of P and λ the parameters input into the jump chain.Note that, by properties of exponential random variables encapsulated in (3.2.4),

P(k jumps in [0, t]) =e−λt(λt)k

k!.

Thus

pij(t) =∞∑

k=0

e−λt(λt)k

k!(P k

)ij,

since(P k

)ij

= P(zk = j|z0 = i). Hence

P (t) = eλt(P−I)

= eLt

with L = λ(P − I). The matrix L is called the generator of the continuous timeMarkov chain and can be related to the abstract definition of generator for Markovprocesses as given in Chapter 3. Notice that, since P is a stochastic matrix, thematrix L satisfies all of the following criteria which we take as a definition.

Definition 5.3. A matrix L : I → I with entries Lij is the generator of a continu-ous time Markov chain if

• ∑j lij = 0 ∀i ∈ I;

• lij ∈ [0,∞) ∀(i, j) ∈ I × I with i 6= j.

The definition implies that

−lii ∈ [0,∞) ∀i ∈ I.

Notice that, given a generator L, it is possible to find a discrete time Markovchain, and a sequence of i.i.d exponential random variables so that the associatedjump chain generates paths of the continuous time Markov chain with generator L.Specifically we fix any λ > 0, generate the τj as an i.i.d sequence with τ0 ∼ exp(λ)and generate zn from the Markov chain P = I + λ−1L. In the construction above

Page 82: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

66 CHAPTER 5. MARKOV CHAINS

we insisted that the jump times were i.i.d from exp(λ). However it is also possibleto allow each τj to be independent of the others, but distributed as exp(λ(z(t−j )).We discuss this in the next section.

The fundamental object that characterizes a continuous time Markov chain isits generator. We now give another way to see the relationship between the con-tinuous time Markov chain with transition matrix P (t), and the generator L. Con-sider a continuous time Markov chain z(t), t > 0, taking values in the state spaceI ⊆ 1, 2, . . . . Let pij(t) be the transition probability from state i to j given by(5.2.1). The Markov property implies that for all t, ∆t > 0,

pij(t + ∆t) =∑

k

pik(t)pkj(∆t).

This is the Chapman-Kolmogorov equation (3.3.3) in the discrete state space set-ting, so that integrals become sums. From this equation it follows that

pij(t + ∆t)− pij(t)∆t

=∑

k

pik(t)`kj(∆t),

where

`kj(∆t) =1

∆t×

( pkj(∆t) k 6= jpjj(∆t)− 1 k = j

). (5.2.2)

Suppose that the limit `kj = lim∆t→0 `kj(∆t) exists. We then obtain, formally,

dpij

dt=

k

pik`kj . (5.2.3)

Because∑

j pij(∆t) = 1 it follows that∑

j `ij(∆t) = 0, and assuming that thelimit exists, ∑

j

`ij = 0. (5.2.4)

This implies that ∑

j

pij = 1

for the limit equation (5.2.3).Introducing the matrices P (t), L with entries pij , `ij , respectively, i, j ∈ I,

equation (5.2.3) reads, in matrix notation,

dP

dt= PL, P (0) = I. (5.2.5)

Page 83: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

5.2. THE GENERATOR 67

As shown in (5.2.2) above, it is calculated from P via the formula

L = lim∆t→0

1∆t

(P (∆t)− I

).

The generator has constants in its null space by (5.2.4):

L1 = 0. (5.2.6)

Furthermore, the non-negativity of the pij implies that L has non-negative off-diagonals. The condition (5.2.4) thus implies that diagonal entries of −L are alsonon-negative.

Since P (t) = exp(Lt) solves this problem we see that P and L commute sothat P (t) also solves

dP

dt= LP, P (0) = I. (5.2.7)

We refer to both (5.2.5) and (5.2.7) as the master equation of the Markov chain.Let ρ(t) = (ρ0(t), ρ1(t), . . . )T be the transpose of the ith row of P (t), i.e.,

a column vector whose entries ρj(t) = pij(t) are the probabilities that a systemstarting in state i will end up, at time t, in each of the states j ∈ I. Let ei denotethe ith unit vector, zero in all entries except the ith, in which it is one. Directlyfrom (5.2.5) we obtain the following theorem.

Theorem 5.4. The probability vector ρ satisfies

dt= LT ρ, ρ(0) = ei. (5.2.8)

If the initial state of the Markov chain is random, with probability vector ρ(0) = ρ0

chosen independently of the transition probabilities in the Markov chain, then

dt= LT ρ, ρ(0) = ρ0. (5.2.9)

Proof. The first result follows from (5.2.5). Let ρ(i) denote the solution of (5.2.8).If the initial condition is random with ρ(0) = ρ0 then

ρ(t) =∑

i

ρ0(i)ρ(i)(t).

Differentiating and using (5.2.8) gives (5.2.9).Equation (5.2.9) is the Markov chain analogue of the Liouville and Fokker-

Planck equations described in the previous and following chapter respectively. Werefer to it as the forward equation.

Page 84: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

68 CHAPTER 5. MARKOV CHAINS

Let φ : I 7→ R be a real valued function defined on the state space; it can berepresented as a vector with entries φj , j ∈ I. Then let v(t) = (v0(t), v1(t), . . . )T

denote the vector with ith entry

vi(t) = Eφz(t)|z(0) = i,

where E denotes expectation with respect to the Markov transition probabilities.

Theorem 5.5. The vector of expectations v satisfies the equation

dv

dt= Lv, v(0) = φ. (5.2.10)

Proof. The function vi(t) can be written explicitly in terms of the transition prob-abilities:

vi(t) =∑

j

pij(t)φj . (5.2.11)

If we set φ = (φ0, φ1, . . . )T then this can be written in vector form as v(t) =P (t)φ. Differentiating with respect to time and using the master equation (5.2.7)gives the desired result.

Equation (5.2.10) is the Markov chain analogue of the method of character-istics and of the backward Kolmogorov equation described in the preceding andfollowing chapters respectively. We refer to it as the backward equation.

5.3 Existence and Uniqueness

In the construction above we insisted that the jump times were i.i.d exponentialrandom variables with fixed rate λ. However it is also possible to allow each τj tobe independent of the others, but with rate depending on the state z(t). Given agenerator L there is a canonical way of constructing a jump chain, with variablydistributed jump times, to realize the Markov chain. Let l(i) = −lii (no summationconvention). For j 6= i we set

pij = lij/l(i), l(i) 6= 0,0, l(i) = 0

.

and for j = i we set

pii = 0 l(i) 6= 0,1 l(i) = 0.

.

Page 85: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

5.3. EXISTENCE AND UNIQUENESS 69

Notice that, if l(i) 6= 0,

j

pij =1

l(i)

j 6=i

lij = − liil(i)

= 1,

and that, if l(i) = 0, ∑

j

pij = 1.

Hence p is a stochastic matrix.We generate a sequence znn>0 from the discrete time Markov chain with

P as transition matrix. The jump chain associated with this choice of transitionmatrix P is then given by

z(t) = zn, t ∈ [tn, tn+1). (5.3.1)

where tn+1 = tn + τn and the τn are independent random variables distributed asexp(l(zn)). Again z(t) takes values in I, is cadlag, and may be described by thedifferential equation (5.1.4). We call this the canonical jump chain associated to aMarkov chain with generator L.

The discrete time Markov chain with transition matrix P will generate se-quences znn>0 with probability one. However, the associated jump chain justdescribed may not generate a sequence z(t)t>0. Specifically, the sequence ofjump times tn may accumulate at a finite time. In this situation we say that theMarkov chain explodes – we do not have existence and uniqueness on t ∈ [0,∞).

Definition 5.6. A continuous time Markov chain is non-explosive if, with proba-bility one, the jump times do not accumulate at a finite time.

It is important to understand conditions which ensure non-explosion. There aretwo useful conditions which ensure this:

Theorem 5.7. The Markov chain is non-explosive if either:

• I is finite; or

• supi∈I l(i) < ∞.

Proof. Let

ζ =∞∑

n=0

τn.

Page 86: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

70 CHAPTER 5. MARKOV CHAINS

Set Tn = l(zn)τn and notice that the Tn form an i.i.d sequence with T1 ∼exp(1). By the strong law of large numbers,

1N + 1

N∑

n=0

Tn → 1 a.s.

In both cases (i) and (ii) we have l = supi∈I l(i) < ∞. Hence

lζ >∞∑

n=0

Tn = ∞ a.s.

so that lξ = ∞. The result follows.

5.4 Ergodicity

For simplicity we assume that I is a finite set. We start by discussing discrete timeMarkov chains. By (5.1.1), the matrix P−I has a non-empty null-space, and henceits transpose does too. Hence there exists vector ρ∞ such that

P T ρ∞ = ρ∞. (5.4.1)

Theorem 5.8. All eigenvalues of P lie in the closed unit circle. The vector ρ∞ maybe chosen so that all of its entries are non-negative.

The vector ρ∞ is known as the invariant distribution.

Definition 5.9. The discrete time Markov chain is said to be ergodic if the spectrumof P lies strictly inside the unit circle, with the exception of a simple eigenvalue atone.

Using the construction of L from P in (5.2.2) we deduce that

L1 =0,

LT ρ∞ =0. (5.4.2)

Theorem 5.10. All eigenvalues of L lie in the left half-plane. The vector ρ∞ maybe chosen so that all of its entries are non-negative.

The vector ρ∞ is known as the invariant distribution.

Page 87: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

5.5. DISCUSSION AND BIBLIOGRAPHY 71

Definition 5.11. The continuous time Markov chain is said to be ergodic if thespectrum of L lies strictly in the left-half plane, with the exception of a simpleeigenvalue at zero.

Theorem 5.12. An ergodic continuous time Markov chain on finite state space Isatisfies the following five properties:

i) Null(L) = span1;

ii) Null(LT ) = spanρ∞, ρ∞(i) > 0∀i ∈ I;

iii) ∃C, λ > 0 such that solution of the forward equation satisfies

‖ρ(t)− ρ∞‖1 6 Ce−λt ∀t > 0;

iv) ∃C, λ > 0 such that solution of the backward equation satisfies

‖v(t)− 〈ρ∞, φ〉1‖∞ 6 Ce−λt ∀t > 0.

v)

limT→∞

1T

∫ T

0φ(z(t))dt = 〈ρ∞, φ〉 a.s.

Since the state space is finite dimensional the convergence results hold in anynorm; however the choices as stated are natural from a probabilistic viewpoint.

Notice that, by choosing φ = ei, we deduce from the final result that the ith

component of ρ∞ can be found as the proportion of time that an arbitrary trajectoryof the Markov chain on t ∈ [0,∞) spends in state i.

5.5 Discussion and Bibliography

A good background reference on Markov chains is Norris [110]. The text [41] hasa wealth of material on the general setting for Markov processes, including Markovchains. The paper [57] describes a simple algorithm for simulating continuous timeMarkov chains.

Page 88: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

72 CHAPTER 5. MARKOV CHAINS

5.6 Exercises

1. Consider the discrete time Markov chain with generator

P =(

1− α αβ 1− β

).

Find the invariant density ρ∞. Under what conditions is the Markov chainergodic?

2. Consider the continuous time Markov chain with generator

L =( −a a

b −b

).

Find the invariant density ρ∞. Under what conditions is the Markov chainergodic?

3. Construct the canonical jump chain associated to the generator given in theprevious question.

4. Implement a numerical algorithm to simulate paths of the continuous timeMarkov chain from the previous example, by using the canonical jump chain.

5. Show that the jump chain (5.1.3) satisfies the Markov property.

6. Show that the canonical jump chain (5.3.1) satisfies the Markov property.

Page 89: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

Chapter 6

Stochastic Differential Equations

6.1 Set-Up

In this chapter we give background material concerning ODEs driven by Gaussianwhite noise – SDEs. Let W (t) denote m dimensional Brownian motion, h : Z →Rd a smooth vector-valued function and γ : Z → Rd×m a smooth matrix valuedfunction. In the following we typically take Z = Td,Rd or Rl ⊕ Td−l. Considerthe Ito SDE

dz

dt= h(z) + γ(z)

dW

dt, z(0) = z0. (6.1.1)

We think of the term dWdt as representing Gaussian white noise: a delta-correlated

Gaussian process. But such a process only exists as a distribution, and so theprecise interpretation of (6.1.1) is as an integral equation for z ∈ C(R+,Z):

z(t) = z(0) +∫ t

0h(z(s))ds +

∫ t

0γ(z(s))dW (s). (6.1.2)

In order to make sense of this equation we need to make sense of the stochasticintegral against W (s). We use the Ito interpretation of the stochastic integral asdefined in Chapter 3. section. Because it is notationally convenient to do so, wewill frequently write SDEs in the unintegrated form (6.1.1). Wheneever we do thisit should be interpretred as shorthand for the precise intrerpretation (6.1.2).

In later chapters we will often consider z = (xT , yT )T , with x ∈ X , y ∈ Y .If Z = Td (resp. Rd) then X = Tl (resp. Rl) and Y = Td−l (resp. Rd−l). IfZ = Rl ⊕ Td−l then X = Rl and Y = Td−l.

73

Page 90: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

74 CHAPTER 6. STOCHASTIC DIFFERENTIAL EQUATIONS

6.2 Existence and Uniqueness

In Theorem 4.2 we proved existence and uniqueness of solutions for ODE (i.e.when γ ≡ 0 in (6.1.1)) for globally Lipschitz vector fields h(x). A very similartheorem holds when γ 6= 0. As for ODE the conditions can be weakened, whena priori bounds on the solution can be found; but we limit ourselves to the simpleset-up of the following theorem, for expository purposes.

Theorem 6.1. Assume that both h(·) and γ(·) are globally Lipschitz on Z and thatz0 is a random variable independent of the Brownian motion W (t) with

E|z0|2 < ∞.

Then the SDE (6.1.1) has a unique solution z(t) ∈ C(R+;Z) with

E[∫ T

0|z(t)|2 dt

]< ∞ ∀T < ∞.

We conclude the section with two remarks, both of which will play an impor-tant role in future chapters.

Remark 6.2. The Stratonovich analogue of (6.1.1) is

dz

dt= h(z) + γ(z) dW

dt, z(0) = z0. (6.2.1)

By using definitions (3.4.5) and (3.4.6) it can be shown that z satisfying the StratonovichSDE (6.2.1) also satisfies the Ito SDE

dz

dt= h(z) +

12∇ · (γ(z)γ(z)T

)− 12γ(z)∇ · (γ(z)T

)+ γ(z)

dW

dt, z(0) = z0.

(6.2.2)See Exercise 1.

Remark 6.3. The Definition 3.3 of Brownian motion implies the interesting scalingproperty

W (ct) =√

cW (t)

where the above should be interpreted as holding in law. From this it follows that,if s = ct, then

dW

ds=

1√c

dW

dt,

again in law.

Page 91: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

6.3. THE GENERATOR 75

Hence, if we scale time to s = ct in (6.1.1), then we get the equation

cdz

ds= h(z) +

√cγ(z)

dW

ds, z(0) = z0.

(The precise interpretation is as an integral equation, as always). Notice that,whilst the SDE transforms unusually under s = ct the Fokker-Planck equationtransforms in the standard way, because it sees the a quadratic term formed fromthe diffusion coefficient.

6.3 The Generator

With the functions h(z),γ(z) given in the SDE (6.1.1) we define

Γ(z) = γ(z)γ(z)T .

The generator L is then defined as

Lv = h · ∇v +12Γ : ∇∇v. (6.3.1)

We will also be interested in the formal L2−adjoint operator L∗

L∗v = −∇ · (hv) +12∇ · ∇ · (Γv).

The Ito formula which follows is the basic result concerning the rate of changein time of functions V : Z → Rn evaluated the solution of an Z-valued SDE.Heuristically it delivers the following result:

d

dtV (z(t)) = LV (z(t)) +∇V (z(t))γ(z(t))

dW

dt, Y (0) = V (z0).

Note that if W were a smooth time-dependent function this formula would not becorrect: there is an additional term in LV , proportional to Γ, which arises from thelack of smoothness of Brownian motion. As for the SDE (6.1.1) itself, the preciseinterpretation of this expression is in integrated form.

Lemma 6.4. Ito Formula Assume that the conditions of Theorem 6.1 hold. Letx(t) solve (6.1.1) and let V ∈ C2(Z,Rn). Then the process Y (t) = V (z(t))satisfies

V (z(t)) = V (z(0)) +∫ t

0LV (z(s))ds +

∫ t

0∇V (z(s))γ(z(s))dW (s).

Page 92: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

76 CHAPTER 6. STOCHASTIC DIFFERENTIAL EQUATIONS

Let φ : Z 7→ R and consider the function

v(z, t) = E(φ(z(t))|z(0) = z

), (6.3.2)

where the expectation is with respect to all Brownian driving paths. By averagingin the Ito formula, which removes the stochastic integral, and using the Markovproperty, we deduce the following important consequence of Lemma 6.4.

Theorem 6.5. Assume that φ is chosen sufficiently smooth so that the backwardKolmogorov equation

∂v

∂t= Lv (z, t) ∈ Z × (0,∞),

v = φ (z, t) ∈ Z × 0 (6.3.3)

has a unique classical solution v(x, t) ∈ C2,1(Z × (0,∞),R+). Then v is givenby (6.3.2).

This is the analogue of (5.2.10) for Markov chains. If γ ≡ 0 in (6.1.1), so thatthe dynamics are deterministic, and ϕt is the flow on Z so that z(t) = ϕt(z(0)),then the Kolmogorov equation (6.3.3) reduces to the hyperbolic equation (4.3.4)whose characteristics are the integral curves of the ODE (4.1.1).

A direct consequence of Result 6.5 is the Fokker-Planck equation for propaga-tion of densities.

Theorem 6.6. Consider the (6.1.2) with z(0) a random variable with densityρ0(z). Assume that the law of z(t) has a density ρ(x, t) ∈ C2,1(Z × (0,∞),R+).Then ρ satisfies the Fokker-Planck equation

∂ρ

∂t= L∗ρ, (z, t) ∈ Z × (0,∞),

ρ = ρ0, z ∈ Z × 0. (6.3.4)

Proof. Let Eµ denote averaging with respect to the product measure induced bymeasure µ with density ρ0 on z(0) and independent driving Wiener measure on theSDE itself. By the previous result, averaging over random z(0) distributed withdensity ρ0(z), we find

Eµ(φ(z(t))) =∫

Zv(z, t)ρ0(z)dz

=∫

Z(eLtφ)(z)ρ0(z)dz

=∫

Z(eL

∗tρ0)(z)φ(z)dz.

Page 93: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

6.3. THE GENERATOR 77

But since ρ(z, t) is the density of z(t) we also have

Eµ(φ(z(t))) =∫

Zρ(z, t)φ(z)dz.

Equating these two expressions for the expectation at time t shows that

ρ(z, t) = (eL∗tρ0)(z)

in L2. Hence the density ρ(z, t) satisfies the Fokker-Planck equation as required.

The Fokker-Planck equation is the continuous analogue of (5.2.11) for Markovchains, and of the Liouville equation (4.3.7) for ODE. Note that constants are inthe null-space of L given by (6.3.1). Hence we expect L∗ to have a non-trivialnull-space. Let ρ be any steady solution of the Fokker-Planck equation, i.e.

L∗ρ = 0. (6.3.5)

We have the following important result giving the Dirichlet form associated withthe operator L.

Theorem 6.7. Let ρ be any steady solution of the Fokker-Planck equation on Td

with periodic boundary conditions. Let f be a smooth function on Td. Then∫

Td

(−Lf(z))f(z)ρ(z) dz =12

Td

(∇f(z) · Γ(z)∇f(z))ρ(z) dz. (6.3.6)

Proof. We have

L∗(fρ) = −∇ · (hρf) +12∇ · ∇ ·

(Γρf

)

=(L∗ρ)

f +(− Lf

)ρ +∇ρ · Γ∇f

+ρΓ : ∇∇f +∇f · (∇ · Γ)ρ

=(− Lf

)ρ +∇ρ · Γ∇f

+ρΓ : ∇∇f +∇f · (∇ · Γ)ρ.

Now let g be another smooth, periodic function. We use the previous calculation tocompute, using the fact that L and L∗ are adjoint operators in L2(Td) and by using

Page 94: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

78 CHAPTER 6. STOCHASTIC DIFFERENTIAL EQUATIONS

the divergence theorem on the last term in the second line,∫

Td

(Lg)fρ dz =∫

Td

gL∗(fρ) dz

=∫

Td

g(−Lf)ρ + g∇ρ · Γ∇f dz

+∫

Td

g(∇f · (∇ · Γ)ρ + ρΓ : ∇∇f

)dz

=∫

Td

g(−Lf)ρ dz −∫

Td

∇g · Γ∇fρ dz. (6.3.7)

Equation (6.3.7) implies∫

Td

(−Lg) fρ dz +∫

Td

g(−Lf)ρ dz =∫

Td

∇g · Γ∇fρ dz. (6.3.8)

Equation (6.3.6) follows from the above equation upon setting g = f .Roughly speaking it shows that −L is a positive operator, in an appropriate

weighted L2 space. Indeed let ρ be strictly positive on Z and define a measureµ(dx) = ρ(x)dx. We can then introduce the weighted Hilbert space L2(µ) withinner product and norm as follows:

(a, b)ρ =∫

Td

a(z) · b(z)ρ(z)dz, |a|2ρ = (a, a)2ρ.

Then the preceding theorem shows that

(−Lf, f)ρ =12|γT∇f |2ρ. (6.3.9)

Remark 6.8. In suitable functional settings, the previous theorem also applieson other choices of domain Z , not just on the torus. Specifically the functionspace should ensure that L∗ is the adjoint of L and allow the divergence theoremcalculation used to reach (6.3.7). Typically these conditions are realized on non-compact spaces by means of decay assumptions at infinity.

6.4 Ergodicity

For simplicity of exposition we state the rigorous results in the case whereZ = Td.Consider the SDE (6.1.1) on Td. Equip both the generatorL and its adjointL∗ withperiodic boundary conditions.

Page 95: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

6.4. ERGODICITY 79

Definition 6.9. The SDE is said to be ergodic if the spectrum of the generator liesstrictly in the left-half plane, with the exception of a simple eigenvalue at the origin.

In the following we use the short-hand notation ρ(t) and v(t) to denote thefunction-valued time-dependent solutions of the Fokker-Planck and Kolmogorovequations respectively. Thus we may view ρ(t), v(t) as being in a Banach space,for each fixed t, and measure their size through the Lp norms. The notation 1 isused to denote functions which are constant and equal to one, a.e. in an Lp sense.

Theorem 6.10. Equip L, L∗ on Td with periodic boundary conditions and assumethat Γ(z) is strictly positive-definite, uniformly in z ∈ Td. Then the SDE (6.1.1) isergodic and satisfies the following five properties:

• Null(L) = span1;

• Null(L∗) = spanρ∞, infz∈Td ρ∞(z) > 0;

• ∃C, λ > 0 such that the solution of the Fokker-Planck equation with initialdata a Dirac mass at arbitrary z(0) ∈ Td satisfies

‖ρ(t)− ρ∞‖1 6 Ce−λt ∀t > 0;

• ∃C, λ > 0 such that the solution of the backward Kolmogorov equation withinitial data a continuous function φ satisfies

∥∥∥∥v(t)− (∫

Td

φ(z)ρ∞(z) dz)1∥∥∥∥∞

6 Ce−λt ∀t > 0;

• For all φ ∈ C(Td)

limT→∞

1T

∫ T

0φ(z(t))dt =

Td

φ(z)ρ∞(z) dz, a.s.

Let IA denote the indicator function of Borel set A ⊆ Z . This function is notcontinuous but may be approximated by a sequence of continuous functions. Bychoosing φ to be IA we deduce from the last result that the measure µ defined by

µ(A) = limT→∞

1T

∫ T

0IA(z(t)) dt, (6.4.1)

has density ρ∞. Thus µ(dz) = ρ∞(z)dz.

Page 96: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

80 CHAPTER 6. STOCHASTIC DIFFERENTIAL EQUATIONS

Remark 6.11. Although we do not prove these results, a few remarks are in order.First the fact that the null-space ofL comprises constant, when the diffusion matrixΓ is uniformly positive-definite and when periodic boundary conditions are used,may be seen by means of the maximum principle. Secondly the same fact alsofollows directly from (6.3.9) if it is assumed that ρ∞ is strictly positive.

Example 6.12. Consider one-dimensional Brownian motion on T1 = S1:

dz

dt=√

2σdW

dt, z(0) = z0.

The generator L is the differential operator

L = σ2 d2

dz2

equipped with periodic boundary conditions on [0, 1]. This operator is self-adjoint.The null space of both L and L∗ comprises constant functions on [0, 1]. Both thebackward Kolmogorov and the Fokker-Planck equation reduce to the heat equation

∂ρ

∂t= σ

∂2ρ

∂x2

with periodic boundary conditions. Straightforward Fourier analysis shows thatthe solution converges to a constant at an exponential rate.

Although the theorem as stated holds with state space Td, it readily extends toa variety of settings with the appropriate function space choice for L and L∗. Toillustrate this we include two examples on Rd.

Example 6.13. Consider the Ornstein-Uhlenbeck (OU) process

dz

dt= −z +

√2σ

dW

dt, z(0) = z0, (6.4.2)

with z(t) ∈ R. Here we consider z0 is fixed and non-random. The solution is

z(t) = e−tz0 +√

∫ t

0e−(t−s)dW (s).

Hence,

Ez(t) = z0e−t

Page 97: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

6.4. ERGODICITY 81

and, by the Ito isometry,

E(z(t)− Ez(t))2 = 2σE(∫ t

0e−2(t−s)dW (s)

)2

= 2σ

∫ t

0e−2(t−s) ds

= σ(1− e−2t

).

In fact,z(t) ∼ N

(e−tz0, σ

(1− e−2t

))

indicating convergence to the Gaussian invariant measure N (0, σ) as t →∞.

The Fokker-Planck equation here reduces to

∂ρ

∂t+

∂x(−xρ) = σ

∂2ρ

∂x2(6.4.3)

with initial condition being a Dirac mass centered at z0. It is readily verified thatthe density associated with the Gaussian measure for z(t) is a solution of this linearPDE.

Example 6.14. Consider the equations

dx

dt=− ax + y, x(0) = x0,

dy

dt=− y +

√2σ

dW

dt, y(0) = y0,

for (x(t), y(t)) ∈ R2. We assume a > 0. The Fokker-Planck equation takes theform

∂ρ

∂t+

∂x

((−ax + y)ρ

)+

∂y(−yρ) = σ

∂2ρ

∂y2.

The previous example shows that y has Gaussian distribution and converges to aGaussian invariant measure. Since

x(t) = e−atx(0) +∫ t

0e−a(t−s)y(s)ds

and y is Gaussian we deduce that x too is Gaussian.These considerations suggest that we seek a steady solution of the Fokker-

Planck equation in the form

ρ∞(x, y) ∝ exp−αx2 + βxy − γy2.

Page 98: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

82 CHAPTER 6. STOCHASTIC DIFFERENTIAL EQUATIONS

(The constant of proportionality should be chosen so that ρ∞ integrates to 1 onR2.) Substitution shows that

α =a(a + 1)2

2σ, β =

2a(a + 1)2σ

, γ =(a + 1)

2σ.

Note that we have thus found the density of a Gaussian invariant measure for(x, y).

Example 6.15. Consider the stochastic integral

I(t) =∫ t

0η(s) dW1(s),

where η(t) is the Ornstein–Uhlenbeck process defined in Example 6.13, namely

dt= −η +

√2σ

dW2

dt, η(0) = η0.

Here W1(t) and W2(t) are independent Brownian motions. The invariant measurefor η is anN (0, σ) Gaussian random variable. We assume that the initial conditionis distributed according to this invariant measure. Hence, η(t) is a stationaryergodic Markov process and Corollary 3.33 applies. Hence

limε→0

εI(t/ε2) =√

σW (t),

where W (t) is a standard Brownian motion in one dimension.

6.5 Discussion and Bibliography

For a discussion of SDEs from the viewpoint of the Fokker-Planck equation, seeRisken [128] or Gardiner [53]. For a discussion of the generator L, and the back-ward Kolmogorov equation, see Oksendal [111]. For a discussion concerning el-lipticity, hypo-ellipticity and smoothness of solutions to these equations see Rogersand Williams [130]. The book by Khasminskii [66] has a good discussion of er-godicity. The book by Mao [100] has a good overview of stability theory and largtime properties of SDEs. The Fokker-Planck equation is often refered to as theforward Kolmogorov equation in the mathematics literature.

We have only discussed strong solutions to SDE. The definition of a weaksolution, together with existence and uniqueness theorems of weak solutions canbe found in [130, Ch. 5]. The weak formulation of an SDE is equivalent to themartingale formulation. See [140]

Page 99: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

6.6. EXERCISES 83

6.6 Exercises

1. Derive the Ito SDE (6.2.2) from the Stratonovich SDE (6.2.1). Using theIto form of the Stratonovich SDE, find the Fokker-Planck equation for theStratonovich SDE.

2. Prove Result 6.5 (hint: use the Ito formula and the martingale property of thestochastic integral).

3. Consider the OU process defined in equation (6.4.2).

(a) Calculate all moments of the process z(t).

(b) Solve the corresponding Fokker–Planck equation (6.4.3) with ρ(0, x) =δ(z) to show that the solution is

ρ(x, t; z) =√

1µσ2

(1− e−2t) exp

[−

(x− e−tz

)2

σ2 (1− e−2t)

].

(c) Deduce the long–time behavior of the OU process from the above for-mula.

4. Consider the SDE

mx = −∇V (x)− γx +√

γDdW

dt, (6.6.1)

where m, D, γ are positive constants and V (x) : Rd 7→ R is a smooth func-tion.

(a) Write equation (6.6.1) as a first order system of SDEs in the form

dz

dt= − 1

DK∇H(z) +

1√m

J∇H(z) +√

2KdB

dt,

where z = (xT , yT )T , J (respectively K) is a skew (respectively sym-metric) matrix which you should define and H(z) = 1

2 |y|2 + V (x).

(b) Write the corresponding generator and the Fokker–Planck equation.

(c) Solve the stationary Fokker–Planck equation (hint: use separation of vari-ables).

(d) Solve equation (6.6.1) in one dimension for the cases V (x) ≡ 0 andV (x) = 1

2x2.

Page 100: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

84 CHAPTER 6. STOCHASTIC DIFFERENTIAL EQUATIONS

5. Consider a Markov chain z with generator

L =( −a a

b −b

).

Now let x solve an SDE with coefficients depending on x:

dx

dx= f(x, z) + α(x, z)

dW

dt.

Write down the generator for the Markov process (x, z).

Page 101: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

Chapter 7

Partial Differential Equations

7.1 Set-Up

In this chapter we describe the basic theory of elliptic, parabolic and hyperbolicPDE. In sections 7.3 and 7.4 We study the Dirichlet and the periodic boundaryvalue problem for elliptic PDE in divergence form. Section 7.5 will be devotedto the Fredholm alternative, in a more general setting. In section 7.6 we describethe Cauchy problem for parabolic PDE and in section 7.7 the Cauchy problem forhyperbolic PDE. The function space settings are described in Chapter 2.

7.2 Elliptic PDE

The (homogeneous) Dirichlet problem is to find u(x) solving

−∇ · (A(x)∇u) = f, for x ∈ Ω (7.2.1a)

u(x) = 0, for x ∈ ∂Ω, (7.2.1b)

where A(x) is a positive definite matrix and f ∈ H−1(Ω).Let Y = Td the d− dimensional torus. The periodic problem is to find u(y)

solving−∇y · (A(y)∇yu) = f(y), u(y) is 1–periodic, (7.2.2)

where A(y) is periodic positive definite matrix and f ∈ H∗ where H∗ is the dual ofH; recall that H is the set of mean zero H1

per(Y) functions – see equation (2.4.5).The class of coefficients A(x) that we will consider is provided in the following

definition.

85

Page 102: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

86 CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS

Definition 7.1. Let α, β ∈ R, such that 0 < α 6 β < ∞. We define M(α, β,Ω)to be the set of d× d matrices A(x) = aij(x)d

i,j=1 ∈ (L∞(Ω))d×d such that, forevery vector ξ ∈ Rd and every x ∈ Ω

(i) ξT A(x)ξ > α|ξ|2,

(ii) |A(x)ξ| 6 β|ξ|.

Furthermore, we define Mper(α, β,Y) to be the set of matrices in M(α, β,Y) withY–periodic coefficients.

In Section 7.3 we will study the Dirichlet problem. The periodic problem willbe studied in Section 7.4. The Fredholm alternative for elliptic operators of theform

A = −∇y · (A(y)∇y) + b(y) · ∇y

with periodic boundary conditions will be studied in Section 7.5.The elliptic operators we study have the form

A = −∇ · (A(x)∇) + b(x) · ∇+ c(x)

Operators of this form, and the corresponding PDE, and said to be in divergenceform. On the other hand, elliptic differential operators of the form

A = −A(x) : ∇∇+ b(x) · ∇+ c(x),

are in non–divergence form.

7.3 The Dirichlet Problem for Elliptic PDE

First we give the precise definition of a solution. For this we will need to introducethe bilinear form

a[φ, ψ] =∫

Ω∇ψ(x)T A(x)∇φ(x) dx, (7.3.1)

for φ, ψ ∈ H10 (Ω). We will use the notation 〈·, ·〉 for the pairing between H1

0 (Ω)and its dual H−1(Ω) (see Chapter 2).

Definition 7.2. We will say that u ∈ H10 (Ω) is a weak solution of the boundary

value problem (7.2.1) if

a[u, v] = 〈f, v〉 ∀ v ∈ H10 (Ω). (7.3.2)

Page 103: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

7.3. THE DIRICHLET PROBLEM FOR ELLIPTIC PDE 87

The Lax–Milgram Theorem 2.41 enables us to prove existence and uniquenessof weak solutions for the class of matrices A(x) given by Definition 7.1.

Theorem 7.3. The Dirichlet problem (7.2.1) with A ∈ M(α, β,Ω), f ∈ H−1(Ω)and g = 0 has a unique weak solution u ∈ H1

0 (Ω). Moreover, the followingestimate holds:

‖u‖H10 (Ω) 6 1

α‖f‖H−1(Ω). (7.3.3)

Proof. We have to verify the conditions of the Lax–Milgram Theorem. We startwith coercivity. We use the positive definiteness of the matrix A to obtain:

a[u, u] =∫

ΩA(x)∇u∇u dx

> α

Ω|∇u|2 dx = α‖u‖2

H10 (Ω).

Finally we proceed with continuity. We use the L∞ bound on the coefficientsaij(x)d

i,j=1, together with the Cauchy–Schwarz inequality to estimate:

a[u, v] =∫

Ω∇vA∇u dx

6 β

Ω|∇u||∇v| dx

6 β‖∇u‖L2(Ω)‖∇v‖L2(Ω)

= β‖u‖H10 (Ω)‖v‖H1

0 (Ω).

The bilinear form a(u, v) satisfies the conditions of the Lax–Milgram Theorem andhence there exists a unique solution u ∈ H1

0 (Ω) of (7.3.2).Now we prove estimate (7.3.3). We have:

α‖u‖2H1

0 (Ω) 6 a[u, u] = 〈f, u〉6 ‖f‖H−1(Ω)‖u‖H1

0 (Ω),

from which the estimate follows.If f ∈ L2 then the following bound is also useful.

Remark 7.4. Consider the problem (7.2.1) with A ∈ M(α, β,Ω) and f ∈ L2(Ω).Then

‖u‖H10 (Ω) 6 CΩ

α‖f‖L2(Ω),

where CΩ is the Poincare constant for the domain Ω defined in Theorem 2.21.

Page 104: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

88 CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS

The bound (7.3.3) enables us to obtain information on the solution of 1–parameterfamilies of Dirichlet problems.

Theorem 7.5. Assume that there are α, β such that the one parameter family ofmatrices Aε belongs to M(α, β, Ω) for all ε > 0. Consider the Dirichlet problem

−∇ · (Aε(x)∇uε) = f, for x ∈ Ω (7.3.4a)

uε(x) = 0, for x ∈ ∂Ω, (7.3.4b)

with f(x) ∈ H−1(Ω). Then there is a constant C independent of ε such that

‖uε‖H10 (Ω) 6 C; (7.3.5)

furthermore, there exists a subsequence εnn>0 and a u(x) ∈ H10 (Ω) such that

uεn(x) → u(x) strongly in L2(Ω).

Proof. Estimate (7.3.3) implies (7.3.5). The Rellich compactness theorem, The-orem 2.19, implies that there exists a function u ∈ H1

0 (Ω) and a subsequenceε′ ∈ ε such that u is the strong L2–limit of uε′ .

Remark 7.6. When studying homogenization for elliptic PDE, in Chapter 13, wewill be interested in finding the equation satisfied by u.

7.4 The Periodic Problem for Elliptic PDE

It is intuitively clear that the solution of the periodic problem can be determinedonly up to a constant. To ensure uniqueness we need to fix this constant: we workin H, the set of mean zero H1

per(Y) functions discussed above. We will use thenotation a1[·, ·] to denote the bilinear form

a1[u, v] =∫

Y∇yvA∇yu dy ∀u, v ∈ H. (7.4.1)

Recall that we denote the pairing between H and its dual H∗ by 〈·, ·〉H,H∗ . Thestructure of H∗ means that (7.2.2) has a unique solution only when f has meanzero.

Definition 7.7. We will say that u ∈ H is a weak solution of the boundary valueproblem (7.2.2) if

a1[u, v] = 〈f, v〉H∗,H ∀ v ∈ H. (7.4.2)

Page 105: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

7.5. FREDHOLM ALTERNATIVE 89

Existence and uniqueness of weak solutions to (7.2.2) holds within the spaceH . Indeed, we have the following theorem.

Theorem 7.8. The problem (7.2.2) with A ∈ Mper(α, β,Y) and f ∈ H∗ has aunique weak solution u ∈ H . Moreover, the following estimate holds:

‖u‖H 6 1α‖f‖H∗ . (7.4.3)

The proof is almost identical to that of Theorem 7.3 and so we omit it. The factthat (7.2.2) has a unique solution only when f has mean zero can also be shown bymeans of the Fredholm alternative – a topic that we no turn to.

7.5 The Fredholm Alternative for Second Order UniformlyElliptic Operators in Divergence Form

In this section we prove that elliptic differential operators of the form

A = −∇ · (A(x)∇) + b(x) · ∇, (7.5.1)

with periodic coefficients and equipped with periodic boundary conditions, satisfythe Fredholm alternative. Notice that Theorem 2.42 does not apply directly tooperator A since it is an unbounded operator. The main idea will be to study theresolvent operator

RA(λ) = (A+ λI)−1, (7.5.2)

where I stands for the identity operator on L2per(Y) and λ > 0. We will prove that

it is compact; consequently, Fredholm theory can be used.Our assumptions on the coefficients of A are

A(x) ∈ Mper(α, β,Y), (7.5.3a)

A(x) = AT (x), (7.5.3b)

b(x) ∈ (C1per(Y))d. (7.5.3c)

The L2–adjoint of A is

A∗U = −∇ (A(x)∇U)−∇ · (b(x)U) (7.5.4)

also equipped with periodic boundary conditions. We want to study the PDE

Au = f, u(x) is 1–periodic (7.5.5)

Page 106: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

90 CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS

and its adjointA∗U = F, U(x) is 1–periodic, (7.5.6)

for f, F ∈ L2per(Y).

Let a[·, ·] : H1per(Y) × H1

per(Y) → R and a∗[·, ·] : H × H → R denote thebilinear forms associated with the operators A and A∗, i.e.

a[u, v] =∫

Y∇vA∇u dx +

Yb · ∇uv dx, ∀u, v ∈ H1

per(Y)

and

a∗[u, v] =∫

Y∇vA∇u dx−

Y∇ · (bv)u dx ∀u, v ∈ H1

per(Y)

respectively. As in Section 7.4, we will say that u and U are weak solutions of thePDE (7.5.5) and (7.5.6) provided that

a[u, v] = (f, v) ∀ v ∈ H1per(Y) (7.5.7)

anda∗[U, V ] = (F, V ) ∀V ∈ H1

per(Y), (7.5.8)

respectively.The main result of this section is contained in the next theorem.

Theorem 7.9. (Fredholm Alternative for Periodic Elliptic PDE)

i) Assume conditions (7.5.3). Then precisely one of the following statementsholds: either

(a) for every f ∈ L2per(Y) there exists a unique solution of (7.5.5); or else

(b) there exists a nontrivial weak solution of the homogeneous problem

Au = 0, u(x) is 1–periodic. (7.5.9)

ii) When assertion (b) holds we have that

1 6 dim(Null(A)

)= dim

(Null(A∗)) < ∞.

iii) Finally, the boundary value problem (7.5.5) has a weak solution if and onlyif

(f, U) = 0, ∀U ∈ Null(A).

Page 107: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

7.5. FREDHOLM ALTERNATIVE 91

For the proof of this theorem we will need two lemmas.

Lemma 7.10. Assume conditions (7.5.3). Then there exist constants ν, µ > 0 suchthat

|a[u, v]| 6 ν‖u‖H1‖v‖H1 ,

andα

2‖u‖2

H1 6 a[u, u] + µ‖u‖2L2 ,

for all u, v ∈ H1per(Y).

Proof. 1. We use the L∞ bounds on the coefficients A, b, together with the Cauchy–Schwarz inequality to deduce:

|a(u, v)| 6∣∣∣∣∫

YA∇u∇v dx +

Yb · ∇uv dx

∣∣∣∣

6 ‖A‖L∞

Y|∇u||∇v| dx + ‖b‖L∞

Y|∇u||v| dx

6 C (‖∇u‖L2‖∇v‖L2 + ‖∇u‖L2‖v‖L2)

6 C‖u‖H1‖v‖H1 .

2. We use now the uniform ellipticity of A to compute:

α‖∇u‖2L2 6

Y(∇u)T A∇u dx

= a[u, u]−∫

Yb∇uu dx

6 a[u, u] + ‖b‖L∞

Y|∇u||u| dx. (7.5.10)

Now we make use of the algebraic inequality

ab 6 δa2 +14δ

b2, ∀δ > 0.

Using this in the second term on the right hand side of (7.5.10) we obtain∫

Y|∇u||u| dx 6 δ‖∇u‖2

L2 +14δ‖u‖2

L2 . (7.5.11)

We chose δ so thatα− ‖b‖L∞δ =

α

2.

Page 108: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

92 CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS

We use inequality (7.5.11) with δ chosen as above in (7.5.10) to obtain

α

2‖∇u‖2

L2 6 a[u, u] +14δ‖b‖L∞‖u‖2

L2 .

We add now α2 ‖u‖2

L2 on both sides of the above inequality to obtain

α

2‖u‖2

H1 6 a[u, u] + µ‖u‖2L2 ,

withµ =

12α‖b‖L∞ +

α

2.

Lemma 7.11. Assume conditions (7.5.3). Take µ from Lemma 7.10. Then forevery λ > µ and each function f ∈ L2

per(Y) there exists a unique weak solutionu ∈ H1

per(Y) of the problem

(A+ λI)u = f, u(x) is 1–periodic. (7.5.12)

Proof. Let λ > µ. Define the operator

Aλ := A+ λI. (7.5.13)

The bilinear form associated to Aλ is

aλ[u, v] = a[u, v] + λ(u, v) ∀u, v ∈ H1per(Y). (7.5.14)

Now, Lemma 7.10 together with our assumption that λ > µ imply that the bilinearform aλ[u, v] is continuous and coercive on H1

per(Y). Hence the Lax–MilgramTheorem applies1 and we deduce the existence and uniqueness of a solution u ∈H1

per(Y) of the equation

aλ[u, v] = (f, v), ∀v ∈ H1per(Y). (7.5.15)

This is precisely the weak formulation of the boundary value problem (7.5.12).Proof of Theorem 7.9. 1. By Lemma 7.11 there exists, for every g ∈ L2

per(Y),a unique solution u ∈ H1

per(Y) of

aµ[u, v] = (g, v), ∀v ∈ H1per(Y). (7.5.16)

1We have that 〈f, v〉 = (f, v) defines a bounded linear functional on H1per(Y).

Page 109: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

7.5. FREDHOLM ALTERNATIVE 93

We use the resolvent operator defined in (7.5.2) to write the solution of (7.5.16) inthe following form:

u = RA(µ)g. (7.5.17)

Consider now equation (7.5.5). We add the term µu on both sides of this equationto obtain

Aµu = µu + f,

where Aµ is defined in (7.5.13). The weak formulation of this equation is

aµ[u, v] = (µu + f, v), ∀v ∈ H1per(Y).

We can rewrite this as an integral equation (see (7.5.17))

u = RA(µ)(µu + f),

or, equivalently,(I −K)u = h,

whereK := µRA(µ), h = RA(µ)f.

2. Now we claim that the operator K : L2per(Y) → L2

per(Y) is compact. In-deed, let u be the solution of (7.5.16) which is given by (7.5.17). We use thesecond estimate in Lemma 7.10, the definition of the bilinear form (7.5.14) and theCauchy–Schwarz inequality in (7.5.16) to obtain

α

2‖u‖2

H1 6 aµ[u, u] = (g, u)

6 ‖g‖L2‖u‖L2 6 ‖g‖L2‖u‖H1 .

Consequently,‖u‖H1 6 C‖g‖L2 .

We use now (7.5.17), the definition of K and the above estimate to deduce that

µ‖u‖H1 = ‖Kg‖H1 6 Cµ‖g‖L2 . (7.5.18)

By the Rellich compactness theorem H1per(Y) is compactly embedded in L2

per(Y)and consequently estimate (7.5.18) implies that K maps bounded sets in L2

per(Y)into compact ones in L2

per(Y). Hence, it is a compact operator.3. We apply now the Fredholm alternative (Theorem 2.42) to the operator K: either

Page 110: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

94 CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS

a. there exists a unique u ∈ L2per(Y) such that

(I −K)u = h, (7.5.19)

or

b. there exists a non trivial solution u ∈ L2per(Y) of the homogeneous equation

(I −K)u = 0. (7.5.20)

Let us assume that (7.5.19) holds. From the preceding analysis we deduce thatthere exists a unique weak solution u ∈ H1

per(Y) of (7.5.5). Assume now that(7.5.20) holds. Let N and N∗ denote the dimensions of null spaces of I −K andI −K∗, respectively. From Theorem 2.42 we know that N = N∗. Moreover, it isstraightforward to prove that

u ∈ N (I −K) = 0 ⇔ a[u, φ] = 0, ∀u ∈ H1per(Y)

andv ∈ N (I −K∗) = 0 ⇔ a∗[v, φ] = 0, ∀φ ∈ H1

per(Y),

Thus, the Fredholm alternative for K implies the Fredholm alternative forA (withinthe context of weak solutions.)4. Now we prove the third part of the theorem. Let v ∈ N (I −K∗). By Theorem2.42 we know that (7.5.20) has a solution if and only if

(h, v) = 0 ∀ v ∈ N (I −K∗).

We compute

(h, v) = (RA(µ)f, v) =1µ

(Kf, v)

=1µ

(f, K∗v) =1µ

(f, v).

Hence, problem (7.5.5) has a weak solution if and only if

(f, v) = 0 ∀ v ∈ N (A∗).

This completes the proof of the theorem.

Page 111: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

7.6. PARABOLIC PDE 95

Example 7.12. Let f ∈ (L2per(Y)) and assume that A(x) satisfies assumptions

(7.5.3a) and (7.5.3b). Then the problem

a1[u, v] = (f, v), ∀ v ∈ H1per(Y),

where a1[·, ·] was defined in (7.4.1), has a unique solution u ∈ H if and only if

〈f, 1〉 = 0. (7.5.21)

Indeed, consider the homogeneous adjoint equation

A∗U = 0.

Clearly, the constant function (say, U = 1) is a solution of this equation. Theuniform ellipticity of the matrix A(x) implies that

Y|∇U | dx = 0,

and hence, the constant solution is unique. Since assumptions (7.5.3a) and (7.5.3b)are satisfied, Theorem 7.9 applies and the result follows.

7.6 Parabolic PDE

In this section we describe the basic theory of parabolic PDE, in non-divergenceform. Specifically we study the initial value (Cauchy) problem

∂u

∂t= Lu + F (x, t), for (x, t) ∈ Rd × [0, T ], (7.6.1a)

u(x, 0) = f(x), for x ∈ Rd, (7.6.1b)

whereL := b(x) · ∇+

12A(x) : ∇∇. (7.6.2)

We will make the following assumptions.

b(x) ∈ (C∞b (Rd),Rd); (7.6.3a)

A(x) ∈ (C∞b (Rd),Rd×d); (7.6.3b)

F (x, t) ∈ C∞0 (Rd × [0, T ]); (7.6.3c)

A ∈ M(α, β,Rd). (7.6.3d)

Page 112: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

96 CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS

Theorem 7.13. Under assumptions (7.6.3) there exists a unique solution u(x, t) ∈C1,2

([0, t]× Rd

)∪C∞ ([0, t)× Rd

)to the Cauchy problem (7.6.1). Furthermore

the following estimate holds.

‖u‖L∞(Rd×[0,t]) 6 ‖f‖L∞(Rd) +∫ t

0‖F (·, s)‖L∞(Rd) ds. (7.6.4)

Estimate (7.6.4) is a consequence of the maximum principle for parabolic PDE.The solution u(x, t) in the above theorem is a classical solution. The weak

formulation of the Cauchy problems (7.6.1) is obtained by multiplying the equationby a smooth, compactly supported function and integrating by parts. See Exercise5.

7.7 Hyperbolic PDE

In this chapter we investigate some basic properties of solutions to the Cauchyproblem for the linear transport PDE

∂u

∂t+∇ · (a(x)u) = 0, (x, t) ∈ Rd × R+, (7.7.1a)

u = f(x), (x, t) ∈ Rd × 0, (7.7.1b)

where a(x) : Rd → Rd is a smooth vector field and f(x) is a smooth scalar field.Notice that equation (7.7.1a) is the Liouville equation for the differential equation

dx

dt= a(x).

In this chapter we will define an appropriate concept of solution for (7.7.1), wewill state the basic existence and uniqueness theorem and derive some basic energyestimates.

As is always the case, the weak formulation of (14.2.4) involves multiplicationby a smooth function, integration over R+ ×Rd and integration by parts. We havethe following definition

Definition 7.14. Assume that a(x) ∈ C∞b (Rd,Rd) and that f(x) ∈ C∞

b (Rd). Afunction u(x, t) ∈ L∞(R+ × Rd) is a weak solution of (7.7.1) provided that.∫

R+

Rd

(∂φ(x, t)

∂t+ a(x) · ∇xφ(x, t)

)u(x, t) dxdt +

Rd

f(x)φ(x, 0) dx = 0,

(7.7.2)for every φ(x, t) ∈ C∞

0 (R+ × Rd).

Page 113: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

7.8. DISCUSSION AND BIBLIOGRAPHY 97

Theorem 7.15. Assume that a(x) ∈ C∞b (Rd,Rd) and that f(x) ∈ C∞

b (Rd). Thenthere exists a unique weak solution of (7.7.1).

The proof uses the method of characteristics introduced in Chapter 4.

7.8 Discussion and Bibliography

The material of this section is pretty standard and can be found in many bookson partial differential equations and functional analysis, such as [25, 56, 45, 129].Our treatment of the Dirichlet problem follows closely [45, Ch. 6]. Our discussionabout periodic boundary conditions is based on [30, Ch. 4]. The Fredholm theoryfor the Dirichlet problem is developed in [45, Sec. 6.2.3].

In the case where the data of the problem is regular enough so that the Dirichletproblem (7.2.1) admits a classical solution (i.e. a function u ∈ C2(Ω) ∩ C0(Ω)satisfying (7.2.1)), then the weak and classical solutions coincide. See e.g. [56].

We saw in this chapter that the analysis of operators in divergence form isbased on energy methods within appropriate function spaces. On the other hand,for PDE in non–divergence form techniques based on the maximum principle aremore suitable. The maximum principle for elliptic PDE is studied in [56] andfor parabolic PDE in [125]. Of course, provided that the coefficients are C1, wecan rewrite a divergence form PDE in non–divergence form and vice versa, byintroducing terms that involve first order derivatives. Operators in non–divergenceform appear more natural in the probabilistic theory of diffusion, as generatorsof Markov processes. The dimensionality of the null space of the L2–adjoint ofa non–divergence form second order uniformly elliptic operator is related to theergodic theory of Markov processes, see [116, 120].

Turning now to parabolic PDE, the proof of Theorem 7.13 can be found in [51];see also[52]. Similar theorems hold for parabolic PDE with time–dependent coef-ficients. Parabolic PDE can also be studied using probabilistic methods. Indeed,under appropriate assumptions on the coefficients, the solution of (7.6.1) admits aprobabilistic interpretation, after noting that it is the backward Kolmogorov equa-tion for an SDE, when F ≡ 0. Probabilistic proofs of existence and uniqueness ofsolutions to parabolic PDE can be found in [28]. In this book the case of unboundedcoefficients is also studied.

Page 114: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

98 CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS

7.9 Exercises

1. Use the Lax–Milgram lemma to prove Theorem 7.8.

2. Use Theorem 7.9 to derive Theorem 7.8.

3. State and prove a result analogous to that discussed in Remark 7.4 for the peri-odic problem (7.2.2).

4. Prove the Fredholm alternative for operator A defined in (7.5.1) under assump-tions analogous to (7.5.3), but adapted to the case of Dirichlet boundary condi-tions.

5. Consider the parabolic PDE

∂u

∂t= Lu + F (x), for (x, t) ∈ Ω× [0, T ], (7.9.1a)

u(x, t) = 0, for x ∈ ∂Ω, (7.9.1b)

u(x, 0) = f(x), for x ∈ Ω. (7.9.1c)

whereL := ∆. (7.9.2)

Formulate a notion of weak solution analogous to that developed in the Ellipticcase in the previous chapter.

6. Prove that the equation (7.9.1) has a unique steady solution u(x). Prove thatu(x, t) → u(x) as t →∞.

7. Use the method of characteristics to solve the equation

∂u(x, t)∂t

+ a(x)∂u(x, t)

∂x= b(x) (x, t) ∈ R× R+,

u(x, 0) = 0 (x, t) ∈ R× 0.

8. Use the method of characteristics to solve the Burger’s equation

∂u(x, t)∂t

+ u(x, t)∂u(x, t)

∂x= 0 (x, t) ∈ R× R+,

u(x, 0) = 0 (x, t) ∈ R× 0.

Page 115: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

Part II

Perturbation Expansions

99

Page 116: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …
Page 117: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

Chapter 8

Invariant Manifolds for ODE

8.1 Introduction

Perhaps the simplest situation where variable reduction occurs in dynamical sys-tems is that of attractive invariant manifolds. These manifolds slave one subset ofthe variables to another. In this chapter we describe a situation where attractiveinvariant manifolds can be constructed in scale-separated systems, by means ofperturbation expansions.

8.2 Full Equations

We consider ODE and write z solving (4.1.1) as z = (xT , yT )T , where

dx

dt= f(x, y),

dy

dt= 1

εg(x, y), (8.2.1)

and ε ¿ 1. Here x ∈ X and y ∈ Y .Let ϕt

x(y) be the solution operator of the fast dynamics with x viewed as afixed parameter. To be precise, for any ξ ∈ X , let

d

dtϕt

ξ(y) = g(ξ, ϕtξ(y)), ϕ0

ξ(y) = y. (8.2.2)

We assume that

limt→∞ϕt

ξ(y) = η(ξ) (8.2.3)

101

Page 118: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

102 CHAPTER 8. INVARIANT MANIFOLDS FOR ODE

exists, is independent of y and that the convergence is uniform in ξ. Roughlyspeaking y(t) solving (8.2.1) is given by y(t) ≈ ϕ

t/εx(0)(y(0)) for times t which are

small compared with 1 so that x has not evolved very much. If we then look atscales which are large compared with ε, so that y is close to its equilibrium point(for example if t = O(ε

12 ), then we deduce that then y(t) ≈ η(x(0)). This is the

mechanism by which y becomes slaved to x and we now seek to make heuristicsmore precise.

Notice that the generator L for (8.2.1) has the form

L =1εL0 + L1 (8.2.4)

whereL0 = g(x, y) · ∇y, L1 = f(x, y) · ∇x

In particular, L0 is the generator of a process on Y for each fixed x.Now consider the following PDE for v(y, t) in which x is viewed as a fixed

parameter:∂v

∂t= L0u, v(y, 0) = φ(y) (8.2.5)

Result 4.4 shows thatv(y, t) = φ(ϕt

x(y)).

Thusv(y, t) → φ(η(x)), as t →∞ (8.2.6)

This is related to ergodicity, as equation (8.2.6) shows that the functional v(y, t)exhibits no dependence on initial data, asymptotically as t → ∞. Compare withthe discussion on ergodicity in Chapter 4 and Theorem 4.7 in particular.

8.3 Simplified Equations

Define the vector field F0(x) by

F0(x) = f(x, η(x)).

Result 8.1. For ε ¿ 1 and t = O(1) x solving (4.1.1) is approximated by X

solvingdX

dt= F0(X). (8.3.1)

Page 119: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

8.4. DERIVATION 103

This is the leading order approximation in ε. Keeping the next order yields therefined approximation

dX

dt= F0(X) + εF1(X), (8.3.2)

where

F1(x) = ∇yf(x, η(x))(∇yg(x, η(x)))−1∇xη(x)f(x, η(x)).

This approximation requires that ∇yg(x, η(x)) is invertible.

8.4 Derivation

The method used to find these simplified equations is to seek an approximate in-variant manifold for the system. Furthermore, we assume that the manifold can berepresented as a graph over x, namely y = Ψ(x). Such a graph is invariant underthe dynamics if

dy

dt= ∇Ψ(x(t))

dx

dt,

whenever y = Ψ(x). This implies that Ψ must solve the nonlinear PDE

1εg(x,Ψ(x)) = ∇Ψ(x)f(x,Ψ(x)).

We seek solutions to this equation as a power series

Ψ(x) = Ψ0(x) + εΨ1(x) +O(ε2).

This is our first example of a multiscale expansion.Substituting and equating successive powers of ε to zero yields the hierarchy

O(1ε ) g(x,Ψ0(x)) = 0,

O(1) ∇yg(x,Ψ0(x))Ψ1(x) = ∇Ψ0(x)f(x,Ψ0(x)).

Notice that equations (8.2.2,8.2.3) together imply that g(ξ, η(ξ)) = 0 for all ξ.

Hence the O(1ε ) equation above may be satisfied by choosing Ψ0(x) = η(x), giv-

ing the approximation (8.3.1). Since the rate of convergence in (8.2.3) is assumedto be uniform is it natural to assume that y = η(ξ) is a hyperbolic equilibrium pointof (8.2.2), so that ∇yg(x, η(x)) is invertible. Setting Ψ0(x) = η(x) in the O(1)equation, and inverting yields

Ψ1(x) = ∇yg(x, η(x))−1∇η(x)f(x, η(x)).

Page 120: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

104 CHAPTER 8. INVARIANT MANIFOLDS FOR ODE

Thus

f(x,Ψ(x)) = f(x,Ψ0(x) + εΨ1(x)) +O(ε2))

= f(x,Ψ0(x)) + ε∇yf(x,Ψ0(x))Ψ1(x) +O(ε2)

= f(x, η0(x)) + ε∇yf(x, η(x))Ψ1(x) +O(ε2)

and the refined approximation (8.3.2) follows.

8.5 Applications

8.5.1 Linear Fast Dynamics

A structure arising in many applications is where the frozen x dynamics, given byϕt

ξ(·), is linear. As a simple example consider the equations

dx

dt= f(x, y)

dy

dt= −y

ε+

g(x)ε

. (8.5.1)

Here d = 2 and X = Y = R, Z = R2. It is straightforward to show that

ϕtξ(y) = e−ty +

∫ t

0es−tg(ξ)ds

= e−ty + (1− e−t)g(ξ).

Hence (8.2.3) is satisfied for η(·) = g(·)The simplified equation given by Result 8.1 is hence

dX

dt= f(X, g(X)).

Using the fact that ∇yg(x, y) = −1 we see that the more refined approximation(8.3.2) is

dX

dt= f(X, g(X))

(1− ε

df

dy(X, g(X))g′(X)

).

8.5.2 Large Time Dynamics

The statement of the result concerning simplified dynamics concerns approxima-tion of x(t) on O(1) time intervals with respect to ε. However in many cases the

Page 121: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

8.5. APPLICATIONS 105

results extend naturally to the infinite time domain. The following example illus-trates this idea.

Consider the equations

dx1

dt= −x2 − x3

dx2

dt= x1 +

15x2

dx3

dt=

15

+ y − 5x3

dy

dt= −y

ε+

x1x3

ε, (8.5.2)

so that X = R3 and Y = R. Result 8.1 indicates that x should be well approxi-mated by X = solving the Rossler system

dX1

dt= −X2 −X3

dX2

dt= X1 +

15X2

dX3

dt=

15

+ X3(X1 − 5). (8.5.3)

A comparison of the numerically generated attractors for both systems shows that,when projected onto the common space X = R3, they are very close.

8.5.3 Centre Manifold

Consider the equations

dx

dt= λx +

2∑

i=0

aixiy2−i,

dy

dt= x− y +

2∑

i=0

bixiy2−i.

At λ = 0 an eigenvalue of the system linearized at the origin passes through zeroand so we expect to find a bifurcation and a resulting centre manifold. To constructthis manifold set

x → εx, y → εy, λ → ελ, t → ε−1t.

Page 122: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

106 CHAPTER 8. INVARIANT MANIFOLDS FOR ODE

This corresponds to looking for small amplitude solutions, close to the fixed pointat the origin, at parameter values close to the bifurcation values. Such solutionsevolve slowly and hence time is rescaled. The equations become

dx

dt= λx +

2∑

i=0

aixiy2−i,

dy

dt=

1ε(x− y) +

2∑

i=0

bixiy2−i.

A perturbation expansion gives the invariant manifold y = x and we obtain thecentre manifold

dX

dt= λX + AX2

where A =∑2

i=0 ai.

8.6 Discussion and Bibliography

Early studies of the reduction of ODE with attracting slow manifold into differential-algebraic equations includes the independent work of Levinson and of Tikhonov(see O’Malley [113] and Tikhonov et al. [149]). The use of a spectral gap suffi-ciently large relative to the size of the nonlinear terms is used in the construction oflocal stable, unstable and center manifolds (e.g., Carr [27], Wiggins [154]), slowmanifolds (Kreiss [83]) and inertial manifolds (Constantin et al. [32]).

8.7 Exercises

1. Consider the equations

dx

dt= λx + a0x

3 + a1xy,

dy

dt= −y +

2∑

i=0

bixiy2−i.

Show that the scaling

x → εx, y → ε2y, λ → ε2λ, t → ε−2t

Page 123: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

8.7. EXERCISES 107

puts this system in a form to which the perturbation techniques of this sectionapply. Deduce that the centre manifold has the form

dX

dt= λX + AX3

where A = a0 + a1b2.

2. Consider the equations

dx

dt= Ax + εf0(x, y),

dy

dt= −1

εBy + g0(x, y),

for ε ¿ 1 and x ∈ Rl, y ∈ Rd−l. Assume that B is symmetric positive-definite.Find the first three terms in an expansion for an invariant manifold representingy as a graph over x.

Page 124: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

108 CHAPTER 8. INVARIANT MANIFOLDS FOR ODE

Page 125: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

Chapter 9

Averaging for Markov Chains

9.1 Introduction

Perhaps the simplest setting in which to expose variable elimination for stochasticdynamical problems is to work in the setting of Markov chains. In this context it isnatural to study situations where a subset of the variables evolves rapidly comparedwith the remainder, and can be replaced by their averaged effect.

9.2 Full Equations

We work in the set-up of Chapter 5 and consider the backward equation for vi(t) =E

(φ(z(t))|z(0) = i

), namely

dv

dt= Qv. (9.2.1)

We assume that the generator Q 1 takes the form

Q =1εQ0 + Q1, (9.2.2)

with ε ¿ 1. We study situations where the state space is indexed by two variables,x and y, and the leading order contribution in Q, namely Q0, corresponds to fastergodic dynamics in y, with x frozen. Averaging over y then gives the effectivereduced dynamics for x.

The precise situation is as follows. Our state space is I := Ix × Iy withIx, Iy ⊆ 1, 2, · · · . We let q((i, k), (j, l)) denote the element of the generator

1In this chapter we denote the generator by Q rather than L because we use index l for the state-space; thus we wish to avoid confusion with the components of the generator

109

Page 126: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

110 CHAPTER 9. AVERAGING FOR MARKOV CHAINS

associated with transition from (i, k) ∈ Ix × Iy to (j, l) ∈ Ix × Iy. 2 Considernow a family of Markov chains on Iy, indexed by i ∈ Ix. We write the generatoras A0(i) with entries as a0(k, l; i); the indices denote transition from k ∈ Iy tol ∈ Iy for given fixed i ∈ Ix. We assume that, for each i ∈ Ix, A0(i) generates anergodic Markov chain on Iy. Hence A0 has a one-dimensional null-space and foreach i ∈ Ix

3

l

a0(k, l; i) = 0, (i, k) ∈ Ix × Iy,

k

ρ∞(k; i)a0(k, l; i) = 0, (i, l) ∈ Ix × Iy. (9.2.3)

This is the index form of equations (5.4.2) with Q replaced by A0(i). Without lossof generality we choose the normalization

k

ρ∞(k; i) = 1, ∀i ∈ Ix.

Thus ρ∞(i) = ρ∞(k; i)k∈Iy is a probability density for each i ∈ Ix.Similarly to the above we introduce the generators of a Markov chain on Ix, pa-

rameterized by k ∈ Iy. We denote the generator by A1(k) with indices a1(i, j; k);the indices denote transition from i ∈ Ix to j ∈ Ix, for each fixed k ∈ Iy. Withthis notation for the A0, A1 we introduce generators Q0, Q1 of Markov chains onIx × Iy by

q0((i, k), (j, l)) = a0(k, l; i)δij ,q1((i, k), (j, l)) = a1(i, j; k)δkl.

(9.2.4)

Here δij is the usual Kronecker delta. In Q0 (resp. Q1) the Kronecker delta repre-sents the fact that no transitions are taking place in Ix (resp. Iy).

To confirm that these are indeed generators notice that non-diagonal entries(i, k) 6= (j, l) are non-negative because A0 and A1 are generators. Also

j,l

q0((i, k), (j, l)) =∑

j,l

a0(k, l; i)δij

=∑

l

a0(k, l; i)

= 02In this chapter, and in Chapter 16, we will not use suffices to denote the dependence on the state

space as the double-indexing makes this a cluttered notation. Hence we use q((i, k), (j, l)) ratherthan q(i,k),(j,l).

3Summation is always over indices in Ix or Iy in this chapter. It should be clear from the contextwhich of the two sets is being summed over.

Page 127: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

9.3. SIMPLIFIED EQUATIONS 111

by (9.2.3). A similar calculation shows that∑

j,l

q1((i, k), (j, l)) = 0,

using the fact that∑

j

a1(i, j; k) = 0, ∀ (i, k) ∈ Ix × Iy,

since A1(k) is a generator for each fixed k. Thus Q0, Q1 are also the generators ofMarkov chains. Finally note that any linear combination of generators, via positivescalar constants, will also be a generator. Hence (9.2.2) defines a generator for anyε > 0.

9.3 Simplified Equations

We define the generator Q1 of a Markov chain on Ix by:

l1(i, j) =∑

k

ρ∞(k; i)a1(i, j; k). (9.3.1)

Notice that q1(i, j) > 0 for i 6= j because ρ∞(k; i) > 0 for i 6= j. Furthermore

j

q1(i, j) =∑

k

ρ∞(k; i)

j

q1(i, j; k)

= 0.

Hence Q1 is the generator of a Markov chain.

Result 9.1. Consider equation (9.2.1) under assumption (9.2.2). Then for ε ¿ 1and t = O(1) the implied dynamics in x is approximately Markovian with genera-tor Q1 and governed by the backward equation

dv0

dt= Q1v0. (9.3.2)

As discussed above, the generator of a Markov chain on Ix alone, and the dy-namics in Iy has been eliminated through averaging. We now provide justificationfor this elimination of variables, by means of perturbation expansion.

Page 128: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

112 CHAPTER 9. AVERAGING FOR MARKOV CHAINS

9.4 Derivation

The method used is to show that backward equation for the full Markov chain in(x, y) ∈ IxIy can be approximated by the backward equation

dv0

dt= Q1v0

for x alone.We consider equation (9.2.1) under (9.2.2). We have the backward equation

dv

dt=

(1εQ0 + Q1

)v.

Unlike the previous section, where we solved a nonlinear PDE containing a smallparameter ε, here the problem is linear. In the following five chapters, all ourperturbation expansions are for similar linear equations. The derivation here ishence prototypical of what follows.

We seek solutions v = v(i, k, t) in the form of the multiscale expansion

v = v0 + εv1 +O(ε2). (9.4.1)

Substituting and equating powers of ε we find

O(1ε ) Q0v0 = 0, (9.4.2a)

O(1) Q0v1 = −Q1v0 +dv0

dt. (9.4.2b)

By (9.2.3) we deduce from (9.4.2a) that v0 is independent of k ∈ Iy. Abusingnotation, we write

v0(i, k, t) = v0(i, t)1(k) (9.4.3)

where 1(k) = 1, ∀k ∈ Iy. The operator Q0 is singular and hence for (9.4.2b)to have a solution, the Fredholm alternative implies that we must impose the solv-ability condition

−Q1v0 +dv0

dt⊥Null QT

0 . (9.4.4)

From (9.2.3) we deduce that the null space of QT0 is characterized by

k,i

ρ∞(k; i)c(i)q0((i, k), (j, l)) = 0, (9.4.5)

Page 129: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

9.5. APPLICATION 113

for any vector c = c(i) on Ix. Using (9.4.3) we find that

dv0

dt−Q1v0 =

dv0

dt(i, t)1(k)−

j,l

a1(i, j; k)δklv0(j, t)1(l)

=(dv0

dt(i, t)−

j

a1(i, j; k)v0(j, t))1(k).

Imposing the solvability condition (9.4.4) by means of (9.4.5) we obtain∑

k,i

ρ∞(k; i)c(i)(dv0

dt(i, t)−

j

a1(i, j; k)v0(j, t)),

which implies that∑

i

c(i)(dv0

dt(i, t)−

j

Q1(i, j)v0(j, t))

= 0.

Since c is an arbitrary vector on Ix we deduce that each component of the sum overi is zero. This yields (9.3.2).

9.5 Application

Consider a simple example where Ix = Iy = 1, 2. Thus we have a four stateMarkov chain on I = Ix×Iy. We assume that the generators of the Markov chainson Iy and Ix are given by

A0(i) =( −θi θi

φi −φi

)

and

A1(k) =( −αk αk

βk −βk

),

respectively. In the first (resp. second) of these Markov chains i ∈ Ix (resp.k ∈ Iy) is a fixed parameter. The parameters θi, φk, αk, βk are all non-negative.

If we order the four states of the Markov chain as (1, 1), (1, 2), (2, 1), (2, 2)then the generators Q0 and Q1 are given by

Q0 =

−θ1 θ1 0 0φ1 −φ1 0 00 0 −θ2 θ2

0 0 φ2 −φ2

(9.5.1)

Page 130: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

114 CHAPTER 9. AVERAGING FOR MARKOV CHAINS

and

Q1 =

−α1 0 α1 00 −α2 0 α2

β1 0 −β1 00 β2 0 −β2

. (9.5.2)

Note that any linear combination of Q0 and Q1 will have zeros along the anti-diagonal and hence the same is true of Q; this reflects the fact that, by construction,transitions in both Ix and Iy do not happen simultaneously.

The invariant density of the Markov chain with generator A0(i) is ρ∞(i) =(λi, 1 − λi)T with λi = φi/(θi + φi). Recall that the averaged Markov chain onIx has generator Q1 with entries

q1(i, j) =∑

k

ρ∞(k; i)a1(i, j; k)

= λia1(i, j; 1) + (1− λi)a1(i, j; 2).

Thus

Q1 =( −λ1α1 − (1− λ1)α2 λ1α1 + (1− λ1)α2

λ1β1 + (1− λ1)β2 −λ1β1 − (1− λ1)β2

). (9.5.3)

9.6 Discussion and Bibliography

Two recent monographs where multiscale problems for Markov chains are studiedare [156], [155]. See also [136] for a broad discussion of averaging and dimen-sion reduction in stochastic dynamics. Markov chain approximations for SDEs,especially in the large deviation limit, are studied in [50]

1. Find a multiscale expansion for the invariant measure of the Markov chain withgenerator Q = 1

εQ0 + Q1 when Q0, Q1 are given by (9.5.1), (9.5.2).

2. Find the invariant measure of Q1 and interpret your findings in the light of youranswer to the previous question.

Page 131: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

Chapter 10

Averaging for ODE and SDE

10.1 Introduction

Here we take the averaging principle developed in the previous chapter for Markovchains, and apply it to ODE and SDE. The unifying theme is the approximate so-lution of the backward equation by perturbation expansion, and consequent elimi-nation of variables.

10.2 Full Equations

We write z solving (6.1.1) as z = (xT , yT )T and consider the case where

dx

dt= f(x, y),

dy

dt=

1εg(x, y) +

1√εβ(x, y)

dV

dt, (10.2.1)

with ε ¿ 1 and V a standard Brownian motion. Here x ∈ X , y ∈ Y , z ∈ Z asdiscussed in sections 4.1 and 6.1.

Our starting point is to analyze the behavior of the fast dynamics with x beinga fixed parameter. In Chapter 8 we considered systems in which the fast dynamicsconverge to an x-dependent fixed point. This gives rise to a situation where the y

variables are “slaved” to the x variables. Averaging generalizes this idea to situa-tions where the dynamics in the y variable, with x fixed, is more complex. As inthe previous chapter on Markov chains, we average out the fast variable, over anappropriate invariant measure.

115

Page 132: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

116 CHAPTER 10. AVERAGING FOR ODE AND SDE

Let ϕtx(y) be the solution operator of the fast dynamics with x a fixed parame-

ter. To be precise, for fixed ξ,

d

dtϕt

ξ(y) = g(ξ, ϕtξ(y)) + β(ξ, ϕt

ξ(y))dV

dt, ϕ0

ξ(y) = y. (10.2.2)

As in Chapter 8, y(t) solving (10.2.2) is given by y(t) ≈ ϕt/εx(0)(y(0)) for times t

which are small compared with 1, so that x has not evolved very much. If (10.2.2)is ergodic then y(t) will traverse its ξ − dependent invariant measure in any timelarge compared to ε and it is natural to average y in the x equation, against thisinvariant measure with ξ = x. We now make these heuristics precise.

In the case where β ≡ 0 then ϕtξ(y) coincides the solution of (8.2.2). When

β 6= 0, note that ϕtξ(y) depends on the Brownian motion V (s)s∈[0,t] and hence

is a random variable. Rather than assuming convergence to a fixed point, as we didin (8.2.3), we assume here that ϕt

ξ(y) is ergodic.To be precise we assume that the measure defined by

µx(A) = limT→∞

1T

∫ T

0IA(ϕt

x(y)) dt, A ⊆ Y, (10.2.3)

exists, for IA the indicator function of arbitrary Borel sets A ⊆ Y . When work-ing with an SDE (β 6= 0) that it is natural to assume that µx(·) has a densityµx(dy) = ρ∞(y; x)dy. However we will illustrate by means of an example aris-ing in Hamiltonian mechanics that this assumption is not necessary. Note that thesituation in Chapter 8 corresponds to µx(dy) = δ(y − η(x))dy, that is the ergodicinvariant measure is a Dirac mass centered on the invariant manifold.

We define the generators

L0 = g(x, y) · ∇y +12B(x, y) : ∇y∇y (10.2.4)

L1 = f(x, y) · ∇x

where B(x, y) = β(x, y)β(x, y)T . In terms of the generator L0, our ergodicityassumption becomes the statement that L0 has one dimensional null-space charac-terized by

L01(y) = 0,

L∗0ρ∞(y; x) = 0. (10.2.5)

Here 1(y) denotes constants in y.

Page 133: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

10.3. SIMPLIFIED EQUATIONS 117

In the case where Y = Td the operators L0 and L∗0 are equipped with periodicboundary conditions. The interrelations amongst the characterizations of ergodic-ity that we use here are made precise in the case where B(x, y) is strictly positive-definite, uniformly in (x, y) ∈ X × Y, in Theorem 6.10. In more general situa-tions, such as when Y = Rd, similar rigorous justifications are possible, but thefunctional setting is more complicated, typically employing weighted Lp−spaceswhich characterize the decay of the invariant density at infinity.

10.3 Simplified Equations

Define a vector field F by

F (x) =∫

Yf(x, y) µx(dy). (10.3.1)

Result 10.1. For ε ¿ 1 and t = O(1) x solving (10.2.1) is approximated by X

solvingdX

dt= F (X). (10.3.2)

Note that, in the case we considered in Chapter 8, where µx(dy) = δ(dy −η(x))dy, we obtain

F (x) = f(x, η(x)).

This is precisely the vector field in (8.3.1) and so the simplified equations in Chap-ter 8 are a special case of those derived here. We will derive Result 10.1 in the casewhere β is non-zero and we will use the density given by µx(dy) = ρ∞(y; x)dy.

The expression for F given above is useful for theoretical purposes, and whenthe measure µ is known explicitly. However, for numerical purposes the followingrepresentation is often useful:

Result 10.2. An alternative representation of F (x) is via a time-average:

F (x) = limT→∞

1T

∫ T

0f(x, ϕt

x(y(0)))dt.

This representation is found by using (10.2.3) to evaluate (10.3.1).

Page 134: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

118 CHAPTER 10. AVERAGING FOR ODE AND SDE

10.4 Derivation

As for Markov chains, we derive the averaged equations by working with the back-ward equation. Let

v(x, y, t) = E(φ(x(t), y(t))|x(0) = x, y(0) = y

).

The backward equation (6.3.3) for the SDE (10.2.1) is

∂v

∂t=

1εL0v + L1v. (10.4.1)

Here L0,L1 are given by (10.2.4) and z in (6.3.3) is (x, y) here. Note that L0 is adifferential operator in y, in which x appears as a parameter. Thus we must equipit with boundary conditions. For simplicity we consider the case where Y = Td

so that periodic boundary conditions are used. Note however that other functionalsettings are also possible; the key in what follows is application of the Fredholmalternative to operator equations defined through L0.

We seek a solution to (10.4.1) in the form of the multiscale expansion

v = v0 + εv1 +O(ε2)

and obtain

O(1/ε) L0v0 = 0, v0 is 1–periodic, (10.4.2a)

O(1) L0v1 = −L1v0 +∂v0

∂t, v1 is 1–periodic. (10.4.2b)

Equation (10.4.2a) implies that v0 is in the null space of L0 and hence, by (10.2.5)and ergodicity, is a function only of (x, t). The Fredholm alternative for (10.4.2b)shows that

−L1v0 +∂v0

∂t⊥Null L∗0.

By (10.2.5) this implies that∫

Yρ∞(y;x)

(∂v0

∂t(x, t)− f(x, y) · ∇xv0(x, t)

)dy = 0.

Since ρ∞ is a probability density we have∫Y ρ∞(y;x)dy = 1. Hence

∂v0

∂t− (

Yf(x, y)µ(y)dy) · ∇xv0(x, y) = 0

Page 135: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

10.5. APPLICATIONS 119

so that by (10.3.1),∂v0

∂t− F (x) · ∇xv0 = 0

This is the backward equation for (10.3.2); indeed the method of characteristics asgiven by in Result 4.4 shows that we have the required result.

10.5 Applications

We consider two applications of the averaging principle, the first in the context ofSDE, and the second in the context of Hamiltonian ODE.

10.5.1 A Skew-Product SDE

Consider the equations

dx

dt= (1− y2)x,

dy

dt= −α

εy +

√λ

ε

dV

dt. (10.5.1)

Here X = Y = R. It is of interest to know whether x will explode in time, or re-main bounded. We can get insight into this question in the limit ε → 0 by derivingthe averaged equations. The ergodic measure for y is a Gaussian N (0, λ

2α). Notethat this measure does not depend on x and hence has density ρ∞(y) only. Theaverage vector field F is here defined by

F (x) =(1−

Rρ∞(y)y2dy

)x

where ρ∞ is the density associated with Gaussian N (0, λ2α). Thus

Rd

ρ∞(y)y2dy =λ

and

F (x) =(1− λ

)x.

Hence trajectories of x will explode if λ < 2α and will contract if λ > 2α. Ifλ = 2α then the averaged vector field is zero. In this situation we need to rescale

Page 136: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

120 CHAPTER 10. AVERAGING FOR ODE AND SDE

time t 7→ t/ε and to obtain the problem

dx

dt=

1ε(1− y2)x,

dx

dt= − α

ε2y +

√2α

ε2

dv

dt.

In this time rescaling nontrivial dynamics occur. SDE of this form are the topic ofChapter 11, and this specific example is considered in section 11.6.

10.5.2 Hamiltonian Mechanics

In many applications Hamiltonian systems with strong potential forces, responsi-ble for fast, small amplitude, oscillations around a constraining sub-manifold, areencountered. It is then of interest to describe the evolution of the slowly evolvingdegrees of freedom by averaging over the rapidly oscillating variables. We give anexample of this. The example is interesting because it shows that the formalism ofthis chapter can be extended to pure ordinary differential equations, with no noisepresent; it also illustrates that it is possible to deal with situations where the lim-iting measure µ retains some memory of initial conditions – in this case the totalenergy of the system.

Consider a two-particle system with Hamiltonian,

H(x, p, y, v) =12(p2 + v2) + V (x) +

ω(x)2 ε2

y2,

where (x, y) are the coordinates and (p, v) are the conjugate momenta of the twoparticles, V (x) is a non-negative potential and ω(x) is assumed to satisfy ω(x) >ω > 0 for all x. The corresponding equations of motion are

dx

dt= p

dp

dt= −V ′(x)− ω′(x)

2ε2y2

dy

dt= v

dv

dt= −ω(x)

ε2y.

We let E denote the value of the energy H at time t = 0. We assume that theenergy E is bounded independently of ε and this implies that y2 6 2ε2E/ω. Hence

Page 137: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

10.5. APPLICATIONS 121

the solution approaches the submanifold y = 0 as ε → 0. Note, however, thaty appears in the combination y/ε in the x equations and in the expression forthe energy H . Thus it is natural to make the change of variables η = y/ε. Theequations then read

dx

dt= p

dp

dt= −V ′(x)− ω′(x)

2η2

dt=

1εv

dv

dt= −ω(x)

εη. (10.5.2)

In these variables we recover a system of the form (10.2.1) with “slow” variables,x ← (x, p), and “fast” variables, y ← (η, v). It is instructive to write the equationin second order form as

d2x

dt2+ V ′(x) +

12ω′(x)η2 = 0

d2η

dt2+

1ε2

ω(x)η = 0.

The fast equations represent an harmonic oscillator whose frequency, ω1/2(x), ismodulated by the x variables.

Consider the fast dynamics, with (x, p) frozen. The total energy for this fastdynamics is

12v2 +

ω(x)2

η2.

The initial energy of the fast system, which is conserved, is found by subtractingthe energy associated with the frozen variables from the total energy of the originalsystem, and we denote it by

Efast = E − 12p2 − V (x).

For fixed x, p the dynamics in η, v is confined to this energy shell, which we denoteby Y(x, p).

Harmonic oscillators satisfy an equi-partition property whereby, on average,the energy is equally distributed between its kinetic and potential contributions

Page 138: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

122 CHAPTER 10. AVERAGING FOR ODE AND SDE

(the virial theorem). Thus the average of the kinetic energy of the fast oscillatoragainst the ergodic measure µx,p is

Y(x,p)

ω(x)2

η2µx,p(dη, dv) =12

[E − 1

2p2 − V (x)

].

Thus ∫

Y(x,p)

12η2µx,p(dη, dv) =

12ω(x)

[E − 1

2p2 − V (x)

].

Here (x, p) are viewed as fixed parameters and the total energy E is specified bythe initial data of the whole system. The averaging principle states that the rapidlyvarying η2 in the equation for p can be approximated by its time-average, givingrise to a closed system of equations for (X,P ) ≈ (x, p). These are

dX

dt= P

dP

dt= −V ′(X)− ω′(X)

2ω(X)

[E − 1

2P 2 − V (X)

], (10.5.3)

with initial data E, X(0) = X0 and P (0) = P0. It is verified below that (X, P )satisfying (10.5.3) conserve the following adiabatic invariant

J =1

ω1/2(X)

[E − 1

2P 2 − V (X)

].

Thus, (10.5.3) reduces to the simpler form

dX

dt= P

dP

dt= −V ′(X)− J0 [ω1/2(X)]′, (10.5.4)

where J0 is given by

J0 =1

ω1/2(X0)

[E − 1

2P 2

0 − V (X0)]

.

This means that the influence of the stiff potential on the slow variables is to replacethe potential V (x) by an effective potential,

Veff(x) = V (x) + J0 ω1/2(x).

Page 139: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

10.6. DISCUSSION AND BIBLIOGRAPHY 123

Note that the limiting equation contains memory of the initial conditions for thefast variables, through the constant J0. Thus the situation differs slightly from thatcovered by the conjunction of Results 10.1 and 10.2.

To verify that J is indeed conserved in time, note that, from the definition of J ,

d

dt

12 (X)J

)=

d

dt

(E − 1

2P 2 − V (X)

)

=Pω′(X)2ω(X)

(E − 1

2P 2 − V (X)

)

=Pω′(X)

2ω12 (X)

J.

But, from the equations of motion for (X, P ),

d

dt

12 (X)J

)=

12

ω′(X)

ω12 (X)

dX

dtJ + ω

12 (X)

dJ

dt

=Pω′(X)

2ω12 (X)

J + ω12 (X)

dJ

dt.

Equating the two expressions gives

dJ

dt= 0

since ω(X) is strictly positive.

10.6 Discussion and Bibliography

A detailed account of the averaging method, as well as numerous examples, canbe found in Sanders and Verhulst [132]. See also [7]. An English language reviewof the Russian literature can be found in Lochak and Meunier [90]. An overviewof the topic of slow manifolds, especially in the context of Hamiltonian problems,may be found in [92]. The averaging method applied to equations (10.2.1) is ana-lyzed in an instructive manner in [115], where the Liouville equation is used to con-struct a rigorous proof of the averaged limit. The paper [150] provides an overviewof variable elimination in a wealth of problems with scale separation.

Anosov’s theorem is the name given to the averaging principle in the contextof ODE – (10.2.1) with β ≡ 0. This theorem requires the fast dynamics to beergodic. Often ergodicity fails due to the presence of “resonant zones”—regions

Page 140: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

124 CHAPTER 10. AVERAGING FOR ODE AND SDE

in X for which the fast dynamics is not ergodic. Arnold and Neistadt [90] ex-tended Anosov’s result to situations in which the ergodicity assumption fails on asufficiently small set of x ∈ X . Those results were further generalized and ex-tended to the stochastic framework by Kifer, who also studied the diffusive andlarge deviation character of the discrepancy between the effective and exact solu-tion [78, 79, 80, 81].

The situations in which the fast dynamics tend to fixed points, periodic solu-tions, or chaotic solutions can be treated in a unified manner through the introduc-tion of Young measures. Artstein and co-workers considered a class of singularlyperturbed system of type (10.2.1), with attention given to the limiting behavior ofboth slow and fast variables. In all of the above cases the pair (x, y) can be shownto converge to (X,µX), where X is the solution of

dX

dt=

Yf(X, y) µX(dy),

and µX is the ergodic measure on Y; the convergence of y to µX is in the sense ofYoung measures. (In the case of a fixed point the Young measure is concentratedat a point.) A general theorem along these lines is proved in [9].

There are many generalizations of this idea. The case of non-autonomous fastdynamics, as well as a case with infinite dimension are covered in [10]. More-over, these results still make sense even if there is no unique invariant measureµx, in which case the slow variables can be proved to satisfy a (non-deterministic)differential inclusion [11].

In the context of SDE, an interesting generalization of (10.2.1) is to considersystems of the form

dx

dt= f(x, y) + α(x, y)dU

dt

dy

dt= 1

εg(x, y) + 1√εβ(x, y)dV

dt . (10.6.1)

The simplified equation is then an SDE, not an ODE. This situation is a sub-case ofthe set-up we consider in the next chapter which can be obtained by setting f0 = 0in that chapter, letting f1 = f and by identifying ε here with ε2 in that chapter.

In the application section we studied the averaging principle for a two-scaleHamiltonian system. The systematic study of Hamiltonian problems with twotime-scales was initiated by Rubin and Ungar [131]. More recently the ideas ofNeistadt, based on normal form theory, have been applied to such problems [16];

Page 141: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

10.7. EXERCISES 125

this approach is very powerful, yielding very tight, exponential, error estimates be-tween the original and limiting variables. A recent approach to the problem, usingthe techniques of time-homogenization [21], is the paper [22]. The example wepresent is taken from that paper. The heuristic derivation we have given here ismade rigorous in [22], using time-homogenization techniques, and it is also gen-eralized to higher dimension. Resonances become increasingly important as theco-dimension, m, increases, limiting the applicability of the averaging approach tosuch two-scale Hamiltonian systems (Takens [142]).

10.7 Exercises

1. Find the averaged equation resulting from the SDE (10.6.1).

2. Consider the equations

dx

dt= −∇xΦ(x, y) +

√2σ

dU

dt

dy

dt= −1

ε∇yΦ(x, y) +

√2σ

ε

dV

dt.

Show that the averaged equation for X has the form

dX

dt= −∇Ψ(X) +

√2σ

dU

dt

where the Fixman potential Ψ is given by

exp(− 1σ

Ψ(x)) =∫

Yexp

(− 1

σΦ(x, y)

)dy.

3. Let y be a two state continuous time Markov chain with generator

L =1ε

( −a ab −b

).

Without loss of generality label the state-space I = −1, +1. Consider theSDE

dx

dt= f(x, y) + α(x, y)

dU

dt.

Write the generator of the process and use multiscale analysis to derive theaveraged SDE of the form

dX

dt= F (X) + A(X)

dU

dt.

Page 142: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

126 CHAPTER 10. AVERAGING FOR ODE AND SDE

4. Let z be a continuous time Markov chain with generator

L =( −a a

b −b

).

Without loss of generality label the state-space I = −1, +1. Define twofunctions ω : I → (0,∞) and m : I → (−∞,∞) by ω(±1) = ω± andm(±1) = m±. Now consider the stochastic differential equations, with coeffi-cients depending upon z, given by

dx

dt= f(x, y) +

√2σ

dU

dt

dy

dt= −1

εω(z)(y −m(z)) +

√2σ

ε

dV

dt.

Write the generator for this process and use multiscale analysis to derive theaveraged coupled Markov chain and SDE of the form

dX

dt= F (X, z) +

√2σ

dU

dt.

5. Generalize the previous exercise to the case where the transition rates of theMarkov chain, determined by a and b, depend upon x and y.

Page 143: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

Chapter 11

Homogenization for ODE andSDE

11.1 Introduction

In this chapter we continue our study of systems of SDE with two, widely sepa-rated characteristic time scales. The setting is similar to the one considered in theprevious chapter. The difference is that in this chapter we seek to derive an effec-tive equation is the longer, diffusive time scale O(1/ε2). This is the time scale ofinterest when the effective drift defined in equation (10.3.1) vanishes, due, for ex-ample, to the symmetries of the problem. This is the centering condition, equation(11.2.2) below. In contrast to the case considered in the previous chapter, in thediffusive time scale the effective equation is stochastic.

11.2 Full Equations

Consider the stochastic differential equations

dx

dt=

1εf0(x, y) + f1(x, y) + α(x, y)

dU

dtdy

dt=

1ε2

g(x, y) +1εβ(x, y)

dV

dt. (11.2.1)

Both the x and y equations contain fast dynamics, but the dynamics in y is an orderof magnitude faster than in x. Here x ∈ X , y ∈ Y , z ∈ Z as discussed in sections4.1 and 6.1.

127

Page 144: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

128 CHAPTER 11. HOMOGENIZATION FOR ODE AND SDE

As in the previous chapter, we assume that the dynamics in y, with x viewedas frozen, is ergodic. Specifically we assume that the frozen dynamics (10.2.2)generates an ergodic measure µx(·) defined by (4.4.1). In addition we also assumethat f0(x, y) averages to zero under this measure, so that

∫f0(x, y)µx(dy) = 0. (11.2.2)

It can then be shown that the term f0 will, in general, give rise to O(1) effectivedrift and noise contributions in an approximate equation for x.

For equation (11.2.1) the backward Kolmogorov equation (6.3.3) with φ =φ(x) is,

∂v

∂t=

1ε2L0v +

1εL1v + L2v

v(x, y, 0) = φ(x), (11.2.3)

where

L0v = g · ∇yv +12B : ∇y∇yv (11.2.4a)

L1v = f0 · ∇xv (11.2.4b)

L2v = f1 · ∇xv +12A : ∇x∇xv (11.2.4c)

with

A(x, y) = α(x, y)α(x, y)T ,

B(x, y) = β(x, y)β(x, y)T .

By using perturbation theory we eliminate the y dependence in this equation, givingrise to simplified equations for x.

In terms of the generator L0, our ergodicity assumption becomes the statementthat L0 has one dimensional null-space characterized by

L01(y) = 0,

L∗0ρ∞(y; x) = 0. (11.2.5)

Here 1(y) denotes constants in y.As in the previous chapter, in the case where Y = Td the operators L0 and L∗0

are equipped with periodic boundary conditions. Then in the case where B(x, y)

Page 145: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

11.3. SIMPLIFIED EQUATIONS 129

is strictly positive-definite, uniformly in (x, y) ∈ X ×Y see Theorem 6.10 justifiesthe statement that the null-space of L∗0 is one-dimensional. In more general situ-ations, such as when Y = Rd, similar rigorous justifications are possible, but thefunctional setting is more complicated, typically employing weighted Lp−spaceswhich characterize the decay of the invariant density at infinity.

11.3 Simplified equations

We define the cell problem as follows:

L0Φ(x, y) = −f0(x, y),∫

Φ(x, y)ρ∞(y; x)dy = 0. (11.3.1)

Because f0 satisfies (11.2.2), this equation has a unique solution by the Fredholmalternative. We may then define a vector field F by

F (x) =∫ (

f1(x, y) + (∇xΦ(x, y))f0(x, y))ρ∞(y; x)dy

= F1(x) + F0(x) (11.3.2)

and a diffusion matrix A(x) by

A(x)A(x)T = A1(x) +12

(A0(x) + A0(x)T

)(11.3.3)

where

A0(x) :=∫

2f0(x, y)⊗ Φ(x, y)ρ∞(y; x)dy, (11.3.4)

A1(x) :=∫

A(x, y)ρ∞(y; x)dy. (11.3.5)

To make sure that A(x) is well-defined it is necessary to prove that the sum ofA1(x) and A0(x) is positive semi-definite. This is done in section 11.5.

Result 11.1. For ε ¿ 1 and t = O(1) x solving (11.2.1) is approximated by X

solvingdX

dt= F (X) + A(X)

dW

dt. (11.3.6)

Alternative representations of F (x) and A(x)A(x)T may be found by usingtime averaging. The first of such results is as follows. Note that the expressions

Page 146: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

130 CHAPTER 11. HOMOGENIZATION FOR ODE AND SDE

for A0 and F0 involve two time integrals. The integral over s is an ergodic aver-age, replacing averaging with respect to the stationary measure on path space; theintegral over t enables us to express the effective equations without reference tosolution of the cell problem Φ, and requires a decay of correlations in order to bewell-defined.

Result 11.2. Alternative representations of the vector field F and diffusion matrixA can be found through the following integrals over time:

F1(x) = limT→∞

1T

∫ T

0f1(x, ϕt

x(y))dt,

A1(x) = limT→∞

1T

∫ T

0A(x, ϕt

x(y))dt;

and

A0(x) = 2∫ ∞

0

(lim

T→∞1T

∫ T

0f0(x, ϕs

x(y))⊗ f0(x, ϕt+sx (y))ds

)dt. (11.3.7)

If the generator L0 is independent of x then

F0(x) =∫ ∞

0

(lim

T→∞1T

∫ T

0∇xf0(x, ϕt+s

x (y))f0(x, ϕsx(y))ds

)dt.

A second representation of A0(x) and F0(x), intermediate between the pureergodic average and pure time-average, is as follows. We let Eµx denote the sta-tionary measure on path space for ϕt

x(·). To be precise this is the product measureformed from use of µx(·) on initial data and standard independent Wiener measureon driving Brownian motions. With this notation we have:

Result 11.3. Alternative representations of the vector field F0(x) and diffusionmatrix A0(x) can be found through the following integrals over time and Eµx:

A0(x) = 2∫ ∞

0Eµx

(f0(x, y)⊗ f0(x, ϕt

x(y)))dt (11.3.8)

and, if the generator L0 is independent of x, then

F0(x) =∫ ∞

0Eµx

(∇xf0(x, ϕt

x(y))f0(x, y))dt. (11.3.9)

The following result will be useful to us in deriving the previous two repre-sentations of A0(x) and F0(x). It uses ergodicity to represent solution of the cellproblem, and related Poisson equations, as time-integrals.

Page 147: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

11.3. SIMPLIFIED EQUATIONS 131

Result 11.4. Assume that L0 is the generator of an ergodic process in Y withinvariant density ρ∞(y). Let h be a function orthogonal to the null-space of L∗0 sothat ∫

Yh(y)ρ∞(y)dy = 0.

Now define

H(y) = −∫ ∞

0(eL0t)h(y)dt.

Then

L0H = h,

YH(y)ρ∞(y)dy = 0.

Proof. We first note that

(L0H)(y) = −∫ ∞

0L0(eL0th)(y)dt

= −∫ ∞

0

∂t

((eL0th)(y)

)dt

= h(y)− limt→∞(eL0th)(y)

= h(y).

In the last identity we used ergodicity and the fact that h averages to zero under thedensity ρ∞. This follows from Theorem 6.10 because v(y, t) =

(eL0 th)(y) solves

the backward Kolmogorov equation.Note that also

∫Hρ∞dy = 0 since

YH(y)ρ∞(y)dy = −

Y

(∫ ∞

0(eL0th)(y)dt

)ρ∞(y)dy

= −∫ ∞

0

(∫

Y(eL0th)(y)ρ∞(y)dy

)dt

= −∫ ∞

0

(∫

Yh(y)e

(L∗0tρ∞

)(y)dy

)dt

= 0.

The last line follows because ρ∞(y) is stationary for the Fokker-Planck dynamics.

Page 148: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

132 CHAPTER 11. HOMOGENIZATION FOR ODE AND SDE

11.4 Derivation

We seek a multiscale expansion for the solution with the form

v = v0 + εv1 + ε2v2 + · · · (11.4.1)

Note that L0 is a differential operator in y, in which x appears as a parameter.Thus we must equip it with boundary conditions. For simplicity we consider thecase where Y = Td so that periodic boundary conditions are used. Note how-ever that other functional settings are also possible; the key in what follows isapplication of the Fredholm alternative to operator equations defined through L0.

Substituting this expansion into (11.2.3) and equating powers of ε gives a hierarchyof equations, the first three of which are

O(1/ε2) L0v0 = 0, v0 is 1–periodic, (11.4.2a)

O(1/ε) L0v1 = −L1v0 v1 is 1–periodic, (11.4.2b)

O(1) L0v2 =∂v0

∂t− L1v1 − L2v0 v2 is 1–periodic. (11.4.2c)

The initial conditions are that v0 = φ and vi = 0 for i > 1.

Recall that we assume that L0 generates an ergodic process and hence that L0

and L∗0 have a one-dimensional null space. These are characterized in (11.2.5).Hence equation (11.4.2a) implies that v0(x, t) only. The solution of the next twoequations requires the Fredholm alternative. In both cases the right hand side mustbe orthogonal to NullL∗0. By ergodicity we know that

NullL∗0 = spanρ∞(y; x).

Equation (11.4.2b) is solvable for v1, by virtue of (11.2.2) which ensures thatthe operator L1, given by (11.2.4c) averages to zero. From the structure of L1, wehave that (11.4.2b) may be written

L0v1 = −f0(x, y) · ∇xv(x, t) (11.4.3)

Since L0 is a differential operator in y alone with x appearing as a parameter,we may separate variables in (11.4.3) and write v1(x, y, t) = Φ(x, y) · ∇xv0(x, t).Thus we represent the solution v1 as a linear operator acting on v0. As our aimis to find a closed equation for v0 this form for v1 is a useful representation ofthe solution. Substituting for v1 in (11.4.3) shows that Φ solves the cell problem

Page 149: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

11.4. DERIVATION 133

(11.3.1). Condition (11.2.2) ensures that there is a solution to the cell problem andthe normalization condition makes it unique. Turning now to equation (11.4.2c)we have that the right hand side takes the form

∂v0

∂t− L2v0 − L1Φ · ∇xv0.

Hence solvability of (11.4.2c) requires

∂v0

∂t=

∫ρ∞(y; x)L2v0(x, t)dy +

∫ρ∞(y; x)L1

(Φ(x, y) · ∇xv0(x, t)

)dy

= I1 + I2. (11.4.4)

We consider the two terms on the right hand side separately. The first is

I1 =∫

ρ∞(y; x)(f1(x, y) · ∇x +

12A(x, y) : ∇x∇x

)v0(x, t)dy

= F1(x) · ∇xv0(x, t) +12A1(x) : ∇x∇xv0(x, t)

Now for the second term note that

L1(Φ · ∇xv0) = f0 ⊗ Φ : ∇x∇xv0 + (∇xΦf0) · ∇xv0.

Hence I2 = I3 + I4 where

I3 =∫

ρ∞(y;x)(∇xΦ(x, y)f0(x, y)

) · ∇xv0(x, t)dy

andI4 =

∫ρ∞(y; x)

(f0(x, y)⊗ Φ(x, y) : ∇x∇xv0(x, t)

)dy.

ThusI2 = F0(x) · ∇xv(x, t) +

12A0(x) : ∇x∇xv0(x, t).

Combining our simplifications of the right hand side of (11.4.4) we obtain,since only the symmetric part of A0 is required to calculate the Frobenius inner-product with another symmetric matrix, the following expression:

∂v0

∂t(x, t) = F (x) · ∇xv0(x, t) +

12A(x)A(x)T : ∇x∇xv0(x, t).

This is the backward equation corresponding to the reduced dynamics given in(11.3.6).

Page 150: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

134 CHAPTER 11. HOMOGENIZATION FOR ODE AND SDE

We finish this section by deriving the alternative expressions for A(x), F (x)through time integration, given in Results 11.2 and 11.3. The expressions for F1(x)and A1(x) in Result 11.2 are immediate from ergodicity, simply using the fact thatthe time average equals the average against ρ∞. By use of Result 11.4, the solutionto the cell problem can be written as

Φ(x, y) =∫ ∞

0(eL0t)f0(x, y)dt =

∫ ∞

0Ef0(x, ϕt

x(y))dt (11.4.5)

where E denotes expectation with respect to Wiener measure. Now

F0(x) =∫

Yρ∞(y;x)∇xΦ(x, y)f0(x, y)dy.

In the case whereL0 is x−independent so that ϕtx(·) = ϕt(·) is also x independent,

we may use (11.4.5) to see that

F0(x) =∫

Yρ∞(y; x)

∫ ∞

0E∇xf0(x, ϕt

x(y))f0(x, y)dt.

Recalling that Eµx denotes the stationary measure on path space for ϕtx(·) and

changing the order of integration we find that

F0(x) =∫ ∞

0Eµx

(∇xf0(x, ϕt

x(y))f0(x, y))dt (11.4.6)

as required for the expression in Result 11.3. Now we replace averages over Eµx

by time averaging to obtain

F0(x) =∫ ∞

0

(lim

T→∞1T

∫ T

0∇xf0(x, ϕt+s

x (y))f0(x, ϕsx(y))ds

)dt,

and so we obtain the desired formula for Result 11.2.A similar calculation to that yielding (11.4.6) gives (11.3.8) for A0(x) in Re-

sult 11.3. Replacing average against Eµx by time-average we arrive at the desiredformula for A0(x) in Result 11.2.

11.5 Properties of the Effective Equation

The effective SDE (11.3.6) is only well-defined if A(x)A(x)T given by (11.3.3),(11.3.5) is non-negative definite. We now prove that this is indeed the case.

Page 151: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

11.6. APPLICATIONS 135

Theorem 11.5. Consider the case where Y = Td and L0 is equipped with periodicboundary conditions. The real-valued matrix A is well-defined by (11.3.3):

〈ξ,A1(x)ξ + A0(x)ξ〉 > 0 ∀x ∈ X , ξ ∈ Rl.

Proof. Let φ(x, y) = ξ · Φ(x, y). Then φ solves

−L0φ = ξ · f0.

By Theorem 6.7 we have

〈ξ, A1(x)ξ + A0(x)ξ〉=

Y

(|α(x, y)T ξ|2 − 2(L0φ(x, y))φ(x, y)

)ρ∞(y;x)dy

=∫

Y

(|α(x, y)T ξ|2 + |β(x, y)T∇yφ(x, y)|2

)ρ∞(y; x)dy

> 0.

Two important remarks are in order.

Remark 11.6. Techniques similar to those used in the proof of the previous theo-rem, using (6.3.8) instead of the Dirichlet form itself, show that

12

(A0(x) + A0(x)T

)=

Y

(∇Φ(x, y)β(x, y)⊗∇Φ(x, y)β(x, y)

)ρ∞(y; x)dy.

(11.5.1)

Remark 11.7. By virture of Remark 6.8 we see that the prceeding theorem can beextended to settings other than Y = Td.

11.6 Applications

11.6.1 Fast Ornstein-Uhlenbeck Noise

Consider the equations

dx

dt=

1ε(1− y2)x,

dy

dt= − α

ε2y +

√2α

ε2

dV

dt.

Page 152: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

136 CHAPTER 11. HOMOGENIZATION FOR ODE AND SDE

Recall that these equations arise from the first application in section 10.5, in thecase where λ = 2α, and after time-rescaling to produce non-zero effects. Thefunction V (t) is a standard unit Brownian motion.

We want to find the fluctuations induced by the fast process y on x. To this endwe need to study the variable ϕt(y) solving (10.2.2), noting that here there is no de-pendence on x in this fast process – it is simply on OU process. A straightforwardcalculation shows that

ϕt(y) = e−αty +√

∫ t

0e−α(t−s)dV (s),

ϕt(y)2 = e−2αty2 +√

2αye−αt

∫ t

0e−α(t−s)dV (s) + 2α

(∫ t

0e−α(t−s)dV (s)

)2.

(11.6.1)

In this example

L0 = −αy∂

∂y+ α

∂2

∂y2. (11.6.2)

Because this operator is x−independent, the invariant density in the null-space ofL∗ is ρ∞(y), and is independent of x. In fact it is the density corresponding to anN (0, 1) random variable. By the Ito isometry,

E(∫ t

0e−α(t−s)dV (s)

)2 =∫ t

0e−2α(t−s)ds,

=12α

(1− e−2αt

).

To construct the measure Eµxwe take y and V to be independent. Thus

∫ρ∞(y)y2dy = 1, Eµx

ϕt(y)2 = 1.

Furthermore

E∫

ρ∞(y)y2ϕt(y)2 = e−2αt

∫ρ∞(y)y4dy + 2αE

( ∫ t

0e−α(t−s)dV (s)

)2

= 3e−2αt + 1− e−2αt

= 1 + 2e−2αt.

Page 153: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

11.6. APPLICATIONS 137

Here f0(x, y) = (1− y2)x. Combining these calculations in (11.3.9) gives

F0(x) = x

∫ ∞

0Eµx

((1− ϕt(y)2)(1− y2)

)dt

= x

∫ ∞

02e−2αtdt

=x

α. (11.6.3)

Similarly from (11.3.8) we obtain

A0(x) =2x2

α.

It follows that the effective equation is

dX

dt=

X

α+

√2α

XdW

dt.

11.6.2 Fast Chaotic Noise

We now consider an example which is entirely deterministic, but which behavesstochastically when we eliminate a fast chaotic variable. In this context it is essen-tial to use the representation of the effective diffusion coefficient given in Result11.2 since it uses time-integrals, and makes no reference to averaging over Brown-ian motion (which is not present in this example). Consider the equations

dx

dt= x− x3 +

λ

εy2

dy1

dt=

10ε2

(y2 − y1)

dy2

dt=

1ε2

(28y1 − y2 − y1y3)

dy3

dt=

1ε2

(y1y2 − 83y3) (11.6.4)

The vector y = (y1, y2, y3)T solves the Lorenz equations, at parameter valueswhere the solution is chaotic [154]. Thus the equation for x is a scalar ODE drivenby a chaotic signal with characteristic time ε2. Because f0(x, y) ∝ y2 and becausef1(x) only, the candidate equation for the approximate dynamics is

dX

dt= X −X3 + σ

dW

dt, (11.6.5)

Page 154: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

138 CHAPTER 11. HOMOGENIZATION FOR ODE AND SDE

where σ is a constant. Now let ψt(y) = e2 · ϕt(y). Then the constant σ can befound by use of (11.3.7) giving

σ2 = 2λ2

∫ ∞

0

(lim

T→∞

∫ T

0ψs(y)ψt+s(y)ds

)dt.

This is the integrated autocorrelation function of y2.

Another way to derive this result is as follows. Gaussian white noise, the time-derivative of Brownian motion, may be thought of as a delta-correlated stationaryprocess. On the assumption that y2 has a correlation function which decays in time,and noting that this has time-scale ε2, the autocorrelation of λ

ε y2(s) at timelag t

may be calculated and integrated from 0 to∞; matching this with the known resultfor Gaussian white noise gives the desired result for σ2.

11.6.3 Stratonovich Corrections

When white noise is approximated by a smooth process typically this leads toStratonovich interpretations of stochastic integrals. We use multiscale analysis toillustrate this idea by means of a simple example.

Consider the equations

dx

dt=

1εv(x)y + σ

dU

dt,

dy

dt= −αy

ε2+

√2α

ε2

dV

dt,

with U, V standard Brownian motions. Heuristically we deduce from the y equa-tion that

y

ε≈

√2α

dV

dt+O(ε). (11.6.6)

Thus we might conjecture a limiting equation of the form

dX

dt=

√2α

v(X)dV

dt+ σ

dU

dt(11.6.7)

which is equivalent in distribution to the equation

dX

dt=

( 2α

v(X)2 + σ2) 1

2 dW

dt,

with W is a standard unit Brownian motion.

Page 155: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

11.6. APPLICATIONS 139

We will show that this heuristic is incorrect: this is because, whenever whitenoise is approximated by a smooth process, the limiting equation should be inter-preted in the Stratonovich sense

dX

dt=

√2α

v(X) dV

dt+ σ

dU

dt.

Thus, in Ito form the limiting equation is

dX

dt=

( 2α

v(X)2 + σ2) 1

2 dW

dt+

v′(X)v(X). (11.6.8)

We now derive this limit equation by the techniques introduced in this chapter.The cell problem is

L0Φ(x, y) = −v(x)y

with L0 given by (11.6.2). The solution is readily seen to be

Φ(x, y) =1α

v(x)y, ∇xΦ(x, y) =1α

v′(x)y.

The invariant density is

ρ∞(y) =1√2π

exp(−y2

2

)

which is in the null-space of L∗0 and corresponds to a standard unit GaussianN (0, 1) random variable.

From equation (11.3.2) we have

F (x) =∫

R

v′(x)v(x)y2ρ∞(y)dy

=1α

v′(x)v(x).

Also (11.3.3) gives

A(x)2 =∫

R

(σ2 +

v(x)2y2)ρ∞(y)dy

= σ2 +2α

v(x)2.

Hence the desired result is established.

Page 156: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

140 CHAPTER 11. HOMOGENIZATION FOR ODE AND SDE

11.6.4 Stokes’ Law

The previous example may be viewed as describing the motion of a massless parti-cle with position x in a velocity field proportional to f(x)y, with y an OU process,and subject to additional molecular bombardment. If the particle has unit mass m

then it is natural to study the equation

md2x

dt2=

1εf(x)y − dx

dt+ σ

dU

dt(11.6.9)

dy

dt= −αy

ε2+

√2α

ε2

dV

dt. (11.6.10)

The first of the two previous equations is Stokes’ law, stating that the force on theparticle is the sum of a drag force, 1

εf(x)y − dxdt , and a force due to molecular

bombardment, σ dUdt . As in the previous example, y is a fluctuating OU process.

For simplicity we consider the case of unit mass m.Using the heuristic (11.6.6) it is natural to conjecture the limiting equation

d2X

dt2=

√2α

f(X)dV

dt− dX

dt+ σ

dU

dt.

This is equivalent to

d2X

dt2+

dX

dt+

(σ2 +

f(X)2) 1

2 dW

dt

in distribution. In contrast to the previous application, the conjecture that this isthe limiting equation turns out to be correct. The reason is that, here, x is smootherand the Ito and Stratonovich corrections coincide; there is no Ito correction tothe Stratonovich integral. We verify the result by using the multiscale techniquesintroduced in this chapter.

We first write (11.6.9) as the first order system

dx

dt= r,

dr

dt= −r +

1εf(x)y + σ

dU

dt,

dy

dt= − 1

ε2αy +

√2α

dV

dt.

Here (x, r) are slow variables (x in (11.2.1)) and y the fast variables (y in (11.2.1)).The cell problem is now given by

L0Φ(x, r, y) = −f0(x, r, y) =( 0−f(x)y

),

Page 157: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

11.6. APPLICATIONS 141

with L0 given by (11.6.2). The solution is

Φ(x, r, y) =( 0

1αf(x)y

), ∇(x,r)Φ(x, y) =

( 0 01αf ′(x)y 0

).

Notice that f0 is in the null-space of ∇(x,r)Φ, and hence (11.3.2) gives

F (X,R) =( R−R

). (11.6.11)

From (11.3.3) we have

A(X, R)A(X, R)T =∫

R

( 0 00 σ2

)+ 2

( 0 00 1

αf(X)2y2

)ρ∞(y)dy.

Recall that ρ∞(y) is the density of anN (0, 1) Gaussian random variable. Evaluat-ing the integral gives

A(X, R)A(X, R)T =( 0 0

0 σ2 + 1α2f(X)2.

).

Hence a natural choice for A(x) is

A(X,R) =( 0

(σ2 + 1α2f(X)2)

12

).

Thus from (11.6.11) and (11.6.12) we obtain the limiting equation

dX

dt= R

dR

dt= −R +

(σ2 +

2f(X)2) 1

2)dW

dt

which, upon elimination of R, is seen to coincide with the conjectured limit.

11.6.5 Kubo Formula

In the previous application we encountered the equation of motion of a particlewith significant mass, subject to Stokes drag and molecular diffusion. Here westudy the same equation of motion, but where the velocity field is steady. Theequation of motion is thus

d2x

dt2= f(x)− dx

dt+ σ

dU

dt. (11.6.12)

Page 158: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

142 CHAPTER 11. HOMOGENIZATION FOR ODE AND SDE

Here U is a standard unit Brownian motion. We will study the effective diffusivebehavior of the particle x on large length and time-scales, in the case where v isperiodic with mean zero. We show that the diffusion coefficient is given by theKubo formula, expressing the diffusion coefficient as the integral of the velocityautocorrelation.

To this end we rescale the equation of motion by setting x → x/ε and t → t/ε2

to obtain

ε2 d2x

dt2=

1εf(x

ε

)−dx

dt+ σ

dU

dt.

Introducing the variables y = εdxdt and z = x/ε we obtain the system

dx

dt=

1εy,

dy

dt= − 1

ε2y +

1ε2

f(z) +σ

ε

dW

dt,

dz

dt=

1ε2

y.

The process (y, z) has timescale ε2 and plays the role of y in (11.2.1); x playsthe role of x in (11.2.1). The operator L0 is the generator of the process (y, z).Furthermore

f1(x, y, z) = 0, f0(x, y, z) = y.

Thus, since the evolution of (y, z) is independent of x, Φ(x, y, z), solution of thecell-problem, is also x−independent. Hence (11.3.2) gives F (x) = 0. Turning nowto the effective diffusivity we find that, since α(x, y) = A(x, y) = 0, (11.3.3) givesA(x)2 = A0(x). Now define ψt(y, r) to be the component of ϕt(y, r) projectedonto the y coordinate. By Result 11.2 we have that

A0(x) = 2∫ ∞

0

(lim

T→∞1T

∫ T

0ψs(y)ψs+t(y)ds

)dt.

The expression

C(t) = limT→∞

1T

∫ T

0y(s)y(s + t)ds

is the autocorrelation of the velocity y(·) of the particle. Thus the effective equationis

dX

dt=√

2DdW

dt,

Page 159: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

11.6. APPLICATIONS 143

a Brownian motion with diffusion coefficient

D =∫ ∞

0C(t)dt,

the integrated velocity autocorrelation.

11.6.6 Neither Ito nor Stratonovich

Consider Stokes’ law (11.6.9) for a particle of small mass m = τ0ε2 where τ0 =

O(1).

τ0ε2 d2x

dt2= −dx

dt+

1εf(x)η, (11.6.13a)

dt=

1ε2

g(η) +1ε

√2σ(η)

dW

dt. (11.6.13b)

For simplicity we have set σ = 0.We interpret (11.6.13b) in the Ito sense. Notice that η(t) is a more general

stochastic process, not necessarily the Ornstein-Uhlenbeck process. We assumethat g(η), σ(η) are such that there exists a unique stationary solution of the Fokker-Planck equation for (11.6.13b), so that η is ergodic.

We write (11.6.13) as a first order system,

dx

dt=

1ε√

τ0v,

dv

dt=

f(x)ηε2√τ0

− v

τ0ε2,

dt= −g(η)

ε2+

√2σ(η)ε

dW

dt.

(11.6.14)

Equations (11.6.14) are of the form (11.2.1) and, under the assumption that thefast process is ergodic, the theory developed in this chapter applies. In order tocalculate the effective coefficients we need to solve the stationary Fokker–Planckequation and the cell problem

L∗0ρ(x, v, η) = 0

and−L0h =

v√τ0

, (11.6.15)

where

L0 = g(η)∂

∂η+ σ(η)

∂2

∂η2+

(f(x)η√

τ0− v

τ0

)∂

∂v.

Page 160: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

144 CHAPTER 11. HOMOGENIZATION FOR ODE AND SDE

Equation (11.6.15) can be simplified considerably: we look for a solution of theform

h(x, v, y) =(−√τ0 v + f(x)h(η)

). (11.6.16)

Substituting this expression in the cell problem we obtain, after some algebra:

−Lηh = η.

Here Lη denotes the generator of η. We assume that η(t) is an ergodic Markovprocess with unique invariant measure with density ρη(η) with respect to Lebesgue;the centering condition which ensures the well posedness of the

∫ηρη(η) dη = 0.

The homogenized SDE is

dX

dt= B(X) +

√D(X)

dW

dt. (11.6.17)

where

B(x) :=∫ ∫ (

v√τ0

h(η)∂f(x)

∂x

)ρ(x, v, η) dvdη

and

D(x) := 2∫ ∫ (

−v2 +v√τ0

h(η)f(x))

ρ(x, v, η) dvdη.

Notice that (11.6.17) is neither of the Ito nor of the Stratonovich form.In the case where η(t) is the Ornstein–Uhlenbeck process

dt= − α

ε2η +

√2λ

ε2

dW

dt(11.6.18)

we can compute the homogenized coefficients D(X) and B(X) explicitly. Theeffective SDE is

dX

dt=

λ

α2(1 + τ0α)f(X)f ′(X) +

√2λ

α2f(X)

dW

dt. (11.6.19)

This equation can be written in the form

X(t) = x0

∫ t

0

α2f(X)dW (t),

where the definition of the stochastic integral through Riemann sums depends onthe value of τ0. Note that in the limit τ0 →∞we recover the Ito stochastic integral,whereas in the limit τ0 → 0 we recover the Stratonovich stochastic integral.

Page 161: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

11.6. APPLICATIONS 145

11.6.7 The Levy Area Correction

In Section 11.6.3 we saw that smooth approximation to white noise in one dimen-sion leads to the Stratonovich stochastic integral. This is not true in general, how-ever, in the multidimensional case: an additional drift might appear in the limit.This spurious drift is related to the non–convergence of the Levy area (see thediscussion in Section 11.7).

Consider the fast–slow system

x1 =1εy1, (11.6.20a)

x2 =1εy2, (11.6.20b)

x3 =1ε

(x1y2 − x2y1) , (11.6.20c)

y1 = − 1ε2

y1 − α1ε2

y2 +1εW1, (11.6.20d)

y2 = − 1ε2

y2 + α1ε2

y1 +1εW2, (11.6.20e)

where α > 0. Multiscale analysis leads to the following homogenized system:

x1 =1

1 + α2

(W1 − αW2

), (11.6.21a)

x2 =1

1 + α2

(W2 + αW1

), (11.6.21b)

x3 =1

1 + α2

((αx1 − x2)W1 + (αx2 + x1)W2

)+

α

1 + α2. (11.6.21c)

Notice that the additional constant drift that appears in equation (11.6.21c) is notconsistent with the Stratonovich interpretation of the stochastic integral. Let uswrite equations (11.6.20d) and (11.6.20e) in the form

y = − 1ε2

Iy +1ε2

αJy +1εIW ,

where y = (y1, y2), W = (W1, W2), I is the identity matrix in R2 and J is theanti-symmetric (symplectic) matrix

J =( 0 −1

1 0.

).

Page 162: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

146 CHAPTER 11. HOMOGENIZATION FOR ODE AND SDE

It becomes clear that the anti-symmetric part in the equation for the fast process isresponsible for the presence of the spurious drift in the homogenized equation. Inparticular, when α = 0 the homogenized equation becomes

x1 = W1,

x2 = W2,

x3 = −x2W1 + x1W2.

11.7 Discussion and Bibliography

The basic perturbation expansion outlined in this Chapter can be rigorously jus-tified and weak convergence of x to X proved as ε → 0; see Kurtz [85]. Theperturbation expansion which underlies the approach is clearly exposed in [116].The use of representations in Result 11.1 is discussed in [116]. The representa-tions in Results 11.2 and 11.3 for the effective drift and diffusion can be used inthe design of coarse time-stepping algorithms – see [151] and [37].

Studying the derivation of effective stochastic models when the original systemis an ODE, as we did in fast chaotic noise application above, is a subject investi-gated in some generality in [117]. The idea outlined in that example is carried outin discrete time by Beck [14] who also uses a skew-product structure to enable theanalysis; the ideas can then be rigorously justified in some cases. In the paper [93]the idea that fast chaotic motion can introduce noise in slow variables is pursued foran interesting physically motivated problem where the fast chaotic behavior arisesfrom the Burgers bath of [97].

Related work can be found in [59] and similar ideas in continuous time areaddressed in [74, 73] for differential equations; however, rather than developinga systematic expansion in powers of ε, they find the exact solution of the Fokker-Planck equation, projected into the spaceX , by use of the Mori-Zwanzig formalism[29], and then make power series expansions in ε of the resulting problem.

This situation, and more general related ones, is covered in a series of papersby Papanicolaou and co-workers—see [119, 116, 117, 115], building on originalwork of Khasminkii [64, 65]. See also [74, 73, 14, 96, 119, 117, 115, 74, 73] forfurther material.

A rigorous explanation of the example presented in Section 11.6.7 based on thetheory of rough paths can be found in [91].

Page 163: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

11.8. EXERCISES 147

In this chapter we have considered equations of the form (11.2.1) where U

and V are independent Brownian motions. Frequently applications arise where thenoise in the two processes are correlated. We will cover such situations in Chapter14 where we study homogenization for parabolic PDEs. The structure of the linearequations considered will be general enough to subsume the form of the backwardKolmogorov equation which arises from (11.2.1) when U and V are correlated.

Applications to climate models, where the atmosphere evolves quickly relativeto the slow oceanic variations, are surveyed in Majda et al. [96]; we have fol-lowed the presentation in [116, 96] quite closely here. Further applications to theatmospheric sciences may be found in [98, 99]. See also [136].

11.8 Exercises

1. a. Let Y denote either Tm or Rd. What is the generator L for the processy ∈ Y given by

dy

dt= g(y) +

dW

dt?

In the case where g(y) = −∇V (y) find a function in the null space of L∗.b. Find the homogenized SDE arising from the system

dx

dt=

1εf(x, y)

dy

dt=

1ε2

g(y) +1ε

dW

dt,

in the case where g = −∇V (y).

c. Define the cell problem, giving appropriate conditions to make the solutionunique in the case Y = Td. State clearly any assumptions on f that arerequired in the preceding derivation.

2. a. Let Y be either Tm or Rm. Write down the generator L0 for the processy ∈ Y given by:

dy

dt= g(y) +

dW

dt.

In the case where g is divergence free, find a function in the null space ofL∗0.

Page 164: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

148 CHAPTER 11. HOMOGENIZATION FOR ODE AND SDE

b. Find the averaged SDE arising from the system

dx

dt= f(x, y)

dy

dt=

1εg(y) +

1√ε

dW

dt,

in the case where g is divergence-free.

c. Find the homogenized SDE arising from the system

dx

dt=

1εf(x, y)

dy

dt=

1ε2

g(y) +1ε

dW

dt,

in the case where g is divergence-free.

d. Define the cell problem, giving appropriate conditions to make the solutionunique in the case Y = Td. Clearly state any assumptions on f that arerequired in the preceding derivation.

3. Consider the Stokes law (11.6.9) in the case where the mass is small:

εa d2xdt2

= 1εf(x)y − dx

dt + σ dUdt

dydt = −αy

ε2 +√

2αε2

dVdt .

(11.8.1)

Derive the limiting equation satisfied by x in the cases a = 1, 2 and a = 3.

Comment on your findings.

4. Consider the equation of motion

dx

dt= f(x) + σ

dW

dt,

where v is periodic with mean zero. It is of interest to understand how x be-haves on large length and timescales. To this end we rescale the equation ofmotion by setting x → x/ε and t → t/ε2 and introduce z = xε. Write downa pair of coupled SDE for x and z. Generalize the methods developed in thischapter to enable eliminate z and obtain an effective equation for x.

5. Carry out the analysis presented in Section 11.6.6 in arbitrary dimensions.Does the limiting equation have the same structure as in the one dimensionalcase?

6. Derive equation (11.6.19) from (11.6.17) when η(t) is given by (11.6.18).

Page 165: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

Chapter 12

Homogenization for EllipticEquations

12.1 Introduction

In this chapter we use multiscale expansions in order to study the problem of ho-mogenization for elliptic problems. At a purely formal level the calculations arevery similar to those used in the previous section to study homogenization forSDEs. The primary difference is that there is no time-dependence in the linearequations.

12.2 Full Equations

We study uniformly elliptic PDEs in divergence form, with Dirichlet boundaryconditions:

−∇ ·(A

(x

ε

)∇uε

)= f, for x ∈ Ω, (12.2.1a)

uε(x) = 0, for x ∈ ∂Ω. (12.2.1b)

Unlike the problems in the previous four chapters, there are not two differentexplicit variables x and y. We will introduce y = x/ε to create a setting similar tothat in the previous chapters. Our goal is then to derive a homogenized equation inwhich y is eliminated, in the limit ε → 0. Furthermore, we study various propertiesof the homogenized coefficients.

We take Ω ⊂ Rd, open, bounded with smooth boundary. We will assume thatthe coefficient A(y) is smooth, 1–periodic and uniformly elliptic. Furthermore,

149

Page 166: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

150 CHAPTER 12. HOMOGENIZATION FOR ELLIPTIC EQUATIONS

will take the function f(x) to be smooth and independent of ε. The assumptionsthat we make are

f ∈ C∞(Rd,Rd); (12.2.2a)

A ∈ C∞(Y,Rd); (12.2.2b)

∃α > 0 : 〈ξ,A(y)ξ〉 > α|ξ|2, ∀ y ∈ Y ∀ ξ ∈ Rd. (12.2.2c)

In the above Y = [0, 1]d denotes the (reference) unit cell (which is actually thed–dimensional unit torus Td.) For simplicity we assume that the matrix A is sym-metric, although this is not strictly necessary. With this assumption, (12.2.2c) statesthat the matrix-valued function A(y) is uniformly positive-definite. The regularityassumptions are more stringent than is necessary; we make them at this point inorder to carry out the formal calculations that follow. Allowing minimal regular-ity assumptions is an important issue: in many applications one expects that thecoefficient A(y) will have jumps when passing from one phase to the other.

12.3 Simplified Equations

Define the homogenized coefficient by the formula

A =∫

Y

(A(y) + A(y)∇χ(y)T

)dy (12.3.1)

where the field χ(y) satisfies the cell problem

−∇y ·(A(y)∇yχ

T (y))

= ∇ ·A(y)T , χ(y) is 1–periodic. (12.3.2)

Result 12.1. For ε ¿ 1 the solution uε of equation (12.2.1) is approximately givenby the solution u of the homogenized equation

−A : ∇∇u = f, for x ∈ Ω, (12.3.3a)

u(x) = 0, for x ∈ ∂Ω. (12.3.3b)

Notice that the field χ is undetermined up to a constant vector. However, sinceonly ∇χ enters the homogenized equation, the value of this constant is irrelevant.

Page 167: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

12.4. DERIVATION 151

12.4 Derivation

Since a small parameter ε appears in equation (12.2.1), it is natural to look for asolution in the form of a power series expansion in ε:

uε = u0 + εu1 + ε2u2 + . . .

The basic idea behind the method of multiple scales is to assume that all terms inthe above expansion depend explicitly on both x and y = x

ε . Furthermore, sincethe coefficients of our PDE are periodic functions of x

ε it is reasonable to expectthat the solution is also a periodic function of its argument x

ε . Hence, we assumethe following ansatz for the solution uε:

uε(x) = u0

(x,

x

ε

)+ ε u1

(x,

x

ε

)+ ε2 u2

(x,

x

ε

)+ . . . , (12.4.1)

where uj(x, y), j = 0, 1, . . . are periodic in y.The variables x and y = x

ε represent the ”slow” (macroscopic) and ”fast”(microscopic) scales of the problem, respectively. For ε ¿ 1 the variable y changesmuch more rapidly than x and we can think of x as being a constant, when lookingat the problem at the microscopic scale. This is where the assumption of scaleseparation enters: we will treat x and y as independent variables. Justifying thevalidity of this assumption as ε → 0 is one of the main issues in the rigorous theoryof homogenization.

The fact that y = xε implies that the partial derivatives with respect to x become

∇x → ∇x +1ε∇y.

In other words, the total derivative (abusing slightly notation) of a function fε(x) :=f

(x, x

ε

)can be expressed as

∇xfε(x) = ∇xf(x, y)∣∣∣y=x

ε

+1ε∇yf(x, y)

∣∣∣y=x

ε

,

where the notation h(x, y)|y=z means that the function h(x, y) is evaluated at y =z.

We use the above to re–write the differential operator

Aε := −∇x · (A(y)∇x)

Page 168: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

152 CHAPTER 12. HOMOGENIZATION FOR ELLIPTIC EQUATIONS

in the formAε =

1ε2A0 +

1εA1 +A2, (12.4.2)

whereA0 := −∇y · (A(y)∇y) , (12.4.3a)

A1 := −∇y · (A(y)∇x)−∇x · (A(y)∇y) , (12.4.3b)

A2 := −∇x · (A(y)∇x) . (12.4.3c)

Notice that the coefficients in all the operators defined above are periodic functionsof y.

Equation (12.2.1), on account of (12.4.2), becomes:(

1ε2A0 +

1εA1 +A2

)uε = f, for x ∈ Ω, (12.4.4a)

uε(x) = 0, for x ∈ ∂Ω. (12.4.4b)

We substitute (12.4.1) into (12.4.4) to deduce:

1ε2A0u0 +

(A0u1 +A1u0) + (A0u2 +A1u1 +A2u0) +O(ε) = f. (12.4.5)

We equate equal powers of ε in the above equation and disregard all terms of orderhigher than 1 to obtain the following sequence of problems:

O(1/ε2) A0u0 = 0, u0 is 1–periodic, (12.4.6a)

O(1/ε) A0u1 = −A1u0, u1 is 1–periodic, (12.4.6b)

O(1) A0u2 = −A1u1 −A2u0 + f, u2 is 1–periodic. (12.4.6c)

Note that the elliptic operatorA0, when equipped with periodic boundary con-ditions, is self-adjoint, and has a one dimensional null-space comprising constantsin y. From (12.4.6a) we deduce that u0(x, y) = u(x) – the solution is y indepen-dent. The remaining two equations are of the form

A0v = h, v is 1–periodic, (12.4.7)

with v = v(x, y) and similarly h = h(x, y), and A0 a differential operator in y inwhich x appears as a parameter. By the Fredholm alternative this equation has asolution if and only if ∫

Yh(x, y)dy = 0.

Page 169: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

12.4. DERIVATION 153

Among all solutions of (12.4.7) which satisfy this solvability condition, we willchoose the unique solution whose integral over Y vanishes:

A0v = h, v is 1–periodic,∫

Yv dy = 0.

Let us proceed now with (12.4.6b) which becomes

A0u1 =(∇y ·AT ) · ∇xu, u1 is 1–periodic,

Yu1 dy = 0. (12.4.8)

The solvability condition is satisfied because∫

Y

(∇y ·AT

)· ∇xu dy = ∇xu ·

Y∇y ·AT dy

= 0,

by the divergence theorem and periodicity of A(·). We seek a solution of (12.4.8)using separation of variables:

u1(x, y) = χ(y) · ∇xu(x). (12.4.9)

Upon substituting (12.4.9) into (12.4.8) we obtain the cell problem (12.3.2) for χ.The field χ(y) is called the first order corrector. Notice that the periodicity ofthe coefficients implies that the right hand side of equation (12.3.2) averages tozero over the unit cell and consequently the cell problem is well posed. We ensurethe uniqueness of solutions to (12.3.2) by requiring the corrector field to have zeroaverage.

Now we consider equation (12.4.6c). In order for this equation to be well posedit is necessary and sufficient for the right hand side to average to zero. Since wehave assumed that the function f(x) is independent of y the solvability conditionimplies: ∫

Y(A2u0 +A1u1) dy = f. (12.4.10)

The first term on the left hand side of the above equation is∫

YA2u0 dy =

Y−∇x · (A(y)∇xu) dy

= −∇x ·[(∫

YA(y) dy

)∇xu(x)

]

= −(∫

YA(y) dy

): ∇x∇xu(x). (12.4.11)

Page 170: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

154 CHAPTER 12. HOMOGENIZATION FOR ELLIPTIC EQUATIONS

Moreover∫

YA1u1 dy =

Y

(−∇y · (A(y)∇xu1)−∇x · (A(y)∇yu1))dy

=: I1 + I2.

The first term I1 = 0 by the divergence theorem. Now we consider I2:

I2 =∫

Y−∇x · (A(y)∇yu1) dy

= −∫

YA(y) : ∇x∇y (χ · ∇xu) dy

= −(∫

Y(A(y)∇yχ(y)T ) dy

): ∇x∇xu. (12.4.12)

We substitute (12.4.12) and (12.4.11) in (12.4.10) to obtain the homogenized equa-tion

−A : ∇x∇xu = f, for x ∈ Ω,

u(x) = 0, for x ∈ ∂Ω,

where the homogenized coefficient A is given by the formula:

A =∫

Y

(A(y) + A(y)∇χ(y)T

)dy.

12.5 Properties of the Homogenized Coefficients

In this section we study some basic properties of the effective coefficients. Inparticular, we show that the effective coefficients matrix A is positive definite,which implies that the homogenized differential operator is uniformly elliptic. Wealso show that the homogenization process can create anisotropies: even if thematrix A(y) is diagonal, the matrix of homogenized coefficients A need not be.

In order to study the matrix of homogenized coefficients it is useful to find analternative representation for A. To this end, we introduce the symmetric bilinearform

a1(ψ, φ) =∫

Y〈∇yφ,A(y)∇yψ〉 dy, (12.5.1)

defined for all functions φ, ψ ∈ C1(Y). We start by obtaining an alternative, equiv-alent formulation for the cell problem. The formulation is closely related to the

Page 171: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

12.5. PROPERTIES OF THE HOMOGENIZED COEFFICIENTS 155

weak form of the elliptic PDE introduced in Chapter 7. In the rest of this sectionwe will assume that the solution of the cell problem is smooth enough to justify thecalculations that follow. It will be enough to assume that each component of thevector χ is continuously differentiable and periodic: χ`(y) ∈ C1

per(Y).As usual, e` denotes the unit vector with ith entry δil. Also let y` denote the

`th component of the vector y. Note that e` = ∇yy`. We let a`(y) denote the `th

column of A(y). Then the cell problem can be written as

∇y ·(A∇yχ` + a`

)= 0, ` = 1, . . . , d.

Using this we obtain the following useful lemma.

Lemma 12.2. The cell problem (12.3.2) can be written in the form

a1(χ` + y`, φ) = 0, ∀φ ∈ C1per(Y), ` = 1, . . . d. (12.5.2)

Proof. We multiply the cell problem as formulated above by an arbitrary functionφ ∈ C1

per(Y ) and integrate by parts over the unit cell to obtain

Y∇yφ ·

(A∇yχ` + a`

)dy = 0

⇒∫

Y∇yφ ·

(A∇yχ` + Ae`

)dy = 0

⇒∫

Y∇yφ ·

(A∇yχ` + A∇yy`

)dy = 0.

This gives the desired result by symmetry of the bilinear form.Using this lemma we give an alternative representation formula for the homog-

enized coefficients. Note that, since A is symmetric, so is the bilinear form a1.Hence the following lemma shows that A is also symmetric.

Lemma 12.3. The effective matrix A has components given by

aij = a1(χj + yj , χi + yi), i, j = 1, . . . , d. (12.5.3)

Proof. We use formula (12.3.1), together with the cell problem written in the form

Page 172: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

156 CHAPTER 12. HOMOGENIZATION FOR ELLIPTIC EQUATIONS

(12.5.2) and symmetry of a1(·, ·), to give

aij = ei ·Aej

=∫

Y

(ei ·Aej + ei ·A∇χT ej

)dy

=∫

Y∇yyi ·A

(∇yyj +∇yχj

)dy

= a1(χj + yj , yi).

Buta1(χj + yj , χi) = 0

by taking φ = χi in the previous lemma. Hence the result follows by the bilinearityof a1.

In addition to being symmetric, the matrix A is also positive definite.

Theorem 12.4. The matrix of homogenized coefficients A is symmetric positivedefinite.

Proof. The previous proof establishes symmetry thus we need to show that thereexists a constant α > 0 such that

〈ξ, Aξ〉 > α|ξ|2, ∀ξ ∈ Rd.

Let ξ ∈ Rd be an arbitrary vector. We use the representation formula (12.5.3) todeduce that:

〈ξ, Aξ〉 = a1(w,w),

with w = ξ.(χ + y). We use now the uniform positive definiteness of A(y) toobtain

a1(w, w) > α

Y|∇yw|2 dy > 0.

Thus A is nonnegative.To show that it is positive definite we argue as follows. Let us assume that

〈ξ, Aξ〉 = 0

for some ξ. Then ∇yw = 0 and w = c, a constant vector; consequently

ξ · y = c− ξ · χ.

Page 173: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

12.6. APPLICATIONS 157

The right hand side of this equation is 1–periodic in y and consequently the lefthand side should also be. The only way this can happen is if ξ = 0. This completesthe proof of the lemma.

The above theorem shows that uniform ellipticity is a property that is preservedunder the homogenization procedure. In particular, this implies that the homoge-nized equation is well posed, since it is a uniformly elliptic PDE with constantcoefficients.

Remark 12.5. Note that homogenization does not preserve isotropy. In particular,even if the diffusion matrix A has only diagonal non–zero elements, the homoge-nized diffusion matrix will in general have non–zero off–diagonal elements. To seethis, let us assume that aij = 0, i 6= j. Then, the off–diagonal elements of thehomogenized diffusion matrix are given by the formula (no summation conventionhere)

aij =∫

Yaii

∂χj

∂yidy, i 6= j.

This expression is not necessarily equal to zero. This leads to the surprising resultthat an isotropic composite material can behave, in the limit as the microstructurebecomes finer and finer, like an anisotropic homogeneous material.

12.6 Applications

We present two useful illustrative examples, the first in one dimension. Essentially,the one–dimensional case is the only case where the cell problem can be solvedanalytically and an explicit formula for the effective diffusivity can be obtained. Inhigher dimensions, explicit formulae for the effective diffusivities can be obtainedonly when the specific structure of the problem under investigation enables us toreduce the calculation of the homogenized coefficients to the one dimensional case.Such a reduction is possible in the case of layered materials, the second examplethat we consider.

Page 174: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

158 CHAPTER 12. HOMOGENIZATION FOR ELLIPTIC EQUATIONS

12.6.1 The One–Dimensional Case

Let d = 1 and take Ω = [0, L]. Then the Dirichlet problem (12.2.1a) reduces to atwo–point boundary value problem:

− d

dx

(a

(x

ε

) duε

dx

)= f, x ∈ (0, 1), (12.6.1a)

uε(0) = uε(1) = 0. (12.6.1b)

We assume, as before, that a(y) is smooth, periodic with period 1. We also assumethat there exist constants 0 < α 6 β such that

α 6 a(y) 6 β, ∀y ∈ [0, 1]. (12.6.2)

We also assume that f is smooth.The cell problem becomes a boundary value problem for an ordinary differen-

tial equations with periodic boundary conditions.

− d

dy

(a(y)

dy

)=

da(y)dy

, y ∈ (0, 1), (12.6.3a)

χ is 1–periodic,∫ 1

0χ(y) dy = 0. (12.6.3b)

Since d = 1 we only have one effective coefficient which is given by the onedimensional version of (12.3.1), namely

a =∫ 1

0

(a(y) + a(y)

dχ(y)dy

)dy

=:⟨

a(y)(

1 +dχ(y)

dy

)⟩. (12.6.4)

Here, and in the remainder of this chapter, we employ the notation

〈f(y)〉 :=∫

Yf(y) dy,

for the average over Y .Equation (12.6.3a) can be solved exactly. Integration from 0 to y gives

a(y)dχ

dy= −a(y) + c1. (12.6.5)

Page 175: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

12.6. APPLICATIONS 159

The constant c1 is undetermined at this point. The inequality (12.2.2c) allows us todivide (12.6.5) by a(y) since it implies that a is strictly positive. We then integrateonce again from 0 to y to deduce:

χ(y) = −y + c1

∫ y

0

1a(y)

dy + c2.

In order to determine the constant c1 we use the fact that χ(y) is a periodic function:

χ(0) = χ(1) ⇒ 0 = 1− c1

∫ 1

0

1a(y)

dy

⇒ c1 =1∫ 1

01

a(y) dy=: 〈a(y)−1〉−1.

Thus, from (12.6.5),

1 +dχ

dy=

1〈a(y)−1〉a(y)

.

Notice that c2 is not required. We substitute this expression in equations (12.6.4)to obtain:

a = 〈a(y)−1〉−1. (12.6.6)

This is the formula which gives the homogenized coefficient in one dimension. Itshows clearly that, even in this simple one-dimensional setting, the homogenizedcoefficient is not found by simply averaging the unhomogenized coefficients overa period of the microstructure. Rather the inverse homogenized coefficient is theaverage of the inverse of the unhomogenized coefficient.

12.6.2 Layered Materials

We consider problem (12.2.1), with assumptions (12.2.2) satisfied, in two dimen-sions. We assume that the domain Ω ∈ R2 represents a layered material: theproperties of the material change only in one direction. Hence, the coefficientsA(y) are functions of one variable: for y = (y1, y2)T we have

aij = aij(y1), i, j = 1, 2. (12.6.7)

The fact that the coefficients are functions of y1 implies the right hand side of thecell problem (12.3.2) is a function of y1 alone. As a consequence the solution ofthe cell problem is also a function of y1 alone and takes the form

χ` = χ`(y1), ` = 1, 2. (12.6.8)

Page 176: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

160 CHAPTER 12. HOMOGENIZATION FOR ELLIPTIC EQUATIONS

Substitution into (12.3.2) we conclude that the cell problem becomes

− d

dy1

(a11(y1)

dχ`(y1)dy1

)=

da1`(y1)dy1

, ` = 1, 2 (12.6.9)

with periodic boundary conditions. Similarly, the formula for the homogenizedcoefficients (12.3.1) becomes:

aij =∫ 1

0

(aij(y1) + ai1(y1)

dχj(y1)dy1

)dy1, i, j = 1, 2. (12.6.10)

Let us now solve equations (12.6.9). These are ordinary differential equations andwe can solve them in exactly the same way that we solved the one-dimensionalproblems in the preceding subsection. To this end, we integrate from 0 to y anddivide through by a11(y1) to obtain

dχ`

dy1= −a1`

a11+ c1

1a11

, ` = 1, 2 (12.6.11)

where the constant c1 is to be determined. We have to consider the cases ` = 1 and` = 2 separately. We start with ` = 1. In this case the above equation simplifies to

dχ1

dy1= −1 + c1

1a11

,

which is precisely the equation that we considered in Section 12.6.1. Thus, wehave:

dχ1

dy1= −1 +

1〈a11(y)−1〉a11(y)

. (12.6.12)

Now we consider equation (12.6.11) for the case ` = 2:

dχ2

dy1= −a12

a11+ c1

1a11

.

We integrate the above equation once again and then determine the coefficient c1

by requiring χ2(y1) to be periodic. The final result is

dχ2(y1)dy1

= −a12(y1)a11(y1)

+〈a12(y1)/a11(y1)〉

〈a−111 (y1)〉

1a11(y1)

. (12.6.13)

Now we can compute the homogenized coefficients. We start with a11. Thecalculation is the same as in the one–dimensional case:

a11 = 〈a11(y1)−1〉−1. (12.6.14)

Page 177: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

12.6. APPLICATIONS 161

We proceed with the calculation of a12. We substitute (12.6.13) into (12.6.10) withi = 1, j = 2 to deduce:

a12 =∫ 1

0

(a12(y1) + a11(y1)

dχ2(y1)dy1

)dy

=∫ 1

0

(a12(y1) + a11(y1)

(−a12(y1)

a11(y1)+〈a12(y1)/a11(y1)〉

〈a−111 (y1)〉

1a11(y1)

))dy

=∫ 1

0

(a12(y1)− a12(y1) +

〈a12(y1)/a11(y1)〉〈a−1

11 (y1)〉

)dy

=〈a12(y1)/a11(y1)〉

〈a−111 (y1)〉

.

Hence

a12 =⟨

a12(y1)a11(y1)

⟩〈a−1

11 (y1)〉−1. (12.6.15)

By the symmetry of A we deduce and of A we have

a21 =⟨

a21(y1)a11(y1)

⟩〈a−1

11 (y1)〉−1. (12.6.16)

Finally we consider a22 :

a22 =∫ 1

0

(a22(y1) + a21(y1)

dχ2(y1)dy1

)dy

=∫ 1

0

(a22(y1) + a21(y1)

(−a12(y1)

a11(y1)+〈a12(y1)/a11(y1)〉

〈a−111 (y1)〉

1a11(y1)

))dy

=∫ 1

0

(a12(y1)− a12(y1)a21(y1)

a11(y1)+

a21(y1)a11(y1)

〈a12(y1)/a11(y1)〉〈a−1

11 (y1)〉

)dy

=⟨

a21(y1)a11(y1)

⟩ ⟨a12(y1)a11(y1)

⟩〈a−1

11 (y1)〉−1 +⟨

a22(y1)− a12(y1)a21(y1)a11(y1)

⟩.

Consequently:

a22 =⟨

a21(y1)a11(y1)

⟩⟨a12(y1)a11(y1)

⟩〈a−1

11 (y1)〉−1 +⟨

a22(y1)− a12(y1)a21(y1)a11(y1)

⟩.

(12.6.17)It is evident from formulae (12.6.14), (12.6.15), (12.6.16) and (12.6.17) that thehomogenized coefficients depend on the original ones in a very complicated, highlynonlinear way.

Page 178: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

162 CHAPTER 12. HOMOGENIZATION FOR ELLIPTIC EQUATIONS

12.7 Discussion and Bibliography

The method of multiple scales is developed and used systematically in [17], wherereferences to the earlier literature can be found. See also [76] and the referencestherein. The one dimensional problem (see Section 12.6.1) was studied in [138],without using the method of multiple scales. In the one dimensional case it ispossible to derive the homogenized equation using the method of multiple scaleseven in the non–periodic setting; see [68, Ch. 5], [30, Ch. 5]. The homogenizedequation for layered materials (see Section 12.6.2) was derived rigorously by Muratand Tartar without any appeal to the method of multiple scales; see [107] and thereferences to the original papers therein. The two–dimensional case that we treatedin Section 12.6.2 can be easily extended to the d–dimensional one, d > 2, i.e. tothe case where aij(y) = aij(y1), i, j = 1, . . . , d. See [107].

The results that we presented in this chapter can be extended and generalizedin various ways. Below we discuss about some generalizations of the basic theory.

Different boundary conditions. The elliptic boundary value problem (12.2.1)that we considered in the previous section was a Dirichlet problem. However,an inspection of the analysis presented in Section 12.4 reveals that the boundaryconditions did not play any role in the derivation of the homogenized equation.In particular, the two–scale expansion (12.4.1) that we used in order to derive thehomogenized equation did not contain any information concerning the boundaryconditions of the problem under investigation. Indeed, the boundary conditionsbecome somewhat irrelevant in the homogenization procedure. Exactly the samecalculations would enable us to obtain the homogenized equation for Neumannor mixed boundary conditions. This is not surprising since the derivation of thehomogenized equation was based on the analysis of ”local” problems of the form(12.4.7). This local problem cannot really ”see” the boundary.

The boundary conditions become very important when trying to prove the ho-mogenization theorem. The fact that the two–scale expansion (12.4.1) does notsatisfy the boundary conditions of our PDE exactly but, rather, only up to O(ε)introduces boundary layers [68, ch. 3] 1. Boundary layers affect the convergencerate, i.e. the rate with which uε(x) converges to u(x) as ε → 0. We can solve thisproblem by modifying the two–scale expansion (12.4.1), adding additional terms

1The presence of boundary and initial layers is a common feature in all problems of singularperturbations. See e.g. [68] and [77] for further details.

Page 179: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

12.7. DISCUSSION AND BIBLIOGRAPHY 163

which take care of the boundary layer and vanish exponentially fast as we moveaway from the boundary so that they do not affect the solution in the interior. Werefer to [12] for details.

Locally Periodic Coefficients. In the Dirichlet problem that we analyzed in Sec-tion 12.4 we assumed that the coefficients aε

ij(x)di,j depend only on the mi-

croscale, i.e.Aε(x) = A

(x

ε

), i, j = 1, . . . d,

with A(y) being 1–periodic functions. However, the method of multiple scaleswould also be applicable to the case where the coefficients depend explicitly on themacroscale as well as the microscale:

Aε(x) = A(x,

x

ε

),

with A(x, y) being 1–periodic in y and smooth in x. When the coefficients havethis form they are called locally periodic or non–uniformly periodic . Analysissimilar to the one presented in Section 12.4 enables us to obtain the homogenizedequation for the Dirichlet problem

−∇x ·(A

(x,

x

ε

)∇xuε

)= f, for x ∈ Ω (12.7.1a)

uε(x) = 0, for x ∈ ∂Ω. (12.7.1b)

Now the homogenized coefficients are functions of x:

−∇x ·(A(x)∇xu

)= f, for x ∈ Ω (12.7.2a)

u(x) = 0, for x ∈ ∂Ω, (12.7.2b)

and the cell problem reads:

−∇y ·(A(x, y)∇yχ

T (x, y))

= ∇xA(x, y)T . (12.7.3)

The homogenized coefficients are given by the formula:

A(x) =∫

Y

(A(x, y) + A(x, y)∇xχ(x, y)T

)dy. (12.7.4)

We emphasize the fact that the ”macroscopic variable” x enters in the above twoequations as a parameter. Consequently, in order to compute the effective coef-ficients we need to solve the cell problem (12.7.3) and evaluate the integrals in(12.7.4) at all points x ∈ Ω.

Page 180: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

164 CHAPTER 12. HOMOGENIZATION FOR ELLIPTIC EQUATIONS

Semilinear PDEIt is possible to study semilinear elliptic PDE with the form

−∇ ·(A

(x

ε

)∇uε

)= f(uε, for x ∈ Ω, (12.7.5a)

uε(x) = 0, for x ∈ ∂Ω. (12.7.5b)

The homogenized equation takes the form

−A : ∇∇u = f(u), for x ∈ Ω, (12.7.6a)

u(x) = 0, for x ∈ ∂Ω. (12.7.6b)

Higher Order Correctors In section (12.2) we studied homogenization for theDirichlet boundary value problem (12.2.1) using the method of multiple scales.We derived the homogenized equation (12.3.3) and the cell problem (12.3.2). Wealso computed the first order correction u1

(x, x

ε

)

u1

(x,

x

ε

)= χ

(x

ε

)· ∇xu(x) + u1(x). (12.7.7)

We can proceed with solving equation (12.4.6) and computing the second correctorfield u2(x, y). This is given by

u2(x, y) =d∑

i,j=1

Θ(y) : ∇x∇xu + u2(x) (12.7.8)

where the second order corrector field Θij(y)di,j=1 satisfies

A0Θ = B. (12.7.9)

Here B(y) is given by

B(y) := −A + A(y) + A(y)∇yχ(y)T + χ(y)⊗∇y

(A(y)T

).

Reiterated Homogenization. The method of multiple scales can be extended tosituations where there are k length scales in the problem, i.e. when the matrix A

has the formA = A

(x

ε,

x

ε2, . . . ,

x

εk

),

Page 181: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

12.7. DISCUSSION AND BIBLIOGRAPHY 165

and A is 1–periodic in all of its arguments.This is known as reiterated homogenization – [17, Sec. 1.8]. A rigorous analy-

sis of reiterated homogenization in a quite general setting is presented in [5]. Reit-erated homogenization has recently found applications in the problem of advectionand diffusion of passive tracers in fluids. See, e.g. [122, 103, 104] for details.When there are infinitely many scales in the problem, without a clear separation,the homogenization result breaks down. See [15].

Bounds on the Effective coefficients. In general it is not possible to computethe homogenized coefficients analytically; indeed, their calculation requires thesolution of the cell problem and the calculation of the integrals in (12.3.1). Inmost cases this is can be done only numerically. It is possible, however, to obtainbounds on the magnitude of the effective coefficients. Various tools, for examplevariational formulas for the effective coefficients have been developed. We refer to[106, 144] for various results in this direction.

Numerical methods for Elliptic PDE with rapidly oscillating coefficients. Thenumerical evaluation of homogenized coefficients, in the periodic setting, can beperformed efficiently using a spectral method. On the other hand, the numericalsolution of the original boundary value problem (12.2.1) when ε is small is a veryhard problem. Special methods, which in one way or the other are based on homog-enization, have been developed over the last few years. We refer to [69, 35, 1, 39]and the references therein on this topic.

The Method of Multiple Scales for Evolution PDE. The method in this chapterreadily extends to intial/boundary value problem such as the following parabolic(diffusion) PDE:

∂uε

∂t−∇x ·

(A

(x

ε

)∇xuε

)= f(x, t) in Ω× (0, T ), (12.7.10a)

uε(x, t) = 0 on ∂Ω× (0, T ) (12.7.10b)

uε(x, 0) = uin(x) in Ω. (12.7.10c)

Similarly, one can also study the problem of homogenization for hyperbolic(wave) equations:

∂2uε

∂t2−∇x ·

(A

(x

ε

)∇xuε

)= f(x, t) in Ω× (0, T ), (12.7.11a)

Page 182: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

166 CHAPTER 12. HOMOGENIZATION FOR ELLIPTIC EQUATIONS

uε(x, t) = 0 on ∂Ω× (0, T ) (12.7.11b)

uε(x, 0) = uin(x) in Ω. (12.7.11c)

∂uε

∂t(x, 0) = vin(x) in Ω. (12.7.11d)

Another time-dependent situation of interest arises when the coefficients of theevolution PDE oscillate in time as well as space. Consider the following parabolicPDE

∂uε

∂t−∇x ·

(A

(x

ε,

t

ε2

)∇xuε

)= f(x, t) in Ω× (0, T ), (12.7.12a)

uε(x, t) = 0 on ∂Ω× (0, T ), (12.7.12b)

uε(x, 0) = uin(x) in Ω. (12.7.12c)

Now we take the coefficients A(y, τ) to be 1–periodic in both y and τ . This meansthat we have to introduce two fast variables: y = x

ε and τ = tε2 . More information

on homogenization for evolution equations with space–time dependent coefficientscan be found in [17, Ch. 3].

We study parabolic problems in the next chapter. However, in contrast to whatis covered in this chapter, we consider spatial operators which are not in divergenceform.

12.8 Exercises

1. Consider the problem of homogenization for (12.2.1) when the coefficientsmatrix A(y) has different period in each direction

A(y + λkek) = A(y), k = 1, . . . ,

with λk > 0, k = 1, . . . d. Write down the formulas for the homogenizedcoefficients.

2. Consider the two–scale expansion (12.4.1) for problem (12.2.1). In this chap-ter we calculated the first three terms in the two–scale expansion: u0 solvesthe homogenized equation, u1 is given by (12.7.7) and u2 by (12.7.8). Solveequations (12.7.7) (after checking that the solvability condition is satisfied) tocompute all higher order terms in the expansion and to obtain the correspond-ing cell problems.

Page 183: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

12.8. EXERCISES 167

3. Consider the Dirichlet problem (12.2.1) for a d–dimensional layered material,i.e.

aij(y) = aij(y1), 1–periodic in y1, i, j = 1, . . . , d.

Solve the corresponding cell problem and obtain formulas for the homogenizedcoefficients for d > 3, arbitrary.

4. Consider the Dirichlet problem (12.2.1) for a d–dimensional isotropic material,i.e.

aij(y) = a(y)δij , 1–periodic, i, j = 1, . . . , d,

where δij stands for Kronecker’s delta.

a. Use the specific structure of A(y) to simplify the cell problem as much asyou can.

b. Let d = 2 and assume that a(y) is of the form

a(y) = Y1(y1)Y2(y2).

Solve the two components of the cell problem and obtain formulas for thehomogenized coefficients (hint: use separation of variables).

5. Consider the boundary value problem (12.7.1). Assume that A(x, y) is smooth,1–periodic in y and uniformly elliptic and that, furthermore, f is smooth. Usethe method of multiple scales to obtain generalizations of the homogenizedequation (12.7.2), the cell problem (12.7.3) and the formula for the homoge-nized coefficients (12.7.4). Verify that the results of section 12.5 still hold.

6. Consider the Dirichlet problem

−∇ ·(A

(x

ε,

x

ε2

)∇uε

)= f, for x ∈ Ω (12.8.1a)

uε(x) = 0, for x ∈ ∂Ω. (12.8.1b)

where the coefficients A(y, z) are periodic in both y and z with period 1. Usethe 3–scale expansion

uε(x) = u0

(x,

x

ε,

x

ε2

)+ εu1

(x,

x

ε,

x

ε2

)+ ε2u2

(x,

x

ε,

x

ε2

)+ . . .

to derive an effective homogenized equation, together with the formula for thehomogenized coefficients and two cell problems.

Page 184: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

168 CHAPTER 12. HOMOGENIZATION FOR ELLIPTIC EQUATIONS

7. Repeat the previous exercise by homogenizing first with respect to z = y/ε

and then with respect to y:

a. Homogenize the equation

−∇ ·(A

(y,

y

ε

)∇uε

)= f, for x ∈ Ω (12.8.2a)

uε(x) = 0, for x ∈ ∂Ω (12.8.2b)

by treating y as a parameter.

b. Homogenize the equation

−∇ ·(A

(x

ε

)∇uε

)= f, for x ∈ Ω (12.8.3a)

uε(x) = 0, for x ∈ ∂Ω, (12.8.3b)

where A(y) is given by the expression derived in the preceding section ofthe question.

8. Derive the homogenized equation, together with the cell problem and the for-mula for the homogenized coefficients, by applying the method of multiplescales to the equation (12.7.10).

9. Consider the initial boundary value problem (12.7.12). Explain why it is nat-ural to the period of oscillations in time to be ε2, when the period of oscil-lations in space is ε? Carry out the homogenization analysis based on themethod of multiple scales for the cases where the coefficients are of the formaij(x

ε , tε)d

i,j=1 and aij(xε , t

ε3 )di,j=1

2.

10. Use the method of multiple scales to derive the homogenized equation from(12.7.11).

11. Prove that the homogenized coefficient a for equation (12.6.1) under (12.6.2)has the same upper and lower bound as a(y). Moreover, show that it is boundedfrom above by the average of a(y):

α 6 a 6 β,

anda 6 〈a(y)〉.

2See [17, ch.3] for further details on the derivation of the homogenized equations using themethod of multiple scales.

Page 185: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

12.8. EXERCISES 169

12. Show that the equation (12.7.5) can be homogenized to obtain he effectiveequation (12.7.6).

13. a. (i) Consider the eigenvalue problem

−∆uε +1εf(x/ε)uε = λεuε, x ∈ Ω

uε = 0, x ∈ ∂Ω.

Assume that f : Td → Rd is smooth and periodic and that∫

Td

f(y)dy = 0.

Use a multiscale expansion to find an approximation to the eigenvalue prob-lem in which ε → 0 is eliminated.

b. (ii) Are the resulting eigenvalues smaller or larger than the eigenvalueswhich arise when f ≡ 0?

14. a. (i) Let A : Td → Rd×d be smooth and periodic and consider the eigenvalueproblem

−∇ ·(A(x/ε)∇uε

)= λεuε, x ∈ Ω

uε = 0, x ∈ ∂Ω.

Use a multiscale expansion to find an approximation to the eigenvalue prob-lem in which ε → 0 is eliminated.

b. (ii) State conditions on the matrix A(y) under which the homogenized equa-tion is a well-posed elliptic boundary value problem.

Page 186: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

170 CHAPTER 12. HOMOGENIZATION FOR ELLIPTIC EQUATIONS

Page 187: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

Chapter 13

Homogenization for ParabolicEquations

13.1 Introduction

In this chapter we use multi–scale techniques to investigate the long time behaviorof solutions to parabolic PDEs. The techniques employed are almost identical tothose used in the study of homogenization for SDEs, in Chapters 11. This connec-tion will be made more explicit at the end of the chapter.

13.2 Full Equations

We study the following initial value (Cauchy) problem

∂u(x, t)∂t

= b(x) · ∇u(x, t) + D∆u(x, t) for (x, t) ∈ Rd × R+, (13.2.1a)

u = u0 for (x, t) ∈ Rd × 0, (13.2.1b)

with D > 0. In our analysis we will assume that the vector b(x) is smooth andperiodic in space with period 1 in all spatial directions. Furthermore, we assumethat the initial conditions are slowly varying, so that

u0(x) = f(εx), (13.2.2)

with ε ¿ 1. Because the initial data is slowly varying it is natural to look to largelength and time-scales to see the effective behavior of the PDE. If b averages to

171

Page 188: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

172 CHAPTER 13. HOMOGENIZATION FOR PARABOLIC EQUATIONS

zero in an appropriate sense then, as we will now show, the effective behavior of u

is that of a pure diffusion.To see this effect we introduce the new variables z, τ through

z = εx, τ = ε2t

and defineuε(z, τ) = u

(z

ε,

τ

ε2

).

The rescaled field uε(z, τ) satisfies the equation

∂uε(z, τ)∂τ

=1εb(z

ε

)· ∇zu

ε(z, τ) + D∆zuε(z, τ).

Returning, for notational clarity, to the variables (x, t), and using (13.2.2), wearrive at the following initial value problem:

∂uε(x, t)∂t

=1εb(x

ε

)· ∇uε(x, t) + D∆uε(x, t). (13.2.3a)

uε(x, 0) = f(x). (13.2.3b)

This equation will be the object of our study. It can be derived directly from(13.2.1a) by setting x → x/ε and t → t/ε2.

Define the operatorL0 = b(y) · ∇y + D∆y (13.2.4)

with periodic boundary conditions and its adjoint L∗, also with periodic boundaryconditions. Note that L is the generator of an SDE. Hence it is natural to define theinvariant distribution ρ(y) to be the stationary solution of the adjoint equation:

L∗0ρ = 0. (13.2.5)

By Theorem 6.10 there is a unique solution to this equation, up to normalization.We assume that that the drift b(y) is centered with respect to the invariant distribu-tion of the fast process: ∫

Td

b(y)ρ(y) dy = 0. (13.2.6)

We will call this the centering condition. Since by Theorem 6.10 L also has aone-dimensional null-space, comprising constants, it follows that the equation

Lv = −b

has a solution, unique up to constants, by the Fredholm alternative.

Page 189: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

13.3. SIMPLIFIED EQUATIONS 173

13.3 Simplified Equations

The field χ(y) is defined to be the solution of the cell problem

−L0χ(y) = b(y), χ is 1 –periodic,∫

Yχ(y)ρ(y)dy = 0. (13.3.1)

The effective diffusion tensor (or effective diffusivity) is defined as

K = DI + 2D

Td

∇χ(y)T ρ(y) dy +∫

Td

(b(y)⊗ χ(y)

)ρ(y) dy. (13.3.2)

Result 13.1. For ε ¿ 1 and t = O(1) the solution of (13.2.3) uε(x, t) is approxi-mated by u(x, t), the solution of

∂u(x, t)∂t

= K : ∇x∇xu(x, t), (13.3.3a)

u(x, 0) = f(x). (13.3.3b)

13.4 Derivation of the Homogenized equation

Our goal now is to use the method of multiple scales in order to analyze the behav-ior of uε(x, t), the solution of (13.2.3) in the limit as ε → 0. In particular, we wantto derive Result 13.1.

We introduce the auxiliary variable y = x/ε. 1 Let φ = φ(x, x/ε) be scalar-valued. The chain rule gives

∇φ = ∇xφ +1ε∇yφ and ∆φ = ∆xφ +

2ε∇x · ∇yφ +

1ε2

∆yφ.

The partial differential operator that appears on the right hand side of equation(13.2.3) now becomes

L =1ε2L0 +

1εL1 + L2.

where

L0 = b(y) · ∇y + D∆y,

L1 = b(y) · ∇x + 2D∇x · ∇y,

L2 = D∆x.

1As in the elliptic case, this is where the assumption of scale separation enters: we will treat xand y as independent variables. Justifying this assumption as ε → 0 is one of the main issues inthe rigorous theory of homogenization.

Page 190: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

174 CHAPTER 13. HOMOGENIZATION FOR PARABOLIC EQUATIONS

In terms of x and y equation (13.2.3a) becomes

∂uε

∂t= Luε.

We seek a solution in the form of a multiple scales expansion

uε(x, t) = u0 (x, y, t) + εu1 (x, y, t) + ε2u2 (x, y, t) + . . . (13.4.1)

where uj(x, y, t), j = 1, 2 . . . , is a periodic function of y. We substitute (13.4.1)and equate terms of equal powers in ε. Note that L0 is a differential operator in y

only and recall that it is equipped with periodic boundary conditions. We obtainthe following sequence of equations.

O(1/ε2) L0u0 = 0, u0 is 1–periodic, (13.4.2a)

O(1ε) L0u1 = −L1u0, u1 is 1–periodic, (13.4.2b)

O(1) L0u2 = −L1u1 − L2u0 +∂u0

∂t, u2 is 1–periodic.(13.4.2c)

Note that L0 is a differential operator in y only and recall that it is equipped withperiodic boundary conditions.

Equation (13.4.2) implies that u0(x, y, t) = u(x, t) only. In order to proceedwe need to study equations of the form

−L0v = f, (13.4.3)

with periodic boundary conditions. The Fredholm alternative shows that this equa-tion has a solution if and only if

Td

f(y)ρ(y)dy = 0. (13.4.4)

Notice thatL1u0 = b(y0 · ∇xu(x, t).

The centering condition (13.2.6) shows us that (13.4.4) is satisfied so that (13.4.2)has a solution. We use separation of variables to write it as

u1(x, y, t) = χ(y) · ∇xu(x, t).

Then χ(y) solves the cell problem (13.3.1). Our assumptions imply that thereexists a unique (up to constants), smooth solution to the cell problem.

Page 191: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

13.5. EFFECTIVE DIFFUSIVITY 175

Now we proceed with the analysis of theO(1) equation (13.4.2). The solvabil-ity condition 13.4.4 reads

Td

(∂u

∂t− L2u− L1u1

)ρ dy = 0.

The fact that u0 = u(x, t) is independent of y enables us to rewrite the aboveequation in the form

∂u

∂t= D∆u +

Td

(L1u1

)ρ dy. (13.4.5)

Now we have

L1u1 =(b · ∇x(χ · ∇xu) + 2D∇x · ∇y(χ · ∇xu)

)

=(b⊗ χ + 2D∇yχ

T)

: ∇x∇xu.

In view of the above calculation, equation (13.4.5) becomes

∂u(x, t)∂t

= K : ∇x∇xu(x, t),

which is the homogenized equation (13.3.3a). The effective diffusivity K is givenby formula (13.3.2).

13.5 Properties of the Effective Diffusivity

In this section we show that the effective diffusivity is non-negative definite. Thisimplies that the homogenized equation is well posed. To prove this we need tocalculate the Dirichlet form associated with the operator L0. The following is adirect consequence of Theorem 6.7 in the case of additive noise.

Lemma 13.2. Let f(y) be a smooth periodic function. Then∫

Td

(−L0f(y))f(y)ρ(y) dy = D

Td

|∇yf(y)|2ρ(y) dy. (13.5.1)

We have the following.

Theorem 13.3. Let ξ ∈ Rd an arbitrary vector and let χξ(y) := χ(y) · ξ. Then

〈ξ,Kξ〉 =∫

Td

|ξ +∇yχξ(y)|2ρ(y) dy.

In particular, the effective diffusivity in non–negative definite.

Page 192: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

176 CHAPTER 13. HOMOGENIZATION FOR PARABOLIC EQUATIONS

Proof. Note that −L0χξ = ξ · b. We use Lemma 13.2 to calculate

〈ξ,Kξ〉 = D|ξ|2 + 2D

Td

ξ · ∇yχξ(y)ρ(y) dy + D

Td

(ξ · b)χξ(y)ρ(y) dy

= D|ξ|2 + 2D

Td

ξ · ∇yχξ(y)ρ(y) dy + D

Td

|∇yχξ(y)|2ρ(y) dy

=∫

Td

|ξ +∇yχξ(y)|2ρ(y) dy.

The fact that the effective diffusivity in non–negative definite follows immediatelyfrom the above equation.

It is of interest to know how the effective diffusion tensor K compares withthe original diffusion tensor dI . It turns out that the quadratic form 〈ξ,Kξ〉, foran arbitrary unit vector ξ ∈ Rd (the effective diffusivity) can be either greater orsmaller than D|ξ|2 (the value arising from the original diffusivity). This issue isdiscussed in detail in the next section.

We will see in the next section that the effective diffusivity is smaller than D

for gradient vector fields b, and that it is greater than D for divergence–free vectorfields b.

13.6 Applications

In this section we will consider two particular choices for the drift term b(y) in(13.2.3a), gradient and divergence–free fields. In both cases it is possible to per-form explicit calculations which yield considerable insight. In particular, we willbe able to obtain a formula for the (unique) invariant distribution and, consequently,to simplify the centering condition (13.2.6). Furthermore we will be able to com-pare the effective diffusivity and the original diffusivity D. We will see that theeffective diffusivity is smaller than D for gradient vector fields b, and that it isgreater than D for divergence–free vector fields b. Finally, we will study two par-ticular cases of gradient and divergence free fields for which we can derive closedformulas for the effective diffusivity.

13.6.1 Gradient Vector Fields

We consider the case where the vector field b(y) in equation (13.2.3a) is the gradi-ent of a smooth, scalar periodic function,

b(y) = −∇yV (y). (13.6.1)

Page 193: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

13.6. APPLICATIONS 177

The function V (y) is called the potential. In this case it is straightforward to de-rive a formula for the solution ρ(y) of the stationary adjoint equation (13.2.5) withperiodic boundary conditions. The distribution ρ(y) defined in (13.6.3) below iscalled the Gibbs distribution. The normalization constant Z is called the parti-tion function.

Lemma 13.4. Assume that condition (13.6.1) holds and let L∗0 denote the adjointoperator for the operator L0 defined in (13.2.4). Then the equation

L∗0ρ = 0,

Td

ρ(y)dy = 1, (13.6.2)

subject to periodic boundary conditions on Td has a unique solution given by

ρ(y) =1Z

e−V (y)/D, Z =∫

Td

e−V (y)/D dy. (13.6.3)

Proof. Equation (13.6.2), in view of equation (13.6.1), becomes

∇ · (∇yV (y)ρ(y) + D∇yρ(y))

= 0.

We immediately check that ρ(y) given by (13.6.3) satisfies

∇yV (y)ρ(y) + D∇yρ(y) = 0,

and hence it satisfies (13.6.2). Furthermore, by construction we have that∫

Td

1Z

e−V (y)/D dy = 1,

and hence ρ(y) is normalized. Thus we have constructed a solution. Uniquenessfollows by the maximum principle.

In the case of gradient flows the operator

L0 = b(y) · ∇y + D∆y (13.6.4)

with periodic boundary conditions becomes symmetric, in the appropriate functionspace. We have the following.

Lemma 13.5. Assume that condition (13.6.1) is satisfied and let ρ(y) denote theGibbs distribution (13.6.3). Then the operator L0 given in (13.6.4) satisfies

Td

f(y)(L0h(y)

)ρ(y) dy =

Td

h(y)(L0f(y)

)ρ(y) dy, (13.6.5)

for all smooth periodic f(y), h(y).

Page 194: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

178 CHAPTER 13. HOMOGENIZATION FOR PARABOLIC EQUATIONS

Proof. Using the divergence theorem we have

Td

fL0hρ dy =1Z

Td

f(−∇yV · ∇yh

)e−V/D dy +

D

Z

Td

f∆yhe−V/D dy

=D

Z

Td

f∇yh · ∇y

(e−V/D

)dy − D

Z

Td

∇yf∇yhe−V/D dy

−D

Z

Td

f∇yh · ∇y

(e−V/D

)dy

= −D

Td

(∇yf · ∇yh

)ρ dy.

The expression in the penultimate line is symmetric in f, h and hence (13.6.5)follows.

Remark 13.6. Actually this symmetry property arises quite naturally from theidentity used in proving Theorem 6.7. Furthermore, the calculation used in theproof of the above lemma gives us the following useful formula

Td

fL0hρ dy = −D

Td

(∇yf · ∇yh

)ρ(y) dy. (13.6.6)

The Dirichlet form Theorem 6.7 follows from this upon setting f = h.

Now we are ready to prove various properties of the effective diffusivity.

Theorem 13.7. Assume that b is a gradient so that (13.6.1) holds and let ρ(y)denote the Gibbs distribution (13.6.3). Then the effective diffusivity is positivesemi-definite. Furthermore, the effective diffusivity is always depleted:

〈ξ,Kξ〉 6 D|ξ|2 ∀ξ ∈ Rd.

Proof. Non-negativity is proved in Theorem 13.3. To establish the comparison on

Page 195: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

13.6. APPLICATIONS 179

the effective diffusivity, note that we have

K = DI + 2D

Td

∇χT ρ dy +∫

Td

−∇yV ⊗ χρ dy

= DI − 2D

Td

∇yρ⊗ χdy +∫

Td

−∇yV ⊗ χρ dy

= DI − 2∫

Td

−∇yV ⊗ χρ dy +∫

Td

−∇yV ⊗ χρ dy

= DI −∫

Td

−∇yV ⊗ χρ dy

= DI −∫

Td

(− L0χ)⊗ χρ dy

= DI −D

Td

∇yχ⊗∇yχρ dy.

This proves that the effective diffusivity is symmetric. Furthermore, for χξ = χ · ξ,

〈ξ,Kξ〉 = D|ξ|2 −D

Td

|∇yχξ|2ρ dy

6 D|ξ|2.This proves depletion.

13.6.2 The One Dimensional Case

The one-dimensional case is always in gradient form: b(y) = −∂yV (y). Further-more in one dimension we can solve the cell problem (13.3.1) in closed form andcalculate the effective diffusion coefficient explicitly–up to quadratures. We startwith the following calculation concerning the structure of the diffusion coefficients.

K = D + 2D

∫ 1

0∂yχρ dy +

∫ 1

0−∂yV χρ dy

= D + 2D

∫ 1

0∂yχρ dy + D

∫ 1

0χ∂yρ dy

= D + 2D

∫ 1

0∂yχρ dy −D

∫ 1

0∂yχρ dy

= D

∫ 1

0

(1 + ∂yχ

)ρ dy. (13.6.7)

The cell problem (13.3.1) in one dimension is

D∂yyχ− ∂yV ∂yχ = ∂yV. (13.6.8)

Page 196: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

180 CHAPTER 13. HOMOGENIZATION FOR PARABOLIC EQUATIONS

We multiply equation (13.6.8) by e−V (y)/D to obtain

∂y

(∂yχe−V (y)/D

)= −∂y

(e−V (y)/D

).

We integrate the above equation from 0 to 1 and multiply by eV (y)/D to obtain

∂yχ(y) = −1 + c1eV (y)/D.

Another integration yields

χ(y) = −y + c1

∫ y

0eV (y)/D dy + c2.

The periodic boundary conditions imply that χ(0) = χ(1), from which we con-clude that

−1 + c1

∫ y

0eV (y)/D dy = 0.

Hence

c1 =1

Z, Z =

∫ 1

0eV (y)/σ dy.

We conclude that

∂yχ = −1 +1

ZeV (y)/D.

We substitute this expression into (13.6.7) to obtain

K =D

Z

∫ 1

0(1 + ∂yχ(y)) e−V (y)/D

=D

ZZ

∫ 1

0eV (y)/De−V (y)/D dy

=D

ZZ, (13.6.9)

with

Z =∫ 1

0e−V (y)/D dy, Z =

∫ 1

0eV (y)/D dy. (13.6.10)

The Cauchy-Schwarz inequality shows that ZZ > 1.

Page 197: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

13.6. APPLICATIONS 181

13.6.3 Divergence–Free Fields

In this section we consider the problem of homogenization for (13.2.3a) in the casewhere the vector field b(y) is divergence–free (or incompressible):

∇ · b(y) = 0. (13.6.11)

The incompressibility of b(y) simplifies the analysis considerably because the ad-vection operator

L0 = b(y) · ∇y,

with periodic boundary conditions is anti–symmetric:

Lemma 13.8. Let b(y) be a smooth, periodic vector field satisfying (13.6.11). Thenfor all smooth, periodic functions f(y), h(y) we have

Td

f(y) (b(y) · ∇yh(y)) dy = −∫

Td

h(y) (b(y) · ∇yf(y)) dy.

In particular, ∫

Td

f(y) (b(y) · ∇yf(y)) dy = 0. (13.6.12)

Proof. We use the incompressibility of b(y), together with the periodocity of f(y),h(y) and b(y) to calculate

Td

f(y) (b(y) · ∇yh(y)) dy =∫

Td

f(y)∇y ·(b(y)h(y)

)dy

= −∫

Td

∇yf(y) · (b(y)h(y)) dy

= −∫

Td

h(y)(b(y) · ∇yf(y)

)dy.

Equation (13.6.12) follows from the above calculation upon setting f = h.

Using the previous lemma it is easy to prove that the unique invariant measureof the fast process is the Lebesgue measure.

Lemma 13.9. Let L0 denote the operator

L0 = b(y) · ∇y + D∆y

Page 198: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

182 CHAPTER 13. HOMOGENIZATION FOR PARABOLIC EQUATIONS

with periodic boundary conditions and with b(y) satisfying (13.6.11). Let L∗0 de-note the L2(Td)–adjoint of L0. Then the adjoint equation

L∗0ρ = 0,

Td

ρ(y)dy = 1, (13.6.13)

with periodic boundary conditions on Td has a unique classical solution given by

ρ(y) = 1. (13.6.14)

Proof. Lemma 13.8 implies that the L2–adjoint of L0 is

L∗0 = −b(y) · ∇y + D∆y,

with periodic boundary conditions. Let ρ(y) be a solution of equation (13.6.13).We multiply the equation by ρ(y), integrate over Td and use Lemma 13.8 to obtain

Td

|∇yρ(y)|2 dy = 0,

from which we deduce that ρ(y) is a constant. Hence, the unique normalized solu-tion of (13.6.13) is given by (13.6.14).

Remark 13.10. Given the solution ρ(y) = 1, uniqueness can also be proved byuse of the maximum principle. A consequence of the preceding lemma is that, when

∇y · b(y) = 0,

the solvability condition (13.2.6) becomes∫

Td

b(y) dy = 0.

Thus, it is straightforward to check whether a given periodic divergence–free fieldsatisfies the solvability condition – the field must average to zero over the unittorus.

Now let χ(y) be the solution of the cell problem (13.3.1) with b(y) satisfying(13.6.11). The periodicity of χ(y), together with (13.6.14) imply that the secondterm on the right hand side of equation (13.3.2) vanishes and the formula for theeffective diffusivity reduces to

K := DI +∫

Td

b(y)⊗ χ(y) dy. (13.6.15)

Page 199: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

13.6. APPLICATIONS 183

The effective diffusivity for gradient flows is symmetric. This is not true fordivergence–free flows. However, only the symmetric part of K enters into thehomogenized equation. This is because

K : D =12(K + KT ) : D

whenever D is symmetric. Clearly ∇x∇xu is symmetric and so

K : ∇x∇xu =12(K +KT ) : ∇x∇xu.

For this reason we redefine the effective diffusivity to be the symmetric part of K:

K := DI +12

Td

(b(y)⊗ χ(y) + χ(y)⊗ b(y)) . (13.6.16)

Our goal now is to show that the homogenization procedure enhances diffusion,i.e. that the effective diffusivity is always greater than the molecular diffusivity D.For this we will need an alternative representation formula for K.

Theorem 13.11. The effective diffusivityK given by (13.6.16) can be written in theform

K := DI +(∫

Td

∇yχ(y)∇yχ(y)T dy

). (13.6.17)

Proof. We take the outer-product of the cell-problem (13.3.1) with χ(y) to the leftand integrate over the unit cell to obtain

−D

Td

χ(y)⊗∆yχ(y) dy−∫

Td

χ(y)⊗ (∇yχ(y)b(y)) dy =∫

Td

χ(y)⊗ b(y) dy.

We apply the divergence theorem to the two integrals on the left hand side of theabove equation, using periodicity and the fact that b is divergence free, to obtain

D

Td

∇yχ(y)∇yχ(y)T dy +∫

Td

(∇χ(y)b(y))⊗ χ(y) dy =∫

Td

χ(y)⊗ b(y) dy.

(13.6.18)We now take the outer-product with χ to the right and use the divergence theoremonly on the first integral, to obtain

D

Td

∇yχ(y)∇yχ(y)T dy −∫

Td

(∇χ(y)b(y))⊗ χ(y) dy =∫

Td

b(y)⊗ χ(y) dy.

(13.6.19)

Page 200: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

184 CHAPTER 13. HOMOGENIZATION FOR PARABOLIC EQUATIONS

We add equations (13.6.18) and (13.6.19) to obtain:

12

Td

b(y)⊗ χ(y) + χ(y)⊗ b(y) dy = D

Td

∇yχ(y)∇yχ(y)T dy.

Equation (13.6.17) now follows upon substituting the above expression into equa-tion (13.6.16).

Formula (13.6.17) readily yields that the effective diffusivity is alwayes greaterthan the molecular diffusivity in the following sense.

Theorem 13.12. For every vector ξ ∈ Rd we have

〈ξ,Kξ〉 > D|ξ|2,

where | · | denotes the Euclidean norm in Rd. Equality holds for all ξ only whenχ(y) ≡ 0.d

Proof. Let ξ ∈ Rd and define χξ := χ · ξ. From (13.6.17) we have

〈ξ,Kξ〉 := D|ξ|2 +∫

Td

|∇yχξ(y)|2 dy

> D|ξ|2.

Clearly equality of the two diffusivities for all ξ implies that χξ = 0 for all ξ

implying that ξ = 0.

We remark that from the above corollary it immediately follows that the ef-fective diffusivity is a positive definite matrix and, hence, well posedness of thehomogenized equation (13.3.3), when (13.6.11) is satisfied, is ensured.

13.6.4 Shear Flow in 2D

It this section we study an example of a divergence–free flow for which the cellproblem can be solved in closed form, that of a shear flow. The structure of ashear velocity field is such that the cell problem becomes an ordinary differentialequation which can be easily solved by means of Fourier series.

We consider the problem of homogenization for (13.2.3a) in two dimensionsfor the following velocity field:

b(y) = (0, b2(y1)), (13.6.20)

Page 201: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

13.6. APPLICATIONS 185

where b2(y1) is a smooth, 1–periodic function. Notice that the velocity field (13.6.20)is incompressible:

∇ · b(y) =∂b1

∂y1+

∂b2

∂y2=

∂b2(y1)∂y2

= 0.

The two components of the cell problem satisfy

−D∆yχ1(y)− b2(y1)∂χ1(y)

∂y2= 0, (13.6.21a)

−D∆yχ2(y)− b2(y1)∂χ2(y)

∂y2= b2(y1), (13.6.21b)

as well as periodicity and the normalization condition that χ integrates to zero overthe unit cell Y.

If we multiply the first equation (13.6.21a) by χ1(y), integrate by parts over Td

and use Lemma 13.8, then we deduce that∫

Td

|∇yχ1(y)|2 dy = 0.

Hence, χ1(y) = 0, since we impose the condition 〈χ(y)〉 = 0. On the other hand,the fact that the right hand side of (13.6.21b) depends only on y1 implies that itis reasonable to assume that the solution χ2(y) is independent of y2; we seek asolution of this form and then uniqueness of solutions to the cell problem impliesthat it is the only solution. Equation (13.6.21b) becomes:

−Dd2χ2(y1)

dy21

= b2(y1). (13.6.22)

By (13.6.16) the effective diffusivity K is the following 2× 2 matrix:

K =(

D + 〈b1χ1〉 〈b2χ1 + b1χ2〉〈b2χ1 + b1χ2〉 D + 〈b2χ2〉

)

=(

D 00 D + 〈b2(y1)χ2(y1)〉

).

From this we deduce that

K22 = D +1D‖b2(y1)‖2

H−1 . (13.6.23)

Page 202: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

186 CHAPTER 13. HOMOGENIZATION FOR PARABOLIC EQUATIONS

The H−1 norm of a real–valued, periodic function with period 1 can be expressedin terms of Fourier series:

‖f(y)‖2H−1 =

12π2

∞∑

k=1

|fk|2|k|2 .

We will derive this result in due course. First we give a quick justification. Wehave that

〈b2(y1)χ2(y1)〉 = −⟨

Dd2χ2(y1)

dy21

χ2(y1)⟩

= D‖χ2(y1)‖2H1 .

From this expression and from (13.6.22) it follows that

〈b2(y1)χ2(y1)〉 =1D‖b2(y1)‖2

H−1 .

Notice that remarkable fact that the effective diffusion coefficient blows up as theoriginal diffusion coefficient tends to zero.

Hence, in order to compute the effective diffusivity we need to calculate theaverage of b2χ2 over the unit cell. Since this expression depends only on y1 wehave:

〈b2(y1)χ2(y1)〉 =∫ 1

0b2(y1)χ2(y1) dy1.

Now we need to solve equation (13.6.22), subject to periodic boundary conditions.We will accomplish this by using the Fourier series representation of periodic func-tions. In the remainder of this example we will drop all indices and subscripts:b(y) := b2(y1), χ(y) := χ2(y1).

Since the velocity field b2(y1) is smooth and 1–periodic, it can be expressedthrough its Fourier series expansion

b(y) =∞∑

k=−∞bke

2πiky, (13.6.24)

where

bk =∫ 1

0b(y)e−2πiky dy. (13.6.25)

It is straightforward to check that the facts that b(y) has mean zero and is a realvalued function imply, respectively, that b0 = 0 and that b−k = bk (We will refer to

Page 203: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

13.6. APPLICATIONS 187

this as the reality condition). Now, the solution χ(y) of (13.6.22) is an element ofH1(0, 1) and hence it admits the Fourier representation

χ(y) =∑

k 6=0

χke2πiky, (13.6.26)

where we have already imposed the condition 〈χ(y)〉 = 0. Furthermore, since χ(y)is a real valued function, we have that χ−k = χk. We use (13.6.26) to calculate thesecond order derivative of χ(y) :

d2χ(y)dy2

= −4π2∑

k 6=|k|2χke

2πiky.

We use this expression in (13.6.22), together with the Fourier expansion of b(y)and the orthonormality of the Fourier basis to obtain an infinite set of equations foreach fourier coefficient of χ:

4π2D|k|2χk = bk, k ∈ Z/0.

Consequently

χk =bk

4π2D|k|2

and

χ(y) =∑

k 6=0

bk

4π2D|k|2 e2πiky. (13.6.27)

Formula (13.6.27) provides us with the solution of the cell problem (13.6.22). We

Page 204: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

188 CHAPTER 13. HOMOGENIZATION FOR PARABOLIC EQUATIONS

use this formula, together with (13.6.24) to compute:

〈b(y)χ(y)〉 =∫ 1

0b(y)χ(y) dy

=∫ 1

0

k 6=0

bk

4π2D|k|2 e2πiky

` 6=0

b`e2πi`y

dy

=∑

k, 6=0

(bkb`

4π2D|k|2∫ 1

0e2πi(k+`)y dy

)

=∑

k, 6=0

(bkb`

4π2D|k|2 δk,−`

)

=∑

k 6=0

|bk|24π2D|k|2

=1

2π2D

∞∑

k=1

|bk|2|k|2 .

In deriving the penultimate equality we used the reality condition for b(y). Wecombine the above calculation with the formula for the effective diffusivity to ob-tain equation (13.6.23):

K22 = D +1D‖b2(y1)‖2

H−1 .

13.7 The Connection to SDE

Equation (13.2.1) is the backward Kolmogorov equation associated with the SDE

dx

dt= b(x) +

√2D

dW

dt, (13.7.1)

where W denotes standard Brownian motion on Rd. Unsurprisingly, then, thehomogenization results derived in this chapter have implications for the behaviorof solutions to this SDE. To see this we first apply the re-scaling, used to derive(13.2.3) from (13.2.1), to the SDE. That is we re-label according to

x → x/ε, t → t/ε2

giving the SDEdx

dt=

1εb(x

ε

)+√

2DdW

dt. (13.7.2)

Page 205: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

13.8. DISCUSSION AND BIBLIOGRAHPY 189

(Recall Remark 6.3 regarding the behaviour of white noise under time-rescaling).If we introduce the variable y = x/ε then we can write this SDE in the form

dx

dt=

1εb(y) +

√2D

dW

dt,

dy

dt=

1ε2

b(y) +1ε

√2D

dW

dt. (13.7.3)

This is precisely in the form (11.2.1) which we analyzed in Chapter 11. Theonly difference is that the noises appearing in the x and y equations are corre-lated (U = V ). This has the effect of changing the operator L1 in that chapter, sothat the results derived there do not apply directly. They can, however, be extendedto the study of correlated noise. Notice that the centering condition (13.2.6) is pre-cisely the condition (11.2.2) since ρ is the stationary solution of the Fokker-Planckequation.

The calculations in this chapter show how the backward Kolmogorov equationfor the coupled SDE in (x, y) can be approximated by a diffusion equation in the x

variable alone. Indeed the diffusion equation is pure Brownian motion. Interpretedin terms of the SDE we obtain the following result.

Result 13.13. Assume that (13.2.6) holds For ε ¿ 1 and t = O(1), x solving theSDE (13.7.3) can be approximated by X solving

dX

dt=√

2KdW

dt

where the matrix K is given by (13.3.2).

13.8 Discussion and Bibliograhpy

The problem of homogenization for second order parabolic PDE and its connec-tion to the study of the long time asymptotics of solutions to SDE is studied in [17,Ch. 3]. References to the earlier literature can be found in there. See also [112].Periodic homogenization for gradient flows is discussed in e.g. [112, 124, 152, 54].Stochastic differential equations of the form (13.7.1) whose drift is the gradient ofa periodic scalar function describe Brownian motion in periodic potentials. This avery important problem in many applications, e.g. in solid state physics and biol-ogy. See [128, Ch. 11], [126] and the references therein. Multiscale techniqueswere applied to this problem in [123]. On the other hand, the SDE (13.7.1) with

Page 206: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

190 CHAPTER 13. HOMOGENIZATION FOR PARABOLIC EQUATIONS

divergence–free drift occur naturally in the modeling of diffusion processes in flu-ids. Homogenization for periodic, incompressible flows is a part of the theory ofturbulent diffusion e.g. [94, Ch. 2]. In this context an interesting question concernsthe dependence of the effective diffusivity on the molecular diffusion. See [94, Ch.2] and the references therein.

It is possible to derive a homogenized equation even when the centering condi-tion (13.2.6) is not satisfied. In this case it is necessary to use a frame co–movingwith the mean flow

F =∫

Td

b(y)ρ(y) dy. (13.8.1)

Then, it is possible to derive a homogenized equation of the form (13.3.3) for therescaled field

uε(x, t) = u

(x

ε− Ft

ε2,

t

ε2

).

The effective diffusivity is given by the formula

K = DI + 2D

Td

∇yχ(y)T ρ(y) dy +∫

Td

(b(y)− F )⊗ χ(y)ρ(y) dy, (13.8.2)

The cell problem (13.3.1) is also modified:

−L0χ = b− F. (13.8.3)

It was proved in section 13.6.1 that for gradient flows the diffusion is alwaysdepleted. In fact, a much sharper results can be obtained: the effective diffusivityis ”exponentially” smaller than D, for D sufficiently small. That is, there existpositive constants c1 and c2 such that

Kξ = c1e−c2/D, D ¿ 1,

where

Kξ :=d∑

i,j=1

Kijξiξj .

See [26] and the references therein.On the other hand, the effective diffusion coefficient can become arbitrarily

large when a constant external force is added to the gradient drift, [127, 133].The presence of a mean flow plays an important role in the case of divergence–free flows, in particular in connection to the small D–asymptotics of the effectivediffusivity. See [95, 105, 20, 137, 18]

Page 207: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

13.9. EXERCISES 191

The fact that the effective diffusivity along the direction of the shear is inverselyproportional to the molecular diffusivity, formula (13.6.23) was discovered by G.I.Taylor in [146], without any appeal to homogenization theory. This phenomenonis usually called Taylor dispersion. See also [6]. In deriving the result we usedformal calculations with Fourier series. Of course, we have to prove that we candifferentiate the Fourier series and that the Fourier series that we get for the secondderivative of χ(y) makes sense. In this section we will only perform formal calcu-lations. The existence, uniqueness and regularity of solutions to the cell problemwill be examined in the next section. For various properties of Fourier series werefer the reader to [61, Ch. 3].

13.9 Exercises

1. Derive a formula for u2(x, x/ε, t), the third term in the expansion (13.4.1).

2. Consider the problem of homogenization for

∂uε

∂t= −1

ε∇V

(x

ε

)· ∇uε + D∆uε

in one dimension with the (1–periodic) potential

V (y) =

a1 : y ∈ [0, 12 ],

a2 : y ∈ (12 , 1],

where a1, a2 are positive constants. Calculate the effective diffusivity. Howdoes it depend on D?

3. Same as in the previous exercise for the potential

V (y) =

y : y ∈ [0, 12 ],

1− y : y ∈ (12 , 1],

4. Consider the problem of homogenization for the reaction–advection–diffusionequation

∂uε

∂t=

1εb(x

ε

)· ∇uε + ∆uε +

1εc(x

ε

)uε, (13.9.1)

where the vector field b(y) and the scalar function c(y) are smooth and pe-riodic. Use the method of multiple scales to homogenize the above PDE. Inparticular:

Page 208: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

192 CHAPTER 13. HOMOGENIZATION FOR PARABOLIC EQUATIONS

(a) Derive the solvability condition.

(b) Obtain the conditions that b(y) and c(y) should satisfy so that you canderive the homogenized equation.

(c) Derive the homogenized equation, the cell problem(s) and the formulafor the homogenized coefficients. Suppose that the reaction term is non–linear, i.e. the zeroth order term in equation. (13.9.1) is replaced by

c(x

ε, uε

),

where the function c(y, u) is 1–periodic in y for every u. Can you ho-mogenize equation. (13.9.1) in this case?

5. Consider the problem of homogenization for the PDE

∂uε

∂t=

(b1(x) +

1εb2

(x

ε

))· ∇uε + ∆uε, (13.9.2)

where the vector field b2(y) ia smooth and periodic and b1(x) is periodic. Usethe method of multiple scales to homogenize the above PDE. In particular:

(a) Derive the solvability condition.

(b) Obtain the conditions that b2(y) should satisfy so that you can derive thehomogenized equation.

(c) Show that the homogenized equation is

∂u

∂t= B · ∇u +

d∑

i,j=1

Kij∂2u

∂xi∂xj(13.9.3)

and derive the cell problem(s) and the formulas for the homogenized co-efficients B and K.

6. Consider the problem of homogenization for the PDE (13.9.2) in the casewhere

b1(x) = −∇V (x), and b2(y) = −∇yp(y),

where p(y) is periodic.

(a) Show that in this case there exists a symmetric matrix K such that

Kij = DKij , Bi = −d∑

j=1

Kij∂V

∂xj.

Page 209: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

13.9. EXERCISES 193

(b) Let

L := B · ∇+d∑

i,j=1

Kij∂2

∂xi∂xj.

1. Derive a formula for L∗, the L2–adjoint of L.

2. Show that the function

ρ(y) :=1Z

e−V (y)/D, Z =∫

Td

e−V (y)/D dy

solves the homogeneous adjoint equation

L∗ρ = 0.

7. Consider the problem of homogenization for the following PDE

∂uε

∂t=

1εb(x

ε

)· ∇uε +

d∑

i,j=1

aij

(x

ε

) ∂2uε

∂xixj, (13.9.4)

where the vector field b(y) and the matrix A(y) are smooth and periodic, andA(y) is positive definite. Use the method of multiple scales to derive the ho-mogenized equation. In particular:

(a) Derive the solvability condition.

(b) Obtain conditions on b(y) which ensure the existence of a homogenizedequation.

(c) Derive the homogenized equation, the cell problem and the formula forthe homogenized coefficients.

(d) Prove that the homogenized matrix is nonnegative definite.

8. Let b(y) be a smooth, real valued 1–periodic, mean zero function and letbk+∞

k=−∞ be its Fourier coefficients. Prove that

a0 = 0

and thata−k = ak.

Page 210: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

194 CHAPTER 13. HOMOGENIZATION FOR PARABOLIC EQUATIONS

Page 211: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

Chapter 14

Homogenization for LinearTransport PDE

14.1 Introduction

In this chapter we investigate the long time behavior of solutions to the linear ad-vection equation. The techniques we employ are often referred to as homogeniza-tion techniques in the literature, and so we retain this term. However in termsof our classification in the introductory Chapter 1 the methods are actually moreclosely aligned to averaging methods for ODE, as discussed in Chapter 10. Thisconnection will be made explicit in the following.

14.2 Full Equations

We study the long time behavior of solutions to the linear advection equation forsteady periodic velocity fields a:

∂u(x, t)∂t

+ a(x) · ∇u(x, t) = 0 for (x, t) ∈ Rd × R+, (14.2.1a)

u(x, 0) = u0(x). (14.2.1b)

This is (13.2.1) in the case D = 0. Equation (14.2.1) is the backward Liouvilleequation corresponding to the dynamical system

dx

dt= a(x). (14.2.2)

See (4.3.4).

195

Page 212: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

196 CHAPTER 14. HOMOGENIZATION FOR LINEAR TRANSPORT PDE

As in Chapter 13 we study the case where

u0(x) = f(εx),

and rescale the equation in both space and time in order to understand the behaviorof solutions to equation (14.2.1), and consequently to (14.2.2), at length and timescales which are long when compared to those of the fluid velocity a(x). In thissetting, the small parameter in the problem is the ratio between the characteristiclength (time) scale of the velocity field – it’s period – and the largest length (time)scale of the problem – the one at which we are looking for a homogenized descrip-tion. In contrast to the analysis of the advection-diffusion equation in the previouschapter, we rescale time and space in the same fashion, namely

x → ε−1x, t → ε−1t. (14.2.3)

Such a transformation is natural since the transport PDE (14.2.1a) is of first orderin both space and time.

The initial value problem that we wish to investigate then becomes:

∂uε(x, t)∂t

+ a(x

ε

)· ∇uε(x, t) = 0 for (x, t) ∈ Rd × R+, (14.2.4a)

uε(x, 0) = f(x) for x ∈ Rd. (14.2.4b)

14.3 Simplified Equations

LetL0 = a(y) · ∇y (14.3.1)

with periodic boundary conditions. We assume for the moment that there are nonon–trivial functions in the null space N of L0:

N (L0) =

constants in y. (14.3.2)

From Chapter 4 we know that this is an ergodicity assumption on the ODE withvector field a. Note that if a is divergence-free (the velocity field is incompressible)then L is skew-symmetric and so we deduce that

N (L∗0) =

constants in y. (14.3.3)

Under this assumption we have the following:

Page 213: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

14.4. DERIVATION 197

Result 14.1. Let a be a periodic divergence-free vector field. Then, for ε ¿ 1 andt = O(1), the solution of (13.2.3) uε(x, t) is approximated by u(x, t), the solutionof the homogenized equation:

∂u(x, t)∂t

+ a · ∇xu(x, t) = 0, a :=(∫

Ya(y) dy

),

together with the same initial condition as for uε.

14.4 Derivation

We use the method of multiple scales as introduced in the two preceding chapters.To this end, we look for a solution in the form of a two-scale expansion:

uε(x, t) = u0

(x,

x

ε, t

)+ εu1

(x,

x

ε, t

)+ . . . . (14.4.1)

We assume that all terms in the expansion uj(x, y, t), j = 0, 1, . . . are 1–periodicin y and treat x and y := x

ε as independent variables. We substitute (14.4.1) intoequation (14.2.4a), use the assumed independence of x and y and collect equalpowers of ε to obtain the following set of equations

O(1/ε) L0u0 = 0, u0 is 1–periodic, (14.4.2a)

O(1) L0u1 = −L1u0, u1 is 1–periodic, (14.4.2b)

whereL0 = a(y) · ∇y and L1 =

∂t+ a(y) · ∇x.

Under assumption (14.3.2) we can now complete the homogenization proce-dure. From the first equation in (14.4.2) we get that the first term in the expansionis independent of the oscillations which are expressed through the auxiliary vari-able y:

u0 = u(x, t).

We use this to compute:

L1u0 =∂u(x, t)

∂t+ a(y) · ∇xu(x, t).

On the other hand, an integration by parts, together with the fact that a(y) isdivergence–free, gives:

YL0u1 dy =

Ya(y) · ∇yu1 dy = 0.

Page 214: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

198 CHAPTER 14. HOMOGENIZATION FOR LINEAR TRANSPORT PDE

This implies that a necessary condition for the existence of solutions to the equation

L0u1 = f

is that the right hand side f averages to 0. By (14.3.2), this is the analogue ofthe Fredholm alternative for this advection equation with periodic boundary condi-tions. Necessity follows from the above calculation; sufficiency follows from theergodic theorem 4.7. Applying this condition to the second equation in (14.4.2)gives

0 =∂u(x, t)

∂t+

(∫

Ya(y) dy

)· ∇xu(x, t). (14.4.3)

We have thus obtained the desired homogenized equation:

∂u(x, t)∂t

+ a · ∇xu(x, t) = 0, a :=(∫

Ya(y) dy

),

together with the same initial conditions as those for uε.

14.5 The Connection to ODE

Recall from Chapter 4 that the solution of (14.2.4) is given by

u(x, t) = f(ϕt(x))

where ϕt(x) solves ODE

d

dtϕt(x) = a

(ϕt(x)ε

),

ϕt(x) = x.

Result 14.1 shows that, when (14.3.2) holds. this equation is well approximated by

ϕt(x) = at + x,

the solution of

d

dtϕt(x) = a,

ϕt(x) = x.

Page 215: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

14.6. THE ONE–DIMENSIONAL CASE 199

Another way to see this result is as follows. Let x = ϕt(x0) and y = x/ε.

Then

dx

dt= a(y),

dy

dt=

1εa(y).

Under the ergodic hypothesis the fast process y has uniform invariant measure onthe torus Td. Thus the averaging Result 10.1 gives that x is well approximated bythe solution of the equation

dX

dt= a.

One can describe Result 14.1 the following way: under assumption (14.3.2) wehave that

limε→0

ε

∫ t/ε

0a(x(s)) ds = at,

where x(t) is the solution of the ODE

dx

dt= a(x), x(0) = x0.

This is another way to see that Result 14.1 is a direct consequence of the ergodicityof the flow generated by a(y) on the unit torus.

14.6 The One–Dimensional Case

Consider the rescaled transport equation (14.2.4a) in one dimension:

∂u

∂t+ a(x)

∂u

∂x= 0 for (x, t) ∈ R× R+, (14.6.1a)

u(x, 0) = u0(x). (14.6.1b)

We assume that a(y) is a positive, smooth, 1–periodic function. Clearly, a(y) is notdivergence–free (excluding the trivial case where a(y) is a constant function) andit is not expected that Result 14.1 holds. Indeed, the stationary forward Liouvilleequation

L∗0ρ = 0, ρ > 0, 1–periodic (14.6.2)

Page 216: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

200 CHAPTER 14. HOMOGENIZATION FOR LINEAR TRANSPORT PDE

together with the normalization condition∫ 1

0ρ(y) dy = 1,

no longer has constants as solution The unique normalized solution of (14.6.2) isthe probability density

ρ(y) =C

a(y), C = 〈a(y)−1〉−1;

here we have used the notation 〈·〉 to denote averaging over [0, 1], as in Chapter 12.We can use the method of multiple scales to homogenize equation (14.6.1).

The solvability condition (14.4.3) gives

∂u

∂t+

∫ 1

0L1uρ(y) dy = 0,

from which we obtain the homogenized equation

∂u

∂t+ a∗

∂u

∂x= 0, (14.6.3)

with the same initial conditions as in (14.6.1b) and with

a∗ = 〈a(y)−1〉−1.

Notice that, in contrast to the ergodic divergence–free case, it is the harmonic av-erage of the velocity field that appears in the homogenized equation (14.6.3).

One can also derive the homogenized equation (14.6.3) using the method ofcharacteristics. To see this, consider the equation

dx

dt= a

(x

ε

)

in one dimension, and under the same assumptions as before. If we set y = x/ε

then it is straightforward to show that

dy

dt=

1εa(y)

so that, if we define T by

T =∫ 1

0

1a(z)

dz

Page 217: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

14.7. THE TWO–DIMENSIONAL CASE. KOLMOGOROV’S THEOREM 201

then

y(nεT ) =x(0)

ε+ n.

Hencex(nεT ) = x(0) + nε.

It follows from continuity that x(t) → X(t) where

X(t) = x(0) +t

T.

This limiting variable satisfies the homogenized equation

dX

dt=

1T

= a∗.

14.7 The Two–Dimensional Case. Kolmogorov’s Theo-rem

Unfortunately, even for divergence-free fields, the ergodic hypothesis is often notsatisfied. Consider the equation (14.4.2a):

L0u0 = a(y) · ∇yu0(x, y, t) = 0. (14.7.1)

This equation rarely implies that the first term in the expansion is independent ofy: the null space of the operator L0 contains, in general, non trivial functions of y.As an example, consider the smooth, 1–periodic, divergence–free field

a(y) = (sin(2πy2), sin(2πy1)).

It is easy to check that the function

f(y) = cos(2πy1)− cos(2πy2)

solves equation (14.7.1). Consequently, the null space of L0 depends on the ve-locity field a(y) and, it does not consist, in general, merely of constants in y. Thisimplies that we cannot carry out the homogenization procedure using the methodof multiple scales: no useful information is contained in equations (14.4.2).

It is natural to ask whether there is a way of deciding whether a given divergence-free velocity field onT2 is ergodic or not. This is indeed possible it two dimensions.We have the following theorem, which is due to Kolmogorov.

Page 218: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

202 CHAPTER 14. HOMOGENIZATION FOR LINEAR TRANSPORT PDE

Theorem 14.2. Let a(y) : T2 → R2 be a smooth divergence–free velocity fieldwith no stagnation points:

a21(y) + a2

2(y) > 0, ∀y ∈ T2.

Let ai, i = 1, 2 denote the average of the ith component of the velocity field overT2 and define the rotation number as

γ =a1

a2.

Assume that γ is irrational and normally approximated by rationals. Then a(y) isergodic.

Remark 14.3. A number a ∈ R is said to be normally approximated by rationalsif there exist C, ε such that for every integer q

minp

∣∣∣∣a−p

q

∣∣∣∣ > C

q2+ε,

where the minimum is taken over the set of all integers.

14.8 Discussion and Bibliography

The problem of homogenization for linear transport equations has been studied bymany authors. See for example [36, 70, 145, 24].

We remark that the calculation of the effective velocity a does not require thesolution of a Poisson equation. This is a common characteristic of first order per-turbation theory (we only needed to go up to terms of O(ε) in the expansion inorder to obtain the homogenized equation). The perturbation expansion used hereis analogous to that used in the method of averaging, for Markov chains, ODE andSDE, in Chapters 9 and 10.

The method of multiple scales enabled us to obtain the homogenized equationfor the linear transport equation (14.2.4a) only in the case where the velocity fieldis ergodic. The method of multiple scales breaks down when the velocity field isnot ergodic, since in this case we don’t really have a solvability condition whichwould enable us to obtain a homogenized equation. In order to study the problemfor general velocity fields, not necessarily ergodic, we need to use the method oftwo–scale convergence. This will be done in Chapter 21.

Page 219: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

14.9. EXERCISES 203

A proof of Theorem 14.2 can be found in [135]. A similar theorem holds forvelocity fields with an invariant measure other than the Lebesgue measure on T2.See [145].

The example studied in Section 14.6 was communicated to the authors by T.Hou. It can also be found in [38, 145].

14.9 Exercises

1. How does the dynamics in the ODE studied in Section 14.6 change if a isallowed to change sign?

2. Consider the equationdx

dt= a

(x

ε

)b( t

εa

)

in one dimension, and under the assumption that a (resp. b) is smooth, 1−periodicand infx a > 0 (resp. infy b > 0). Find the averaged equations.

3. Study the problem of homogenization for (7.7.1) with a smooth periodic ve-locity field a(y) : T2 7→ R2 of the form

a(y) = (a1(y2), 0).

4. Study the problem of homogenization for (7.7.1) with a velocity field a(y) :T2 7→ R2 of the form

a(y) = b(y)(0, γ),

where b(y) is a smooth, 1–periodic scalar function and γ ∈ R.

Page 220: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

204 CHAPTER 14. HOMOGENIZATION FOR LINEAR TRANSPORT PDE

Page 221: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

Part III

Theory

205

Page 222: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …
Page 223: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

Chapter 15

Invariant Manifolds for ODE:The Convergence Theorem

15.1 The Theorem

In this section we describe a rigorous theory substantiating the perturbation expan-sions for invariant manifolds in Chapter 8. We study the equations

dx

dt= f(x, y),

dy

dt= 1

εg(x, y), (15.1.1)

for ε ¿ 1 and x ∈ Rl, y ∈ Rd−l. We assume that the dynamics for y with x frozenhas a unique exponentially attracting fixed point, for all x. Specifically we assumethat there exists η : Rl → Rd−l and α > 0 such that

g(ξ, η(ξ)) = 0 ∀ξ ∈ Rl

〈g(x, y1)− g(x, y2), y1 − y2〉 6 −α|y1 − y2|2 ∀x ∈ Rl, ∀y1, y2Rd−l.

The dynamics with x frozen at ξ satisfies

d

dtϕt

ξ(y) = g(ξ, ϕtξ(y)), ϕ0

ξ(y) = y. (15.1.2)

Our assumptions on η and g imply the following exponential convergence of ϕtξ(y)

to its globally attracting fixed point η(ξ).

Lemma 15.1. For all y ∈ Rd−l

|ϕtξ(y)− η(ξ)| 6 e−αt|y − η(ξ)|.

207

Page 224: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

208CHAPTER 15. INVARIANT MANIFOLDS FOR ODE: THE CONVERGENCE THEOREM

Proof. Since η(ξ) is time-independent we have

d

dtϕt

ξ(y) = g(ξ, ϕtξ(y)),

d

dtη(ξ) = g(ξ, η(ξ)).

Hence

12

d

dt|ϕt

ξ(y)− η(ξ)|2 = 〈g(ξ, ϕtξ(y))− g(ξ, η(ξ)), ϕt

ξ(y)− η(ξ)〉6 −α|ϕt

ξ(y)− η(ξ)|2.

The result follows from the differential form of the Gronwall Lemma 4.3.In essence we wish to prove a result like this when x is no longer frozen at ξ,

but rather evolves on its own time-scale of O(1). We make the following standingassumptions. These simplify the analysis and make the ideas of the proof of thebasic result clearer; however they can all be weakened in various different ways.The assumptions are the existence of a constant C > 0 such that1:

|f(x, y)| 6 C ∀(x, y) ∈ Rd,

|∇xf(x, y)| 6 C ∀(x, y) ∈ Rd,

|∇yf(x, y)| 6 C ∀(x, y) ∈ Rd,

|η(x)| 6 C ∀x ∈ Rl,

|∇η(x)| 6 C ∀x ∈ Rl.

With these assumptions we prove that x is close to X solving

dX

dt= f(X, η(X)). (15.1.3)

Theorem 15.2. Assume that x(0) = X(0). Then there are constants K, c > 0such that x(t) solving (15.1.1) and X(t) solving (15.1.3) satisfy

|x(t)−X(t)|2 6 ceKt(ε|y(0)− η(x(0))|2 + ε2

).

Note that the error is of size√

ε for times ofO(1). However this can be reducedto size ε if the initial deviation y(0)− η(x(0)) is of size

√ε.

1In the first and fourth items in this list the norms are standard vector norms; in the second, thirdand fifth they are operator norms. All are Euclidian

Page 225: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

15.2. THE PROOF 209

15.2 The Proof

Define z(t) by y(t) = η(x(t)) + z(t). Then

dz

dt=

dy

dt−∇η(x)

dx

dt

=1εg(x, η(x) + z)−∇η(x)f(x, η(x) + z)

=1ε

(g(x, η(x) + z)− g(x, η(x))

)−∇η(x)f(x, η(x) + z).

Now

〈g(x, η(x) + z)− g(x, η(x)), z〉 6 −α|z|2.

Hence, choosing δ2 = ε/α,

12

d

dt|z|2 6 −α

ε|z|2 + C2|z|

6 −α

ε|z|2 +

δ2

2C4 +

12|z|2δ2

6 − α

2ε|z|2 +

ε

2αC4.

Henced

dt

(e

αε

t|z|2) 6(e

αε

t) ε

αC4.

Integrating gives

|z(t)|2 6 e−αε

t|z(0)|2 +(1− e−

αε

t)ε2C4

α2. (15.2.1)

Now

dX

dt= f(X, η(X))

dx

dt= f(x, η(x) + z).

Hence

d

dt(x−X) = f(x, η(x) + z)− f(X, η(X))

= f(x, η(x) + z)− f(x, η(X)) + f(x, η(X))− f(X, η(X)).

Page 226: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

210CHAPTER 15. INVARIANT MANIFOLDS FOR ODE: THE CONVERGENCE THEOREM

Thus

12

d

dt|x−X|2 6 C|η(x)− η(X) + z||x−X|+ C|x−X|2

6 (C2 + C)|x−X|2 + C|z||x−X|.

It follows that

d

dt|x−X|2 6 (3C2 + 2C)|x−X|2 + |z|2.

Letting K = 3C2 + 2C, and using the bound (15.2.1) for |z(t)|2 we obtain

d

dt

(e−Kt|x−X|2

)6 e−

αε

t|z(0)|2 + e−Kt ε2

α2C4.

Integrating and using x(0) = X(0) gives the desired result.

15.3 Discussion and Bibliography

Theorem 15.2 shows that x from the full equations (15.1.1) remains close to X

solving the reduced equations (15.1.3) over time-scales which are of the orderln(ε−1). On longer time-scales the individual solutions can diverge, because ofthe exponential separation of trajectories which may be present in any dynamicalsystem. Notice, however, that (15.2.1) shows that

lim supt→∞

|y(t)− η(x(t))| 6 C4

α2ε2

suggesting that y(t) is approximately slaved to x(t), via y = η(x), for arbitrarytime-intervals. There are results concerning the approximation of x over arbitrarilylong times. Thes long-time approximation results are built on making rigorous theconstruction of an invariant manifold as described in Chapter 8. The idea is asfollows. Consider the equations

dx

dt= f(x, η(x) + z),

dz

dt= 1

εg(x, η(x) + z)−∇η(x)f(x, η(x) + z).

Notice that y = η(x) + z. Using the fact that ∇yg(x, η(x)) is negative-definite itis possible to prove the existence of an invariant manifold for z with the form

z = εη1(x; ε)

Page 227: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

15.4. EXERCISES 211

with η1 bounded uniformly in ε. To be precise, the equations for x and z startedwith initial conditions z(0) = εη1(x(0); ε) will satisfy z(t) = εη1(x(t); ε) for allpositive time. Furthermore the manifold is attracting so that |z(t)−εη1(x(t); ε)| →0 as t →∞. Thus we have an attractive invariant manifold for y with the form

y = η(x) + εη1(x; ε).

The existence and uniqueness of invariant manifolds can be proved by a varietyof techniques, predominantly the Lyapunov-Perron approach (Hale [63], Temam[148]) and the Hadamard graph transform (Wells [153]). Important work in thisarea is due to Fenichel [46, 47] who sets up a rather general construction of nor-mally hyperbolic invariant manifolds. The book of Carr [27] has a clear introduc-tion to the Lyapunov-Perron approach to proving existence of invariant manifolds.

15.4 Exercises

1. Show that, under the assumptions on g stated at the beginning of the chapter,ϕt

ξ : Rd−l → Rd−l is a contraction mapping for any t > 0. What is its fixedpoint?

2. Prove a result similar to Theorem 15.2 but removing the assumption that η andf are globally bounded; use instead linear growth assumptions on η and f .

3. Consider the equations

dx

dt= Ax + εf0(x, y),

dy

dt= −1

εBy + g0(x, y), (15.4.1)

for ε ¿ 1 and x ∈ Rl, y ∈ Rd−l. Assume that B is symmetric positive-definite.

Let z(t, x0; η) solve the equation

dz

dt= Az + f0(z, η(z)), z(0, x0; η) = x0.

Define Tη by

(Tη)(x0) =∫ 0

−∞e−Bs/εg0(z(s, x0; η), η(z(s, x0; η)))ds.

Page 228: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

212CHAPTER 15. INVARIANT MANIFOLDS FOR ODE: THE CONVERGENCE THEOREM

Show that if η is a fixed point of T then y = η(x) is an invariant manifold forthe equations (15.4.1). (This is known as the Lyapunov-Perron approach to theconstruction of invariant manifolds).

4. Assume that f0, g0 and all derivatives are uniformly bounded. Prove that T fromthe previous question has a fixed point. To do this apply a contraction mappingargument in a space of Lipschitz graphs η which are sufficiently small and havesufficiently small Lipschitz constant.

Page 229: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

Chapter 16

Averaging for Markov Chains:The Convergence Theorem

16.1 The Theorem

In this chapter we prove a result concerning averaging for Markov chains. Theset-up is as in Chapter 9. To make the proofs simler we concentrate on the finitestate-space case. Let Ix, Iy ⊆ 1, 2, · · · be finite sets. Consider a Markov chain

zε(t) =(

xε(t)yε(t)

)

on Ix × Iy. We assume that the backward equation has form

dv

dt=

1εQ0v + Q1v (16.1.1)

where Q0, Q1 are given by (9.2.4). Let x0(t) be a Markov chain on Ix with back-ward equation

dv0

dt= Q1v0, (16.1.2)

and with Q1 given by (9.3.1). We are interested in approximating xε(t) by x0(t).Note that the formula for the approximate process implied by the Kolmogorovequation is exactly that derived in Chapter 9 by means of formal asymptotics.

Note that xε(t) is not itself Markovian; only the pair (xε(t), yε(t)) is. Thuswe are approximating a non-Markov stochastic process by a Markovian one. To beprecise we prove that, at any fixed time, the statistics of xε(t) are close to those ofx(t). That is, we prove weak convergence of xε(t) to x0(t) at any fixed time t.

213

Page 230: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

214CHAPTER 16. AVERAGING FOR MARKOV CHAINS: THE CONVERGENCE THEOREM

Theorem 16.1. For any t > 0, xε(t) ⇒ x0(t), as ε → 0.

16.2 The Proof

Let v0 be defined as in (16.1.2). We then have

v0 ∈ Null(Q0),dv0

dt−Q1v0⊥Null(Q∗

0).

This follows from the construction of v0 in Chapter 9. Hence there exists v1 so that

Q0v0 = 0,

Q0v1 =dv0

dt−Q1v0.

We can make v1 unique by insisting that it has zero average against the null-spaceof Q∗

0, although this particular choice is not necessary. We simply ask that a solu-tion is chosen which is ε− independent. Since the equation itself is ε−independent,such a solution can always be chosen simply by insisting that the part of the so-lution v1 in the null-space of Q0 is independent of ε. For any such v1, and for v0

given by (16.1.2), definer = v − v0 − εv1.

Substituting v = v0 + εv1 + r into (16.1.1) and using the properties of v0, v, weobtain

dv0

dt+ ε

dv1

dt+

dr

dt=

1εQ0v0 + Q0v1 +

1εQ0r + Q1v0 + εQ1v1 + Q1r.

Hence

dr

dt=

(1εQ0 + Q1

)r + εq,

q = Q1v1 − dv1

dt.

Now Q = 1εQ0 + Q1 is the generator of a Markov chain. Hence using || · ||∞ to

denote the supremum norm on vectors over the finite set Ix × Iy, as well as theindunced operator norm, we have

||eQt||∞ = 1. (16.2.1)

This follows from (5.1.2) because eQt is a stochastic matrix.

Page 231: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

16.3. DISCUSSION AND BIBLIOGRAPHY 215

By the variation of constants formula we have

r(t) = eQtr(0) + ε

∫ t

0eQ(t−s)q(s)ds, (16.2.2)

viewing r(t), q(t) as vectors on Ix × Iy, for each t. We assume that v(i, j, 0) =φ(i), v0(i, 0) = φ(i) and then

v(i, j, t) = Eφ(xε(t)|xε(0) = i, yε(0) = j) (16.2.3a)

v0(i, t) = Eφ(x0(t)|x(0) = i). (16.2.3b)

Weak convergence of xε(t) to x0(t), for fixed t, is proved if

v(i, j, t) → v0(i, t), as ε → 0,

for any φ : IxR. Equations (16.2.3) imply that r(0) = −εv1(0). Using (16.2.1) wehave, from (16.2.2),

||r(t)||∞ 6 ε||eQt||∞||v1(0)||∞ + ε

∫ t

0||eQ(t−s)||∞||q(s)||∞ds

6 ε||v1(0)||∞ + ε

∫ t

0||q(s)||∞ds

6 ε(||v1(0)||∞ + t sup

06s6t||q(s)||∞

).

Hence, for any fixed t > 0, r(t) → 0 as ε → 0 and it follows that v → v0 asε → 0.

Remark 16.2. The proof actually gives a convergence rate because it shows that

||v(t)− v0(t)||∞ = O(ε), as ε → 0.

16.3 Discussion and Bibliography

The result we prove only shows that the random variable xε(t) converges to x0(t)for each fixed t. It is also posible to prove the more interesting result that the con-vergence actually occurs pathwise in the Skorokhod topology (defined in [55]).See [136].

Page 232: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

216CHAPTER 16. AVERAGING FOR MARKOV CHAINS: THE CONVERGENCE THEOREM

16.4 Exercises

1. Consider the two state continuous time Markov chain y with generator

L =1ε

( −a ab −b

)

and state-space I = −1, +1. Consider the ODE on Td given by

dx

dt= f(x, y)

where f : Td × I → Rd.

a. Write down the generator for this process.

b. Using multiscale analysis, show that the averaged SDE is

dX

dt= F (X)

whereF (x) = λf(x,+1) + (1− λ)f(x,−1)

and λ ∈ (0, 1) should be specified.

2. Prove the assertions made in the preceding exercise.

3. Conjecture what the behaviour of x is in the case where F (x) = 0 and largetimes are considered.

Page 233: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

Chapter 17

Averaging for SDE: TheConvergence Theorem

17.1 The Theorem

The goal of this chapter is to develop a rigorous theory based on the averagingprinciple for SDE that we developed in Chapter 10. To allow for a simplified proofwe study the problem on the torus. Consider the SDE on Td given by

dx

dt= f(x, y), (17.1.1a)

dy

dt=

1εg(x, y) +

1√εβ(x, y)

dV

dt(17.1.1b)

where V is a standard Brownian motion on Rd−l, f : Tl × Td−l 7→ Rl, g :Tl × Td−l 7→ Rd−l, β : Tl × Td−l 7→ R(d−l)×(d−l) are smooth periodic functions.Assume also that, writing z = (xT , yT )T ,

∃β > 0 : 〈ξ, β(x, y)ξ〉 > β|ξ|2, ∀ξ ∈ Rd−l, z ∈ Td.

Recall that, under this assumption, the process found by freezing x = ξ in (17.1.1b),is ergodic (Theorem 6.10). Thus an effective equation for the evolution of x can befound by averaging f over the invariant measure of this ergodic process. We nowmake this idea precise.

The process ϕtξ given by (10.2.2) is ergodic and has a smooth invariant density

ρ∞(y; ξ). This invariant density spans the null-space of L∗0, found as the adjoint ofL given by

L0 = g(x, y) · ∇y +12B(x, y) : ∇y∇y (17.1.2)

217

Page 234: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

218CHAPTER 17. AVERAGING FOR SDE: THE CONVERGENCE THEOREM

where B(x, y) = β(x, y)β(x, y)T ; both L0 and L∗0 have periodic boundary condi-tions. The averaged equations are then

dX

dt= F (X), (17.1.3a)

F (ξ) =∫

Td−l

f(ξ, y)ρ∞(y; ξ)dy. (17.1.3b)

Note that F : Tl 7→ Rl is periodic by construction. The resulting formulae areexactly those given in Chapter 10, specialized to the particular drift and diffusioncoefficients on the torus studied in this chapter.

Theorem 17.1. Let p > 1. Let x(0) = X(0). Then the function x(t) solving(17.1.1) converges to X(t) solving (17.1.3) in Lp

(Ω, C

([0, T ],Tl

)): if x(0) =

X(0) then, for any T > 0, there is C = C(T ) such that

E sup06t6T

|x(t)−X(t)|p 6 Cεp/2.

17.2 The Proof

Recall that L0 is the generator for ϕtx(y), with x viewed as a fixed parameter, given

by (17.1.2). Thus L0 is a differential operator in y only; x appears as a parameter.Now let φ(x, y) solve the elliptic boundary value problem

L0φ(x, y) = f(x, y)− F (x),∫

Td−l

φ(x, y)dy = 0,

φ(x, ·) is periodic on Td−l.

By construction ∫

Td−l

(f(x, y)− F (x)

)ρ∞(y;x)dy = 0

and ρ∞ spans Null(L∗0). By the Fredholm alternative, φ has a unique solution.

Lemma 17.2. The functions f, φ,∇xφ,∇yφ, β are smooth and bounded.

Proof. The properties of f, β follow from the fact that they are defined on the torusand have derivatives of all orders by assumption. Since the invariant density ρ∞

is the solution of an elliptic eigenvalue problem on the torus, it too is smooth and

Page 235: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

17.2. THE PROOF 219

periodic. Hence F is smooth and periodic. Consequently f − F is smooth andperiodic. Hence φ and all its derivatives are smooth and periodic.

Proof of Theorem Notice that the generator for (17.1.1) is

L =1εL0 + L1.

Here L0 is given by (17.1.2) and

L1 = f(x, y) · ∇x.

Now we apply the Ito formula (Lemma 6.4) to φ to obtain the following informalexpression, with precise interpretation found by integrating in time:

dt(x, y) =

1ε(L0φ)(x, y) + f(x, y) · ∇xφ(x, y) +

1√ε∇yφ(x, y)β(x, y)

dV

dt.

Since L0φ = f − F we obtain

dx

dt= F (x) + (L0φ)(x, y)

= F (x) + εdφ

dt− εf(x, y) · ∇xφ(x, y)

− √ε∇yφ(x, y)β(x, y)

dV

dt. (17.2.1)

(Again a formal expression made rigorous by time-integration). The functionsf, φ,∇xφ are all smooth and bounded by Lemma 17.2 so that

θ(t) :=(φ(x(t), y(t))− φ(x(0), y(0))

)−∫ t

0f(x(s), y(s)) · ∇xφ(x(s), y(s))ds

satisfiessup

06τ6t|θ(τ)| 6 C1.

Now define

M(t) := −∫ t

0∇yφ(x(s), y(s)), β(x(s), y(s))dV (s).

Since ∇yφ, β are smooth and bounded, by Lemma 17.2, the Ito isometry gives

E|M(t)|2 =∫ t

0E|∇yφ(x(s), y(s))β(x(s), y(s))|2Fds

6 C2t. (17.2.2)

Page 236: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

220CHAPTER 17. AVERAGING FOR SDE: THE CONVERGENCE THEOREM

Now the precise interpretation of (17.2.1) is

x(t) = x(0) +∫ t

0F (x(s))ds + εθ(s) +

√εM(t).

Also, from (17.1.3),

X(t) = X(0) +∫ t

0F (X(s))ds.

Let e(t) = x(t)−X(t) so that, using e(0) = 0,

e(t) =∫ t

0

(F (x(s))− F (X(s))

)ds + εθ(t) +

√εM(t).

Since F is Lipschitz on Tl we obtain, for t ∈ [0, T ],

|e(t)| 6∫ t

0L|e(s)|ds + εC1 +

√ε|M(t)|.

Hence, by (17.2.2) and the Burkholder-Davis-Gundy inequality Theorem 3.15, weobtain

E sup06t6T

|e(t)|p 6 C(εp + εp/2

(E sup

06t6T|M(t)|p) + LpT p−1

∫ T

0E|e(s)|pds

)

6 C(εp + εp/2

(E|M(T )|2)p/2 + LpT p−1

∫ T

0E|e(s)|pds

)

6 C(εp/2 +

∫ T

0E sup

06τ6s|e(τ)|pds

).

By the integrated version of the Gronwall inequality in Lemma 4.3 we deduce that

E sup06t6T

|e(t)|p 6 Cεp/2.

Note that the constants C grows with T .

17.3 Discussion and Bibliography

Note that our convergence result proves strong convergence: we compare each pathof the SDE for (x, y) with the approximating ODE for X . This strong convergenceresult is possible primarily because the limiting approximation is in this case deter-ministic. When the approximation is itself stochastic, as arises for (10.6.1), then it

Page 237: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

17.4. EXERCISES 221

is more natural to study weak convergence type results (but see Exercise 1 below).For results of the latter type see [41].

Averaging results for ODE are proved by different techniques. Much of theoriginal motivation for this work comes from averaging of perturbed integrableHamiltonian systems, expressed in action-angle variables. See [62] for references.

17.4 Exercises

1. Consider equation (10.6.1) in the case where α(x, y) ≡ 1. Write down theaveraged dynamics for X and modify the techniques of this chapter to provestrong convergence of x to X .

2. Consider equation (10.6.1) in the case where d = 2, l = 1 and α(x, y) ≡ y2.

If y is a scalar OU process (6.4.2), independent of x, write down the averageddynamics for X and use the properties of the OU process to prove weak con-vergence.

Page 238: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

222CHAPTER 17. AVERAGING FOR SDE: THE CONVERGENCE THEOREM

Page 239: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

Chapter 18

Homogenization for SDE: TheConvergence Theorem

18.1 The Theorem

In this chapter we develop a rigorous theory based on the homogenization principlederived in Chapter 11. We consider a simple setting where the entire problemis posed on the torus, in order to elucidate the principle ideas. Furthermore, wework in the skew-product setting where the fast process is independent of the slow-process, and where the fluctuating term in the slow process depends only on the fastprocess; this allows for a simplified proof which nonetheless contains the essenceof the main ideas.

Consider the SDE on Td given by

dx

dt=

1εf0(y) + f1(x, y) + α

dU

dt(18.1.1a)

dy

dt=

1ε2

g(y) +1εβ(y)

dV

dt(18.1.1b)

where U (resp. V ) is a standard Brownian motion on Rl (resp. Rd−l) and α ∈Rl×l. The functions f0 : Td−l 7→ Rl, f1 : Tl × Td−l 7→ Rl, g : Td−l 7→ Rd−l,β : Td−l 7→ R(d−l)×(d−l) are smooth and periodic. We also assume that

∃β > 0 : 〈ξ, β(y)ξ〉 > β|ξ|2, ∀ξ ∈ Rd−l, y ∈ Td−l.

Recall that, under this assumption, the process y in (18.1.1b) is ergodic (see Result6.10.) Thus an effective equation for the evolution of x can be found by averag-

223

Page 240: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

224CHAPTER 18. HOMOGENIZATION FOR SDE: THE CONVERGENCE THEOREM

ing f1 over the invariant measure of this ergodic process, and by examining thefluctuations induced by f0. The aim of this chapter is to quantify this idea.

Define the generator L0 by (17.1.2):

L0v = g(y) · ∇yv +12B(y) : ∇y∇yv

with periodic boundary conditions, and where

B(y) = β(y)β(y)T .

The precise statement about ergodicity of the y process is that ϕt(y) given by(10.2.2) is ergodic and has a smooth invariant density ρ∞(y); this function spansthe null-space of L∗0. Note that ϕt(·) and ρ∞(·) are independent of ξ here, becauseg, β depend only on y. We assume that

Td−l

f0(y)ρ∞(y)dy = 0.

Under this assumption the calculations in Chapter 11 we may homogenize the SDE(18.1.1) to obtain

dx

dt= F (x) + A

dW

dt(18.1.2)

where

F (ξ) =∫

Td−l

f1(ξ, y)ρ∞(y)dy

and

AAT = ααT +∫

Td−l

(f0(y)⊗ Φ(y) + Φ(y)⊗ f0(y)

)ρ∞(y)dy.

Here Φ(y) solves the cell problem

L0Φ(y) = −f0(y), (18.1.3)∫

Td−l

Φ(y)ρ∞(y) = 0,

Φ(y) periodic on Td−l.

This has a unique solution, by the Fredholm alternative. The resulting formulae areexactly those given in Chapter 11, specialized to the particular drift and diffusioncoefficients studied in this chapter.

Page 241: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

18.2. THE PROOF 225

Notice that, for ξ ∈ Rl, φ = Φ · ξ, the proof of Theorem 11.5 shows that AAT

is positive-definite

⟨ξ,AAT ξT

⟩= |αT ξ|2 +

Td−l

|βT (y)∇φ(y)|2ρ∞(y)dy.

The Remark 11.6 shows that

AAT = ααT +∫

Td−l

ρ∞(y)(∇Φ(y)β(y)⊗∇Φ(y)β(y)

)dy.

Hence the SDE for X is well-defined.

Theorem 18.1. Let x(t) solve (17.2.1) and X(t) solve (18.1.2) and set x(0) =X(0). Then, for any T > 0, x ⇒ X in C

([0, T ],Tl

).

18.2 The Proof

We will use the following lemma.

Lemma 18.2. Let w ∈ C([0, T ],Rr); let F ∈ C1(Tl;Tl) and let A ∈ Rl×r. If u

satisfies the integral equation

u(t) = u(0) +∫ t

0F (u(s))ds + Aw(t)

then u ∈ C([0, T ],Tl) and the mapping w 7→ u is a continuous mapping from(C([0, T ],Rr) into C([0, T ],Tl).

Proof. Existence and uniqueness follows by a standard contraction mapping argu-ment, based on the iteration

u(n+1)(t) = u(0) +∫ t

0F (un(s))ds + Aw(t),

and using the fact that F is globally Lipschitz, with constant L. For continuityconsider the equations

u(i)(t) = u(0) +∫ t

0F (u(i)(s))ds + Aw(i)(t)

for i = 1, 2. Subtracting and letting e = u1 − u2, δ = w1 − w2, we get

e(t) =∫ t

0

(F (u1(s))− F (u2(s))

)ds + Aδ(s).

Page 242: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

226CHAPTER 18. HOMOGENIZATION FOR SDE: THE CONVERGENCE THEOREM

Hence

|e(t)| 6∫ t

0L|e(s)|ds + |A||δ(t)|.

From the integrated form of the Gronwall inequality in Lemma 4.3 it follows that

sup06t6T

|e(s)| 6 C sup06t6T

|δ(t)|.

This establishes continuity.Let χ(x, y) solve the cell problem

L0χ(x, y) = f1(x, y)− F (x), (18.2.1)∫

Td−l

χ(x, y)ρ∞(y)dy = 0,

χ(x, y) periodic on Td−l.

This has a unique solution, by the Fredholm alternative, since f1 − F averages tozero over Td−l. The following lemma has proof which is very similar to the proofof Lemma 17.2; hence we omit it.

Lemma 18.3. The functions f0, f1,Φ, χ and all their derivatives are smooth andbounded.

Proof of TheoremNotice that, by the Ito formula Lemma 6.4, we have that

dt=

1ε2L0χ +

1εL1χ + L2χ +∇xχα

dU

dt+

1ε∇yχβ

dV

dt; (18.2.2)

the rigorous interpretation is, as usual, the integrated form. Here the operatorsL1,L2 are given by

L1 = f0(x, y) · ∇x,

L2 = f1(x, y) · ∇x +12ααT : ∇x∇x.

Now define

θ(t) =ε2(χ(x(t), y(t))− χ(x(0), y(0))

)

− ε

∫ t

0

(L1χ

)(x(s), y(s))ds− ε2

∫ t

0

(L2χ

)(x(s), y(s))ds (18.2.3)

Page 243: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

18.2. THE PROOF 227

and

M1(t) = ε2

∫ t

0∇xχ(x(s), y(s))αdU(s) (18.2.4)

+ ε

∫ t

0∇yχ(x(s), y(s))β(y(s))dV (s).

Since χ, f0 and f1 and all their derivatives are bounded, we have

E sup06t6T

|θ(t)|p 6 Cεp. (18.2.5)

Furthermore

E|M1(t)|2 = ε4

∫ t

0|∇xχ(x(s), y(s))α(x(s), y(s))T |2Fds

+ ε2

∫ t

0|∇yχ(x(s), y(s))β(y(s))|2Fds.

By the Burkholder-Davis-Gundy inequality of Theorem 3.15 we deduce that

E sup06t6T

|M1(t)|p 6 Cεp. (18.2.6)

Hence, by (18.2.2) and (18.2.1), we deduce that∫ t

0

(f1(x(s), y(s))− F (x(s))

)ds = r(t) := θ(t) + M1(t) (18.2.7)

where we have shown that

r(t) = O(εp) in Lp(Ω, C([0, T ],Rl)

). (18.2.8)

Now apply the Ito formula to Φ solving (18.1.3) to obtain, since Φ is indepen-dent of x,

dΦdt

=L0Φε2

+1ε∇yΦβ

dV

dt.

(As usual the rigorous interpretation is in integrated form). Hence

∫ t

0f0(y(s))ds = ε

(Φ(y(0))−Φ(y(s))

)+

∫ t

0∇yΦ(y(s))β(y(s))dV. (18.2.9)

Since Φ is bounded on Td−l we deduce that, as ε → 0,

ε(Φ(y(t))− Φ(y(0))

)= O(εp) in Lp

(Ω, C([0, T ],Rl)

). (18.2.10)

Page 244: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

228CHAPTER 18. HOMOGENIZATION FOR SDE: THE CONVERGENCE THEOREM

If we set

M2(t) =∫ t

0(∇yΦ)(y(s))β(y(s))dV (s),

then the Martingale Central Limit Theorem 3.31 implies that, as ε → 0,

M2(t) ⇒ α2W (t), (18.2.11)

where W (t) is standard Brownian motion on Rl and

α2αT2 = lim

t→∞1t

∫ t

0(∇yΦ)(y(s))β(y(s))⊗ (∇yΦ)(y(s))β(y(s))ds

=∫

Td−l

ρ∞(y)((∇yΦ)(y)β(y)⊗ (∇yΦ)(y)β(y)

)dy.

Combining (18.2.7), (18.2.9) in (18.1.1a) we obtain

x(t) = x(0) +∫ t

0F (x(s))ds + αU(t) + M2(t) + η(t)

whereη(t) = r(t) + ε

(Φ(y(0))− Φ(y(t))

).

By (18.2.8), (18.2.10) we have that

η(t) → 0, in Lp(Ω, C([0, T ],Rl)

)

and hence by (18.2.11),

(η(t),M2(t)) ⇒ (0, α2W (t)), in C([0, T ],R2l).

Because U(t) is independent of (η(t),M(t)) we deduce that

(U(t), η(t),M2(t)) ⇒ (U(t), 0, α2W (t)), in C([0, T ],R3l).

The mapping (U, η,M) → x is continuous from (C([0, T ],R3l) into C([0, T ],Tl),by Lemma 18.2. Hence, because weak convergence is preserved under continuousmappings, we deduce that x(t) ⇒ X(t) in C([0, T ],Tl), where X solves

X(t) = x(0) +∫ t

0F (X(s))ds + αU(t) + α2W (t).

Page 245: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

18.3. DISCUSSION AND BIBLIOGRAPHY 229

18.3 Discussion and Bibliography

The proofs presented here have been simplified by the assumption that the limitingprocess for X has additive noise. Thus we were able to use the Martingale Cen-tral Limit Theorem in a very straightforward way. In the general case where thelimiting process has general state-dependent noise proofs of convergence are morecomplicated. See [136, 17, 41].

18.4 Exercises

1. Consider the coupled pair of scalar SDEs

dx

dt= f(x, y)− 1

εy +

(y2 + a(x)

)dU

dt(18.4.1a)

dy

dt= − 1

ε2y +

√2dV

dt(18.4.1b)

Write down the homogenized equations.

2. Assume that f, a and all derivatives are bounded. Using the exact solutionof the OU process prove a convergence theorem related to your conjecturedhomogenized equation in the previous exercise.

Page 246: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

230CHAPTER 18. HOMOGENIZATION FOR SDE: THE CONVERGENCE THEOREM

Page 247: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

Chapter 19

Homogenization for Elliptic PDE:The Convergence Theorem

19.1 The Theorems

In this chapter we prove the homogenization theorem for second order uniformlyelliptic PDE with periodic coefficients and Dirichlet boundary conditions:

Theorem 19.1. Let uε be the solution of

−∇ · (Aε(x)∇uε) = f, for x ∈ Ω, (19.1.1a)

uε(x) = 0, for x ∈ ∂Ω. (19.1.1b)

with f ∈ H−1(Ω) and Aε(x) = A(

), A(y) ∈ Mper(α, β,Y). Furthermore, let

u be the solution of the homogenized problem

−A : ∇∇u = f, for x ∈ Ω (19.1.2a)

u(x) = 0, for x ∈ ∂Ω, (19.1.2b)

with A given by

A =∫

Y

(A(y) + A(y)∇yχ(y)T

)dy. (19.1.3)

and where the vector field χ(y) satisfies

−∇y · (A(y)∇yχ(y)) = ∇y ·A(y)T , χ(y) is 1–periodic. (19.1.4)

Thenuε u weakly in H1

0 (Ω).

231

Page 248: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

232CHAPTER 19. HOMOGENIZATION FOR ELLIPTIC PDE: THE CONVERGENCE THEOREM

In addition to the basic homogenization theorem, we will also prove a resultwhich shows that retaining extra terms in the multiscale expansion does indeedgive improved approximations. The following result says that we can get strongconvergence in H1(Ω) provided that we take the first order corrector field intoaccount.

Theorem 19.2. Consider uε(x) and u(x) as in Theorem 19.1. Then

limε→0

∥∥∥uε(x)−(u(x) + εχ

(x

ε

)· ∇u(x)

)∥∥∥H1(Ω)

= 0. (19.1.5)

19.2 Proof of the Homogenization Theorem

In this section we prove the homogenization theorem, Theorem 19.1 using themethod of two–scale convergence. Before starting with the proof let us make someremarks on our approach. The first step in our analysis is to use the energy esti-mates from Chapter 7 to deduce that uε as well as ∇uε have two–scale convergentsubsequences. The second step is to use a test function of the form

φε(x) = φ0(x) + εφ1

(x,

x

ε

), (19.2.1)

in order to pass to the two scale limit. In this way we obtain a coupled systemof equations for the first two terms in the expansion u, u1, the two–scale sys-tem, see equation (19.2.2) below. Well–posedness of this system is proved usingthe Lax–Milgram lemma, Theorem 2.41. The final step is to decouple this sys-tem of equations using separation of variables, thereby obtaining the homogenizedequation for u.

Let us remark that the basic compactness theorems of two–scale convergence,Theorems 2.33 and 2.36, enable us to extract a two–scale convergent subsequencebut they do not allow us to conclude that the whole sequence converges to a limit.In the case of Theorem 19.1 the limit u is unique, since the homogenized equation(19.1.2) has a unique solution. This implies that the whole sequence converges, notjust a subsequence. In the calculations that follow we will use this result withoutany further reference.

We will break the proof of Theorem 19.1 into three parts, as described above.The following lemma provides us with the first two terms of the two–scale expan-sion, together with the coupled system of equations that they satisfy.

Page 249: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

19.2. PROOF OF THE HOMOGENIZATION THEOREM 233

Lemma 19.3. Let uε(x) be the solution of (19.1.1) with the assumptions of Theo-rem 19.1. Then there exist functions u(x), u1(x, y) ∈

H10 (Ω)× L2(Ω; H1

per(Y)/R)

such that uε and∇uε two–scale converge to u(x) and∇xu+∇yu1. Furthermore,u, u1 satisfy the two–scale system

−∇y ·(A(y) (∇xu +∇yu1)

)= 0 in Ω× Y, (19.2.2a)

−∇x ·(∫

YA(y)

(∇xu +∇yu1

)dy

]= f in Ω, (19.2.2b)

u(x) = 0 on ∂Ω, u1(x, y) is periodic in y. (19.2.2c)

Proof. 1. We have that ‖uε‖H10 (Ω) 6 C which implies the first part of the lemma:

there exist functions u(x) ∈ H10 (Ω), u1(x, y) ∈ L2(Ω;H1

per(Y)/R) such that

uε(x) 2 u(x), (19.2.3a)

∇uε(x) 2 ∇xu(x) +∇yu1(x, y). (19.2.3b)

Furthermore, the two–scale limit of uε(x) is also the weak H10 (Ω)–limit of this

sequence.2. The weak formulation of (19.1.1) is

Ω〈Aε(x)∇uε,∇φε〉 dx = 〈f, φε〉 ∀φε ∈ H1

0 (Ω). (19.2.4)

We use a test function of the form (19.2.1)

φε(x) = φ0(x) + εφ1

(x,

x

ε

), φ0 ∈ C∞

0 (Ω), φ1 ∈ C∞0 (Ω;C∞

per(Y)).

We clearly have that φε ∈ H10 (Ω). Upon using this test function in (19.2.4) and

rearranging terms we obtain:∫

Ω〈∇uε, (Aε(x))T · ∇φε〉 dx =

Ω

⟨∇uε, Aε(x) ·

(∇xφ0(x) +∇yφ1

(x,

x

ε

))⟩dx

Ω

⟨∇uε, (Aε(x))T · ∇xφ1

(x,

x

ε

)⟩dx

=: I1 + εI2

= 〈f, φ0 + εφ1〉.

Page 250: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

234CHAPTER 19. HOMOGENIZATION FOR ELLIPTIC PDE: THE CONVERGENCE THEOREM

Now, the function (Aε(x))T ·(∇xφ0(x) +∇yφ1

(x, x

ε

))is of the form φ1(y)φ2(x, y)

with φ1(y) ∈ L∞(Y) and φ2(x, y) ∈ L2(Ω;Cper(Y)). Hence, it is an admissibletest function. We can thus pass to the two scale limit to obtain:

I1 →∫

Ω

Y〈A(y) · (∇xu +∇yu1) , (∇xφ0 +∇yφ1)〉 dydx.

The function Aε(x) · ∇xφ1

(x, x

ε

)is also an admissible test function. Passing to

the limit in I2 we obtain:I2 → 0.

Moreover, we have that φ0 + εφ1 φ0 weakly in H10 (Ω) which implies that

〈f, φ0 + εφ1〉 → 〈f, φ0〉.

Putting the above considerations together we obtain the limiting equation∫

Ω

Y〈A(y) · (∇xu +∇yu1) ,∇xφ0 +∇yφ1〉 dydx = 〈f, φ0〉. (19.2.5)

In deriving (19.2.5) we assumed that the test functions φ0, φ1 are smooth, a den-sity argument however enables us to conclude that (19.2.5) holds for every φ0 ∈L2(Ω), φ1 ∈ L2(Ω;H1

per(Y)/R).3. Now, (19.2.5) is the weak formulation of the two–scale system (19.2.2). To

see this, we set φ0 = 0 to obtain:∫

Ω

Y〈A(y) · (∇xu +∇yu1) ,∇yφ1〉 dydx = 0.

which is precisely the weak formulation of (19.2.2a). Setting now φ1 = 0 in(19.2.5) we get

Ω

Y〈A(y) · (∇xu +∇yu1) ,∇xφ0〉 dydx = 〈f, φ0〉.

which is the weak formulation of (19.2.2b). The boundary conditions (19.2.2c)follow from the fact that u(x) ∈ H1

0 (Ω) and u1(x, y) ∈ L2(Ω; H1per(Y)/R).

Now we need to prove that the two–scale system is well posed. This is thecontent of the next lemma.

Lemma 19.4. Under the assumptions of Theorem 19.1 the two–scale system (19.2.2)has a unique solution

u(x), u1(x, y)

∈ H1

0 (Ω)× L2(Ω; H1per(Y)/R)

.

Page 251: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

19.2. PROOF OF THE HOMOGENIZATION THEOREM 235

Proof. We will use the Lax–Milgram Lemma. The weak formulation of the two–scale system is given by equation (19.2.5) with (φ0, φ1) ∈ H1

0 (Ω)×L2(Ω;H1per(Y)/R).

We will denote this product space by X . This is a Hilbert space with inner product

(U, V )X = (∇u,∇v)L2(Ω) + (∇yu1,∇yv1)L2(Ω×Y)

for all U = (u, u1), V = (v, v1) and induced norm

‖U‖2X = ‖∇u‖L2(Ω) + ‖∇yu1‖L2(Ω×Y),

Let us define now the bilinear form

a[U,Φ] =∫

Ω

Y〈A(y) (∇xu +∇yu1) ,∇xφ0 +∇yφ1〉 dydx,

with Φ := (φ0, φ1). We have to check that this bilinear form is continuous andcoercive. Let us start with continuity. We use the L∞ bound on A(y), togetherwith the Cauchy–Schwartz inequality to obtain:

a[U,Φ] =∫

Ω

Y〈A(y) · (∇xu +∇yu1) ,∇xφ0 +∇yφ1〉 dydx

6 β

Ω

Y〈∇xu +∇yu1,∇xφ0 +∇yφ1〉 dydx

6 C‖U‖X‖Φ‖X .

We proceed with coercivity. We use the fact that the integral of the derivative of aperiodic function over the unit cell vanishes to obtain:

a[U,U ] =∫

Ω

Y〈A(y) · (∇xu +∇yu1) ,∇xu +∇yu1〉 dydx

> α

Ω

Y|∇xu +∇yu1|2 dydx

= α

(∫

Ω

Y|∇xu|2 dydx + 2

Ω

Y∇xu · ∇yu1 dydx +

Ω

Y|∇yu1|2 dydx

)

= α(‖u‖2

X + ‖∇yu1‖2L2(Ω×Y)

)= α‖U‖2

X

and consequentlya[U,U ] > α‖U‖2

X .

Hence, the bilinear form a[U,Φ] continuous and coercive and the Lax–MilgramLemma applies. This proves existence and uniqueness of solutions of the two–scale system in X .

Now we can conclude the proof of Theorem 19.1.

Page 252: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

236CHAPTER 19. HOMOGENIZATION FOR ELLIPTIC PDE: THE CONVERGENCE THEOREM

Lemma 19.5. Consider the unique solution (u, u1) ∈ H10 (Ω)×L2(Ω; H1

per(Y)/R)of the two–scale system (19.2.2). Then u is the unique solution of the homogenizedequation (19.1.2) and u1(x, y) is of the form

u1(x, y) = χ(y) · ∇u(x), (19.2.6)

where χ(y) is the solution of the cell problem (19.1.4).

Proof. We substitute (19.2.6) into (19.2.2a) to obtain

−∇y · (A(y)∇yχ(y))∇xu = ∇yA(y) · ∇xu.

This equation is satisfied provided that χ(y) ∈ (H1per(Y)/R)d is the unique solu-

tion of the cell problem. Now equation (19.2.2b) becomes

−∇x

(∫

YA(y) (∇xu− (∇yχ)∇xu) dy

)= f.

This is precisely the homogenized equation with the homogenized coefficientsgiven by (19.1.3).

The above three lemmas provide us with the proof of Theorem 19.1.

Remark 19.6. The fact that the choice (19.2.6) for u1 enables us to solve the two–scale system, provided that u0 satisfies the homogenized equation, implies that thisthe only possible set of functions uu1 which solves the two–scale system, sincewe have already proved uniqueness of solutions.

19.3 Proof of the Corrector Result

Theorem 19.1 implies that uε converges to u(x) strongly in L2(Ω). Thus, in orderto prove Theorem 19.2, it is enough to prove that

limε→0

∥∥∥∇uε(x)−∇(u(x) + εu1

(x,

x

ε

))∥∥∥L2(Ω;Rd)

= 0,

or, equivalently,

limε→0

∥∥∥∇uε(x)−(∇u(x) + ε∇xu1

(x,

x

ε

)+∇yu1

(x,

x

ε

))∥∥∥L2(Ω;Rd)

= 0.

Assuming now enough regularity on u1 so that it can be considered to be an admis-sible test function we have that

∥∥ε∇xu1

(x, x

ε

)∥∥(L2(Ω;Rd)

→ 0. Hence, it is enoughto prove that

limε→0

∥∥∥∇uε(x)−(∇u(x) +∇yu1

(x,

x

ε

))∥∥∥L2(Ω;Rd)

= 0.

Page 253: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

19.4. DISCUSSION AND BIBLIOGRAPHY 237

The uniform ellipticity of A now implies:

α ‖∇uε(x) −(∇u(x) +∇yu1

(x,

x

ε

))∥∥∥L2(Ω;Rd)

= α

Ω

∣∣∣∇uε(x)−(∇u(x) +∇yu1

(x,

x

ε

))∣∣∣2

dx

6∫

Ω〈Aε(x) · (∇xuε −∇xu−∇yu1) ,∇xuε −∇xu−∇yu1〉 dx

6∫

Ω〈Aε(x) · ∇xuε(x),∇xuε(x)〉 dx

+∫

Ω〈Aε(x) · (∇xu +∇yu1) ,∇xu +∇yu1〉 dx dx

−∫

Ω

⟨(Aε(x) + (Aε(x))T

) · ∇xuε,∇xu +∇yu1

⟩dx

→ 〈f, u〉+∫

Ω

Y〈A(y) (∇xu(x) +∇yu1(x, y)) ,∇xu(x) +∇yu1(x, y)〉 dydx

+∫

Ω

Y

⟨(A(y) + (A(y))T

) · (∇xu(x) +∇yu1(x, y)) ,∇xu(x) +∇yu1(x, y)⟩

dydx

= a[U,U ] + a[U,U ]− 2a[U,U ] = 0.

Consequently

limε→0

∥∥∥∇uε(x)−(∇u(x) +∇yu1

(x,

x

ε

))∥∥∥L2(Ω;Rd)

= 0

and the theorem is proved

19.4 Discussion and Bibliography

The proofs of Theorems 19.1 and 19.2 are taken from [4]. The idea of using appro-priate test functions for characterizing the limit of a sequence of functions is verycommon in the theory of PDEs. See [42] and the references therein. In the contextof homogenization test functions of the form (19.2.1) have been used by Kurtz in[85]. See also the perturbed test function approach of Evans, [44, 43].

Tartar’s method of oscillating test functions is based on constructing appro-priate test functions using the cell problem. See [143, 144], [30, Ch. 8] and thereferences therein.

It is not always possible to decouple the two–scale system and to obtain aclosed equation for the first term in the two–scale expansion, the homogenized

Page 254: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

238CHAPTER 19. HOMOGENIZATION FOR ELLIPTIC PDE: THE CONVERGENCE THEOREM

equation. In order to do this we need an H1–estimate on the solution uε(x) ofthe unhomogenized PDE. This enables us to conclude that the two–scale limit isindependent of the microscale and consequently a homogenized equation actuallyexists. This is not always the case. See for example the case of linear transportPDE studied in Chapter 21. Other examples can be found in [4, 3].

19.5 Exercises

1. Let a(y) : Td 7→ Rd be a smooth, mean zero, 1–periodic, divergence–free fieldand consider the problem of homogenization for the steady–state advection–diffusion equation

−∆uε +1εa

(x

ε

)· ∇uε = f, for x ∈ Ω, (19.5.1a)

uε(x) = 0, for x ∈ ∂Ω. (19.5.1b)

Use the method of two–scale convergence to prove the homogenization theo-rem for this PDE.

2. Same as in the previous exercise for the PDE

−∆uε +1εa

(x

ε

)· ∇uε +

1εV

(x

ε

)uε = f, for x ∈ Ω,

uε(x) = 0, for x ∈ ∂Ω,

where V (y) is smooth, 1–periodic, mean zero with infy∈Td V (y) > 0.

3. Same as in the previous exercise for the PDE (19.5.1) with

a(y) = −∇yV (y),

where V (y) is smooth and 1–periodic.

4. Use two–scale convergence to prove the homogenization theorem for the Neu-mann problem.

Page 255: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

Chapter 20

Homogenization for ParabolicPDE: The Convergence Theorem

20.1 Introduction

In this chapter we prove the homogenization theorem for second order parabolicPDE of the form studied in Chapter 13. The method of proof is structurally verysimilar to that used in Chapter 16 where we prove an averaging theorem for Markovchains. To be precise, we obtain an equation for the error in a multiscale expan-sion, and directly estimate the error from this equation. The crucial estimate usedin Chapter 16 follows from the fact that Q in that chapter is a stochastic matrix;the analogous estimate in this chapter follows from the maximum principle forparabolic PDE.

20.2 The Theorem

Theorem 20.1. Let uε(x, t) be the solution of

∂uε

∂t=

1εb(x

ε

)· ∇uε + D∆uε (x, t) ∈ Rd × (0, T ), (20.2.1a)

uε = f(x) (x, t) ∈ Rd × 0, (20.2.1b)

with b(y) ∈ C∞per(Td) and f(x) ∈ C∞

b (Rd). Let u(x, t) be the solution of thehomogenized equation

∂u

∂t= K : ∇∇u (x, t) ∈ Rd × (0, T ), (20.2.2a)

239

Page 256: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

240CHAPTER 20. HOMOGENIZATION FOR PARABOLIC PDE: THE CONVERGENCE THEOREM

u = f(x) (x, t) ∈ Rd × 0, (20.2.2b)

where K is given by (13.3.1). Then

‖uε(x, t)− u(x, t)‖L∞(Rd×(0,T )) 6 Cε. (20.2.3)

Thus uε → u in L∞(Rd × (0, T )).

20.3 The Proof

In Chapter 13 we derived the two–scale expansion

uε(x, t) ≈ u(x, t) + εu1

(x,

x

ε, t

)+ ε2u2

(x,

x

ε, t

),

where

u1

(x,

x

ε, t

)= χ

(x

ε

)· ∇xu(x, t). (20.3.1)

An analysis similar to the one presented in Chapter 12 enables us to obtain theexpressions for u2 (see Exercise 1, Chapter 13) and

u2

(x,

x

ε, t

)= Θ

(x

ε

): ∇x∇xu(x, t). (20.3.2)

The vector field χ(y) and the matrix Θ(y) solve the Poisson equations

−L0χ(y) = b(y)

and

−L0Θ = b(y)⊗ χ(y) + 2D∇yχ(y)

−∫

Y

(b(y)⊗ χ(y) + 2D∇yχ(y)

)ρ(y) dy, (20.3.3)

with periodic boundary conditions. Here ρ(y) is the invariant distribution givenby (13.2.5). The corrector fields χ(y), Θ(y) solve uniformly elliptic PDE withsmooth coefficients and periodic boundary conditions. Consequently, both χ(y)and Θ(y) together with all their derivatives are bounded:

‖χ(y)‖Ck(Rd;Rd) 6 C, ‖Θ(y)‖(Ck(Rd);Rd×d) 6 C, (20.3.4)

for every integer k > 0. Furthermore, our assumptions on the initial conditionsf(x) imply that u(x, t), which solves a PDE with constant coefficients, is bounded

Page 257: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

20.3. THE PROOF 241

in L∞((0, T ) × Rd), together with all of its derivatives with respect to space andtime. This, together with estimates (20.3.4) provide us with the bounds

‖uε1(x, t)‖L∞((0,T )×Rd) 6 C, ‖uε

2(x, t)‖L∞((0,T )×Rd) 6 C, (20.3.5)

with the constant C being independent of ε. In writing the above we use the nota-tion uε

i (x, t) := ui(x, x/ε, t), i = 1, 2.Let Rε(x, t) denote the remainder defined through the equation

uε(x, t) = u(x, t) + εu1

(x,

x

ε, t

)+ ε2u2

(x,

x

ε, t

)+ Rε(x, t). (20.3.6)

LetLε =

1εb(x

ε

)· ∇+ D∆.

We want to apply Lε to functions of the form f(x, x/ε, t). We have

Lε =1ε2

(b(y) · ∇y + D∆y) +1ε

(b(y) · ∇x + 2D∇x∇x) + D∆x

=:1ε2L0 +

1εL1 + L2,

with y = xε and where

L0 = b(y) · ∇y + D∆y,

L1 = b(y) · ∇x + 2D∇x · ∇y,

L2 = D∆x.

Recall that u0, u1 and u2 are constructed so that

O(1ε2

) L0u0 = 0,

O(1ε) L0u1 = −L1u0,

O(1) L0u2 = −L1u1 − L2u0 +∂u0

∂t.

We apply Lε to the expansion (20.3.6) to obtain

Lεuε = Lε(u + εu1 + ε2u2

)

=1ε2L0u +

(L0u1 + L1u) + (L0u2 + L1u1 + L2u)

+ε (L1u2 + L2u1) + ε2L2u2 + LεRε

=∂u

∂t+ ε (L1u2 + L2u1) + ε2L2u2 + LεRε.

Page 258: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

242CHAPTER 20. HOMOGENIZATION FOR PARABOLIC PDE: THE CONVERGENCE THEOREM

On the other hand

∂uε

∂t=

∂u

∂t+ ε

∂u1

∂t+ ε2 ∂u2

∂t+

∂Rε

∂t.

We combine the above two equations together with the unhomogenized equationto obtain

∂Rε

∂t= LεRε + εF ε(x, t),

with

F ε(x, t) = L1u2 + L2u1 − ∂u1

∂t+ ε

(L2u2 − ∂u2

∂t

). (20.3.7)

Furthermore

uε(x, 0) = f(x)

= u(x, 0) + εu1

(x,

x

ε, 0

)+ ε2u2

(x,

x

ε, 0

)+ Rε(x, 0),

and consequently, on account of (20.2.1b) we have

Rε(x, 0) = εhε(x)

withhε(x, 0) = u1

(x,

x

ε, 0

)+ εu2

(x,

x

ε, 0

). (20.3.8)

Putting the above calculations together we obtain the following Cauchy problemfor the remainder Rε(x, t)

∂Rε

∂t= LεRε + εF ε(x, t) (x, t) ∈ Rd × (0, T ), (20.3.9a)

Rε = εhε(x) (x, t) ∈ Rd × 0, (20.3.9b)

with F ε(x, t) and hε(x) given by (20.3.7) and (20.3.8), respectively.In order to prove Theorem 20.1 we will need estimates on F ε and hε.

Lemma 20.2. Under assumptions F ε(x, t) and hε(x, t) satisfy

‖F ε(x, t)‖L∞(Rd×(0,T )) 6 C (20.3.10)

and‖hε(x)‖L∞(Rd) 6 C, (20.3.11)

respectively, where the constant C is independent of ε.

Page 259: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

20.4. DISCUSSION AND BIBLIOGRAPHY 243

Proof. We have

F ε(x, t) = L1

(x

ε

): ∇x∇xu(x, t)

)+ L2

(x

ε

)· ∇xu(x, t)

)

− ∂

∂t

(x

ε

)· ∇xu(x, t)

)

(D∆xΘ

(x

ε

): ∇x∇xu(x, t)− ∂

∂t

(x

ε

): ∇x∇xu(x, t)

))

= Cε(x) : D3xu(x, t) + Dε(x) : D4

xu(x, t)

+Eε(x) : D2x

∂u

∂t(x, t) + Hε(x) · ∇x

∂u

∂t(x, t), (20.3.12)

where the functions Cε(x), Dε(x) and Eε(x) involve χ(

), Θ

(xε

)and their

derivatives. Estimates combined with (20.3.12), imply (20.3.10).Furthermore

hε(x) = χ(x

ε

)· ∇xu(x, 0) + εΘ

(x

ε

): ∇x∇xu(x, 0).

The uniform estimates on χ(y) and Θ(y), together with our assumptions on theinitial conditions of (20.2.1) lead to estimate (20.3.11).

From equation (20.3.9a), the estimate (7.6.4) from Chapter 7 gives

‖Rε‖L∞(Rd×(0,T )) 6 ε‖hε‖L∞(Rd) + ε

∫ T

0‖F ε(·, s)‖L∞(Rd) ds

6 εC + εCT

6 Cε. (20.3.13)

We combine (20.3.6) with (20.3.13) and (20.3.5) and use the triangle inequality toobtain

‖uε − u‖L∞(Rd×(0,T )) = ‖εu1 + ε2u2 + Rε‖L∞(Rd×(0,T ))

6 ε‖u1‖L∞(Rd×(0,T )) + ε2‖u2‖L∞(Rd×(0,T )) + ‖Rε‖L∞(Rd×(0,T ))

6 Cε,

from which (20.2.3) follows.

20.4 Discussion and Bibliography

Our proof of the homogenization theorem has relied on the maximum principle.This is the simplest approach, provided that one is willing to assume sufficient

Page 260: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

244CHAPTER 20. HOMOGENIZATION FOR PARABOLIC PDE: THE CONVERGENCE THEOREM

regularity on the coefficients of the PDE, and it is very well suited for PDE in un-bounded domains. However, other techniques may be used to study homogeniza-tion for parabolic equations. These include probabilistic methods [121], energymethods [17, Ch. 2] and the method of two-scale convergence [3].

The method employed in this chapter, namely the derivation of a PDE for theerror term and the use a priori estimates to control can be applied to various prob-lems in the theory of singular perturbations. Some other examples can be found in[115, 120]. The method is usually called bootstrapping.

20.5 Exercises

1. Consider the case where ∇ · b(x) = 0. Assume that the solution of the Cauchy20.2.1 problem is smooth, bounded and decays sufficiently fast at infinity.

i. Consider the inhomogeneous Cauchy problem

∂R

∂t=

1εb(x

ε

)· ∇R + D∆R + F (x, t) (x, t) ∈ Rd × (0, T ),

R(x, 0) = f(x),

where f(x) ∈ L2(Rd) and F (x, t) ∈ L2((0, T )× Rd). Prove the estimate

‖R‖2L2((0,T )×Rd)+C1‖∇R‖2

L2((0,T )×Rd) 6 C2‖f‖2L2(Rd)+C3‖F‖2

L2((0,T )×Rd).

ii. Use this to prove convergence in L2((0, T )× Rd).

iii. What is the maximum time interval (0, T ) over which the homogenizationtheorem holds?

2. Similarly for the Cauchy problem

∂uε

∂t= ∇ ·

(A

(x

ε

)∇uε

)(x, t) ∈ Rd × (0, T ),

uε(x, 0) = uin(x),

where the matrix A(y) satisfies the standard periodicity, smoothness and uni-form ellipticity assumptions.

Page 261: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

20.5. EXERCISES 245

3. Consider the initial boundary value problem

∂uε

∂t= ∇ ·

(A

(x

ε

)∇uε

)(x, t) ∈ Ω× (0, T ),

uε(x, 0) = uin(x), xΩ

uε(x, t) = 0, (x, t) ∈ ∂Ω× [0, T ],

where Ω is a bounded domain in Rd with smooth boundary and A(y) satisfiesthe standard assumptions.

i. Prove the estimate

‖uε‖2L∞((0,T )×Ω) + C‖uε‖2

L2((0,T );H10 (Ω)) 6 ‖uin(x)‖2

L2(Ω).

ii. Use the above estimate to prove the homogenization theorem using themethod of two–scale convergence.

iii. Can you apply the bootstrapping method to this problem?

Page 262: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

246CHAPTER 20. HOMOGENIZATION FOR PARABOLIC PDE: THE CONVERGENCE THEOREM

Page 263: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

Chapter 21

Homogenization for TransportPDE: The Convergence Theorem

21.1 Introduction

In this chapter we prove the homogenization theorem for linear transport equationswith periodic, divergence–free velocity fields. We show that, when the velocityfield does not generate an ergodic flow on the unit torus, the homogenized limitleads to a coupled system of equations, the two–scale system. We use the methodof two–scale convergence to prove the main theorem.

21.2 The Theorem

Theorem 21.1. Let uε(x, t) be the weak solution of

∂uε(x, t)∂t

+ a(x

ε

)· ∇uε(x, t) = 0 for (x, t) ∈ Rd × R+, (21.2.1a)

uε(x, 0) = uin(x) for x ∈ Rd. (21.2.1b)

and assume that uin(x) ∈ L2(Rd) and that a(y) is smooth, divergence–freeand 1–periodic. Then uε(x, t) two–scale converges to u0(x, y, t) ∈ L2(R+ ×Rd; L2

per(Y)) which satisfies∫

R+

Rd

Ya(y) · ∇yφ(x, y, t)u0(x, y, t) dydxdt = 0, (21.2.2a)

a(y) · ∇yφ(x, y, t) = 0, φ(x, y, t) ∈ C∞0 (R+ × Rd;C∞

per(Y )), (21.2.2b)

247

Page 264: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

248CHAPTER 21. HOMOGENIZATION FOR TRANSPORT PDE: THE CONVERGENCE THEOREM

R+

Rd

Y

(∂φ(x, y, t)

∂t+ a(y) · ∇xφ(x, y, t)

)u0(x, y, t) dydxdt

+∫

Rd

uin(x)(∫

Yφ(x, y, 0) dy

)dx = 0. (21.2.2c)

We will refer to equations (21.2.2) as the two–scale system. Notice that the twoscale system involves an auxiliary function φ(x, y, t) ∈ C∞

0 (R+ × Rd;C∞per(Y)).

21.3 The Proof

Proof. We will need appropriate energy estimates on the solution of (21.2.1). As-suming that the solution is smooth and compactly supported, we multiply equation(21.2.1a) by uε, integrate over Rd and use the incompressibility of a(y) to obtain:

d

dt

Rd

|uε(x, t)|2 dx = 0.

We integrate now with respect to time over the interval [0, t] and then take thesupremum over t to obtain:

‖uε(x, t)‖2L∞(R+;L2(Rd)) 6 ‖uin(x)‖2

L2(Rd) 6 C. (21.3.1)

Estimate (21.3.1) implies, by Theorem 2.39, that there exists a subsequence, stilldenoted by uε, which two–scale converges to a function u0(x, y, t) ∈ L2(R+ ×Rd × Y ).

The weak formulation of (21.2.1) is∫

R+

Rd

(∂φε(x, t)

∂t+ a

(x

ε

)· ∇xφε(x, t)

)uε(x, t) dxdt+

Rd

uin(x)φε(x, 0) dx = 0,

(21.3.2)for every φε(x, t) ∈ C∞

0 (R+ × Rd). We choose an admissible test function of theform

φε = εφ1

(x,

x

ε, t

), φ1(x, y, t) ∈ C∞

b (R+ × Rd; C∞per(Y)).

Inserting φε in (21.3.2), using the chain rule (φε is a function of both x and xε ) and

passing to the limit as ε tends to 0 we obtain:∫

R+

Rd

Ya(y) · ∇yφ(x, y, t)u0(x, y, t) dydxdt = 0.

Page 265: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

21.3. THE PROOF 249

This is equation (21.2.2a). We choose now an admissible test function φε =φ

(x, x

ε , t), φ(x, y, t) ∈ C∞

0 (R+ × Rd;C∞per(Y)) such that (21.2.2b) is satis-

fied. We insert this test function in (7.7.2) and pass to the limit as ε tends to 0 toobtain

R+

Rd

Y

(∂φ(x, y, t)

∂t+ a(y) · ∇xφ(x, y, t)

)u0(x, y, t) dydxdt

+∫

Rd

uin(x)(∫

Yφ(x, y, 0) dy

)dx = 0,

which is precisely (21.2.2c).

Remark 21.2. In the case where a(y) is ergodic, divergence–free vector field,equation (21.2.2a) is satisfied if and only if u0 is independent of y. Then, we canobtain the (weak formulation of) the homogenized equation

∂u

∂t+

(∫

Ya(y) dy

)· ∇u = 0, u(x, 0) = uin(x)

by choosing a test function which is independent of y.

Of course, in order for the homogenized systems of equations to be of anyinterest, we need to prove that it is well posed. This is the content of the followingtheorem.

Theorem 21.3. There exists a unique solution u0(x, t, y) ∈ L2(R+ × Rd × Y ) ofequations (21.2.2a), (21.2.2b) and (21.2.2c).

Proof. The existence of a solution follows from the existence of a two–scale limitfor uε. Let us proceed with uniqueness. Let u1(x, y, t), u2(x, y, t) be two solu-tions of the homogenized system with the same initial conditions. We form thedifference

U(x, y, t) = u1(x, y, t)− u2(x, y, t).

This function satisfies, by linearity, the same system of equations (21.2.2a), (21.2.2b)and (21.2.2c), with zero initial conditions. We have:

∫ T

0

Rd

Y

(∂φ(x, y, t)

∂t+ a(y) · ∇xφ(x, y, t)

)U(x, y, t) dydxdt = 0,

Page 266: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

250CHAPTER 21. HOMOGENIZATION FOR TRANSPORT PDE: THE CONVERGENCE THEOREM

where T > 0 arbitrary but fixed. Now, U(x, y, t) satisfies equation (21.2.2a) andthus, assuming that it is smooth enough, we can use it as a test function in the aboveequation to obtain:

∫ T

0

Rd

Y

(∂U(x, y, t)

∂t+ a(y) · ∇xU(x, y, t)

)U(x, y, t) dydxdt = 0.

Assuming that U(x, y, t) has compact support we can integrate by parts the secondterm in the above equation to obtain:

R+

Rd

Ya(y) · ∇xU(x, y, t)U(x, y, t) dydxdt =

−∫

R+

Rd

Y(a(y)U(x, y, t)) · ∇xU(x, y, t) dydxdt,

from which we deduce that this term vanishes and, thus, we are left with

12

R+

d

dt

Rd

Y|U(x, y, t)|2 dydxdt = 0.

We use now the fact that, by assumption, U(x, y, t) is compactly supported todeduce that ∫

Rd

Y|U(x, y, t)|2 dydx = 0.

Consequently, U(x, y, t) ≡ 0 and, thus, the solution u0(x, y, t) is unique.

21.4 Discussion and Bibliography

In our proof we have followed [36]. A complete (in the L2 sense) set of test func-tions was used in [70] in order to characterize the homogenized limit in the generaltwo–dimensional case. It was shown there that the homogenized limit was an in-finite symmetric set of linear hyperbolic equations. The method of characteristicswas used in order to prove the homogenization theorem for two dimensional flowsin [145].

21.5 Exercises

1. Consider the following Cauchy problem

∂uε(x, t)∂t

+ a(x,

x

ε

)· ∇uε(x, t) = 0 for (x, t) ∈ Rd × R+,

Page 267: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

21.5. EXERCISES 251

uε(x, 0) = uin(x) for x ∈ Rd,

where a(x, y) ∈ C∞b (Rd, C∞per(Y);Rd) with∇·a(x, x/ε) = 0. Use the method

of two–scale convergence to prove the homogenization theorem.

2. Similarly for the forced transport PDE

∂uε(x, t)∂t

+ a (x) · ∇uε(x, t) = f(x,

x

ε

)for (x, t) ∈ Rd × R+,

uε(x, 0) = uin(x) for x ∈ Rd,

with f(x, y) being smooth, periodic in its second argument and f(x, x/ε) isbounded in L2(Rd).

3. Similarly for the transport equation (21.2.1a) with oscillating initial data

uε(x, 0) = uin

(x,

x

ε

),

where uin(x, y) being smooth, periodic in its second argument and uin(x, x/ε)is bounded in L2(Rd).

4. Combine the above exercises to prove the homogenization theorem for thetransport PDE

∂uε(x, t)∂t

+ a(x,

x

ε

)· ∇uε(x, t) = f

(x,

x

ε

)for (x, t) ∈ Rd × R+,

uε(x, 0) = uin

(x,

x

ε

)for x ∈ Rd,

Page 268: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

252CHAPTER 21. HOMOGENIZATION FOR TRANSPORT PDE: THE CONVERGENCE THEOREM

Page 269: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

Bibliography

[1] A. Abdulle and W. E. Finite difference heterogeneous multi-scale methodfor homogenization problems. J. Comput. Phys., 191(1):18–39, 2003.

[2] R. A. Adams and J. J.F. Fournier. Sobolev Spaces. Pure and Applied Math-ematics. Elsevier, Oxford, 2003.

[3] G. Allaire. Homogeneization et convergence a deux echelles. application aun probleme de convection diffusion. C.R. Acad. Sci. Paris, 312:581–586,1991.

[4] G. Allaire. Homogenization and two-scale convergence. SIAM J. Math.Anal., 23(6):1482–1518, 1992.

[5] G. Allaire and M. Briane. Multiscale convergence and reiterated homogeni-sation. Proc. Roy. Soc. Edinburgh Sect. A, 126(2):297–342, 1996.

[6] R. Aris. On the dispersion of a solute in a fluid flowing through a tube. Proc.R. Soc. Lond. A, 235:67–77, 1956.

[7] V. I. Arnol′d. Mathematical methods of classical mechanics, volume 60 ofGraduate Texts in Mathematics. Springer-Verlag, New York, 1989.

[8] Vladimir I. Arnol′d. Ordinary differential equations. Springer Textbook.Springer-Verlag, Berlin, 1992.

[9] Zvi Artstein. On singularly perturbed ordinary differential equations withmeasure-valued limits. Math. Bohem., 127(2):139–152, 2002.

[10] Zvi Artstein and Marshall Slemrod. On singularly perturbed retardedfunctional-differential equations. J. Differential Equations, 171(1):88–109,2001.

253

Page 270: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

254 BIBLIOGRAPHY

[11] Zvi Artstein and Alexander Vigodner. Singularly perturbed ordinary differ-ential equations with dynamic limits. Proc. Roy. Soc. Edinburgh Sect. A,126(3):541–569, 1996.

[12] N. Bakhvalov and G. Panasenko. Homogenisation: averaging processesin periodic media. Kluwer Academic Publishers Group, Dordrecht, 1989.Mathematical problems in the mechanics of composite materials, Translatedfrom the Russian by D. Leıtes.

[13] R. Balescu. Statistical dynamics. Matter out of equilibrium. Imperial Col-lege Press, London, 1997.

[14] Christian Beck. Brownian motion from deterministic dynamics. Phys. A,169(2):324–336, 1990.

[15] Gerard Ben Arous and Houman Owhadi. Multiscale homogenization withbounded ratios and anomalous slow diffusion. Comm. Pure Appl. Math.,56(1):80–113, 2003.

[16] Giancarlo Benettin, Luigi Galgani, and Antonio Giorgilli. Realization ofholonomic constraints and freezing of high frequency degrees of freedom inthe light of classical perturbation theory. I. Comm. Math. Phys., 113(1):87–103, 1987.

[17] A. Bensoussan, J.L. Lions, and G. Papanicolaou. Asymptotic analysis ofperiodic structures. North-Holland, Amsterdam, 1978.

[18] R. N. Bhattacharya, V.K. Gupta, and H.F. Walker. Asymptotics of solutedispersion in periodic porous media. SIAM J. APPL. MATH, 49(1):86–98,1989.

[19] Patrick Billingsley. Convergence of probability measures. Wiley Series inProbability and Statistics: Probability and Statistics. John Wiley & SonsInc., New York, 1999.

[20] J. Bonn and R.M. McLaughlin. Sensitive enhanced diffusivities for flowswith fluctuating mean winds: a two-parameter study. J. Fluid Mech.,445:345–375, 2001.

Page 271: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

BIBLIOGRAPHY 255

[21] Folkmar Bornemann. Homogenization in time of singularly perturbed me-chanical systems, volume 1687 of Lecture Notes in Mathematics. Springer-Verlag, Berlin, 1998.

[22] Folkmar A. Bornemann and Christof Schutte. Homogenization of Hamilto-nian systems with a strong constraining potential. Phys. D, 102(1-2):57–77,1997.

[23] A. Braides. Γ-convergence for beginners, volume 22 of Oxford Lecture Se-ries in Mathematics and its Applications. Oxford University Press, Oxford,2002.

[24] Y. Brenier. Remarks on some linear hyperbolic equations with oscillatorycoefficients. In Third International Conference on Hyperbolic Problems,Vol. I, II (Uppsala, 1990), pages 119–130. Studentlitteratur, Lund, 1991.

[25] H. Brezis. Analyse fonctionnelle. Collection Mathematiques Appliqueespour la Maıtrise. [Collection of Applied Mathematics for the Master’s De-gree]. Masson, Paris, 1983.

[26] F. Campillo and A. Piatnitski. Effective diffusion in vanishing viscosity.In Nonlinear partial differential equations and their applications. Collegede France Seminar, Vol. XIV (Paris, 1997/1998), volume 31 of Stud. Math.Appl., pages 133–145. North-Holland, Amsterdam, 2002.

[27] Jack Carr. Applications of centre manifold theory, volume 35 of AppliedMathematical Sciences. Springer-Verlag, New York, 1981.

[28] S. Cerrai. Second order PDE’s in finite and infinite dimension, volume 1762of Lecture Notes in Mathematics. Springer-Verlag, Berlin, 2001.

[29] Alexandre J. Chorin, Ole H. Hald, and Raz Kupferman. Optimal predictionwith memory. Phys. D, 166(3-4):239–257, 2002.

[30] D. Cioranescu and P. Donato. An Introduction to Homogenization. OxfordUniversity Press, New York, 1999.

[31] Earl A. Coddington and Norman Levinson. Theory of ordinary differentialequations. McGraw-Hill Book Company, Inc., New York-Toronto-London,1955.

Page 272: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

256 BIBLIOGRAPHY

[32] P. Constantin, C. Foias, B. Nicolaenko, and R. Temam. Integral mani-folds and inertial manifolds for dissipative partial differential equations,volume 70 of Applied Mathematical Sciences. Springer-Verlag, New York,1989.

[33] B. Dacorogna. Direct methods in the calculus of variations. Springer-Verlag, Berlin-Heidelberg, 1989.

[34] A. Defranceschi. An introduction to homogenization and G – convergence.Lecture notes, 1993.

[35] Mihai Dorobantu and Bjorn Engquist. Wavelet-based numerical homoge-nization. SIAM J. Numer. Anal., 35(2):540–559 (electronic), 1998.

[36] W. E. Homogenization of linear and nonlinear transport equations. Commu-nications in Pure and Applied Mathematics, 45:301–326, 1992.

[37] W. E, D. Liu, and E. Vanden-Eijnden. Analysis of multiscale methods forstochastic differential equations. Comm. Pure Appl. Math., 58(11):1544–1585, 2005.

[38] Weinan E and Bjorn Engquist. The heterogeneous multiscale methods.Commun. Math. Sci., 1(1):87–132, 2003.

[39] Weinan E and Bjorn Engquist. Multiscale modeling and computation. No-tices Amer. Math. Soc., 50(9):1062–1070, 2003.

[40] R.E. Edwards. Functional analysis. Dover Publications Inc., New York,1995.

[41] S.N. Ethier and T.G. Kurtz. Markov processes. Wiley Series in Probabilityand Mathematical Statistics: Probability and Mathematical Statistics. JohnWiley & Sons Inc., New York, 1986.

[42] L. C. Evans. Weak convergence methods for nonlinear partial differentialequations, volume 74 of CBMS Regional Conference Series in Mathematics.Published for the Conference Board of the Mathematical Sciences, Wash-ington, DC, 1990.

[43] L.C. Evans. The perturbed test function method for viscosity solutions ofnonlinear pde. Proceedings of the Royal Society of Edinburgh, 111A:359–375, 1989.

Page 273: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

BIBLIOGRAPHY 257

[44] L.C. Evans. Periodic homogenisation of certain fully nonlinear partial differ-ential equations. Proceedings of the Royal Society of Edinburgh, 120:245–265, 1992.

[45] L.C. Evans. Partial Differential Equations. AMS, Providence, Rhode Island,1998.

[46] Neil Fenichel. Persistence and smoothness of invariant manifolds for flows.Indiana Univ. Math. J., 21:193–226, 1971/1972.

[47] Neil Fenichel. Geometric singular perturbation theory for ordinary differen-tial equations. J. Differential Equations, 31(1):53–98, 1979.

[48] J-P. Fouque, G.C. Papanicolaou, and R.K. Sircar. Derivatives in financialmarkets with stochastic volatility. Cambridge University Press, Cambridge,2000.

[49] M. I. Freıdlin. The Dirichlet problem for an equation with periodic co-efficients depending on a small parameter. Teor. Verojatnost. i Primenen.,9:133–139, 1964.

[50] M.I. Freidlin and A.D. Wentzell. Random Perturbations of dunamical sys-tems. Springer-Verlag, New York, 1984.

[51] A. Friedman. Partial Differential Equations. Holt, Rinehart and Winston,New York, 1969.

[52] Avner Friedman. Partial differential equations of parabolic type. Prentice-Hall Inc., Englewood Cliffs, N.J., 1964.

[53] C. W. Gardiner. Handbook of stochastic methods. Springer-Verlag, Berlin,second edition, 1985. For physics, chemistry and the natural sciences.

[54] J. Garnier. Homogenization in a periodic and time-dependent potential.SIAM J. APPL. MATH, 57(1):95–111, 1997.

[55] I. I. Gikhman and A. V. Skorokhod. Introduction to the theory of randomprocesses. Dover Publications Inc., Mineola, NY, 1996.

[56] D. Gilbarg and N.S. Trudinger. Elliptic partial differential equations of sec-ond order. Classics in Mathematics. Springer-Verlag, Berlin, 2001.

Page 274: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

258 BIBLIOGRAPHY

[57] Daniel T. Gillespie. A general method for numerically simulating thestochastic time evolution of coupled chemical reactions. J. ComputationalPhys., 22(4):403–434, 1976.

[58] D. Givon, R. Kupferman, and A.M. Stuart. Extracting macroscopic dynam-ics: model problems and algorithms. Nonlinearity, 17(6):R55–R127, 2004.

[59] Dror Givon and Raz Kupferman. White noise limits for discrete dynamicalsystems driven by fast deterministic dynamics. Phys. A, 335(3-4):385–412,2004.

[60] G.R. Grimmett and D.R. Stirzaker. Probability and random processes. Ox-ford University Press, New York, 2001.

[61] R.B. Guenther and J.W. Lee. Partial differential equations of mathematicalphysics and integral equations. Dover Publications Inc., Mineola, NY, 1996.

[62] J. K. Hale. Averaging methods for differential equations with retarded argu-ments and a small parameter. J. Differential Equations, 2:57–73, 1966.

[63] Jack K. Hale. Asymptotic behavior of dissipative systems, volume 25 ofMathematical Surveys and Monographs. American Mathematical Society,Providence, RI, 1988.

[64] R. Z. Has′minskiı. The averaging principle for parabolic and elliptic differ-ential equations and Markov processes with small diffusion. Teor. Verojat-nost. i Primenen., 8:3–25, 1963.

[65] R. Z. Has′minskiı. A limit theorem for solutions of differential equationswith a random right hand part. Teor. Verojatnost. i Primenen, 11:444–462,1966.

[66] R. Z. Has′minskiı. Stochastic stability of differential equations, volume 7 ofMonographs and Textbooks on Mechanics of Solids and Fluids: Mechanicsand Analysis. Sijthoff & Noordhoff, Alphen aan den Rijn, 1980.

[67] A. Holmbom. Homogenization of parabolic equations: An alternativeapproach and some corrector-type results. Applications of Mathematics,42(5):321–343, 1995.

Page 275: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

BIBLIOGRAPHY 259

[68] M.H. Holmes. Introduction to perturbation methods. Springer-Verlag, NewYork, 1995.

[69] T. Y. Hou and X-H. Wu. A multiscale finite element method for ellip-tic problems in composite materials and porous media. J. Comput. Phys.,134(1):169–189, 1997.

[70] T. Y. Hou and X. Xin. Homogenization of linear transport equations withoscillatory vector fields. SIAM J. APPL. MATH, 52(1):34–45, 1992.

[71] W. Huisinga, Ch. Schutte, and A.M. Stuart. Extracting macroscopic stochas-tic dynamics: model problems. Comm. Pure Appl. Math., 56(2):234–269,2003.

[72] A. R. Humphries and A.M. Stuart. Deterministic and random dynamicalsystems: theory and numerics. In Modern methods in scientific computingand applications (Montreal, QC, 2001), volume 75 of NATO Sci. Ser. IIMath. Phys. Chem., pages 211–254. Kluwer Acad. Publ., Dordrecht, 2002.

[73] Wolfram Just, Katrin Gelfert, Nilufer Baba, Anja Riegert, and Holger Kantz.Elimination of fast chaotic degrees of freedom: on the accuracy of the Bornapproximation. J. Statist. Phys., 112(1-2):277–292, 2003.

[74] Wolfram Just, Holger Kantz, Christian Rodenbeck, and Mario Helm.Stochastic modelling: replacing fast degrees of freedom by noise. J. Phys.A, 34(15):3199–3213, 2001.

[75] I. Karatzas and S.E. Shreve. Brownian Motion and Stochastic Calculus,volume 113 of Graduate Texts in Mathematics. Springer-Verlag, New York,second edition, 1991.

[76] J. B. Keller. Darcy’s law for flow in porous media and the two-space method.In Nonlinear partial differential equations in engineering and applied sci-ence (Proc. Conf., Univ. Rhode Island, Kingston, R.I., 1979), volume 54 ofLecture Notes in Pure and Appl. Math., pages 429–443. Dekker, New York,1980.

[77] J. Kevorkian and J. D. Cole. Multiple scale and singular perturbation meth-ods, volume 114 of Applied Mathematical Sciences. Springer-Verlag, NewYork, 1996.

Page 276: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

260 BIBLIOGRAPHY

[78] Yuri Kifer. Averaging in dynamical systems and large deviations. Invent.Math., 110(2):337–370, 1992.

[79] Yuri Kifer. Limit theorems in averaging for dynamical systems. ErgodicTheory Dynam. Systems, 15(6):1143–1172, 1995.

[80] Yuri Kifer. Averaging and climate models. In Stochastic climate models(Chorin, 1999), volume 49 of Progr. Probab., pages 171–188. Birkhauser,Basel, 2001.

[81] Yuri Kifer. Stochastic versions of Anosov’s and Neistadt’s theorems on av-eraging. Stoch. Dyn., 1(1):1–21, 2001.

[82] C. Kipnis and S. R. S. Varadhan. Central limit theorem for additive function-als of reversible Markov processes and applications to simple exclusions.Comm. Math. Phys., 104(1):1–19, 1986.

[83] Heinz-Otto Kreiss. Problems with different time scales. In Acta numerica,1992, Acta Numer., pages 101–139. Cambridge Univ. Press, Cambridge,1992.

[84] E. Kreyszig. Introductory functional analysis with applications. Wiley Clas-sics Library. John Wiley & Sons Inc., New York, 1989.

[85] T.G. Kurtz. A limit theorem for perturbed operator semigroups with appli-cations to random evolutions. J. Functional Analysis, 12:55–67, 1973.

[86] C. Landim. Central limit theorem for Markov processes. In From classi-cal to modern probability, volume 54 of Progr. Probab., pages 145–205.Birkhauser, Basel, 2003.

[87] C. Landim and S. Olla. Central Limit Theorems for Markov Processes.Lecture Notes, 2005.

[88] A. Lasota and M.C. Mackey. Chaos, Fractals, Noise. Springer-Verlag, NewYork, 1994.

[89] P.D. Lax. Functional Analysis. Wiley, NY, 2002.

[90] P. Lochak and C. Meunier. Multiphase averaging for classical systems, vol-ume 72 of Applied Mathematical Sciences. Springer-Verlag, New York,1988.

Page 277: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

BIBLIOGRAPHY 261

[91] T.J. Lyons. Differential equations driven by rough signals. Rev. Mat.Iberoamericana, 14(2):215–310, 1998.

[92] RS MacKay. Slow manifolds. In Energy Localisation and Transfer, pages149–192. World Scientific, 2004.

[93] A. Majda, I. Timofeyev, and E. Vanden-Eijnden. A priori tests of a stochasticmode reduction strategy. Phys. D, 170(3-4):206–252, 2002.

[94] A.J. Majda and P.R. Kramer. Simplified models for turbulent diffusion:Theory, numerical modelling and physical phenomena. Physics Reports,314:237–574, 1999.

[95] A.J. Majda and R.M. McLaughlin. The effect of mean flows on enhanceddiffusivity in transport by incompressible periodic velocity fields. Studies inApplied Mathematics, 89:245–279, 1993.

[96] A.J. Majda, I. Timofeyev, and E. Vanden Eijnden. A mathematical frame-work for stochastic climate models. Comm. Pure Appl. Math., 54(8):891–974, 2001.

[97] Andrew J. Majda and Ilya Timofeyev. Remarkable statistical behav-ior for truncated Burgers-Hopf dynamics. Proc. Natl. Acad. Sci. USA,97(23):12413–12417 (electronic), 2000.

[98] Andrew J. Majda, Ilya Timofeyev, and Eric Vanden Eijnden. Models forstochastic climate prediction. Proc. Natl. Acad. Sci. USA, 96(26):14687–14691 (electronic), 1999.

[99] Andrew J. Majda, Ilya Timofeyev, and Eric Vanden-Eijnden. Systematicstrategies for stochastic mode reduction in climate. J. Atmospheric Sci.,60(14):1705–1722, 2003.

[100] X. Mao. Stochastic differential equations and their applications. Hor-wood Publishing Series in Mathematics & Applications. Horwood Publish-ing Limited, Chichester, 1997.

[101] M-L Mascarenhas and A-M Toader. Scale convergence in homogenization.Numer. Func. Anal. Optim., 22:127–158, 2001.

[102] G. Dal Maso. An Introduction to Γ–convergence. Birkhauser, Boston, 1993.

Page 278: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

262 BIBLIOGRAPHY

[103] A. Mazzino. Effective correlation times in turbulent scalar transport. Phys-ical Review E, 56:5500–5510, 1997.

[104] A. Mazzino, S. Musacchio, and A. Vulpiani. Multiple-scale analysisand renormalization for preasymptotic scalar transport. Phys. Rev. E (3),71(1):011113, 11, 2005.

[105] R.M. McLaughlin. Numerical averaging and fast homogenization. Journalof Statistical Physics, 90(3):597–626, 1998.

[106] G.W. Milton. The theory of composites, volume 6 of Cambridge Mono-graphs on Applied and Computational Mathematics. Cambridge UniversityPress, Cambridge, 2002.

[107] F. Murat and L. Tartar. H-convergence. In Topics in the mathematical mod-elling of composite materials, volume 31 of Progr. Nonlinear DifferentialEquations Appl., pages 21–43. Birkhauser Boston, Boston, MA, 1997.

[108] G. Nguetseng. A general convergence result for a functional related to thetheory of homogenization. SIAM J. Math. Anal., 20(3):608–623, 1989.

[109] G. Nguetseng. Asymptotic analysis for a stiff variational problem arising inmechanics. SIAM J. Math. Anal., 21(6):1394–1414, 1990.

[110] J.R. Norris. Markov chains. Cambridge Series in Statistical and ProbabilisticMathematics. Cambridge University Press, Cambridge, 1998.

[111] B. Øksendal. Stochastic differential equations. Universitext. Springer-Verlag, Berlin, 2003.

[112] S. Olla. Homogenization of Diffusion Processes in Random Fields. LectureNotes, 1994.

[113] Robert E. O’Malley, Jr. Singular perturbation methods for ordinary differ-ential equations, volume 89 of Applied Mathematical Sciences. Springer-Verlag, New York, 1991.

[114] G. Papanicolaou and S. R. S. Varadhan. Ornstein-Uhlenbeck process in arandom potential. Comm. Pure Appl. Math., 38(6):819–834, 1985.

[115] G. C. Papanicolaou. Some probabilistic problems and methods in singularperturbations. Rocky Mountain J. Math., 6(4):653–674, 1976.

Page 279: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

BIBLIOGRAPHY 263

[116] G. C. Papanicolaou. Introduction to the asymptotic analysis of stochasticequations. In Modern modeling of continuum phenomena (Ninth SummerSem. Appl. Math., Rensselaer Polytech. Inst., Troy, N.Y., 1975), pages 109–147. Lectures in Appl. Math., Vol. 16. Amer. Math. Soc., Providence, R.I.,1977.

[117] G. C. Papanicolaou and W. Kohler. Asymptotic theory of mixing stochas-tic ordinary differential equations. Comm. Pure Appl. Math., 27:641–668,1974.

[118] G. C. Papanicolaou, D. Stroock, and S. R. S. Varadhan. Martingale approachto some limit theorems. In Papers from the Duke Turbulence Conference(Duke Univ., Durham, N.C., 1976), Paper No. 6, pages ii+120 pp. DukeUniv. Math. Ser., Vol. III. Duke Univ., Durham, N.C., 1977.

[119] G. C. Papanicolaou and S. R. S. Varadhan. A limit theorem with strong mix-ing in Banach space and two applications to stochastic differential equations.Comm. Pure Appl. Math., 26:497–524, 1973.

[120] G.C. Papanicolaou. Asymptotic analysis of stochastic equations. In Studiesin probability theory, volume 18 of MAA Stud. Math., pages 111–179. Math.Assoc. America, Washington, D.C., 1978.

[121] E. Pardoux. Homogenization of linear and semilinear second order parabolicpdes with periodic coefficients: A probabilistic approach. Journal of Func-tional Analysis, 167:498–520, 1999.

[122] G. A. Pavliotis. Homogenization Theory for Advection – Diffusion Equa-tions with Mean Flow, Ph.D Thesis. Rensselaer Polytechnic Institute, Troy,NY, 2002.

[123] G. A. Pavliotis. A multiscale approach to brownian motors. Phys. Lett. A,344:331–345, 2005.

[124] G. A. Pavliotis and A. M. Stuart. Periodic homogenization for inertial par-ticles. Phys. D, 204(3-4):161–187, 2005.

[125] M.H. Protter and H.F. Weinberger. Maximum principles in differential equa-tions. Springer-Verlag, New York, 1984.

Page 280: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

264 BIBLIOGRAPHY

[126] P. Reimann. Brownian motors: noisy transport far from equilibrium. Phys.Rep., 361(2-4):57–265, 2002.

[127] P. Reimann, C. Van den Broeck, H. Linke, J.M. Rubi, and A. Perez-Madrid.Giant acceleration of free diffusion by use of tilted periodic potentials. Phys.Rev. Let., 87(1):010602, 2001.

[128] H. Risken. The Fokker-Planck equation, volume 18 of Springer Series inSynergetics. Springer-Verlag, Berlin, 1989.

[129] J. C. Robinson. Infinite-dimensional dynamical systems. Cambridge Textsin Applied Mathematics. Cambridge University Press, Cambridge, 2001.

[130] L. C. G. Rogers and David Williams. Diffusions, Markov processes, andmartingales. Vol. 2. Cambridge Mathematical Library. Cambridge Univer-sity Press, Cambridge, 2000.

[131] Hanan Rubin and Peter Ungar. Motion under a strong constraining force.Comm. Pure Appl. Math., 10:65–87, 1957.

[132] J. A. Sanders and F. Verhulst. Averaging methods in nonlinear dynamicalsystems, volume 59 of Applied Mathematical Sciences. Springer-Verlag,New York, 1985.

[133] M Schreier, P. Reimann, P. Hanggi, and E. Pollak. Giant enhancement ofdiffusion and particle selection in rocked periodic potentials. Europhys. Let.,44(4):416–422, 1998.

[134] Ch. Schuette, J. Walter, C. Hartmann, and W. Huisinga. An averaging prin-ciple for fast degrees of freedom exhibiting long-term correlations. preprint,2003.

[135] Ya. G. Sinai. Introduction to ergodic theory. Princeton University Press,Princeton, N.J., 1976.

[136] A. V. Skorokhod. Asymptotic methods in the theory of stochastic differentialequations, volume 78 of Translations of Mathematical Monographs. Amer-ican Mathematical Society, Providence, RI, 1989.

[137] A.M. Soward and S. Childress. Large magnetic Reynolds number dynamoaction in spatially periodic flow with mean motion. Proc. Roy. Soc. Lond.A, 331:649–733, 1990.

Page 281: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

BIBLIOGRAPHY 265

[138] S. Spagnolo. Sul limite delle soluzioni di problemi di Cauchy relativiall’equazione del calore. Ann. Scuola Norm. Sup. Pisa (3), 21:657–699,1967.

[139] S. Spagnolo. Sulla convergenza di soluzioni di equazioni paraboliche edellittiche. Ann. Scuola Norm. Sup. Pisa (3) 22 (1968), 571-597; errata, ibid.(3), 22:673, 1968.

[140] D.W. Stroock and S.R.S. Varadhan. Multidimensional diffusion processes.Classics in Mathematics. Springer-Verlag, Berlin, 2006.

[141] A. M. Stuart and A. R. Humphries. Dynamical systems and numerical anal-ysis, volume 2 of Cambridge Monographs on Applied and ComputationalMathematics. Cambridge University Press, Cambridge, 1996.

[142] Floris Takens. Motion under the influence of a strong constraining force.In Global theory of dynamical systems (Proc. Internat. Conf., NorthwesternUniv., Evanston, Ill., 1979), volume 819 of Lecture Notes in Math., pages425–445. Springer, Berlin, 1980.

[143] L. Tartar. Nonlocal effects induced by homogenization, in PDE and calculusof Variations, F. Culumbini et al., eds. Boston, Birkhauser, 1989.

[144] L. Tartar. An introduction to the homogenization method in optimal design.In Optimal shape design (Troia, 1998), volume 1740 of Lecture Notes inMath., pages 47–156. Springer, Berlin, 2000.

[145] T. Tassa. Homogenization of two dimensional linear flows with integralinvariance. SIAM J. APPL. MATH, 57(5):1390–1405, 1997.

[146] G. I. Taylor. Dispersion of soluble matter in solvent flowing slowly througha tube. Proc. R. Soc. Lond. A, 219:186–203, 1953.

[147] R. Temam. Navier-Stokes equations. AMS Chelsea Publishing, Providence,RI, 2001.

[148] Roger Temam. Infinite-dimensional dynamical systems in mechanics andphysics, volume 68 of Applied Mathematical Sciences. Springer-Verlag,New York, 1997.

Page 282: AN INTRODUCTION TO MULTISCALE METHODS - WMI - …€¦ · AN INTRODUCTION TO MULTISCALE METHODS G.A. Pavliotis Department of Mathematics Imperial College London and A.M. Stuart …

266 BIBLIOGRAPHY

[149] A. N. Tikhonov, A. B. Vasil′eva, and A. G. Sveshnikov. Differential equa-tions. Springer Series in Soviet Mathematics. Springer-Verlag, Berlin, 1985.

[150] N. G. van Kampen. Elimination of fast variables. Phys. Rep., 124(2):69–160, 1985.

[151] Eric Vanden-Eijnden. Numerical techniques for multi-scale dynamical sys-tems with stochastic effects. Commun. Math. Sci., 1(2):385–391, 2003.

[152] M. Vergassola and M. Avellaneda. Scalar transport in compressible flow.Phys. D, 106(1-2):148–166, 1997.

[153] John C. Wells. Invariant manifolds on non-linear operators. Pacific J. Math.,62(1):285–293, 1976.

[154] Stephen Wiggins. Introduction to applied nonlinear dynamical systems andchaos, volume 2 of Texts in Applied Mathematics. Springer-Verlag, NewYork, 2003.

[155] G. G. Yin and Q. Zhang. Discrete-time Markov chains, volume 55 of Appli-cations of Mathematics (New York). Springer-Verlag, New York, 2005.

[156] G.G Yin and Q. Zhang. Continuous-time Markov chains and applications,volume 37 of Applications of Mathematics (New York). Springer-Verlag,New York, 1998.

[157] K. Yosida. Functional Analysis. Springer-Verlag, Berlin-Heidelberg, 1995.