Probability Theory: STAT310/MATH230;August 27, 2013...

Click here to load reader

  • date post

  • Category


  • view

  • download


Embed Size (px)

Transcript of Probability Theory: STAT310/MATH230;August 27, 2013...

  • Probability Theory: STAT310/MATH230; August

    27, 2013

    Amir Dembo

    E-mail address :

    Department of Mathematics, Stanford University, Stanford, CA 94305.

  • Contents

    Preface 5

    Chapter 1. Probability, measure and integration 7 1.1. Probability spaces, measures and σ-algebras 7 1.2. Random variables and their distribution 18 1.3. Integration and the (mathematical) expectation 30 1.4. Independence and product measures 54

    Chapter 2. Asymptotics: the law of large numbers 71 2.1. Weak laws of large numbers 71 2.2. The Borel-Cantelli lemmas 77 2.3. Strong law of large numbers 85

    Chapter 3. Weak convergence, clt and Poisson approximation 95 3.1. The Central Limit Theorem 95 3.2. Weak convergence 103 3.3. Characteristic functions 117 3.4. Poisson approximation and the Poisson process 133 3.5. Random vectors and the multivariate clt 141

    Chapter 4. Conditional expectations and probabilities 153 4.1. Conditional expectation: existence and uniqueness 153 4.2. Properties of the conditional expectation 158 4.3. The conditional expectation as an orthogonal projection 166 4.4. Regular conditional probability distributions 171

    Chapter 5. Discrete time martingales and stopping times 177 5.1. Definitions and closure properties 177 5.2. Martingale representations and inequalities 186 5.3. The convergence of Martingales 193 5.4. The optional stopping theorem 207 5.5. Reversed MGs, likelihood ratios and branching processes 212

    Chapter 6. Markov chains 227 6.1. Canonical construction and the strong Markov property 227 6.2. Markov chains with countable state space 235 6.3. General state space: Doeblin and Harris chains 257

    Chapter 7. Continuous, Gaussian and stationary processes 271 7.1. Definition, canonical construction and law 271 7.2. Continuous and separable modifications 276



    7.3. Gaussian and stationary processes 286

    Chapter 8. Continuous time martingales and Markov processes 291 8.1. Continuous time filtrations and stopping times 291 8.2. Continuous time martingales 296 8.3. Markov and Strong Markov processes 319

    Chapter 9. The Brownian motion 343 9.1. Brownian transformations, hitting times and maxima 343 9.2. Weak convergence and invariance principles 351 9.3. Brownian path: regularity, local maxima and level sets 369

    Bibliography 377

    Index 379

  • Preface

    These are the lecture notes for a year long, PhD level course in Probability Theory that I taught at Stanford University in 2004, 2006 and 2009. The goal of this course is to prepare incoming PhD students in Stanford’s mathematics and statistics departments to do research in probability theory. More broadly, the goal of the text is to help the reader master the mathematical foundations of probability theory and the techniques most commonly used in proving theorems in this area. This is then applied to the rigorous study of the most fundamental classes of stochastic processes.

    Towards this goal, we introduce in Chapter 1 the relevant elements from measure and integration theory, namely, the probability space and the σ-algebras of events in it, random variables viewed as measurable functions, their expectation as the corresponding Lebesgue integral, and the important concept of independence.

    Utilizing these elements, we study in Chapter 2 the various notions of convergence of random variables and derive the weak and strong laws of large numbers.

    Chapter 3 is devoted to the theory of weak convergence, the related concepts of distribution and characteristic functions and two important special cases: the Central Limit Theorem (in short clt) and the Poisson approximation.

    Drawing upon the framework of Chapter 1, we devote Chapter 4 to the definition, existence and properties of the conditional expectation and the associated regular conditional probability distribution.

    Chapter 5 deals with filtrations, the mathematical notion of information progres- sion in time, and with the corresponding stopping times. Results about the latter are obtained as a by product of the study of a collection of stochastic processes called martingales. Martingale representations are explored, as well as maximal inequalities, convergence theorems and various applications thereof. Aiming for a clearer and easier presentation, we focus here on the discrete time settings deferring the continuous time counterpart to Chapter 8.

    Chapter 6 provides a brief introduction to the theory of Markov chains, a vast subject at the core of probability theory, to which many text books are devoted. We illustrate some of the interesting mathematical properties of such processes by examining a few special cases of interest.

    Chapter 7 sets the framework for studying right-continuous stochastic processes indexed by a continuous time parameter, introduces the family of Gaussian pro- cesses and rigorously constructs the Brownian motion as a Gaussian process of continuous sample path and zero-mean, stationary independent increments.



    Chapter 8 expands our earlier treatment of martingales and strong Markov pro- cesses to the continuous time setting, emphasizing the role of right-continuous fil- tration. The mathematical structure of such processes is then illustrated both in the context of Brownian motion and that of Markov jump processes.

    Building on this, in Chapter 9 we re-construct the Brownian motion via the in- variance principle as the limit of certain rescaled random walks. We further delve into the rich properties of its sample path and the many applications of Brownian motion to the clt and the Law of the Iterated Logarithm (in short, lil).

    The intended audience for this course should have prior exposure to stochastic processes, at an informal level. While students are assumed to have taken a real analysis class dealing with Riemann integration, and mastered well this material, prior knowledge of measure theory is not assumed.

    It is quite clear that these notes are much influenced by the text books [Bil95, Dur10, Wil91, KaS97] I have been using.

    I thank my students out of whose work this text materialized and my teaching as- sistants Su Chen, Kshitij Khare, Guoqiang Hu, Julia Salzman, Kevin Sun and Hua Zhou for their help in the assembly of the notes of more than eighty students into a coherent document. I am also much indebted to Kevin Ross, Andrea Montanari and Oana Mocioalca for their feedback on earlier drafts of these notes, to Kevin Ross for providing all the figures in this text, and to Andrea Montanari, David Siegmund and Tze Lai for contributing some of the exercises in these notes.

    Amir Dembo

    Stanford, California

    April 2010


    Probability, measure and integration

    This chapter is devoted to the mathematical foundations of probability theory. Section 1.1 introduces the basic measure theory framework, namely, the probability space and the σ-algebras of events in it. The next building blocks are random variables, introduced in Section 1.2 as measurable functions ω 7→ X(ω) and their distribution. This allows us to define in Section 1.3 the important concept of expectation as the corresponding Lebesgue integral, extending the horizon of our discussion beyond the special functions and variables with density to which elementary probability theory is limited. Section 1.4 concludes the chapter by considering independence, the most fundamental aspect that differentiates probability from (general) measure theory, and the associated product measures.

    1.1. Probability spaces, measures and σ-algebras

    We shall define here the probability space (Ω,F ,P) using the terminology of mea- sure theory. The sample space Ω is a set of all possible outcomes ω ∈ Ω of some random exper- iment. Probabilities are assigned by A 7→ P(A) to A in a subset F of all possible sets of outcomes. The event space F represents both the amount of information available as a result of the experiment conducted and the collection of all subsets of possible interest to us, where we denote elements of F as events. A pleasant mathematical framework results by imposing on F the structural conditions of a σ-algebra, as done in Subsection 1.1.1. The most common and useful choices for this σ-algebra are then explored in Subsection 1.1.2. Subsection 1.1.3 provides fun- damental supplements from measure theory, namely Dynkin’s and Carathéodory’s theorems and their application to the construction of Lebesgue measure.

    1.1.1. The probability space (Ω,F , P). We use 2Ω to denote the set of all possible subsets of Ω. The event space is thus a subset F of 2Ω, consisting of all allowed events, that is, those subsets of Ω to which we shall assign probabilities. We next define the structural conditions imposed on F .

    Definition 1.1.1. We say that F ⊆ 2Ω is a σ-algebra (or a σ-field), if (a) Ω ∈ F , (b) If A ∈ F then Ac ∈ F as well (where Ac = Ω \A). (c) If Ai ∈ F for i = 1, 2, 3, . . . then also

    ⋃ iAi ∈ F .

    Remark. Using DeMorgan’s law, we know that ( ⋃ iA

    c i ) c =

    ⋂ iAi. Thus the fol-

    lowing is equivalent to property (c) of Definition 1.1.1: (c’) If Ai ∈ F for i = 1, 2, 3, . . . then also

    ⋂ iAi ∈ F .



    Definition 1.1.2. A pair (Ω,F) with F a σ-algebra of subsets of Ω is called a measurable space. Given a measurable space (Ω,F), a measure µ is any countably additive non-negative set function on this space. That is, µ : F → [0,∞], having the properties: (a