Lecture 1. Basics of probability theory

21
Lecture 1. Basics of probability theory Mathematical Statistics and Discrete Mathematics November 2nd, 2015 1 / 21

Transcript of Lecture 1. Basics of probability theory

Page 1: Lecture 1. Basics of probability theory

Lecture 1. Basics of probability theory

Mathematical Statistics and Discrete Mathematics

November 2nd, 2015

1 / 21

Page 2: Lecture 1. Basics of probability theory

Sample space

A sample space S is the set of all possible outcomes ω of an experiment.

! toss a coin, S = {heads, tails}.! toss a die, S = {1, 2, 3, 4, 5, 6}.! pick a card from a deck, S = {all possible cards}.! lottery: choose six numbers from 1 to 49,

S = {all six-element subsets of {1, 2, . . . , 49}}.

2 / 21

Page 3: Lecture 1. Basics of probability theory

Events

An event A ⊂ S is any subset of the sample space, that is, any collection of possibleoutcomes.• The full set S is called the certain event.• The empty set ∅ is called the impossible event.• An event containing exactly one outcome is called an elementary event.

Pick a card, we can have! A1 = {hearts}.! A2 = {even numbers}.! A3 = {queen of hearts}, elementary event.

Usually, an event will take the general form

A = {all outcomes (not) satisfying a specific condition}.

3 / 21

Page 4: Lecture 1. Basics of probability theory

Event operations (Boolean operations)

The intersection of events A and B is

A ∩ B = {all outcomes that are both in A and B} = {ω : ω ∈ A & ω ∈ B}.

If A ∩ B = ∅, then we say that A and B are disjoint.

A BA ∩B

S

A B

S

Figure: left: events with a non-empty intersection on the left-hand side, right: disjoint events.

! Let A = {1, 2, 3, 4}, B = {2, 4, 5, 6}. Then, A ∩ B = {2, 4}.! Let A = {1, 2, 3}, B = {4, 5, 6}. Then, A ∩ B = ∅.! Let A = [0, 7), B = [5, 10). Then, A ∩ B = [5, 7).

4 / 21

Page 5: Lecture 1. Basics of probability theory

Event operations (Boolean operations)

The union of events A and B is

A ∪ B = {all outcomes that are in A or B} = {ω : ω ∈ A or ω ∈ B}.

S

A ∪B

! Let A = {1, 2, 3, 4}, B = {2, 4, 5, 6}. Then, A ∪ B = {1, 2, 3, 4, 5, 6}.! Let A = {1, 2, 3}, B = {4, 5, 6}. Then, A ∪ B = {1, 2, 3, 4, 5, 6}.! Let A = [0, 7), B = [5, 10). Then, A ∪ B = [0, 10).

5 / 21

Page 6: Lecture 1. Basics of probability theory

Event operations

The difference between events A and B is

A \ B = {all outcomes that are in A but not in B} = {ω : ω ∈ A & ω /∈ B}.

B

S

A \B

! Let A = {1, 2, 3, 4}, B = {2, 4, 5, 6}. Then, A \ B = {1, 3}.! Let A = {1, 2, 3}, B = {4, 5, 6}. Then, A \ B = A = {1, 2, 3}.! Let A = [0, 7), B = [5, 10). Then, A \ B = [0, 5), B \ A = [7, 10).

6 / 21

Page 7: Lecture 1. Basics of probability theory

Event operations (Boolean operations)

The complement of event A is

Ac = S \ A = {all outcomes that are not in A} = {ω : ω /∈ A}.

S

A

Ac

Note that to know the complement, we have to know the sample space S.

! Let A = {1, 2, 3, 4}, S = {1, 2, 3, 4, 5, 6}. Then, Ac = {5, 6}.! Let A = [0, 7), B = [0, 10). Then, Ac = [7, 10).

7 / 21

Page 8: Lecture 1. Basics of probability theory

Classical definition of probability

We say that an event A occurs if ω ∈ A. By P(A) we denote the probability that Aoccurs. By P(ω) we denote the probability that the elementary event {ω} occurs, or inother words, that the outcome of the experiment is equal to ω.

Main motivating question: How to compute probabilities of events?

We can agree that• The certain event occurs with probability 1, and we write P(S) = 1• The impossible event occurs with probability 0, and we write P(∅) = 0.

But what about other events?

8 / 21

Page 9: Lecture 1. Basics of probability theory

Classical definition of probability

The classical probability is defined by

P(A) =|A||S| =

number of outcomes in Anumber of all outcomes

.

In particular, for any outcome ω, P(ω) = 1/|S|. This is also called the uniformprobability since the probability does not depend on the particular outcome ω.

Pick a random card from a deck. LetA1 = {hearts},A2 = {even numbers},A3 = {queen of hearts}.

We have |A1| = 13, |A2| = 20, |A3| = 1, and |S| = 52. Then,

P(A1) =1352

=14, P(A2) =

2052

=513

, P(A3) =1

52.

9 / 21

Page 10: Lecture 1. Basics of probability theory

Counting configurationsSuppose we run n experiments and we record all the outcomes.• If we care about the order of the experiments, we record the combined outcome

as an ordered sequence of n outcomes.• If we do note care about the order of the experiments, we record the combined

outcome as an unordered collection of n outcomes.Depending on the type of the experiments, we may or may not get repeated outcomes.

Combined outcomes are called configurations.

Choose two letters out of a, b, c. The combined sample space S of all configuraions is

! S = {aa, ab, ac, ba, bb, bc, ca, cb, cc}, |S| = 9, if we care about the order andallow repetitions,

! S = {ab, ac, ba, bc, ca, cb}, |S| = 6, if we care about the order and don’t allowrepetitions,

! S = {{aa}, {ab}, {ac}, {bb}, {bc}, {cc}}, |S| = 6, if we don’t care about theorder and allow repetitions,

! S = {{ab}, {ac}, {bc}}, |S| = 3, if we don’t care about the order and don’tallow repetitions.

10 / 21

Page 11: Lecture 1. Basics of probability theory

Counting configurations

Configurations• where the order matters and there are no repetitions are called permutations,• where the order does not matter and there are no repetitions are called

combinations.

What is the size of the combined sample space?

The multiplication rule says that if we run k experiments, and we know that thesample space of the ith experiment contains ni outcomes. Then, the combined samplespace contains

k∏i=1

ni = n1 · n2 · . . . · nk

configurations.

11 / 21

Page 12: Lecture 1. Basics of probability theory

Counting configurationsRecall that n factorial is

n! = 1 · 2 · 3 · . . . · n = n · (n− 1) · (n− 2) · . . . · 1.

We have an urn with n balls of different colours. We choose one ball in each of thetotal of k turns! If n ≥ k and we do not put back the ball into the urn, then the first time we have

n balls to choose from, the second time we have n− 1 balls, and so on, until weare left with n− k balls. Hence, the number of all possible configurations is

n · (n− 1) · . . . · (n− k + 1) =n!

(n− k)!.

If n = k, we get n! configurations.

! If we put the ball back into the urn, then each time we have n balls to choosefrom. Hence, the number of all possible configurations is

n · n · . . . · n︸ ︷︷ ︸k times

= nk.

12 / 21

Page 13: Lecture 1. Basics of probability theory

Counting configurations

We have an urn with k balls of different colours. We choose one ball in each of thetotal of n turns! If n ≤ k, and we do not put back the ball into the urn, and we do not care about

the order, the number of all possible configurations is(nk

):=

n!k!(n− k)!

, (You read this notation: n choose k)

! The above number is just the number of all k-element subsets of an n-elementset. Hence, the probability of winning the lottery is 1/

(496

)∼ 1/(14 · 106).

! If we put back the ball into the urn, and we do not care about the order, thenumber of all possible configurations is(

n + k − 1k

).

Exercise for the willing: prove the last formula.

13 / 21

Page 14: Lecture 1. Basics of probability theory

Axioms of general probability

Toss a coin. In classical probability P(heads) = 12 . But what about asymmetric coins?

Axioms of a probability (measure):A probability (measure) P is an assignment of numbers to events such that

(i) P(A) ≥ 0 for all events A (probablities are non-negative),(ii) P(S) = 1 (probability of the certain event is 1),

(iii) P(A ∪ B) = P(A) + P(B) for disjoint events A and B (probabilities add up ondisjoint events).

! Classical probability satisfies all the axioms of a probability measure.

! For a toss of a coin, let P(heads) = 2/3, P(tails) = 1/3, P(∅) = 0, andP({heads, tails}) = 1. Then P satisfies the axioms above.

14 / 21

Page 15: Lecture 1. Basics of probability theory

Axioms of general probability

Important consequences of the axioms:(a) 0 ≤ P(A) ≤ 1 for any event A.(b) P(∅) = 0.(c) P(A) = 1− P(Ac) for any event A.(d) P(A) ≤ P(B) if A ⊂ B.(e) If A1, A2, . . . , An are pairwise disjoint, that is Ai ∩ Aj = ∅ for i 6= j, then

P(A1 ∪ A2 ∪ . . . ∪ An) = P(A1) + P(A2) + . . .+ P(An) =

n∑i=1

P(Ai).

(f) P(A ∪ B) = P(A) + P(B)− P(A ∩ B) for all events A and B.(g) P(A ∪ B) ≤ P(A) + P(B) for all events A and B.

15 / 21

Page 16: Lecture 1. Basics of probability theory

Axioms of general probability

Proof of:(a) The lower bound P(A) ≥ 0 is axiom (i). It follows from axiom (iii) by taking

B = Ac, that P(A) + P(Ac) = 1. The upper bound P(A) ≤ 1 follows from thisequality since by axiom (i), P(Ac) ≥ 0.

(b) By axiom (iii), P(∅) = P(∅) + P(∅) since the empty event is disjoint withitself.

(c) It follows from axiom (iii) by taking B = Ac.(d) By axiom (iii) and (i), and since A ⊂ B, P(B) = P(A) + P(B \ A) ≥ P(A).(e) Apply axiom (iii) n times.(f) By equation (c), P(A ∪ B) = P(A \ B) + P(B \ A) + P(A ∩ B), and

P(A \ B) = P(A)− P(A ∩ B) and P(B \ A) = P(B)− P(A ∩ B).(g) It follows from equation (e) and the fact that P(A ∩ B) ≥ 0.

16 / 21

Page 17: Lecture 1. Basics of probability theory

Conditional probabilityMain motivating question: Suppose that we know that an event occurs. Whatinformation does that give us about other events?

Suppose B has positive probability of occurrence, that is P(B) > 0. For any event A,the probability of A conditioned on B (or probability of A given B) is

P(A|B) := P(A ∩ B)P(B)

.

Throw a die. Let A = {1, 2, 3}. Then P(A) = 1/2 in classical probability, and

! P(A|{even}) = P(A ∩ {even})/P({even}) = 1/61/2 = 1/3. Since

P(A|{even}) < P(A), the occurrence of event {even} has a negative influence onthe probability of occurrence of A.

! P(A|{odd}) = P(A ∩ {odd})/P({odd}) = 2/61/2 = 2/3. Since

P(A|{odd}) > P(A), the occurrence of event {odd} has a positive influence onthe probability of occurrence of A.

P(A|B) and P(B|A) have different interpretations and usually take different values.17 / 21

Page 18: Lecture 1. Basics of probability theory

Independent events

Two events A and B are called independent if and only if

P(A ∩ B) = P(A)P(B).

If A and B are independent, and P(B) > 0 then

P(A|B) = P(A ∩ B)P(B)

=P(A)P(B)

P(B)= P(A).

Hence, the occurrence of B does not influence the (probability of) occurrence of A.

! ∅ and S are always independent of all other events,

! Throw a die twice and call the outcomes ω1 and ω2. The events A1 = {ω1 = 1}and A2 = {ω2 = 6} are independent.

! The events A = {it is raining somewhere in New Zealand} andB = {it is raining in Gothenburg} are independent.

18 / 21

Page 19: Lecture 1. Basics of probability theory

Total probability formula

Let B be any event, and A1,A2, . . . ,An be pairwise disjoint events, that is Ai ∩ Aj = ∅for i 6= j, and such that

A1 ∪ A2 ∪ . . . ∪ An = S.

S

B

A1 A2. . . An

Then,

P(B) =n∑

i=1

P(B|Ai)P(Ai).

19 / 21

Page 20: Lecture 1. Basics of probability theory

Total probability formula

Proof.

P(B) = P(B ∩ S)

= P(B ∩ (A1 ∪ A2 ∪ . . . ∪ An))

= P((B ∩ A1) ∪ (B ∩ A2) ∪ . . . ∪ (B ∩ A2))

=n∑

i=1

P(B ∩ Ai)

=

n∑i=1

P(B|Ai)P(Ai).

20 / 21

Page 21: Lecture 1. Basics of probability theory

Bayes’ theorem

With B such that P(B) > 0, and A1,A2, . . . ,An as before, we have for anyi ∈ {1, 2, . . . , n},

P(Ai|B) =P(B|Ai)P(Ai)∑ni=1 P(B|Ai)P(Ai)

.

The probability of having a certain disease is 10%. Patients with the disease testpositive with probability 95%. Patients without the disease test positive withprobability 2%. Given a positive test result, what is the probability that the patient hasthe disease? Let

A1 = {patient has the disease}A2 = Ac

1 = {patient does not have the disease}B = {patient tests positive}.

We have

P(A1|B) =P(B|A1)P(A1)

P(B|A1)P(A1) + P(B|A2)P(A2)=

0, 95 · 0, 100, 95 · 0, 10 + 0, 02(1− 0, 10)

= 0.84.

21 / 21