CS#109 Lecture#11 April#20th,2016 - Stanford University · Anatomy of a beautiful equation...

CS 109Lecture 11

April 20th, 2016

Four Prototypical Trajectories

Review

• X is a Normal Random Variable: X ~ N(µ, σ 2)§ Probability Density Function (PDF):

§ Also called “Gaussian”§ Note: f(x) is symmetric about µ

∞<<∞−= −− xexf x where2

1)(22 2/)( σµ

πσµ=][XE

2)( σ=XVar)(xf

The Normal Distribution

abilit

Simplicity is Humble

* A Gaussian maximizes entropy for a given mean and variance

Anatomy of a beautiful equation

probability density at x

the distance to the mean

f(x) =1

�(x�µ)2

a constant

“exponential”

sigma shows up twice

N (µ,�2)

And here we are

Table of Φ(z) values in textbook, p. 201 and handout

F (x) = �

✓x� µ

CDF of Standard Normal: A function that has been solved

for numerically

The cumulative density function

(CDF) of any normal

N (µ,�2)

Symmetry of Phi

ɸ(-a)

1 – ɸ(a)

ɸ(-a) = 1 – ɸ(a)

µ = 0

Interval of Phi

ɸ(c)ɸ(d)

µ = 0

Interval of Phi

ɸ(c)ɸ(d)

ɸ(d) - ɸ(c)

µ = 0

Great in class questions

68% rule only for Gaussians?

68% Rule?

Only applies to normal

P (µ� � < X < µ+ �) = P

✓µ� � � µ

X � µ

µ+ � � µ

= P (�1 < Z < 1)

= �(1)� �(�1)

= �(1)� [1� �(1)]

= 2�(1)� 1

= 2[0.8413]� 1 = 0.683

What is the probability that a normal variable has a value within one standard deviation of its mean?

X ⇠ N(µ,�2)

P (µ� � < X < µ+ �) = P

✓µ� � � µ

X � µ

µ+ � � µ

= P (�1 < Z < 1)

= �(1)� �(�1)

= �(1)� [1� �(1)]

= 2�(1)� 1

= 2[0.8413]� 1 = 0.683

P (µ� � < X < µ+ �) = P

✓µ� � � µ

X � µ

µ+ � � µ

= P (�1 < Z < 1)

= �(1)� �(�1)

= �(1)� [1� �(1)]

= 2�(1)� 1

= 2[0.8413]� 1 = 0.683

P (µ� � < X < µ+ �) = P

✓µ� � � µ

X � µ

µ+ � � µ

= P (�1 < Z < 1)

= �(1)� �(�1)

= �(1)� [1� �(1)]

= 2�(1)� 1

= 2[0.8413]� 1 = 0.683

P (µ� � < X < µ+ �) = P

✓µ� � � µ

X � µ

µ+ � � µ

= P (�1 < Z < 1)

= �(1)� �(�1)

= �(1)� [1� �(1)]

= 2�(1)� 1

= 2[0.8413]� 1 = 0.683

P (µ� � < X < µ+ �) = P

✓µ� � � µ

X � µ

µ+ � � µ

= P (�1 < Z < 1)

= �(1)� �(�1)

= �(1)� [1� �(1)]

= 2�(1)� 1

= 2[0.8413]� 1 = 0.683

68% Rule?Counter example: Uniform X ⇠ Uni(↵,�)

V ar(X) =(� � ↵)2

� =pV ar(X)

=� � ↵p

� � ↵

↵�

� � ↵

2(� � ↵)p

= 0.58

P (µ� � < X < µ+ �)

� � ↵

2(� � ↵)p

= 0.58

How do you sample from a Gaussian?

How Does a Computer Sample Normal?

Further reading: Box–Muller transform

Inverse Transform Sampling

CDF of the Standard Normal

�(x)

Find the x such that �(x) = y

Where we left off…

Normal Approximates Binomial

There is a deep reason for the Binomial/Normal approximation…

• Stanford accepts 2480 students§ Each accepted student has 68% chance of attending§ X = # students who will attend. X ~ Bin(2480, 0.68)§ What is P(X > 1745)?

§ Use Normal approximation: Y ~ N(1686.4, 539.65)

§ Using Binomial:

)5.1745()1745( >≈> YPXP

23.23165.53914.1686 – p)np( – p) np( np ≈≈=

( ) 0055.0)54.2(1)5.1745( 23.234.16865.1745

23.234.1686 ≈Φ−=>=> −−YPYP

0053.0)1745( ≈>XP

Stanford Admissions

• Stanford Daily, March 28, 2014“Class of 2018 Admit Rates Lowest in University History” by Alex Zivkovic

“Fewer students were admitted to the Class of 2018 than the Class of 2017, due to the increase in Stanford’s yield rate which has increased over 5 percent in the past four years, according to Colleen Lim M.A. ’80, Director of Undergraduate Admission.”

Changes in Stanford Admissions

68% 10 years ago 80% last year

Next distribution: Exponential

• X is an Exponential RV: X ~ Exp(λ) Rate: λ > 0§ Probability Density Function (PDF):

§ Cumulative distribution function (CDF), F(X) = P(X ≤ x):

§ Represents time until some evento Earthquake, request to web server, end cell phone contract, etc.

∞<<∞−⎩⎨⎧

where0 if00 if

)(λλ

λ1][ =XE

0 where1)( ≥−= − xexF xλ

Exponential Random Variable

• X = time until some event occurs§ X ~ Exp(λ)§ What is P(X > s + t | X > s)?

§ After initial period of time s, P(X > t | •) for waiting another t units of time until event is same as at start

§ “Memoryless” = no impact from preceding period s

)() and ()|(

sXPtsXP

sXPsXtsXPsXtsXP

>+>=>+>

)()(1)(1)(1

)()( )(

tXPtFeee

sXPtsXP t

>=−===−

+> −−

+−λ

)()|( So, tXPsXtsXP >=>+>

Exponential is Memoryless

E[X] and Var(X) for exponential

• Product rule for derivatives:

• Derivative and integral of exponential:

• Integration by parts:

dvuvduvud ⋅+⋅=⋅ )(

dxed uu

∫∫∫ ⋅+⋅=⋅=⋅ dvuduvvuvud )(

∫∫ ⋅−⋅=⋅ duvvudvu

uu edue =∫

A Little Calculus Review

• Compute n-th moment of Exponential distribution

§ Step 1: don’t panic, think happy thoughts, recall...§ Step 2: find u and v (and du and dv):

§ Step 3: substitute (a.k.a. “plug and chug”)

∫∞

][ dxexXE xnn λλ

xn evxu λ−−== dxedvdxnxdu xn λλ −− == 1

∫∫∫∫ −−−− +−=⋅−⋅=⋅=⋅ dxenxexduvvudxexdvu xnxnxn λλλλ 1

][0][ 111

0−−−−−− =+=+−= ∫∫

∞ nxnxnxnn XEdxexdxenxexXE nnλλ

λλλ λ

,... ][ ][ so ,1]1[][ :case Base 220 212,1

λλλλ ===== XEXEEXE

And Now, Calculus Practice

• Say visitor to your web site leaves after X minutes§ On average, visitors leave site after 5 minutes§ Assume length of stay is Exponentially distributed§ X ~ Exp(λ = 1/5), since E[X] = 1/λ = 5§ What is P(X > 10)?

§ What is P(10 < X < 20)?

1353.0)1(1)10(1)10( 210 ≈=−−=−=> −− eeFXP λ

1170.0)1()1()10()20()2010( 24 ≈−−−=−=<< −− eeFFXP

Visits to a Website

• X = # hours of use until your laptop dies§ On average, laptops die after 5000 hours of use§ X ~ Exp(λ = 1/5000), since E[X] = 1/λ = 5000§ You use your laptop 5 hours/day. § What is P(your laptop lasts 4 years)?§ That is: P(X > (5)(365)(4) = 7300)

§ Better plan ahead... especially if you are coterming:

2322.0)1(1)7300(1)7300( 46.15000/7300 ≈=−−=−=> −− eeFXP

plan)year (5 1612.0)9125(1)9125( 825.1 ≈=−=> −eFXPplan)year (6 1119.0)10950(1)10950( 19.2 ≈=−=> −eFXP

Replacing Your Laptop

Continuous Random Variables

Uniform Random VariableAll values of x between alpha and beta are equally likely.

Normal Random VariableAka Gaussian. Defined by mean and variance. Goldilocks distribution.

Exponential Random VariableTime until an event happens. Parameterized by lambda (same as Poisson).

Alpha Beta Random VariableHow mysterious and curious. You must wait a few classes J.

X ⇠ Uni(↵,�)

X ⇠ N (µ,�2)

X ⇠ Exp(�)

Joint Distributions

Events occur with other events

• For two discrete random variables X and Y, the Joint Probability Mass Function is:

• Marginal distributions:

• Example: X = value of die D1, Y = value of die D2

),(),(, bYaXPbap YX ===

∑===y

YXX yapaXPap ),()()( ,

∑===x

YXY bxpbYPbp ),()()( ,

1, ),1()1( ==== ∑∑

== yyYX ypXP

Discrete Joint Mass Function

Probability Table• States all possible outcomes with several discrete variables• Often is not “parametric”• If #variables is > 2, you can have a probability table, but you can’t draw it on a slide

All values of A

All values of B

b P(A = a, B = b)

Remember “,” means “and”

Every outcome falls into a bucket

It’s Complicated Demo

Go to this URL: https://goo.gl/ZNRsqD

• Consider households in Silicon Valley§ A household has C computers: C = X Macs + Y PCs§ Assume each computer equally likely to be Mac or PC

⎪⎪⎩

⎪⎪⎨

332.0228.0124.0016.0

XY 0 1 2 3 pY(y)

0 0.16 0.12 ? 0.04

1 0.12 0.14 0.12 0

2 0.07 0.12 0 0

3 0.04 0 0 0

A Computer (or Three) In Every House

⎪⎪⎩

⎪⎪⎨

332.0228.0124.0016.0

XY 0 1 2 3 pY(y)

0 0.16 0.12 0.07 0.04

1 0.12 0.14 0.12 0

2 0.07 0.12 0 0

3 0.04 0 0 0

⎪⎪⎩

⎪⎪⎨

332.0228.0124.0016.0

XY 0 1 2 3 pY(y)

0 0.16 0.12 0.07 0.04 0.39

1 0.12 0.14 0.12 0 0.38

2 0.07 0.12 0 0 0.19

3 0.04 0 0 0 0.04

pX(x) 0.39 0.38 0.19 0.04 1.00

Marginal distributions

• This is a joint

• A joint is not a mathematician§ It did not start doing mathematics at an early age§ It is not the reason we have “joint distributions”§ And, no, Charlie Sheen does not look like a joint

o But he does have them…o He also has joint custody of his children with Denise Richards

What about the continuous world?

• Random variables X and Y, are Jointly Continuous if there exists PDF fX,Y(x, y) defined over –∞ < x, y < ∞ such that:

∫ ∫=≤<≤<2

),( ) ,(P ,2121

bYX dxdyyxfbYbaXa

Jointly Continuous

Let’s look at one:

Jointly Continuous

∫ ∫=≤<≤<2

),( ) ,(P ,2121

bYX dxdyyxfbYbaXa

fX,Y (x, y)

• Cumulative Density Function (CDF):

• Marginal density functions:

∫ ∫∞− ∞−

YXYX dxdyyxfbaF ),( ),( ,, ),(),( ,

, baFbaf YXYX ba ∂∂∂=

∫∞

∞−

= dyyafaf YXX ),()( , ∫∞

∞−

= dxbxfbf YXY ),()( ,

Jointly Continuous

• For two continuous random variables X and Y, the Joint Cumulative Probability Distribution is:

• Marginal distributions:

∞<<∞−≤≤== babYaXPbaFbaF YX , where),(),(),(,

),(),P( )()( , ∞=∞<≤=≤= aFYaXaXPaF YXX

),(),P( )()( , bFbYXbYPbF YXY ∞=≤∞<=≤=

Continuous Joint Distribution Functions

Joint Dart Distribution

Darts!

X-Pixel Marginal

xY-Pixel Marginal

X ⇠ N✓900

◆Y ⇠ N

✓900

• Let X and Y be two continuous random variables§ where 0 ≤ X ≤ 1 and 0 ≤ Y ≤ 2

• We want to integrate g(x,y) = xy w.r.t. X and Y:§ First, do “innermost” integral (treat y as a constant):

§ Then, evaluate remaining (single) integral:

∫∫∫ ∫∫ ∫=== == =

=⎥⎦

⎤⎢⎣

⎡=⎟

⎟⎠

⎞⎜⎜⎝

yyy xy x

dyydyxydydxxydydxxy

101 42

=−=⎥⎦

⎤⎢⎣

⎡=∫

Multiple Integrals Without Tears

Let FX,Y (x, y) be joint CDF for X and Y

)),((1) ,P( cbYaXPbYaX >>−=>>

),()()(1 , baFbFaF YXYX +−−=

))()((1 cc bYaXP >∪>−=))()((1 bYaXP ≤∪≤−=

)),()()((1 bYaXPbYPaXP ≤≤−≤+≤−=

Computing Joint Probabilities

),(),(),(),( ) ,(P

12112122

baFbaFbaFbaFbYbaXa

−+−=

≤<≤<

The General Rule Given Joint CDFLet FX,Y (x, y) be joint CDF for X and Y

• Y is a non-negative continuous random variable§ Probability Density Function: fY(y)§ Already knew that:

§ But, did you know that:

§ Analogously, in the discrete case, where X = 1, 2, …, n

dyyfyYE Y∫∞

∞−

= )( ][

dy yYPYE ∫∞

iiXPXE

Lovely Lemma

In the discrete case, where X = 1, 2, …, n

iiXPXE

How this lemma was made

P (X = 1) + P (X = 2) + P (X = 3) + · · ·+ P (X = n)

P (X = 2) + P (X = 3) + · · ·+ P (X = n)P (X = 3) + · · ·+ P (X = n)

P (X = n)

P (X � i) =

= 1P (X = 1) + 2P (X = 2) + · · ·+ n(PX = n)

= E[X]

Each row is an expansion of

Life gives you lemmas,make lemmanade!

• Disk surface is a circle of radius R§ A single point imperfection uniformly distributed on disk

∞<<∞−⎪⎩

⎪⎨⎧

≤+= x,y

Ryxyxf RYX where

Imperfections on a Disk

(x, y)dy =1

2 � x

(x, y)dy =1

2 � x

(x, y)dy =1

2 � x

Only integrate over the support range

Marginal of Y is the same by symmetry

• Disk surface is a circle of radius R§ A single point imperfection uniformly distributed on disk§ Distance to origin:§ What is E[D]?

22 YXD +=

RaaDP ==≤

Imperfections on a Disk

Ra Because of

equally likely outcomes

E[D] =

0P (D > a)da =

01� P (D a)da

01� a2

a� a3

E[D] =

0P (D > a)da =

01� P (D a)da

01� a2

a� a3

E[D] =

0P (D > a)da =

01� P (D a)da

01� a2

a� a3

E[D] =

0P (D > a)da =

01� P (D a)da

01� a2

a� a3

E[D] =

0P (D > a)da =

01� P (D a)da

01� a2

a� a3

CS#109 Lecture#11 April#20th,2016 - Stanford University · Anatomy of a beautiful equation...

Documents

Transcript of CS#109 Lecture#11 April#20th,2016 - Stanford University · Anatomy of a beautiful equation...

[әe] – travel, capital, gallery, abbey [ei] – play, place, stadium, famous [ju:] – museum, beautiful, usually [i] - big, different, symbol [a:] - park,

Magazine advertsilifetroodos.eu/wp-content/uploads/2019/11/Deliverable_Magazines_… · Magazine adverts DOWN TOWN – Sunday, 26 May 2019 ΟΚ – Sunday, 9 June 2019 BEAUTIFUL PEOPLE

X. X X X X X X X X X X X X X X X X.

ΟΜΟΡΦΕΣ ΠΟΛΕΙΣ ΤΗΣ ΕΥΡΩΠΗΣ - BEAUTIFUL EUROPEAN CITIES

X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ X _ μ.

Simple is Beautiful: The unifying customer experience principle

f : R R ; x o R f ´ (x o ) = = x puede acercarse a x o ; desde una única dirección ( eje x ) “incremento en x” : Δ x = x – x o xoxo x informa.

5 ? ? 8 A ; > 8 4 8 @ C > F 0 H ; 0 4 { | 5 x { { 4 8 @ C ...‘ΝΑΛΥΣΕΙΣ_2019.pdf · ó X î X 8 4 X 5 X I X 0 X > X x r r r x x { } x r r r r t | U r x x w r x x x r r s

Tiananmen Square: Tourist Magnet - Beautiful Park to Play – Vibrant City Square China Version

Αλέξης Κόκκος - EAP · Καραγάτσης, Μ. x x x x x x x x x Η σχολική εκδρομή – Ρουκ-Αγγελάκη, Κ. x x x x x x Μες στους προσφυγικούς

Spearman's Rho for the AMH Copula: a Beautiful Formula · 4 Spearman’s Rho for AMH { Beautiful k 1 2 3 4 5 6 7 8 9 a k 1/3 1/12 3/100 1/75 1/147 3/784 1/432 1/675 3/3025 3.2. Accurate

minimath.eu Μαθηματικα Γυμνασιουminimath14.weebly.com/uploads/2/3/6/7/23673440/_.pdf · 1 8 7,6 15 46 1 1 3 3 6 4 9 10 2 1 x x x x x x xx x x x x x x x x x x x

The recycling trees - 20th & 14th Primary Schools of Ilion, Greece - eTwinning project- Recycling through Art

Graph Homomorphisms with Complex Values: A Dichotomy …Graph Homomorphisms with Complex Values: A Dichotomy Theorem ... Bulatov and Grohe [2], and especially the recent beautiful

THE IMPACTS OF OUTSOURCING ON THE ORGANISATION …831505/FULLTEXT01.pdf · &rqwhqwv $%675$&7 x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x

NAME: May 20th, 2009 Feinstein Grad Cell Bio Exam#3: Biol 714biology.hunter.cuny.edu/cellbio/Feinstein Cell Bio 2013/Feinstein/Old... · 1 NAME:_____ May 20th, 2009 Feinstein Grad

Jobs In Cyprus 20th Edition

NASSP Masters 5003F - Computational Astronomy - 2009 Lecture 12: The beautiful theory of interferometry. First, some movies to illustrate the problem.

20th September 2006Riunione CSN1 - Trieste1 LUCID status Outline: The Lucid detector Test performed The installation scenario The electronics

The Design of Online Learning Algorithms fileThe Design of Online Learning Algorithms Wouter M. Koolen Online Learning Workshop Paris, Friday 20th October, 2017