Essential Intuitive Statistics for Experimentation

download Essential Intuitive Statistics for Experimentation

If you can't read please download the document

Embed Size (px)

Transcript of Essential Intuitive Statistics for Experimentation

PowerPoint Presentation

Intuitive StatisticsMatt Gardner

False Positives

Lets pretend T-C=0

False Negatives

Lets pretend T-C=d

( )( x, se )( x z x se )False Positives

Lets pretend T-C=0

False Negatives

Lets pretend T-C=d

( )( x, se )( x z x se )False Positives

Lets pretend T-C=0

False Negatives

Lets pretend T-C=d

( )( x, se )( x z x se )False Positives

Lets pretend T-C=0

False Negatives

Lets pretend T-C=d

( )( x, se )( x z x se )False Positives

Lets pretend T-C=0

False Negatives

Lets pretend T-C=d

( )( x, se )( x z x se )False Positives

Lets pretend T-C=0

False Negatives

Lets pretend T-C=d

( )( x, se )( x z x se )False Positives

Lets pretend T-C=0

False Negatives

Lets pretend T-C=d

MeanProportionSimple Sample Size Calculator here.x durationCommon mistakes:Inputs can change through timeUnderestimating lift is safer than over estimatingThink carefully about choice of power is it high enough?Multiple metrics choose highest traffic requirementLarge sample size > extend duration > bundle features > alternative metric

Inputs for calculation

Sum - of measure over all unitsCount - of analysis units Standard deviation - of measure over all units *Relative lift in average measure test vs. controlAlpha false positive rateBeta false negative rate

* for proportion metrics sd = p.(1-p).n

MeanProportionSimple Sample Size Calculator here.x durationCommon mistakes:Inputs can change through timeUnderestimating lift is safer than over estimatingThink carefully about choice of power is it high enough?Multiple metrics choose highest traffic requirementLarge sample size > extend duration > bundle features > alternative metric

Inputs for calculation

Sum - of measure over all unitsCount - of analysis units Standard deviation - of measure over all units *Relative lift in average measure test vs. controlAlpha false positive rateBeta false negative rate

* for proportion metrics sd = p.(1-p).n

MeanProportionSimple Sample Size Calculator here.x durationCommon mistakes:Inputs can change through timeUnderestimating lift is safer than over estimatingThink carefully about choice of power is it high enough?Multiple metrics choose highest traffic requirementLarge sample size > extend duration > bundle features > alternative metric

Inputs for calculation

Sum - of measure over all unitsCount - of analysis units Standard deviation - of measure over all units *Relative lift in average measure test vs. controlAlpha false positive rateBeta false negative rate

* for proportion metrics sd = p.(1-p).n

MeanProportionSimple Sample Size Calculator here.x durationCommon mistakes:Inputs can change through timeUnderestimating lift is safer than over estimatingThink carefully about choice of power is it high enough?Multiple metrics choose highest traffic requirementLarge sample size > extend duration > bundle features > alternative metric

Experiment results are subject to randomness and conclusions will sometimes be in error We choose the false positive and false negative error rates at experiment design time

We know in advance if the experiment is likely to be useful and we should think carefully before running experiments they are expensive!

Always compute sample size!