Statistical Learning: Bayesian and ML COMP155 Sections 20.1-20.2 May 2, 2007.

Statistical Learning:Bayesian and ML

COMP155

Sections 20.1-20.2May 2, 2007

Definitions• a posteriori: derived from observed facts

• a priori: based on hypothesis or theory rather than experiment

Bayesian Learning• Make predictions using all hypotheses,

weighted by their probabilities• Bayes’ rule: P(a | b) = α P(b | a) P(a)

• For each hypothesis hi, observed data d:

• P(hi | d) = α P(d | hi) P(hi)

• P(d | hi) is the likelihood of d under hypothesis hi

• P(hi) is the hypothesis prior

α is a normalization constant = 1 / ∑i P(d | hi) P(hi)

Bayesian Learning• We want to predict some quantity X:

• The predictions are weighted averages over the predictions of the individual hypotheses

Example• Suppose we know that there are 5 kinds of bags

of candy:

cherry lime % of all bags

Type 1 100% 10%

Type 2 75% 25% 20%

Type 3 50% 50% 40%

Type 4 25% 75% 20%

Type 5 100% 10%

Example: priors• Given a new bag of candy,

predict the type of the bag:

• Five hypotheses:• h1: bag is type 1, P(h1) = .1

• h2: bag is type 2, P(h2) = .2

• h3: bag is type 3, P(h3) = .4

• h4: bag is type 4, P(h4) = .2

• h5: bag is type 5, P(h5) = .1

With no evidence, we use the hypothesis priors

Example: one lime candy• Suppose we unwrap one candy and

determine that it is lime.• P(h1 | onelime) = α P(onlime | h1)P(h1)

= 0.5*(0 * 0.1) = 0

• P(h2 | onelime) = α P(onlime | h2)P(h2) = 0.5*(0.25 * 0.2) = 0.1

Example: two lime candies• Suppose we unwrap another candy and it

is also lime.• P(h1 | twolime) = α P(twolime | h1)P(h1)

= 0.33*(0 * 0.1) = 0

• P(h2 | twolime) = α P(twolime | h2)P(h2) = 0.33*(0.0625 * 0.2) = 0.05

Example: n lime candies• Suppose we unwrap n candies and they

are all lime.• P(h1 | nlime) = αn (0n * 0.1)

• P(h2 | nlime) = αn (0.25n * 0.2)

• P(h3 | nlime) = αn (0.5n * 0.4)

• P(h4 | nlime) = αn (0.75n * 0.2)

• P(h5 | nlime) = αn (1n * 0.1)

Prediction: what candy is next?• P(nextlime | nlime) =

0 * αn (0n * 0.1) + 0.25 * αn (0.25n * 0.2) + 0.5 * αn (0.5n * 0.4) + 0.75 * αn (0.75n * 0.2) + 1 * αn (1n * 0.1)

Analysis: Bayesian Prediction• The true hypothesis eventually dominates

• The posterior probability of any false hypothesis will eventually dominate

• Probability of uncharacteristic data will become vanishingly small

• Bayesian prediction is optimal

• Bayesian prediction is expensive• Hypothesis space may be very large (or

infinite)

MAP Approximation• To avoid expense of Bayesian learning,

one approach is to simply chose the most probable hypothesis and assume it is correct• MAP = maximum a posteriori

• hmap = hi with highest value for P(hi | d)

• In candy example, after 3 limes have been selected a MAP learner will always predict next candy is lime with 100% probability• Less accurate, but much cheaper

Avoiding Complexity• As we’ve seen earlier, allowing overly

complex hypotheses can lead to overfitting

• Bayesian and MAP learning use the hypothesis prior to penalize complex hypotheses• Complex hypotheses typically have lower

priors – since there are typically more complex hypotheses

• We get the simplest hypothesis consistent with the data (as per Ockham’s razor)

ML Approximation• For large data sets, the priors become

irrelevant, in this case we may use maximum likelihood (ML) learning• Choose hml that maximizes P(d | hi)

• Choose the hypothesis that has the highest probability of causing the observed data

• identical to MAP for uniform priors

• ML is the standard (non-Bayesian) statistical learning method

Exercise• Suppose we were pulling candy from a

50/50 bag (type 3) or a 25/75 bag (type 4)

• With full Bayesian learning, what would the posterior probability and prediction plots look like after 100 candies?

• What would prediction plots look like for MAP and ML learning after 1000 candies?

Bayesian 50/50 bag

Bayesian 25/75 bag

MAP 50/50 bag

ML 50/50 bag

MAP 25/75 bag

ML 25/75 bag

Exercise

Answer

Statistical Learning: Bayesian and ML COMP155 Sections 20.1-20.2 May 2, 2007.

Documents

Transcript of Statistical Learning: Bayesian and ML COMP155 Sections 20.1-20.2 May 2, 2007.

Basic Nuclear Physics – 3 Nuclear Cross Sections and ...

ACH-Sections Cut en shells y sólidos-070709

ΕΦΗΜΕΡΙΔΑ ΤΗΣ ΚΥΒΕΡΝΗΣΕΩΣ€¦ · 304 ΕΦΗΜΕΡΙ∆Α tΗΣ ΚΥΒΕΡΝΗΣΕΩΣ Τεύχος Α’ 29/19.02.2019 15, 19.1 και 20.2 του καταστατικού,

20.2. Comptes consolidés - ipsos.com · Comptes consolidés audités du groupe Ipsos au 31 décembre 2016 ... Périmètre de consolidation au 31 décembre 2016 49 7.1 Périmètre

Τεχνικό Σχέδιο (sections 1,2 and 3).pdf · 1 Τεχνικό Σχέδιο ΤΕΧΝΙΚΟ ΣΧΕΔΙΟ ΣΥΝΔΕΤΙΚΟΣ ΚΡΙΚΟΣ ΜΕΤΑΞΥ ΜΕΛΕΤΗΤΗ ΚΑΙ

Basic Nuclear Physics – 3 Nuclear Cross Sections and ...woosley/ay220-15/lectures/lecture5.4x.pdfLecture 5 Basic Nuclear Physics – 3 Nuclear Cross Sections ... For example, a term

Chapter 5, Sections 1–6...Game playing Chapter 5, Sections 1–6 Artiﬁcial Intelligence, spring 2013, Peter Ljunglo¨f; based on AIMA Slides c Stuart Russel and Peter Norvig, 2004

Physics 218: Mechanics Instructor: Dr. Tatiana Erukhimova Sections 818, 819, 820, 821 Lecture 11.

1 Orthogonal Drawing (continued) Sections 8.3 – 8.5 from the book Bert Spaan.

Estimation of neutron cross-sections for 16O up to 5.2 MeV · Estimation of neutron cross-sections for 16O up to 5.2 MeV through R-matrix analysis S. Kunieda, K. Shibata, T. Fukahori

ACH-Sections Cut en shells y sólidos

Conic Sections- Circle, Parabola, Ellipse, Hyperbola

The t Distributions Sections 20.1, 20people.hsc.edu/faculty-staff/robbk/math121/lectures... · Robb T. Koether (Hampden-Sydney College) The t DistributionsSections 20.1, 20.2 Thu,

20: Carbohydrates - people.chem.ucsb.edu · 1-4,9/99 Neuman Chapter 20 4 Table 20.1. General Names of Monosaccharides Number of C's (n) General Name 3 triose 4 tetrose 5 pentose

Syntax of LFP Topics in Logic and Complexity Part 91. Immerman. Chapter 4. 2. Libkin. Sections 10.2 and 10.3 3. Gra¨del et al. Section 2.6. 4. Ebbinghaus and Flum Sections 8.1 and

TRIUMF: differential cross sections at 20 – 68 MeV → Roman Tacik PSI: 1. Analyzing powers of scattering at 45 – 90 MeV 2. Total SCX cross sections at 40.

CMD2 and SND results on e -> hadrons cross sections · CMD2 and SND results on e+e--> hadrons cross sections Fedor Ignatov on behalf of the CMD2 & SND collaborations Budker Institute

Divide-and-Conquer Reading: CLRS Sections 2.3, 4.1, 4.2, 4.3, 28.2, 33.4.

NNLO cross sections for processes with jets

Solid Mechanics 2020/2021 - ULisboa · Solid Mechanics 2020/2021 Class 17 Torsion of thin-walled open sections Examples Torsion of thin-walled closed sections, single and multicellular