Naive Bayes Classifiers - Wilkes University

kNN(k-Nearest Neighbor)

kNN (Instance Based Classifier)•Uses k “closest” points (nearest neighbors)

•Requires a similarity (distance) metric

• Similarity between items a and b:• C, common (shared) features

• A, features unique in a

• B, features unique in bS = θC - αA - βBwhere θ,α,β ≥ 0

One of these things…

kNN Classifiers• Requirements

– The set of stored records

– The distance metric

– The value of k

• Classification Algorithm– Compute distance from item to other stored records

– Identify its k nearest neighbors

– Use their class labels to determine the unknown class label

– May weigh the vote according to distance, e.g. use w = 1/d2

(next: video clip examining parameters on website)

Reducing Complexity• Decrease training set size

• Help distance metric• Apply PCA to reduce features• Neighborhood Component Analysis

• Precompute distances• Nearest Neighbor Transformer

• Change search strategy

Naive Bayes Classifiers

Naive Bayes is

supervised learning algorithm

classification algorithm

probabilistic classifier: based on Bayes’ theorem of probability

Bayes’ Theorem

Let A and B be events. Then

P(A|B) =P(B|A)P(A)

P(A) and P(B) are probabilities of observing events A and B,

respectively

P(A|B) is a conditional probability: the likelihood of event A

occurring given that B is true

P(B|A) is also a conditional probability: the likelihood of event

B occurring given that A is true

Naive Bayes

Let a dataset have two classes - for instance: {cats, dogs}. Every data

point has a set of features (variables). Then

P(class|feature set) =P(feature set|class)P(class)

P(feature set)

P(class|feature set) is called the posteriori: probability of classifying a

cat (an image of a cat), given a set of features observed in cats.

P(class) is called the prior: (unscaled) probability that a randomly

chosen observation is a cat.

P(feature set|class) is called the scaler: it scales up or down the prior

given this specific set of features (also called the likelihood).

P(feature set) is called the normalizer (evidence): probability of what

we are observing (the set of features) in our dataset.

Naive Bayes

“Naive” = assumption that all the features in data are

independent of one another! (This strong assumption rarely

holds in the real world though.)

The method is simple and computationally fast!

Example

Suppose we have 60 cats and 40 dogs in our dataset. Each data point

is a vector of n features.

Given particular values for the first two features (feature 1 and feature

2), what is the probability of a data point being a cat or a dog?

Feature Values Cats Dogs

Total 60 40

feature 1 50 5/6 5 1/8

feature 2 45 3/4 10 1/4

both features 40 15/24 5/4 1/32

P(Cat|both features) =40

40 + 5/4= 97%

Naive Bayes in Scikit-Learn

References

“A Comparison of Event Models for Naive Bayes Text

Classification” by Andrew McCallum and Kamal Nigam

“Spam Filtering with Naive Bayes – Which Naive Bayes?” by

Vangelis Metsis, et al.

“Pattern Recognition and Machine Learning” by Christopher

Bishop

“Image Classification Using Naive Bayes Classifier” by

Dong-Chul Park

Naive Bayes in Python (Scikit Learn):

https://scikit-learn.org/stable/modules/

naive_bayes.html

ANN(brief discussion)

Live Session IV(pause video here)

Naive Bayes Classifiers - Wilkes University

Documents

Transcript of Naive Bayes Classifiers - Wilkes University

Bayes - Summer 2017 - Kirchkamp · Contents 5 © Oliver Kirchkamp 1 Introduction 1.1 Preliminaries Purposeofthishandout InthishandoutyouﬁndthecontentoftheslidesIamusing in the lecture.

The Bayes deconvolution problemstatweb.stanford.edu/~ckirby/brad/papers/2015Bayes... · The label \Bayes deconvolution problem" is intended to emphasize the more general nature of

Linear Logic and Naive Set Theory - Research Institute …terui/summer3.pdfLinear Logic and Naive Set Theory ∼ Make our garden grow ∼ Kazushige Terui terui@nii.ac.jp National Institute

Naïve Bayes - University of Liverpoolxiaowei/ai_materials/15-Naive-Bayes.pdf · Naïve Bayes: Subtlety #2 •Often the X i are not really conditionally independent •We use Naïve

L5: Quadratic classifiers - Texas A&M Universityresearch.cs.tamu.edu/prism/lectures/pr/pr_l5.pdfCSCE 666 Pattern Analysis | Ricardo Gutierrez-Osuna | CSE@TAMU 2 Bayes classifiers for

Introduction to Mobile Robotics Bayes Filter – Kalman Filterais.informatik.uni-freiburg.de/teaching/ss10/robotics/slides/10-kalman-filter.pdf · Kalman Filter Bayes filter with

Non-transfusion-dependent thalassemia (NTDT) · 2016-05-11 · High-risk patients • β-thalassemia intermediate • Adults • Post-splenectomy • Transfusion-naive • High PLT

Empirical Bayes Quantile-Prediction - Statistics

Unbiased Bayes for Big Data

b) Bayes-Aktionen und Zul¨assigkeit Satz 2.58 (Bayes ...€¦ · 1 b) Bayes-Aktionen und Zul¨assigkeit Satz 2.58 (Bayes-Aktionen zu nicht entarteter Priori) Gegeben sei ein Entscheidungsproblem

PowerPoint Presentationxiaowei/ai_materials/15-Naive-Bayes.pdf · Title: PowerPoint Presentation Author: Huang, Xiaowei [xiaowei] Created Date: 10/24/2018 10:37:22 PM

SCI 519 - Hybrid Classifiers (Backmatter Pages)978-3-642-40997-4/1.pdf · A Appendix A.1 Hypothesis Testing Letusassumethatwewouldliketoestimatethemeanμofnormaldensity fromsample[314]

1 Grundlagen - tu-freiberg.de1.2 Elemente der Mengenlehre De nition 1.4 ( naive Mengende nition, Cantor , 1895). EineMenge M ist eine Zusammenfassung von bestimmten, wohlunterschiedenen

Introduction to Mobile Robotics Bayes Filter – Discrete ...ais.informatik.uni-freiburg.de/.../robotics/slides/10-discrete.ppt.pdf · Bayes Filter – Discrete Filters Introduction

1 Problem: Consider a two class task with ω 1, ω 2 LINEAR CLASSIFIERS.

Kalman Filter Localization Bayes Filter Reminder

LDA, QDA, Naive Bayesmpetrik/teaching/intro_ml_17/intro_ml_17_f… · 2/16/2017 · LDA, QDA, Naive Bayes Generative Classification Models Marek Petrik 2/16/2017

ΤΜΗΜΑ ΣΤΑΤΙΣΤΙΚΗΣ - stat-athens.aueb.grptd/Notes_2006.pdf · Συµπερασµατολογία κατά Bayes 4 πρέπει να υπεισέρχεται σαν

06 Machine Learning - Naive Bayes

Bayes Net Syntax and Semantics