1 Problem: Consider a two class task with ω 1, ω 2 LINEAR CLASSIFIERS.

Problem: Consider a two class task with ω1, ω2

wxwxwxw

:hyperplanedecision on the Assume

x,x)xx(w

wxwwxw

LINEAR CLASSIFIERS

Hence:

hyperplane on the w

0)( 0 wxwxg T

The Perceptron Algorithm

Assume linearly separable classes, i.e.

The casefalls under the above formulation, since

x0x*w:*w

*T* wxw 0

*w'w *

00 'x'wwx*w T*T

Our goal: Compute a solution, i.e., a hyperplane w,so that

• The steps– Define a cost function to be minimized– Choose an algorithm to minimize the cost

function– The minimum corresponds to a solution

The Cost Function

• Where Y is the subset of the vectors wrongly classified by w. When Y=(empty set) a solution is achieved and

Tx xwwJ )()(

0)( wJ

and if 1

0)( wJ

• J(w) is piecewise linear (WHY?)

The Algorithm• The philosophy of the gradient descent

is adopted.

• Wherever valid

This is the celebrated Perceptron Algorithm

old)()(

(old)(new)

Yx Yxx

xtwtwYx

An example

The perceptron algorithm converges in a finite number of iteration steps to a solution if

)1(x)t(w

x)t(w)1t(w

limlim0

A useful variant of the perceptron algorithm

It is a reward and punishment type of algorithm

otherwise )()1(

0)( ,)()1(

0)( , )()1(

xtwxtwtw

The perceptron

shold thre

weightssynapticor synapses

It is a learning machine that learns from the training vectors via the perceptron algorithm

The network is called perceptron or neuron

Example: At some stage t the perceptron algorithm results in

The corresponding hyperplane is

5.0w,1w,1w 021

05.021 xx

)1(7.0

ρ=0.7

Least Squares Methods If classes are linearly separable, the perceptron

output results in

If classes are NOT linearly separable, we shall compute the weights

so that the difference between• The actual output of the classifier, , and

• The desired outputs, e.g.

to be SMALL

021 w,...,w,w

SMALL, in the mean square error sense, means to choose so that the cost function

responses desired ingcorrespond the

)(minargˆ

minimum is ])[()( 2

xwyEwJ

Minimizing

where Rx the autocorrelation matrix

and the crosscorrelation vector

0])[()(

:in results to w.r.)(

yxEwxxE

xwyEww

][]...[][

...................................

][]...[][

xxExxExxE

SMALL in the sum error squares sense means

that is, the input xi and its

corresponding class label (±1).

: training pairs

Pseudoinverse Matrix Define

responses desired ingcorrespond y

matrix) (an

matrix) (an lxN]x,...,x,x[X N21T

T xxXX1

T yxyX

Assume N=l X square and invertible. Then

)(ˆ)(

TT XXXX 1)( Pseudoinverse of X

XXXXXXX TTTT

Assume N>l. Then, in general, there is no solution to satisfy all equations simultaneously:

The “solution” corresponds to the minimum sum of squares solution

unknowns l equations N ....

Example:

107.48.4

7.441.224.2

8.424.28.2

1 yXXXw

The goal: Given two linearly separable classes, design the classifier

that leaves the maximum margin from both classes

0)( 0 wxwxg T

SUPPORT VECTOR MACHINES

Margin: Each hyperplane is characterized by• Its direction in space, i.e.,

• Its position in space, i.e.,

• For EACH direction,choose the hyperplane that leaves the SAME distance from the nearest points from each class. The margin is twice this distance.

The distance of a point from a hyperplane is given by

Scale, so that at the nearest points from each class the discriminant function is ±1:

Thus the margin is given by

Also, the following is valid

, , 0ww

)x̂(gz x̂

21 for 1)( and for 1)x( 1)( xggxg

SVM (linear) classifier

Minimize

Subject to

This is because minimizing

the margin is maximised

0)( wxwxg T

1)( wwJ

2for 1

Niwxwy iT

i ,...,2,1 ,1)( 0

The above is a quadratic optimization task subject to a set of linear inequality constraints. The Karush-Kuhh-Tucker conditions, state that the minimizer satisfies:

• (1)

• (2)

• (3)

• (4)

• Where is the Lagrangian

0) ,L( 0 ,www

0) , , 00

N,...,2,1i0i ,

Niwxwy iT

ii ,...,2,1 ,01)( 0

(.,.,.)L

1) , ,( 0

10 wxwywwwwL i

The solution: from the above, it turns out that

ii xyw

Remarks:• The Lagrange multipliers can be either zero or

positive. Thus,

where , corresponding to positive Lagrange multipliers

– From constraint (4) above, i.e.,

the vectors contributing to

ii xyw

Niwxwy iT

ii ,...,2,1 ,0]1)([ 0

10 wxw iT

wsatisfy

– These vectors are known as SUPPORT VECTORS and are the closest vectors, from each class, to the classifier.

– Once is computed, is determined from conditions (4).

– The optimal hyperplane classifier of a support vector machine is UNIQUE.

Dual Problem Formulation• The SVM formulation is a convex

programming problem, with– Convex cost function– Convex region of feasible solutions

• Thus, its solution can be achieved by its dual problem, i.e.,

– maximize

– subject to

),,( 0 wwL

• Combine the above to obtain

– maximize

– subject to

ii xxyy

Remarks:• Support vectors enter via inner products• Although the solution, w, is unique, the

Lagrange multipliers ARE NOT.

Non-Separable classes

In this case there is no hyperplane, such that

• Recall that the margin is defined as twice the distance between the following two hyperplanes

xwxwT ,1)(0

The training vectors belong to one of three possible categories

• A. Vectors outside the band which are correctly classified, i.e.,

• B. Vectors inside the band, and correctly classified, i.e.,

• C. Vectors misclassified, i.e.

1)( 0 wxwy Ti

1)(0 0 wxwy Ti

0)( 0 wxwy Ti

All three cases above can be represented as

A.B.C.

are known as slack variables

i wxwy 1)( 0

The goal of the optimization is now two-fold• Maximize margin• Minimize the number of patterns with ,

That is

where C a constant and

• I(.) not differentiable. In practice, we use an approximation

• Following similar procedure as before we obtain

iiICwwwJ

1) , ,(

iiCwwwJ

1) , ,(

KKT conditions

Niwxwy

,...,2,1 ,0, )6(

,...,2,1 ,0 )5(

,...,2,1 ,0]1)([ )4(

,...,2,1 ,0 )3(

The associated dual problem

Maximize

subject to

Remarks: The only difference with the separable

class case is the existence of C in the constraints

i jiii xxyy

N,...,2,1i,C0

1 Problem: Consider a two class task with ω 1, ω 2 LINEAR CLASSIFIERS.

Documents

Transcript of 1 Problem: Consider a two class task with ω 1, ω 2 LINEAR CLASSIFIERS.

H Ω Ω Ω - kpe-lavriou.att.sch.grkpe-lavriou.att.sch.gr/documents/hmer12sep_fotismos-moullou.pdf · γ) στην επιμήκυνση του μυκτήρα με στόχο την

LABORATÓRIO DE SISTEMAS MECATRÔNICOS E ROBÓTICA ] - LAB.pdf · Resistores - 1,0 Ω - 100k Ω 1,2 Ω - 120k Ω 1,5 Ω - 150k Ω 1,8 Ω- 180k Ω 2,2 Ω– 220k Ω 2,7 Ω– 270k

L5: Quadratic classifierscourses.cs.tamu.edu/rgutier/csce666_f13/l5.pdf · L5: Quadratic classifiers • Bayes classifiers for Normally distributed classes –Case 1: Σ =𝜎2𝐼

Architecting for Intermittence · 2019-10-04 · resume until the device has harvested sufficient energy time progress task A task B task C ... Di Wu Giri Prasanna Mugunda Krishnan

Manual do Usuário - dam-assets.fluke.com · (Detector da temperatura da resistência) Pt100 Ω (385) Pt100 Ω (3926) Pt100 Ω (3916) Pt200 Ω (385) Pt500 Ω (385) ... consulte a

Task Force on Integrated Assessment Modelling 45th session ... · Task Force on Integrated Assessment Modelling 45th session, 23-25 May 2016 ... Development residential comb. share

SCI 519 - Hybrid Classifiers (Backmatter Pages)978-3-642-40997-4/1.pdf · A Appendix A.1 Hypothesis Testing Letusassumethatwewouldliketoestimatethemeanμofnormaldensity fromsample[314]

0 / - $ * )Ω Ω Ω - , * , 1 * , / + * % . - . Ω» · + , * , ' ' . . - . Ω ( Χορήγηση μεθαδόνης, βοʑπρενορφίνης πό ιαρική παρακολούθηση

MLE’s, Bayesian Classifiers and Naïve Bayesand Naïve Bayestom/10601_sp08/slides/NBayes-1-28-2008-plus.pdf · MLE’s, Bayesian Classifiers and Naïve Bayesand Naïve Bayes Required

1 Hypothesis Testing. 2 Greene: App. C:892-897 Statistical Test: Divide parameter space (Ω) into two disjoint sets: Ω 0, Ω 1 Ω 0 ∩ Ω 1 = and Ω.

6 - Task 3: Effects of Thermal Tides in SPICAM Data 6.1 ...

Vom Liebhaberobjekt zur Investition. - SÜDWESTBANK · Ω Auto Union (Audi, DKW, Horch, Wanderer) Ω BMW Ω Daimler-Benz Ω Messerschmitt Ω NSU Ω Opel Ω Porsche Ω Glas 4 ... der

Kernel Methods: Support Vector Machines Maximum Margin Classifiers and Support Vector Machines.

Task 1. Plane stress tension of a plate

Periodic task model · 6 11 Earliest Deadline First (EDF) Task model o a set of independent periodic tasks EDF: o Whenever a new task arrive, sort the ready queue so that the task

Download 7E Version of Task (PDF)

The Problem: Consider a two class task with ω 1 , ω 2

³ ΓΔΔΑ ¶ ΘΔ ° Ω ± ΗΑ ®Ω ± ¶ ¸ ¶ ·ΗΑ ·Ω ± SOLAR THERMAL ...ΔΘΛΗΘΟ ΘΔΛΣΡΟ ΔΡΔΤΛΑ ΦΤΗΘΩΛ ΔΠΗΣΖΚΩΛ “ΓΖΚΟΘΡΗΣΟ” / “demokritos”

Multimodal Model Agnostic Meta-Learning via Task-Aware ... · Multimodal Model Agnostic Meta-Learning via Task-Aware Modulation Risto Vuorio* Shao-Hua Sun* Hexiang Hu Joseph J. Lim

1 The Problem: Consider a two class task with ω 1, ω 2 LINEAR CLASSIFIERS.