MA 751 Part 7 Solving SVM: Quadratic Programming 1...

54
MA 751 Part 7 Solving SVM: Quadratic Programming 1. Quadratic programming (QP): Introducing Lagrange multipliers and (can be justified in QP for inequality as α . 4 4 well as equality constraints) we define the Lagrangian

Transcript of MA 751 Part 7 Solving SVM: Quadratic Programming 1...

Page 1: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

MA 751 Part 7

Solving SVM: Quadratic Programming

1. Quadratic programming (QP): Introducing Lagrangemultipliers and (can be justified in QP for inequality asα .4 4

well as equality constraints) we define the Lagrangian

Page 2: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

PÐ ß ,ß ß ß Ña 0 α .

´ O C + OÐ ß Ñ , " "

8 – —4œ" 4œ" 3œ"

8 8 8

4 4 4 3 3 4 4X0 - α 0a a x x

Þ4œ"

8

4 4. 0

Page 3: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Solving using quadratic programmingBy Lagrange multiplier theory for constraints with inequalities,

the minimum of this in

aß ,ß ß œ Ð ßá ß Ñß œ Ð ßá ß Ñ0 α α . .α ." 8 " 8

is a stationary point of this Lagrangian (derivatives vanish) ismaximized wrt , and minimized wrt the Lagrangeaß ,ß 0multipliers, , subject to the constraintsα .

α .3 3ß   !. (5)

Derivatives:

Page 4: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Solving using quadratic programming

`P

`,œ ! Ê C œ !à

4œ"

8

4 4α (6a)

`P "

` 8œ ! Ê œ !Þ

0α .

34 4 (6b)

Plugging in get reduced Lagrangian

P Ð ß Ñ œ O C + OÐ ß Ñ "‡ X

4œ" 4œ"

8 8

4 4 3 3 4a a a x xα - α

Page 5: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Solving using quadratic programming

œ O ]O4œ"

8

4X Xα -a a a α

where

] œ

C ! ! á !! C ! á !ã ã ã ã ã! ! á C !! ! ! á C

Ô ×Ö ÙÖ ÙÖ ÙÖ ÙÕ Ø

"

#

8"

8

Page 6: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Solving using quadratic programming(note (6) eliminates the terms) with same constraints (5).04

Now:

`P

`+œ ! Ê # O O] œ !

4 (7)- a α

Ê + œ ÞC

#3

3 3α

-

Plug in for using (7), replacing by + O O]3"#a - α

everywhere:

Page 7: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Solving using quadratic programming

P Ð ß Ñ œ ]O]"

%‡ X X

4œ"

8

4a α α αα-

œ T"

%4œ"

8

4Xα

-α α,

where T œ ]O] X Þ

Constraints: ; by (6b) this impliesα .4 4ß   !

Page 8: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Solving using quadratic programming

! Ÿ Ÿ Þ"

8α4

Define , .G œ œ" "# 8 #- -α α

[note this does not mean complex conjugate!]

Then want to minimize (division by constant OK - does not#-change minimizing )α

" Ð# Ñ " "

# # % ## T œ T ß

- - --α α

-

4œ" 4œ"

8 8

4 4

#X Xα α α α (8)

Page 9: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Solving using quadratic programmingsubject to constraint also convenient to include! Ÿ Ÿ Gàα3

(6a) as constraint: Thus constraints are:α † œ !Þy

! Ÿ Ÿ Gà † œ !Þα α y

Summarizing above relationships:

0Ð Ñ œ + OÐ ß Ñ ,ßx x x4œ"

8

4 4

where

Page 10: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Solving using quadratic programming

+ œ ßC

#4

4 4α

-

α -α4 4œ # ß

and are the (unconstrained) minimizers of (8), withα4

T œ ] O] ÞX

After are determined, must be computed directly by+ ,4

plugging into

More briefly,

Page 11: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Solving using quadratic programming

0Ð Ñ œ C OÐ ß Ñ ,ßx x x4œ"

8

4 4 4α

where minimize (8).α4

Finally, to find , must plug into original optimization problem:,that is, we minimize

Page 12: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Solving using quadratic programming

"

8Ð" C 0Ð ÑÑ m0m

4œ"

8

4 4 O#x -

œ " C + OÐ ß Ñ , O Þ"

8 – —4œ" 3œ"

8 8

4 3 4 3

Xx x a a-

2 The RKHS for SVMÞ

General SVM: solution function is (see (4) above)

Page 13: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Solving using quadratic programming

0Ð Ñ œ + OÐ ß Ñ ,ßx x x4

4 4

with sol'n for given by quadratic programming as above.+4

Consider a simple case (linear kernel):

OÐ ß Ñ œ †x x x x4 4.

Page 14: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

RKHS for SVMThen we have

0Ð Ñ œ Ð+ Ñ † , ´ † ,ßx x x w x4

4 4

where

w x´ + Þ4

4 4

This gives the kernel. What class of functions is thecorresponding space ?[

Claim it is the set of linear functions of x:

Page 15: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

RKHS for SVM

[ ‘œ Ö † l − ×w x w .

with inner product

Ø † ß † Ù œ †w x w x w w" # " #

is the RKHS of above.OÐ ß Ñx y

Indeed to show that is the reproducing kernelOÐ ß Ñ œ †x y x y for , note that if then recall[ [0Ð Ñ œ † − ßx x wOÐ ß Ñ œ †x y x y, so

Page 16: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

RKHS for SVM

Ø0Ð † ÑßOÐ † ß ÑÙ œ † œ 0Ð Ñy w y y[ ,

as desired.

Thus the matrix , and we find the optimal separatorO œ †34 3 4x x

0Ð Ñ œ †x w x

by solving for as before.w

Note when we add to (as done earlier), have all affine, 0Ð Ñxfunctions .0Ð Ñ œ † ,x w x

Page 17: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

RKHS for SVMNote above inner product gives the norm

m † m œ m m œ A Þw x w# # #

4œ"

8

4[ ‘8

Why use this norm? A priori information content.

Final classification rule: ; 0Ð Ñ ! Ê C œ " 0Ð Ñ ! Êx xC œ "Þ

Learning from training data:

R0 œ Ð0Ð Ñßá ß 0Ð ÑÑ œ ÐC ßá ß C ÑÞx x" 8 " 8

Page 18: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

RKHS for SVM

Thus

[ ‘œ Ö0Ð Ñ œ † À − ×x w x w 8

is set of linear separator functions (known as inperceptronsneural network theory).

Consider separating hyperplane :L À 0Ð Ñ œ !x

3 Toy example:Þ

Page 19: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

RKHS for SVM

Page 20: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

ExampleInformation

R0 œ ÖÒÐ"ß "Ñß "Óß Ò "ß " ß "Óß ÒÐ"ß "Ñß"Óß ÒÐ"ß"Ñß"Ó×

(red +1 blueœ à œ "Ñà

0 œ † ,w x

œ + Ð † Ñ ,ï3

3 3x x

OÐ ß Ñx x3

so

Page 21: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Example

w xœ + Þ3

3 3

PÐ0Ñ œ Ð" 0Ð ÑC Ñ l l *" "

% #4

4 4 #x w ( )

(we let minimize wrt , - œ "Î#à ,ÑÞw

Equivalent:

Page 22: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Example

PÐ0Ñ œ l l" "

% #4œ"

%

4#0 w

C 0Ð Ñ   " à   !Þ4 4 4 4x 0 0

[Note effectively ]03 3 3 œ Ð" † , C Ñw x

Define kernel matrix

Page 23: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Example

O œ OÐ ß Ñ œ † œ

# ! # !! # ! ## ! # !! # ! #

34 3 4 3 4x x x xÔ ×Ö ÙÖ ÙÕ Ø

m0m œ l l œ O # + % + + + + Þ[ w a a# X #

3œ"

%

3 " $ # %œ

Page 24: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Example

where a œ Þ

++ã+

Ô ×Ö ÙÖ ÙÕ Ø

"

#

.

Formulate

Page 25: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Example

PÐ0Ñ œ PÐ ß ,ß Ñ œ O" "

% #a a a0

4œ"

%

4X0

œ"

% + # + + + +

4œ" 4œ"

% %

3 " $ # %3#0

subject to (Eq. 4a):

0 04 4 4 4  " C ÐO Ñ , ß   !Þc da

Lagrange multipliers (seeα .œ Ð ßá ß Ñ ß œ Ð ßá ß Ñα α . ." 8 " 8X X

(4b)):

Page 26: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Example

optimize

PÐ ß ,ß œ O O , C " " "

% #a a a a0 α .ß ß Ñ ˆ ‰

4œ" 4œ"

% %

3 4 4X

40 α

4œ"

%

4 4. 0

Page 27: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Example

œ O ]O , † †Ð"!Ñ" "

%4œ" 4œ"

% %

4 4X X X0 αa a a y α α α 0 . 0

with constraints

α .3 3ß   !Þ

Solution has (see (7) above)

α œ # ]- "a

Page 28: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Example

Ð ] œ œ Ñ

C ! á ! " ! ! !! C á ! ! " ! !ã ã ã ã ! ! " !! ! á C ! ! ! "

recall Ô × Ô ×Ö Ù Ö ÙÖ Ù Ö ÙÕ Ø Õ Ø

"

#

8

and (7a above)

α α αœ œ Þ"

#-

Finally optimize (8):

Page 29: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Example

P œ T ß"

#" 3

3œ"

%Xα α α

where

T œ ]O] X

Page 30: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Example

œ

" ! ! ! # ! # ! " ! ! !! " ! ! ! # ! # ! " ! !! ! " ! # ! # ! ! ! " !! ! ! " ! # ! # ! ! ! "

Ô ×Ô ×Ô ×Ö ÙÖ ÙÖÖ ÙÖ ÙÖÕ ØÕ ØÕ Ø

œ

" ! ! ! # ! # !! " ! ! ! # ! #! ! " ! # ! # !! ! ! " ! # ! #

Ô ×Ô ×Ö ÙÖ ÙÖ ÙÖ ÙÕ ØÕ Ø

Page 31: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Example

œ Þ

# ! # !! # ! ## ! # !! # ! #

Ô ×Ö ÙÖ ÙÕ Ø

constraint is

! Ÿ Ÿ G ´ œ Þ" "

# 8 %α

-(10a)

Thus optimize

Page 32: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Example

P œ # #" 3 " $ # %

3œ" 3œ"

% %

3#α α α α α α

œ Ð Ñ Ð Ñ Þ3œ"

%

3 " $ # %# #α α α α α

œ ? @ ? @ ß# #

where

Page 33: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Example? œ à @ œ Þα α α α" $ # %

Minimizing:

" #? œ !à " #@ œ !

Ê

? œ @ œ Þ"

#

Thus we have

Page 34: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Example

α3 œ"

%

for all (recall the constraint (10a)). Then3

α α αœ # œ œ Þ

"Î"Î"Î"Î

-

Ô ×Ö ÙÖ ÙÕ Ø

4444

Thus

Page 35: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Example

a œ œ Þ]

#

"Î"Î"Î"Î

α

-

Ô ×Ö ÙÖ ÙÕ Ø

4444

Thus

w x x x x xœ + œ Ð Ñ œ Ð%ß !Ñ œ Ð"ß !Ñ" "

% #3 3 " # $ % .

Margin = "l lw œ "Þ

Page 36: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

ExampleNow we find separately from original equation ( ); we will, *

minimize with respect to the original functional,

PÐ0Ñ œ " † , C l l"

%4

4 4 #w x w (

œ " Ð" ,ÑÐ"Ñ " Ð" ,ÑÐ"Ñ"

%œc d c d

Page 37: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Example

" Ð" ,ÑÐ"Ñ Ð" Ð" ,ÑÐ"Ñ "c d c d

œ , , , , ""

%œc d c d c d c d

œ , , ""

2.e fc d c d

Clearly the above is minimized when ., œ !

Thus w œ Ð"ß !Ñà , œ ! Ê

Page 38: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Example

0Ð Ñ œ † , œ Bx w x "

Page 39: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Example

Page 40: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Example4. Other choices of kernel

Recall in SVM we have used the kernel

OÐ ß Ñ œ † Þx y x y

There are many other choices of kernel, e.g.,

OÐ ß Ñ œ / OÐ ß Ñ œ Ð" l † lÑx y x y x yl l 8x y or

note - we must choose a kernel function which is positivedefinite. How do these choices change the discriminationfunction in SVM?0Ð Ñx

Page 41: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Other kernelsEx 1: Gaussian kernel

O Ð ß Ñ œ /5 x y l l#

# #x y5

[can show pos. def. Mercer kernel]

SVM: from (4) above have

0Ð Ñ œ + OÐ ß Ñ , œ + / ,ßx x x4 4

4 4 4

|x x l4#

# #5

where examples in have known classifications , andx4 4J C+ ß ,4 are obtained by quadratic programming.

Page 42: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Other kernels

What kind of classifier is this? It depends on (see Vert5movie).

Note Movie1 varies in the Gaussian ( corresponds to5 5 œ ∞a linear SVM) then movie2 varies the margin (in linearà "

l lwfeature space ) as determined by changing orJ# -equivalently G œ Þ"

# 8-

5. Software available

Software which implements the quadratic programmingalgorithm above includes:

Page 43: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Other kernels

• SVMLight: http://svmlight.joachims.org• SVMTorch: http://www.idiap.ch/learning/SVMTorch.html• LIBSVM: http://wws.csie.ntu.edu.tw/~cjlin/libsvm

A Matlab package which implements most of these is Spider:

http://www.kyb.mpg.de/bs/people/spider/whatisit.html

6. Example application: handwritten digit recognition -USPS (Scholkopf, Burges, Vapnik)

Handwritten digits:

Page 44: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Other kernels

Page 45: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

ExamplesTraining set: 7300; Test set: 2000

10 class classifier; class has a separating SVM function3>2

0 Ð Ñ œ † ,3 3 3x w x

Chosen class is

Class œ 0 Ð ÑÞargmax3−Ö!ßáß*×

3 x

F FÀ 1 Ä Ð1Ñ œ − Jdigit feature vector x

Kernels in feature space :J

Page 46: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Examples

RBF: OÐ ß Ñ œ /x x3 4

l l3 4#

# #

x x5

Polynomial: O œ Ð † Ñx x3 4.)

Sigmoidal: O œ Ð Ð † Ñ Ñtanh , )x x3 4

Results:

Page 47: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Examples

Page 48: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

ExamplesComputational Biology Applications

References:T. Golub et al Molecular Classification of Cancer: Class

Discovery and Class Prediction by Gene Expression.Science 1999.

S. Ramaswamy et al Multiclass Cancer Diagnosis UsingTumor Gene Expression Signatures. PNAS 2001.

Page 49: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Applications7. Microarrays for cancer classification

Goal: infer cancer genetics by looking at microarray.

Microarray reveals expression patterns and can hopefully beused to discriminate similar cancers, and thus lead to bettertreatments.

Usual problem: small sample size (e.g., 50 cancer tissuesamples), high dimensionality (e.g., 20-30,000). Curse ofdimensionality.

Example 1: Myeloid vs. Lymphoblastic leukemias

Page 50: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Applications

ALL: acute lymphoblastic leukemiaAML: acute myeloblastic leukemia

SVM training: leave one out cross-validation

Page 51: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Applications

Page 52: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Applications S. Mukherjee

fig. 1: Myeloid and Lymphoblastic Leukemia classification by SV

Page 53: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Applications

S. Mukherjee

Page 54: MA 751 Part 7 Solving SVM: Quadratic Programming 1 ...math.bu.edu/people/mkon/MA751/L11DSolvingSVMOptional.pdf · 4. Other choices of kernel Recall in SVM we have used the kernel

Applicationsfig 2: AML vs. ALL error rates with increasing sample size

In above figure the curves represent error rates with splitbetween training and test sets. Red dot represents leaveone out cross-validation.