Post on 05-Jan-2017
Exercises in Machine LearningPlaying with Kernels
Zdenek Zabokrtsky, Ondrej Bojar
Institute of Formal and Applied LinguisticsFaculty of Mathematics and Physics
Charles University, Prague
Tue Apr 19, 2016
1 / 78
Outline
I Regularization parameter C in SVM.
I Linear Kernel: k(x, y) = x · yI Polynomial Kernel: k(x, y) = (γ ∗ x · y + coeff0)degree
I RBF Kernel: k(x, y) = exp(−γ‖x− y‖2); γ > 0. . . including their parameters
I Cross-validation HeatmapI Multi-class SVM
I For the PAMAP-easy dataset.I Regularization parameters.I Inseparable classes.
Based on http://scikit-learn.org/stable/modules/svm.html and other scikit-demos.
2 / 78
Regularization (C) in linear SVM
k(x, y) = x · y
(Linear kernel = no kernel)
The parameter C in (linear) SVM:I sets the weight of the sum of slack variables.I serves as a regularization parameter.I controls the number of support vectors.
Penalty Number ofC for Errors points considered Margin Bias Variance
Low Low Many Wide High LowHigh High Few Narrow Low High
Think C for VarianCe.
3 / 78
SVM Linear C=0.1
4 / 78
SVM Linear C=0.2
5 / 78
SVM Linear C=0.5
6 / 78
SVM Linear C=1
7 / 78
SVM Linear C=5
8 / 78
SVM Linear C=10
9 / 78
SVM Linear C=20
10 / 78
SVM Linear C=50
11 / 78
SVM Linear C=100
12 / 78
Polynomial Kernel
k(x, y) = (γ ∗ x · y + coeff0)degree
13 / 78
SVM Poly (degree 1)
14 / 78
SVM Poly (degree 2)
15 / 78
SVM Poly (degree 3)
16 / 78
SVM Poly (degree 4)
17 / 78
SVM Poly (degree 5)
18 / 78
SVM Poly (degree 6)
19 / 78
SVM Poly (degree 7)
20 / 78
SVM Poly (degree 8)
21 / 78
SVM Poly (degree 9)
22 / 78
SVM Poly (degree 3, gamma 0.05)
23 / 78
SVM Poly (degree 3, gamma 0.1)
24 / 78
SVM Poly (degree 3, gamma 0.2)
25 / 78
SVM Poly (degree 3, gamma 0.5)
26 / 78
SVM Poly (degree 3, gamma 0.7)
27 / 78
SVM Poly (degree 3, gamma 1)
28 / 78
SVM Poly (degree 3, gamma 2)
29 / 78
SVM Poly (d=3, g=0.5, coef=-2.0)
30 / 78
SVM Poly (d=3, g=0.5, coef=-1.0)
31 / 78
SVM Poly (d=3, g=0.5, coef=-0.50)
32 / 78
SVM Poly (d=3, g=0.5, coef=0)
33 / 78
SVM Poly (d=3, g=0.5, coef=0.5)
34 / 78
SVM Poly (d=3, g=0.5, coef=1)
35 / 78
SVM Poly (d=3, g=0.5, coef=2)
36 / 78
RBF Kernels
k(x, y) = exp(−γ‖x− y‖2); γ > 0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
-10 -5 0 5 10
exp(-0.5*x*x)exp(-1*x*x)exp(-2*x*x)exp(-3*x*x)
I Each training point creates its bell.I Overall shape is the sum of the bells.I Kind of “all nearest neighbours”.
37 / 78
RBF Kernel Parameters
C Decision Surface Model Bias VarianceLow Smooth Simple High LowHigh Peaked Complex Low High
gamma Affected PointsLow can be far from training examplesHigh must be close to training examples
I Does higher gamma lead to higher variance?
I Choice critical for SVM performance.I Advised to use GridSearchCV for C and gamma:
I exponentially spaced probesI wide range
38 / 78
SVM RBF (C=0.05, gamma=2)
39 / 78
SVM RBF (C=0.1, gamma=2)
40 / 78
SVM RBF (C=0.2, gamma=2)
41 / 78
SVM RBF (C=0.5, gamma=2)
42 / 78
SVM RBF (C=0.6, gamma=2)
43 / 78
SVM RBF (C=0.7, gamma=2)
44 / 78
SVM RBF (C=1, gamma=2)
45 / 78
SVM RBF (C=2, gamma=2)
46 / 78
SVM RBF (C=1, gamma=2)
47 / 78
SVM RBF (C=0.5, gamma=2)
48 / 78
SVM RBF (C=0.5, gamma=5)
49 / 78
SVM RBF (C=0.5, gamma=10)
50 / 78
SVM RBF (C=0.5, gamma=5)
51 / 78
SVM RBF (C=0.5, gamma=2)
52 / 78
SVM RBF (C=0.5, gamma=1)
53 / 78
SVM RBF (C=0.5, gamma=0.7)
54 / 78
SVM RBF (C=0.5, gamma=0.5)
55 / 78
SVM RBF (C=0.5, gamma=0.2)
56 / 78
SVM RBF (C=0.5, gamma=0.1)
57 / 78
SVM RBF (C=0.5, gamma=0.05)
58 / 78
Cross-validation Heatmap
http://scikit-learn.org/stable/auto_examples/svm/plot_
rbf_parameters.html
59 / 78
Multi-class SVM
Two implementations in scikit-learn:I SVC: one-against-one
I n(n − 1)/2 classifiers constructedI supports various kernels, incl. custom ones
I LinearSVC: one-vs-the-restI n classifiers trained
60 / 78
PAMAP-easy Training Data
30
30.5
31
31.5
32
32.5
33
33.5
34
60 80 100 120 140 160 180 200
1121316172
2434567
61 / 78
Default View (every 200)
62 / 78
Default View (every 300)
63 / 78
Default View (every 400)
64 / 78
Regularization C=0.5
65 / 78
Regularization C=1
66 / 78
Regularization C=5
67 / 78
Regularization C=10
68 / 78
Regularization C=20
69 / 78
Regularization C=50
70 / 78
Regularization C=500
71 / 78
Regularization C=5000
72 / 78
Inseparable classes 12,13 (every 200)
73 / 78
Inseparable classes 12,13 (every 100)
74 / 78
Inseparable classes 12,13 (every 80)
75 / 78
Inseparable classes 12,13 (every 60)
76 / 78
Inseparable classes 12,13 (every 55)
77 / 78
Task and Homework #07I Required minimum: For PAMAP-Easy as divided into
train+test:I Cross-validate to choose between linear, poly and RBF.I Create the heatmap for RBF (i.e. plot score for all values of
C and gamma).I Use the GridSearchCV to find the best C and gamma (i.e.
find the best without plotting anything).I Enter the accuracy of the best setting to
classification results.txt, mention C and gamma in thecomment.
I Optional:I Use GridSearchCV on all our datasets.I Enter the accuracies of the best settings to
classification results.txt, mention C and gamma in thecomment.
Due: 2 weeks from now, i.e. May 3.78 / 78