Exercises in Machine Learning Playing with Kernels
-
Upload
phungthien -
Category
Documents
-
view
238 -
download
3
Transcript of Exercises in Machine Learning Playing with Kernels
Exercises in Machine LearningPlaying with Kernels
Zdenek Zabokrtsky, Ondrej Bojar
Institute of Formal and Applied LinguisticsFaculty of Mathematics and Physics
Charles University, Prague
Tue Apr 19, 2016
1 / 78
Outline
I Regularization parameter C in SVM.
I Linear Kernel: k(x, y) = x · yI Polynomial Kernel: k(x, y) = (γ ∗ x · y + coeff0)degree
I RBF Kernel: k(x, y) = exp(−γ‖x− y‖2); γ > 0. . . including their parameters
I Cross-validation HeatmapI Multi-class SVM
I For the PAMAP-easy dataset.I Regularization parameters.I Inseparable classes.
Based on http://scikit-learn.org/stable/modules/svm.html and other scikit-demos.
2 / 78
Regularization (C) in linear SVM
k(x, y) = x · y
(Linear kernel = no kernel)
The parameter C in (linear) SVM:I sets the weight of the sum of slack variables.I serves as a regularization parameter.I controls the number of support vectors.
Penalty Number ofC for Errors points considered Margin Bias Variance
Low Low Many Wide High LowHigh High Few Narrow Low High
Think C for VarianCe.
3 / 78
SVM Linear C=0.1
4 / 78
SVM Linear C=0.2
5 / 78
SVM Linear C=0.5
6 / 78
SVM Linear C=1
7 / 78
SVM Linear C=5
8 / 78
SVM Linear C=10
9 / 78
SVM Linear C=20
10 / 78
SVM Linear C=50
11 / 78
SVM Linear C=100
12 / 78
Polynomial Kernel
k(x, y) = (γ ∗ x · y + coeff0)degree
13 / 78
SVM Poly (degree 1)
14 / 78
SVM Poly (degree 2)
15 / 78
SVM Poly (degree 3)
16 / 78
SVM Poly (degree 4)
17 / 78
SVM Poly (degree 5)
18 / 78
SVM Poly (degree 6)
19 / 78
SVM Poly (degree 7)
20 / 78
SVM Poly (degree 8)
21 / 78
SVM Poly (degree 9)
22 / 78
SVM Poly (degree 3, gamma 0.05)
23 / 78
SVM Poly (degree 3, gamma 0.1)
24 / 78
SVM Poly (degree 3, gamma 0.2)
25 / 78
SVM Poly (degree 3, gamma 0.5)
26 / 78
SVM Poly (degree 3, gamma 0.7)
27 / 78
SVM Poly (degree 3, gamma 1)
28 / 78
SVM Poly (degree 3, gamma 2)
29 / 78
SVM Poly (d=3, g=0.5, coef=-2.0)
30 / 78
SVM Poly (d=3, g=0.5, coef=-1.0)
31 / 78
SVM Poly (d=3, g=0.5, coef=-0.50)
32 / 78
SVM Poly (d=3, g=0.5, coef=0)
33 / 78
SVM Poly (d=3, g=0.5, coef=0.5)
34 / 78
SVM Poly (d=3, g=0.5, coef=1)
35 / 78
SVM Poly (d=3, g=0.5, coef=2)
36 / 78
RBF Kernels
k(x, y) = exp(−γ‖x− y‖2); γ > 0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
-10 -5 0 5 10
exp(-0.5*x*x)exp(-1*x*x)exp(-2*x*x)exp(-3*x*x)
I Each training point creates its bell.I Overall shape is the sum of the bells.I Kind of “all nearest neighbours”.
37 / 78
RBF Kernel Parameters
C Decision Surface Model Bias VarianceLow Smooth Simple High LowHigh Peaked Complex Low High
gamma Affected PointsLow can be far from training examplesHigh must be close to training examples
I Does higher gamma lead to higher variance?
I Choice critical for SVM performance.I Advised to use GridSearchCV for C and gamma:
I exponentially spaced probesI wide range
38 / 78
SVM RBF (C=0.05, gamma=2)
39 / 78
SVM RBF (C=0.1, gamma=2)
40 / 78
SVM RBF (C=0.2, gamma=2)
41 / 78
SVM RBF (C=0.5, gamma=2)
42 / 78
SVM RBF (C=0.6, gamma=2)
43 / 78
SVM RBF (C=0.7, gamma=2)
44 / 78
SVM RBF (C=1, gamma=2)
45 / 78
SVM RBF (C=2, gamma=2)
46 / 78
SVM RBF (C=1, gamma=2)
47 / 78
SVM RBF (C=0.5, gamma=2)
48 / 78
SVM RBF (C=0.5, gamma=5)
49 / 78
SVM RBF (C=0.5, gamma=10)
50 / 78
SVM RBF (C=0.5, gamma=5)
51 / 78
SVM RBF (C=0.5, gamma=2)
52 / 78
SVM RBF (C=0.5, gamma=1)
53 / 78
SVM RBF (C=0.5, gamma=0.7)
54 / 78
SVM RBF (C=0.5, gamma=0.5)
55 / 78
SVM RBF (C=0.5, gamma=0.2)
56 / 78
SVM RBF (C=0.5, gamma=0.1)
57 / 78
SVM RBF (C=0.5, gamma=0.05)
58 / 78
Cross-validation Heatmap
http://scikit-learn.org/stable/auto_examples/svm/plot_
rbf_parameters.html
59 / 78
Multi-class SVM
Two implementations in scikit-learn:I SVC: one-against-one
I n(n − 1)/2 classifiers constructedI supports various kernels, incl. custom ones
I LinearSVC: one-vs-the-restI n classifiers trained
60 / 78
PAMAP-easy Training Data
30
30.5
31
31.5
32
32.5
33
33.5
34
60 80 100 120 140 160 180 200
1121316172
2434567
61 / 78
Default View (every 200)
62 / 78
Default View (every 300)
63 / 78
Default View (every 400)
64 / 78
Regularization C=0.5
65 / 78
Regularization C=1
66 / 78
Regularization C=5
67 / 78
Regularization C=10
68 / 78
Regularization C=20
69 / 78
Regularization C=50
70 / 78
Regularization C=500
71 / 78
Regularization C=5000
72 / 78
Inseparable classes 12,13 (every 200)
73 / 78
Inseparable classes 12,13 (every 100)
74 / 78
Inseparable classes 12,13 (every 80)
75 / 78
Inseparable classes 12,13 (every 60)
76 / 78
Inseparable classes 12,13 (every 55)
77 / 78
Task and Homework #07I Required minimum: For PAMAP-Easy as divided into
train+test:I Cross-validate to choose between linear, poly and RBF.I Create the heatmap for RBF (i.e. plot score for all values of
C and gamma).I Use the GridSearchCV to find the best C and gamma (i.e.
find the best without plotting anything).I Enter the accuracy of the best setting to
classification results.txt, mention C and gamma in thecomment.
I Optional:I Use GridSearchCV on all our datasets.I Enter the accuracies of the best settings to
classification results.txt, mention C and gamma in thecomment.
Due: 2 weeks from now, i.e. May 3.78 / 78