• date post

05-Jan-2017
• Category

## Documents

• view

221

3

Embed Size (px)

### Transcript of Exercises in Machine Learning Playing with Kernels

• Exercises in Machine LearningPlaying with Kernels

Zdenek Zabokrtsky, Ondrej Bojar

Institute of Formal and Applied LinguisticsFaculty of Mathematics and Physics

Charles University, Prague

Tue Apr 19, 2016

1 / 78

• Outline

I Regularization parameter C in SVM.

I Linear Kernel: k(x, y) = x yI Polynomial Kernel: k(x, y) = ( x y + coeff0)degreeI RBF Kernel: k(x, y) = exp(x y2); > 0

. . . including their parameters

I Cross-validation HeatmapI Multi-class SVM

I For the PAMAP-easy dataset.I Regularization parameters.I Inseparable classes.

Based on http://scikit-learn.org/stable/modules/svm.html and other scikit-demos.

2 / 78

http://scikit-learn.org/stable/modules/svm.html

• Regularization (C) in linear SVM

k(x, y) = x y

(Linear kernel = no kernel)

The parameter C in (linear) SVM:I sets the weight of the sum of slack variables.I serves as a regularization parameter.I controls the number of support vectors.

Penalty Number ofC for Errors points considered Margin Bias Variance

Low Low Many Wide High LowHigh High Few Narrow Low High

Think C for VarianCe.

3 / 78

• SVM Linear C=0.1

4 / 78

• SVM Linear C=0.2

5 / 78

• SVM Linear C=0.5

6 / 78

• SVM Linear C=1

7 / 78

• SVM Linear C=5

8 / 78

• SVM Linear C=10

9 / 78

• SVM Linear C=20

10 / 78

• SVM Linear C=50

11 / 78

• SVM Linear C=100

12 / 78

• Polynomial Kernel

k(x, y) = ( x y + coeff0)degree

13 / 78

• SVM Poly (degree 1)

14 / 78

• SVM Poly (degree 2)

15 / 78

• SVM Poly (degree 3)

16 / 78

• SVM Poly (degree 4)

17 / 78

• SVM Poly (degree 5)

18 / 78

• SVM Poly (degree 6)

19 / 78

• SVM Poly (degree 7)

20 / 78

• SVM Poly (degree 8)

21 / 78

• SVM Poly (degree 9)

22 / 78

• SVM Poly (degree 3, gamma 0.05)

23 / 78

• SVM Poly (degree 3, gamma 0.1)

24 / 78

• SVM Poly (degree 3, gamma 0.2)

25 / 78

• SVM Poly (degree 3, gamma 0.5)

26 / 78

• SVM Poly (degree 3, gamma 0.7)

27 / 78

• SVM Poly (degree 3, gamma 1)

28 / 78

• SVM Poly (degree 3, gamma 2)

29 / 78

• SVM Poly (d=3, g=0.5, coef=-2.0)

30 / 78

• SVM Poly (d=3, g=0.5, coef=-1.0)

31 / 78

• SVM Poly (d=3, g=0.5, coef=-0.50)

32 / 78

• SVM Poly (d=3, g=0.5, coef=0)

33 / 78

• SVM Poly (d=3, g=0.5, coef=0.5)

34 / 78

• SVM Poly (d=3, g=0.5, coef=1)

35 / 78

• SVM Poly (d=3, g=0.5, coef=2)

36 / 78

• RBF Kernels

k(x, y) = exp(x y2); > 0

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

-10 -5 0 5 10

exp(-0.5*x*x)exp(-1*x*x)exp(-2*x*x)exp(-3*x*x)

I Each training point creates its bell.I Overall shape is the sum of the bells.I Kind of all nearest neighbours.

37 / 78

• RBF Kernel Parameters

C Decision Surface Model Bias VarianceLow Smooth Simple High LowHigh Peaked Complex Low High

gamma Affected PointsLow can be far from training examplesHigh must be close to training examples

I Does higher gamma lead to higher variance?

I Choice critical for SVM performance.I Advised to use GridSearchCV for C and gamma:

I exponentially spaced probesI wide range

38 / 78

• SVM RBF (C=0.05, gamma=2)

39 / 78

• SVM RBF (C=0.1, gamma=2)

40 / 78

• SVM RBF (C=0.2, gamma=2)

41 / 78

• SVM RBF (C=0.5, gamma=2)

42 / 78

• SVM RBF (C=0.6, gamma=2)

43 / 78

• SVM RBF (C=0.7, gamma=2)

44 / 78

• SVM RBF (C=1, gamma=2)

45 / 78

• SVM RBF (C=2, gamma=2)

46 / 78

• SVM RBF (C=1, gamma=2)

47 / 78

• SVM RBF (C=0.5, gamma=2)

48 / 78

• SVM RBF (C=0.5, gamma=5)

49 / 78

• SVM RBF (C=0.5, gamma=10)

50 / 78

• SVM RBF (C=0.5, gamma=5)

51 / 78

• SVM RBF (C=0.5, gamma=2)

52 / 78

• SVM RBF (C=0.5, gamma=1)

53 / 78

• SVM RBF (C=0.5, gamma=0.7)

54 / 78

• SVM RBF (C=0.5, gamma=0.5)

55 / 78

• SVM RBF (C=0.5, gamma=0.2)

56 / 78

• SVM RBF (C=0.5, gamma=0.1)

57 / 78

• SVM RBF (C=0.5, gamma=0.05)

58 / 78

• Cross-validation Heatmap

http://scikit-learn.org/stable/auto_examples/svm/plot_

rbf_parameters.html

59 / 78

http://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.htmlhttp://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html

• Multi-class SVM

Two implementations in scikit-learn:I SVC: one-against-one

I n(n 1)/2 classifiers constructedI supports various kernels, incl. custom ones

I LinearSVC: one-vs-the-restI n classifiers trained

60 / 78

• PAMAP-easy Training Data

30

30.5

31

31.5

32

32.5

33

33.5

34

60 80 100 120 140 160 180 200

1121316172

2434567

61 / 78

• Default View (every 200)

62 / 78

• Default View (every 300)

63 / 78

• Default View (every 400)

64 / 78

• Regularization C=0.5

65 / 78

• Regularization C=1

66 / 78

• Regularization C=5

67 / 78

• Regularization C=10

68 / 78

• Regularization C=20

69 / 78

• Regularization C=50

70 / 78

• Regularization C=500

71 / 78

• Regularization C=5000

72 / 78

• Inseparable classes 12,13 (every 200)

73 / 78

• Inseparable classes 12,13 (every 100)

74 / 78

• Inseparable classes 12,13 (every 80)

75 / 78

• Inseparable classes 12,13 (every 60)

76 / 78

• Inseparable classes 12,13 (every 55)

77 / 78

• Task and Homework #07I Required minimum: For PAMAP-Easy as divided into

train+test:I Cross-validate to choose between linear, poly and RBF.I Create the heatmap for RBF (i.e. plot score for all values of

C and gamma).I Use the GridSearchCV to find the best C and gamma (i.e.

find the best without plotting anything).I Enter the accuracy of the best setting to

classification results.txt, mention C and gamma in thecomment.

I Optional:I Use GridSearchCV on all our datasets.I Enter the accuracies of the best settings to

classification results.txt, mention C and gamma in thecomment.

Due: 2 weeks from now, i.e. May 3.78 / 78