Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning...

98
1 UVA CS 6316: Machine Learning Lecture 15: Neural Network / Deep Learning Basics

Transcript of Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning...

Page 1: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

1

UVA CS 6316:Machine Learning

Lecture 15: Neural Network / Deep Learning Basics

Page 2: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression
Page 3: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

3

Page 4: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

ewx+b

1 + ewx+b

Logistic Regression

Sigmoid Function(aka logistic, logit, “S”, soft-step)

4

P(Y=1|x) =

x

Page 5: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

ez

1 + ez

Logistic Regression

x1

x2

x3

Σ

+1

z

ŷ = sigmoid(z) =

5

w1

w2

w3

bSummingFunction

Multiply by weights

ŷ = P(Y=1|x;w)

Σ x wib+=z i

i

OutputSigmoidFunction

Input

Page 6: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

ez

1 + ez

Logistic Regression

x1

x2

x3

Σ

+1

= wT x + b

ŷ = sigmoid(z) =6

px11xpp = 3

1x1

w1

w2

w3

bSummingFunction

SigmoidFunction

Multiply by weights

Σ x wib+=z i

i

Output

z

1x1Input

ŷ = P(Y=1|x;w)

Page 7: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

“Neuron” or “Perceptron”

x1

x2

x3

Σ

+1

z

7

w1

w2

w3

b

Page 8: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

“Neuron” or “Perceptron”

x1

x2

x3

Σ ŷ

8

From here on, we leave out bias for simplicity

w1

w2

w3

z

Page 9: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

Neurons

9http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf

Page 10: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

“Block View” of a Neuron

10

x *w z

ŷ

parameterized block

ez

1 + ez

z = wT x

ŷ = sigmoid(z) =px11xp1x1

Dot Product SigmoidInput Output

Page 11: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

Neuron Representation

Σ

11

The linear transformation and nonlinearity together is typically considered a single neuron

ŷ

x1

x2

x3

x

w1

w2

w3

Page 12: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

Neuron Representation

*w

12

ŷx

Page 13: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

13

Page 14: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

Multiple Neurons

x1

x2

x3

Σ

Σ

Σ

Σ

Input x

14

Wz

ŷ1

ŷ2

ŷ3

ŷ4

Page 15: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

x1

x2

x3

Σ

Σ

Σ

Σ

Input x

15

d = 4

p = 3

W

ez

1 + ez

z =WT x

ŷ= sigmoid(z) =px1dxpdx1

dx1 dx1

ŷ1

ŷ2

ŷ3

ŷ4

Multiple Neurons

Wmatrix

vectorz

Page 16: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

x1

x2

x3

Σ

Σ

Σ

Σ

Input x

16

W

ŷ = P(Y=1|x;W)

z1

Page 17: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

Neural Network (1 Hidden Layer)

x1

x2

x3

Σ

Σ

Σ

Σ

Input x

17

W1

w2

Σ

z1

z2

h1subscript represents layer

ŷ = P(Y=1|x;W)

Page 18: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

Neural Network (1 Hidden Layer)

x1

x2

x3

Σ

Σ

Σ

Σ

Input x

18

z1

Σz2

W1

w2

ŷ = P(Y=1|x;W,w)

z1 =W1T x

h1= sigmoid(z1) px1dxpdx1

dx1 dx1

z2 =w2T h1

ŷ= sigmoid(z2) px11xp1x1

1x1 1x1

h1

Page 19: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

x1

x2

x3

Input x

19

p = 3

W1Σ

Σ

Σ

Σ

Σ

Neural Network (1 Hidden Layer)

ŷ = P(Y=1|x;W,w)

z1 =W1T x

h1= sigmoid(z1) px1dxpdx1

dx1 dx1

z2 =w2T h1

ŷ= sigmoid(z2) px11xp1x1

1x1 1x1

h1

w2

Page 20: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

Dot Product Sigmoid

20

Input

x *W1 z1

W1 is a matrix z1 is a vector

ŷ

Output

Neural Network (1 Hidden Layer)

Dot Product Sigmoid

*w2 z2

h1

z1 =W1T x

h1= sigmoid(z1) px1dxpdx1

dx1 dx1

z2 =w2T h1

ŷ= sigmoid(z2) px11xp1x1

1x1 1x1

Page 21: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

Dot Product Sigmoid

21

Input

x *W1 z1

W1 is a matrix z1 is a vector

ŷ

Output

Neural Network (1 Hidden Layer)

Dot Product Sigmoid

*w2 z2

h1

z1 =W1T x

h1= sigmoid(z1) px1dxpdx1

dx1 dx1

z2 =w2T h1

ŷ= sigmoid(z2) px11xp1x1

1x1 1x1

Page 22: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

Neural Network (2 Hidden Layers)

22

1st hidden layer

2nd hiddenlayer

Outputlayer

x1

x2

x3

W1

w3

W2

h1 h2

ŷ x

Page 23: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

Neural Network (2 Hidden Layers)

23

z1 =WT xh1 = sigmoid(z1)z2 =WT h1 h2 = sigmoid(z2)z3 =wT h2 ŷ = sigmoid(z3)

h1 h2W1

w3

W2

1

2

3

hidden layer 1 output

hidden layer 2 output

ŷ

x1

x2

x3

x

Page 24: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

Multi-Class Neural Network (2 Hidden Layers)

x

24

h1 h2W1 w3

W2

z1 =WT xh1 = sigmoid(z1)z2 =WT h1 h2 = sigmoid(z2)z3 =wT h2 ŷ = sigmoid(z3)

1

2

3

ŷ1

ŷ2

ŷ3

hidden layer 1 output

hidden layer 2 output

x1

x2

x3

Page 25: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

“Block View” Of Multi-Layer Neural Network

x

1st hidden layer

2nd hidden layer

Output layer

25

*W1

*W2

*w3

z1 z2 z3h1 h2 ŷ

Page 26: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

“Deep” Neural Networks (i.e. > 1 hidden layer)

26

Page 27: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

27

Page 28: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

28

Toy Example: Will I pass this class?

http://introtodeeplearning.com/2019/materials/2019_6S191_L1.pdf

Page 29: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

29

Toy Example: Will I pass this class?

http://introtodeeplearning.com/2019/materials/2019_6S191_L1.pdf

Page 30: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

30

Toy Example: Will I pass this class?

http://introtodeeplearning.com/2019/materials/2019_6S191_L1.pdf

Page 31: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

31

4

5

W1 w3

W2

Toy Example: Will I pass this class?

ŷ

http://introtodeeplearning.com/2019/materials/2019_6S191_L1.pdf

Page 32: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

32

W1 w3

W2

4

5

Toy Example: Will I pass this class?

http://introtodeeplearning.com/2019/materials/2019_6S191_L1.pdf

ŷ

Page 33: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

33

Toy Example: Will I pass this class?

W1 w3

W2

4

5

http://introtodeeplearning.com/2019/materials/2019_6S191_L1.pdf

ŷ

Page 34: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

Binary Classification LossBinary Cross Entropy Loss

ŷ = P(Y=1|X,W)

34

x1

x2

x

E= loss = - logP(Y = ŷ | X= x ) = - y log(ŷ) - (1 - y) log(1-ŷ)

W1

w3

W2

true output

Page 35: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

z1

z2

z3

35

x1

x2

x3

x

Σ

Σ

Σ

ŷ1

ŷ2

ŷ3

Multi-Class Classification LossCross Entropy Loss

“Softmax” function. Normalizing function which converts each class output to a probability.

E = loss = - yj log ŷjΣj = 1...K

= P( yi = 1 | x )

W1 W3

W2

ŷi

“0” for all except true class

K = 3

010

0.10.70.2

ŷ y

Page 36: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

36

x1

x2

Σ

W1w3

W2

x

Regression LossMean Squared Error Loss

E = loss= ( y - ŷ )212

true output

ŷ

Page 37: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

37

Page 38: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

38

x1

x2

x3

W1

w3

W2

x ŷ E (ŷ; W)

W* = argmin E( ŷ ; W)W

Page 39: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

Training Neural Networks

39

How do we learn the optimal weights WL for our task??● Gradient descent:

LeCun et. al. Efficient Backpropagation. 1998

WL(t+1) = WL(t) - 𝜂 𝝏 E𝝏 WL(t)

x ŷ

W1w3

W2

Page 40: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

Gradient Descent

40

W* = argmin E( ŷ ; W)W

http://introtodeeplearning.com/2019/materials/2019_6S191_L1.pdf

Page 41: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

41

Gradient Descent

http://introtodeeplearning.com/2019/materials/2019_6S191_L1.pdf

Page 42: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

42

Gradient Descent

𝝏 E𝝏 W(t)

http://introtodeeplearning.com/2019/materials/2019_6S191_L1.pdf

Page 43: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

43

Gradient Descent

𝝏 E𝝏 W(t)

http://introtodeeplearning.com/2019/materials/2019_6S191_L1.pdf

Page 44: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

44

Gradient Descent

http://introtodeeplearning.com/2019/materials/2019_6S191_L1.pdf

Page 45: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

Training Neural Networks

45LeCun et. al. Efficient Backpropagation. 1998

But how do we get gradients of lower layers (e.g. W1) ?● Backpropagation!

○ Repeated application of chain rule of calculus○ Locally minimize the objective○ Requires all “blocks” of the network to be differentiable

x ŷ

W1w3

W2

Page 46: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

46

Backpropagation Intro

http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf

Page 47: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

47

Backpropagation Intro

http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf

Page 48: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

48

Backpropagation Intro

http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf

Page 49: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

49

Backpropagation Intro

http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf

Page 50: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

50

Backpropagation Intro

http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf

Page 51: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

51

Backpropagation Intro

http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf

Page 52: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

52

Backpropagation Intro

http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf

Page 53: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

53

Backpropagation Intro

http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf

Page 54: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

54

Backpropagation Intro

http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf

Page 55: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

55

Backpropagation Intro

http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf

Page 56: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

56

Backpropagation Intro

http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf

Page 57: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

57

Backpropagation Intro

http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf

Page 58: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

58

Backpropagation Intro

http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf

Page 59: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

59

Backpropagation Intro

http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf

Page 60: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

60

Backpropagation Intro

http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf

Page 61: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

61

Backpropagation Intro

Tells us: by increasing x by a scale of 1, we decrease f by a scale of 4

http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf

Page 62: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

62

ŷ = P(y=1|X,W)

x1

x2

x3

W1

w2

x

Backpropagation(binary classification example)

Example on 1-hidden layer network for binary classification

h1

Page 63: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

63

Backpropagation(binary classification example)

x ŷ*W1

*w2

z1z2h1

Page 64: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

64

x ŷ*W1

*w2

z1 z2h1

E = loss =

Backpropagation(binary classification example)

Gradient Descent to Minimize loss: Need to find these!

Page 65: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

65

Backpropagation(binary classification example)

x ŷ*W1

*w2

z1 z2h1

Page 66: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

66

Backpropagation(binary classification example)

= ??

= ??

x ŷ*W1

*w2

z1 h1 z2

Page 67: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

67

Backpropagation(binary classification example)

= ??

= ??

Exploit the chain rule!

x *W1

*w2

z1 ŷh1 z2

Page 68: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

68

Backpropagation(binary classification example)

x *W1

*w2

z1 ŷh1 z2

Page 69: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

69

Backpropagation(binary classification example)

chain rule

x *W1

*w2

z1 ŷh1 z2

Page 70: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

70

Backpropagation(binary classification example)

x *W1

*w2

z1 ŷh1 z2

Page 71: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

71

Backpropagation(binary classification example)

x *W1

*w2

z1 ŷh1 z2

Page 72: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

72

Backpropagation(binary classification example)

x *W1

*w2

z1 ŷh1 z2

Page 73: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

73

Backpropagation(binary classification example)

x *W1

*w2

z1 ŷh1 z2

Page 74: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

74

Backpropagation(binary classification example)

x *W1

*w2

z1 ŷh1 z2

Page 75: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

75

Backpropagation(binary classification example)

x *W1

*w2

z1 ŷh1 z2

Page 76: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

76

Backpropagation(binary classification example)

x *W1

*w2

z1 ŷh1 z2

Page 77: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

77

Backpropagation(binary classification example)

already computed!

x *W1

*w2

z1 ŷh1 z2

Page 78: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

78

“Local-ness” of Backpropagation

fx y

“local gradients”activations

gradients

http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf

Page 79: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

79

“Local-ness” of Backpropagation

fx y

activations

gradients

“local gradients”

http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf

Page 80: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

x

80

Example: Sigmoid Block

sigmoid(x) = 𝛔 (x)

http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf

Page 81: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

f3f2f1

Deep Learning: Layers of Differentiable Parameterized Functions (with nonlinearities)

x ŷ

81

E(ŷ,y)

Page 82: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

82

Building Deep Neural Nets

fx

y

http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf

Page 83: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

83

Block Example Implementation

http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf

Page 84: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

84

Deep Learning Frameworks

Page 85: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

85

Pytorch Sample Code

Page 86: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

86

Building Deep Neural Nets

“GoogLeNet” for Object Classification

Page 87: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

Backprop Demo

x1

x2

Σ

Σ

ŷ

87

w1 z1

w2w3

w4z2

h1

h2

Σ

w5

w6

h1 =exp(z1)

1 + exp(z1)exp(z2)

1 + exp(z2)h2 =

E = ( y - ŷ )2

f2

f4

w(t+1) = w(t) - 𝜂 𝝏 E𝝏 w(t)

𝝏 E𝝏 wi

= ??

z1 = x1w1 + x2w3 z2 = x1w2 + x2w4

f1

ŷ = h1w5 + h2w6 f3

Page 88: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

88

Page 89: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

Nonlinearity Functions (i.e. transfer or activation functions)

x1

x2

x3

Σ

SummingFunction

w1

w2

w3

z

x﹡w

89

Page 90: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

Nonlinearity Functions (i.e. transfer or activation functions)

x﹡w

90http://introtodeeplearning.com/2019/materials/2019_6S191_L1.pdf

Page 91: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

Nonlinearity Functions (i.e. transfer or activation functions)

x﹡w

91https://en.wikipedia.org/wiki/Activation_function#Comparison_of_activation_functions

Name Plot Equation Derivative ( w.r.t x )

Page 92: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

Nonlinearity Functions (i.e. transfer or activation functions)

x﹡w

92

Name Plot Equation Derivative ( w.r.t x )

https://en.wikipedia.org/wiki/Activation_function#Comparison_of_activation_functions

Page 93: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

Nonlinearity Functions (i.e. transfer or activation functions)

x﹡w

93

Name Plot Equation Derivative ( w.r.t x )

usually works best in practicehttps://en.wikipedia.org/wiki/Activation_function#Comparison_of_activation_functions

Page 94: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

94

Page 95: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

95

Neural Net Pipeline

x ŷ

W1w3

1. Initialize weights2. For each batch of input x samples S:

a. Run the network “Forward” on S to compute outputs and lossb. Run the network “Backward” using outputs and loss to compute gradientsc. Update weights using SGD (or a similar method)

3. Repeat step 2 until loss convergence

W2

E

Page 96: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

96

Non-Convexity of Neural Nets

In very high dimensions, there exists many local minimum which are about the same.

Pascanu, et. al. On the saddle point problem for non-convex optimization 2014

Page 97: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

97

Advantage of Neural Nets

As long as it’s fully differentiable, we can train the model to automatically learn features for us.

Page 98: Lecture 15: Learning Basics Neural Network / Deep Machine Learning …€¦ · Machine Learning Lecture 15: Neural Network / Deep Learning Basics. 3. ewx+b 1 + e wx+b Logistic Regression

98