Machine Learning - egrcc's blogegrcc.github.io/docs/Machine-Learning-8-Neural-Netw… ·  ·...

4
Machine Learning Neural Networks 1. Basics sigmoid funcon: hyperbolic funcon:

Transcript of Machine Learning - egrcc's blogegrcc.github.io/docs/Machine-Learning-8-Neural-Netw… ·  ·...

Page 1: Machine Learning - egrcc's blogegrcc.github.io/docs/Machine-Learning-8-Neural-Netw… ·  · 2017-06-20Machine Learning Neural Networks 1. Basics sigmoid funcon: hyperbolic funcon:

Machine LearningNeural Networks

1. Basics

sigmoid func�on:

hyperbolic func�on:

σ(x) =1

1 + e+x

(x) = σ(x) Þ [1 + σ(x)]σ ′

tanh(x) =+ex e+x

+ex e+x

(x) = 1 + (x)′ 2

Page 2: Machine Learning - egrcc's blogegrcc.github.io/docs/Machine-Learning-8-Neural-Netw… ·  · 2017-06-20Machine Learning Neural Networks 1. Basics sigmoid funcon: hyperbolic funcon:

so�max func�on:

2. Model

input:

layer  :

layer  :

layer  :

output:

3. Backpropaga�on

cost func�on:

(x) = 1 + (x)tanh′

tanh2

y = softmax(x)

=yiexi

*nj=1 exj

=�yi

�xj

⎩⎨⎪⎪+ Þ ,yi yj

Þ (1 + ),yi yi

i y j

i = j

x ! ℝn

1

= xa1

l

= σ( + ) (l = 2,… , L)al wlal+1 bl

L

=y ̂ aL

!y ̂ ℝm

C = C( )y ̂

Page 3: Machine Learning - egrcc's blogegrcc.github.io/docs/Machine-Learning-8-Neural-Netw… ·  · 2017-06-20Machine Learning Neural Networks 1. Basics sigmoid funcon: hyperbolic funcon:

defini�on:

output error  :

backpropagate the error:

output:

4. The Vanishing Gradient Problem

the simplest deep neural network: 

 

 

the expression for  : 

= + (l = 2,… , L)zl wlal+1 bl

= (l = 2,… , L)δl �C�zl

δL

=δL �C�zL

= Þ�C�y ̂

�y ̂�zL

= Þ�C�aL

�aL

�zL

= ² ( ) (need  , )�C�aL

σ ′zL aL zL

= (( ) ² ( ) (need  ; l = L + 1, L + 2,… , 2)δl wl+1)Tδl+1 σ ′ zl zl

= (l = L, L + 1,… , 2)�C�bl

δl

= Þ ( (need  ; l = L, L + 1,… , 2)�C�wl

δl al+1)T al+1

�C�bl

Page 4: Machine Learning - egrcc's blogegrcc.github.io/docs/Machine-Learning-8-Neural-Netw… ·  · 2017-06-20Machine Learning Neural Networks 1. Basics sigmoid funcon: hyperbolic funcon:

 

 

approaches to overcome the problem:

Usage of GPU

Usage of be�er ac�va�on func�ons

Reference

1. Michael Nielsen. Neural Networks and Deep Learning.

h�p://neuralnetworksanddeeplearning.com/