Some slides and images are taken from: David Wolfe...

58
Introduction to Deep Learning Some slides and images are taken from: David Wolfe Corne https://www.macs.hw.ac.uk/~dwcorne/Teaching/introdl.ppt Wikipedia Geoffrey A. Hinton

Transcript of Some slides and images are taken from: David Wolfe...

Page 1: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

Introduction to Deep Learning

Some slides and images are taken from:

David Wolfe Corne https://www.macs.hw.ac.uk/~dwcorne/Teaching/introdl.pptWikipediaGeoffrey A. Hinton

Page 2: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

Feedforward networks for function approximation and classification

• Feedforward networks • Deep tensor networks • ConvNets

Page 3: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

Axon

Terminal Branches of Axon

Dendrites

Σ

x1

x2w1

w2

wnxn

x3 w3

Page 4: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

1 w0

w1

wn

x1

xn

A single artificial neuron

Page 5: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

1 w0

w1

wn

x1

xn

A single artificial neuron

inputs

weights

summation

nonlinearactivationfunction

biasnode

output / activation of the neuron

Page 6: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

Activation functionshttps://en.wikipedia.org/wiki/Activation_function

Page 7: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

Activation functionshttps://en.wikipedia.org/wiki/Activation_function

Traditionallyused

Page 8: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

Activation functionshttps://en.wikipedia.org/wiki/Activation_function

Nair and Hinton: Rectified linear units improve restricted Boltzmann machines ICLM’10, 807-814 (2010)Glorot, Bordes and Bengio: Deep sparse rectifier neural networks. PMLR 15:315-323 (2011)

Currently most widely used. Empirically easier to train and results in sparse networks.

Page 9: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

1 w0

w1

wn

x1

xn

Perceptron (Rosenblatt 1957)Used as a binary classifier - equivalent to support vector machine (SVM)

Page 10: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

1 w0

w1

wn

x1

xn

Perceptron (Rosenblatt 1957)

Page 11: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

input vector

hidden layers

outputs

Back-propagate error signal to get derivatives for learning

Compare outputs with correct answer to get error signal

Second-generation neural networks (~1985) Slide credit : Geoffrey Hinton

Page 12: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

input vector

hidden layers

outputs

Compare outputs with correct answer to get error signal

Slide credit : Geoffrey Hinton

Second-generation neural networks (~1985)Error at output for a given example:

j

i

wji

Page 13: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

input vector

hidden layers

outputs

Compare outputs with correct answer to get error signal

Slide credit : Geoffrey Hinton

Second-generation neural networks (~1985)Error at output for a given example:

Error sensitivity at output neuron j:

j

i

Backpropagate error sensitivity to neuron i:wji

Sensitivity on weight wji:

Page 14: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

input vector

hidden layers

outputs

Compare outputs with correct answer to get error signal

Slide credit : Geoffrey Hinton

Second-generation neural networks (~1985)Error at output for a given example:

Error sensitivity at output neuron j:

j

i

Backpropagate error sensitivity to neuron i:wji

Sensitivity on weight wji:

Update weight wji:

learning rate

Page 15: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

Initial random weights

A decision boundary perspective on learning

Page 16: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

Present a training instance / adjust the weights

A decision boundary perspective on learning

Page 17: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

Present a training instance / adjust the weights

A decision boundary perspective on learning

Page 18: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

Present a training instance / adjust the weights

A decision boundary perspective on learning

Page 19: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

Present a training instance / adjust the weights

A decision boundary perspective on learning

Page 20: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

Eventually ….

A decision boundary perspective on learning

Page 21: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

NNs use nonlinear f(x) so they can draw complex boundaries, but keep the data unchanged

Nonlinear versus linear models

Kernel methods only draw straight lines, but transform the data first in a way that makes it linearly separable

Page 22: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

Universal Representation Theorem

• Networks with a single hidden layer can represent any function F(x)with arbitrary precision in the large hidden layer size limit

• However, that doesn‘t mean, networks with single hidden layers are efficient in representing arbitrary functions. For many datasets,deep networks can represent the function F(x) even with narrow layers.

Page 23: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

What does a neural network learn?

Page 24: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

Feature detectors

Page 25: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

What is this unit doing?

Page 26: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

1

63

1 5 10 15 20 25 …

strong +ve weight

low/zero weight

Hidden layer units become self-organized feature detectors

Page 27: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

1

63

1 5 10 15 20 25 …

strong +ve weight

low/zero weight

what does this unit detect?

Page 28: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

1

63

1 5 10 15 20 25 …

strong +ve weight

low/zero weight

it will send strong signal for a horizontal line in the top row, ignoring everywhere else

what does this unit detect?

Page 29: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

1

63

1 5 10 15 20 25 …

strong +ve weight

low/zero weight

what does this unit detect?

Page 30: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

1

63

1 5 10 15 20 25 …

strong +ve weight

low/zero weight

Strong signal for a dark area in the top left corner

what does this unit detect?

Page 31: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

What features might you expect a good NN to learn, when trained with data like this?

Page 32: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

63

1

Horizontal lines

Page 33: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

63

1

Horizontal lines

Page 34: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

63

1

Small circles

Page 35: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

1

But what about position invariance?Our example unit detectors were tied to specific parts of the image

Small circles

Page 36: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

Deep Networks

Page 37: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

etc …detect lines in

Specific positions

v

Higher level detetors ( horizontal line, “RHS vertical lune” “upper loop”, etc…

etc …

successive layers can detect higher-level features

Page 38: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

etc …detect lines in

Specific positions

v

Higher level detetors ( horizontal line, “RHS vertical lune” “upper loop”, etc…

etc …

What does this unit detect?

successive layers can detect higher-level features

Page 39: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

So: multiple layers make sense

Page 40: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

So: multiple layers make sense

Multiple layers are also found in the brain, e.g. visual cortex

Page 41: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

But: until recently deep networks could not be efficiently trained

Page 42: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights
Page 43: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

Convolutional Neural Networks

Page 44: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights
Page 45: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

Convolutional Kernel / Filter

Apply convolutions

Page 46: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights
Page 47: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

Convolutional filters perform image processing

Page 48: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

Example: filters in face recognition

Page 49: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

MNIST dataset

Page 50: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

Misclassified examples

Page 51: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

Applications to molecular systems

Page 52: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

1) Learning to represent (effective) energy function

Behler-Parrinello network

Cartesiancoordinates

Internalcoordinates

Neural networks(may be shared for same atom types)

Atomic energiesTotal energy

Behler and Parrinello, PRL 98, 146401 (2007)

Page 53: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

1) Learning to represent (effective) energy function

Schuett, Arbabzadah, Chmiela, Müller & Tkatchenko, Nature Communications 8, 13890 (2017)

Page 54: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

2) Generator networks

Gómez-Bombarelli, …, Aspuru-Guzik: Automatic Chemical Design using Variational Autoencoders (2016)

Page 55: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

2) Generator networks

Gómez-Bombarelli, …, Aspuru-Guzik: Automatic Chemical Design using Variational Autoencoders (2016)

Page 56: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

2) Generator networks

Gómez-Bombarelli, …, Aspuru-Guzik: Automatic Chemical Design using Variational Autoencoders (2016)

Page 57: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

3) VAMPnets

Mardt, Pasquali, Wu & Noé: VAMPnets - deep learning of molecular kinetics (2017)

Page 58: Some slides and images are taken from: David Wolfe …helper.ipam.ucla.edu/publications/eltut/eltut_14764.pdf · 1 w 0 w 1 w n x 1 x n … A single artificial neuron inputs weights

3) VAMPnets

Mardt, Pasquali, Wu & Noé: VAMPnets - deep learning of molecular kinetics (2017)