Download - Lecture 9 VAE variants 40mins - GitHub Pages

Transcript
Page 1: Lecture 9 VAE variants 40mins - GitHub Pages

VAE variantsHao Dong

Peking University

1

Page 2: Lecture 9 VAE variants 40mins - GitHub Pages

• Convolutional VAE• Conditional VAE• β-VAE• IWAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)

2

VAE variants

Hierarchical representation learning

Temporal representa2on learning

Representation learning

Page 3: Lecture 9 VAE variants 40mins - GitHub Pages

• Convolutional VAE• Conditional VAE• IWAE• β-VAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)

3

VAE variants

Hierarchical representation learning

Temporal representation learning

Representation learning

Page 4: Lecture 9 VAE variants 40mins - GitHub Pages

4

Convolutional Variational Autoencoder

• Limitations of vanilla VAE• The size of weight of fully connected layer == input size x output size• If VAE uses fully connected layers only, will lead to curse of dimensionality

when the input dimension is large (e.g., image).

• Solution

Image is modified from: Deep Clustering with Convolutional Autoencoder. NIPS 2017.

Fully connected layers here(independent variables)

Page 5: Lecture 9 VAE variants 40mins - GitHub Pages

• Convolutional VAE• Conditional VAE• β-VAE• IWAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)

5

VAE variants

Hierarchical representa2on learning

Temporal representation learning

Representation learning

Page 6: Lecture 9 VAE variants 40mins - GitHub Pages

6

Conditional Variational Autoencoder

• Train and inference with labelled data.

Learning structured output representation using deep conditional generative models. NeurIPS 2015.

Page 7: Lecture 9 VAE variants 40mins - GitHub Pages

7

Recap: Variational Autoencoder

Auto-Encoding Variational Bayes. Diederik P. Kingma, Max Welling. ICLR 2013

• Recap: Se>ng up the objecAve• Maximise P(X)• Set Q(z) to be an arbitrary distribuGon

Page 8: Lecture 9 VAE variants 40mins - GitHub Pages

8

Recap: Variational Autoencoder

Auto-Encoding Variational Bayes. Diederik P. Kingma, Max Welling. ICLR 2013

• Recap: Setting up the objective

encoder ideal reconstruction KLD

Page 9: Lecture 9 VAE variants 40mins - GitHub Pages

9

Condi6onal Varia6onal Autoencoder

Learning structured output representation using deep conditional generative models. NeurIPS 2015.

• Setting up the objective with labels• Maximise P(Y|X)• Set Q(z) to be an arbitrary distribution

Page 10: Lecture 9 VAE variants 40mins - GitHub Pages

10

Conditional Variational Autoencoder

Learning structured output representation using deep conditional generative models. NeurIPS 2015.

• Setting up the objective

encoder ideal

reconstruction KLD

Page 11: Lecture 9 VAE variants 40mins - GitHub Pages

11

Conditional Variational Autoencoder

• Train and inference without labelled data i.e., vanilla VAE

Learning structured output representation using deep conditional generative models. NeurIPS 2015.

Page 12: Lecture 9 VAE variants 40mins - GitHub Pages

12

Conditional Variational Autoencoder

• Train and inference with labelled data.

Learning structured output representation using deep conditional generative models. NeurIPS 2015.

Page 13: Lecture 9 VAE variants 40mins - GitHub Pages

13

Conditional Variational Autoencoder

• Train and inference with labelled data.

Learning structured output representation using deep conditional generative models. NeurIPS 2015.

Page 14: Lecture 9 VAE variants 40mins - GitHub Pages

14

Conditional Variational Autoencoder

• Train and inference with labeled data.

Learning structured output representation using deep conditional generative models. NeurIPS 2015.

Page 15: Lecture 9 VAE variants 40mins - GitHub Pages

• Convolutional VAE• Conditional VAE• β-VAE• IWAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)

15

VAE variants

Hierarchical representation learning

Temporal representation learning

Representation learning

Page 16: Lecture 9 VAE variants 40mins - GitHub Pages

16

Before we start

• Disentangled / Factorised representation• Each variable in the inferred latent representation is only sensitive to one

single generative factor and relatively invariant to other factors• Good interpretability and easy generalization to a variety of tasks

β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. ICLR 2017.

Page 17: Lecture 9 VAE variants 40mins - GitHub Pages

17

Before we start

• Unsupervised hierarchical representation learning

Rethinking Style and Content Disentanglement in Variational Autoencoders. ICLR 2018 Workshops.

Page 18: Lecture 9 VAE variants 40mins - GitHub Pages

18

β-VAE

• Unsupervised representation learning• Augment the original VAE framework with a single hyper-parameter β

that modulates the learning constraints• Impose a limit on the capacity of the latent information channel

β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. ICLR 2017.

Page 19: Lecture 9 VAE variants 40mins - GitHub Pages

19

β-VAE

β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. ICLR 2017.

Page 20: Lecture 9 VAE variants 40mins - GitHub Pages

20

β-VAE

β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. ICLR 2017.

Page 21: Lecture 9 VAE variants 40mins - GitHub Pages

21

β-VAE

β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. ICLR 2017.

Page 22: Lecture 9 VAE variants 40mins - GitHub Pages

22

β-VAE

β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. ICLR 2017.

Page 23: Lecture 9 VAE variants 40mins - GitHub Pages

23

β-VAE

β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. ICLR 2017.

Page 24: Lecture 9 VAE variants 40mins - GitHub Pages

24

β-VAE

β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. ICLR 2017.

Page 25: Lecture 9 VAE variants 40mins - GitHub Pages

25

β-VAE

• Discussion: It is really unsupervised?

• It is unsupervised/self-supervised learning, because it does not need any label data

• It is not fully unsupervised learning, it works because of the inductive bias of the neural network model, the hierarchical design introduces prior knowledge about the data

β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. ICLR 2017.

Page 26: Lecture 9 VAE variants 40mins - GitHub Pages

• Convolutional VAE• Conditional VAE• β-VAE• IWAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)

26

VAE variants

Hierarchical representation learning

Temporal representation learning

Representation learning

Page 27: Lecture 9 VAE variants 40mins - GitHub Pages

27

IWAE(Importance Weighted Autoencoder)

Importance Weighted Autoencoders. ICLR 2016.

• Optimise a tighter lower bound than VAE• VAE just optimises a lower bound of log P(X)

encoder ideal reconstruction KLD

-ELBO

ELBO

Page 28: Lecture 9 VAE variants 40mins - GitHub Pages

28

IWAE

Importance Weighted Autoencoders. ICLR 2016.

• Optimise a tighter lower bound than VAE

Page 29: Lecture 9 VAE variants 40mins - GitHub Pages

29

IWAE

Importance Weighted Autoencoders. ICLR 2016.

Page 30: Lecture 9 VAE variants 40mins - GitHub Pages

30

IWAE

Importance Weighted Autoencoders. ICLR 2016.

• Why “Importance weighted”

where

Page 31: Lecture 9 VAE variants 40mins - GitHub Pages

• Convolutional VAE• Conditional VAE• β-VAE• IWAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)

31

VAE variants

Hierarchical representation learning

Temporal representation learning

Representation learning

Page 32: Lecture 9 VAE variants 40mins - GitHub Pages

32

Ladder VAE

• To learn hierarchical latent representation• Deep models with several layers of dependent stochastic variables are difficult to train

• Limiting the improvements obtained using these highly expressive models

LVAE: Ladder Variational Autoencoder. Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther. NIPS 2016.

Page 33: Lecture 9 VAE variants 40mins - GitHub Pages

33

Ladder VAE

LVAE: Ladder Variational Autoencoder. Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther. NIPS 2016.

encode decode encode decode

Hierarchical VAE Ladder VAE

Page 34: Lecture 9 VAE variants 40mins - GitHub Pages

34

Ladder VAE

LVAE: Ladder Variational Autoencoder. Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther. NIPS 2016.

Page 35: Lecture 9 VAE variants 40mins - GitHub Pages

35

Ladder VAE

LVAE: Ladder Variational Autoencoder. Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther. NIPS 2016.

Page 36: Lecture 9 VAE variants 40mins - GitHub Pages

36

Ladder VAE

LVAE: Ladder Variational Autoencoder. Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther. NIPS 2016.

Page 37: Lecture 9 VAE variants 40mins - GitHub Pages

37

Ladder VAE

LVAE: Ladder Variational Autoencoder. Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther. NIPS 2016.

Page 38: Lecture 9 VAE variants 40mins - GitHub Pages

38

Ladder VAE

LVAE: Ladder Variational Autoencoder. Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther. NIPS 2016.

Page 39: Lecture 9 VAE variants 40mins - GitHub Pages

39

Ladder VAE

LVAE: Ladder Variational Autoencoder. Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther. NIPS 2016.

Page 40: Lecture 9 VAE variants 40mins - GitHub Pages

• Convolutional VAE• Conditional VAE• β-VAE• IWAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)

40

VAE variants

Hierarchical representation learning

Temporal representation learning

Representation learning

Page 41: Lecture 9 VAE variants 40mins - GitHub Pages

41

Progressive + Fade-in VAE

• Discussion

Progressive Learning and Disentanglement of Hierarchical Representations. ICLR 2020.

encoders decoders

high-level

middle-level

low-level

Can we directly train a hierarchical VAE with ladder structure like that?

Page 42: Lecture 9 VAE variants 40mins - GitHub Pages

42

Progressive + Fade-in VAE

• Discussion

Progressive Learning and Disentanglement of Hierarchical Representations. ICLR 2020.

encoders decoders

high-level

middle-level

low-level

Information SHORTCUT problemall information go through the low-level path,other paths are ignored.model is lazy…

Page 43: Lecture 9 VAE variants 40mins - GitHub Pages

43

Progressive + Fade-in VAE

• Progressive + Fade-in

Progressive Learning and Disentanglement of Hierarchical Representations. ICLR 2020.

Stage 1: enable high-level path Stage 2: enable high-level and middle paths

Stage 3: enable all paths …

𝛼 increase from 0 to 1 gradually

Page 44: Lecture 9 VAE variants 40mins - GitHub Pages

44

Progressive + Fade-in VAE

• Results

Progressive Learning and Disentanglement of Hierarchical Representations. ICLR 2020.

High-level: background and foreground colors

Middle-level: shape

Low-level: …

Page 45: Lecture 9 VAE variants 40mins - GitHub Pages

45

Progressive + Fade-in VAE

• Results

Progressive Learning and Disentanglement of Hierarchical Representations. ICLR 2020.

High-level Middle-level Low-level

Page 46: Lecture 9 VAE variants 40mins - GitHub Pages

• Convolutional VAE• Conditional VAE• β-VAE• IWAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)

46

VAE variants

Hierarchical representation learning

Temporal representation learning

Representation learning

Page 47: Lecture 9 VAE variants 40mins - GitHub Pages

47

VAE in speech

• Learning latent representations for style control and transfer in end-to-end speech synthesis

• RNN as encoder

Learning latent representations for style control and transfer in end-to-end speech synthesis. ICASSP 2019.

Page 48: Lecture 9 VAE variants 40mins - GitHub Pages

• Convolutional VAE• Conditional VAE• β-VAE• IWAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)

48

VAE variants

Hierarchical representation learning

Temporal representation learning

Representation learning

Page 49: Lecture 9 VAE variants 40mins - GitHub Pages

50

TD-VAE

• State-space model as a Markov Chain model

Temporal Difference VAE. ICLR 2019.

Page 50: Lecture 9 VAE variants 40mins - GitHub Pages

51

TD-VAE

Temporal Difference VAE. ICLR 2019.

Page 51: Lecture 9 VAE variants 40mins - GitHub Pages

52

TD-VAE

Temporal Difference VAE. ICLR 2019.

Page 52: Lecture 9 VAE variants 40mins - GitHub Pages

53

TD-VAE

Temporal Difference VAE. ICLR 2019.

Page 53: Lecture 9 VAE variants 40mins - GitHub Pages

54

TD-VAE

Temporal Difference VAE. ICLR 2019.

Page 54: Lecture 9 VAE variants 40mins - GitHub Pages

55

TD-VAE

Temporal Difference VAE. ICLR 2019.

Page 55: Lecture 9 VAE variants 40mins - GitHub Pages

• Convolutional VAE• Conditional VAE• β-VAE• IWAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)

56

Summary

Hierarchical representation learning

Temporal representation learning

Representation learning

Page 56: Lecture 9 VAE variants 40mins - GitHub Pages

Thanks

57