TTIC 31230, Fundamentals of Deep Learningttic.uchicago.edu/~dmcallester/DeepClass/GANs.pdfTTIC...

TTIC 31230, Fundamentals of Deep Learning

David McAllester, April 2017

Generative Adversarial Networks (GANs)

The Generator and The Discriminator

A GAN consists of two networks: a generator PgenΘ (x) and a

discriminator P discΨ (y|x).

Θ∗ = argmaxΘ

E(x,y)∼(D ] P genΘ )

P discΨ (y|x)

Here x is drawn from the data distribution D or the generatordistribution P

genΘ with equal propability and y = 1 if x is

drawn from D and −1 if x is drawn from PgenΘ .

The discriminator tries to determine which source x came fromand the generator tries to fool the discriminator.

Consistency

If the discriminator is perfect, then the only way to fool it isto exactly copy the data distribution.

Consistency Theorem: If PgenΘ (x) and P disc

Ψ (y|x) are bothuniversally expressive (any distribution can be represented)then P

genΘ∗ = D.

DC GANs, Radford, Metz and Chintala, ICLR 2016

The Generator

Generated Bedrooms

Interpolated Faces

[Ayan Chakrabarti]

Conditional Distribution Modeling

All distribution modeling methods apply to conditional distri-butions.

For conditional GANs we allow the generator to take x as aninput and generate a conditional value c.

Θ∗ = argmaxΘ

Ex∼D, (c,y)∼( D(c|x) ] P genΘ (c|x) )

P discΨ (y|c, x)

Here y = 1 if c is drawn from D(c|x) and y = −1 if c is drawnfrom P

genΘ (c|x).

The Case of Imperfect Generation

Θ∗ = argmaxΘ

P discΨ (y|x)

Ψ∗(Θ) = argminΨ

P (y|x)

P discΨ∗(Θ)(y = 1|x) =

P (x, y = 1)

P (x)=

D(x) + PgenΘ (x)

Θ∗ = argmaxΘ

[− log2P

discΨ∗(Θ)(y|x)

= argmaxΘ

2E(x,1)∼D

D(x) + π(x|Θ)

2E(x,−1)∼P gen

D(x) + PgenΘ (x)

PgenΘ (x)

= argmaxΘ

1− 1

2KL(D,A)− 1

genΘ , A)

A(x) =1

2(D(x) + P

genΘ (x))

Jensen-Shannon Divergence (JSD)

We have arrived at the Jensen-Shannon divergence.

Θ∗ = argminΘ

JSD(D,PgenΘ )

JSD(P,Q) =1

(P,P + Q

(Q,P + Q

0 ≤ JSD(P,Q) = JSD(Q,P ) ≤ 1

The Discriminator Tends to Win

If the discriminator “wins” the discriminator log loss goes tozero (becomes exponentially small) and there is no gradient toguide the generator.

In this case the learning stops and the generator is blockedfrom minimizing JSD(D,P

genΘ ).

The Standard Fix

The standard fix is to replace the loss

` = − logP discΨ (y|x)

˜̀ = −y logP discΨ (1|x)

These two loss functions agree when y = 1 (the case where xis drawn from D) but are very different when x is drawn fromthe generator (y = −1) and P disc

Ψ (1|x) is exponentially closeto zero.

A Margin Interpretation of the Standard Fix

The standard fix can be interpreted in terms of the “margin”of binary classification.

For y ∈ {−1, 1} we typically have sΨ(1|x) = −sΨ(−1|x) andsoftmax over 1 and -1 gives

PΨ(y|x) =1

1 + e−m

where the margin m = 2ysΨ(x).

The margin is large when the prediction is confidently corrent.

A Margin Interpretation of the Standard Fix

In the standard fix we (essentially) take the loss to be themargin of the discriminator.

The generator wants to reduce the discriminator’s margin.

The direction of the update is the same but the step is muchlarger under margin-loss for generated inputs and large dis-criminator margins.

TTIC 31230, Fundamentals of Deep Learningttic.uchicago.edu/~dmcallester/DeepClass/GANs.pdfTTIC...

Documents

Transcript of TTIC 31230, Fundamentals of Deep Learningttic.uchicago.edu/~dmcallester/DeepClass/GANs.pdfTTIC...

Information Theory & Fundamentals of Digital Communications

Part IV Generative Learning algorithmscs229.stanford.edu/notes/cs229-notes2.pdf · CS229Lecturenotes Andrew Ng Part IV Generative Learning algorithms So far, we’ve mainly been talking

Score ArkansasTech Fundamentals Packet Tech Fundamentals Packet- Score.pdf · Score ArkansasTech Fundamentals Packet % % > % % % % % % > > > % αα αα αα ∀ ∀ α αα αα

Do Deep Neural Networks Suffer from Crowding?papers.nips.cc/paper/...deep-neural-networks-suffer-from-crowding.pdf · Do Deep Neural Networks Suffer from Crowding? Anna Volokitiny\

EC7 Cyprus 2007 Deep Excavations

Part IV Generative Learning algorithmskmlee/cs539/cs229-notes2.pdf · Generative Learning algorithms So far, we’ve mainly been talking about learning algorithms that model p(y|x;θ),

Fundamentals of planning (Principles of Management)

The Fundamentals of Modal Testing

Lec: Generative Adversarial Networks

Deep Foundations

Deep Neural Networks - GitHub Pages

Deep generative learning_icml_part2

Introduction to Deep Reinforcement Learningwnzhang.net/teaching/cs420/slides/13-deep-rl.pdf · 2020-06-08 · Deep Reinforcement Learning •Deep Reinforcement Learning •leverages

DEEP FOUNDATIONS – Pile Foundations - جامعة تكريتced.ceng.tu.edu.iq/images/lectures/dr.farouq/ch7-Deep-Foundation... · DEEP FOUNDATIONS – Pile Foundations ... Qs =

WCDMA Fundamentals by Dr. Hatem MOKHTARI

TCOM 501: Networking Theory & Fundamentals

Distributed Training of Generative Adversarial Networks for Fast … · 2018-08-30 · 1 Distributed Training of Generative Adversarial Networks for Fast Detector Simulation on Intel®

11 Chapter 6 SDH Fundamentals(Book)

Fundamentals of Gnostic Philosophy - chicagognosis.orgchicagognosis.org/.../5__fundamentals_of_gnostic_philosophy.pdf · Fundamentals of Gnostic Philosophy The Fourth Pillar of Gnostic

Deep Generative Models I