CS计算机代考程序代写 python deep learning Keras 15_generative-model

15_generative-model

Qiuhong Ke

Generative models
COMP90051 Statistical Machine Learning

“What I Cannot Create, I Do Not Understand” ——Richard Feynman

So far..

Classifier:

• SVM

• Perceptron

• Multi-layer perceptron

• CNN

Feature extraction:

• Pretrained CNN model
2

Using pretrained model in Keras
Feature extraction before the last classifier

Using pretrained model in Keras
Feature extraction from arbitrary layer

Using pretrained model in Keras
Perform classification

Predict class using a pretrained ResNet 50
Pretrained on ImageNet (1000 classes)

Beyond classification
Image generation

Figure 8.10 in Deep learning with python by Francois Chollet 8

Beyond classification
Image editing

Shen, Wei, and Rujie Liu. “Learning residual images for face attribute manipulation.” CVPR 2017.9

Generative Models
Cope with all of above tasks

• Variational Autoencoder (VAE)

• Generative Adversarial Network (GAN)

Outline

• Autoencoder (AE)

• Variational Autoencoder (VAE)

• Generative Adversarial Model (GAN)

Meaningful representations can reconstruct the input data

AE VAE GAN

Orange

Round

willoweit.
Highlight

Encoder – decoder

AE VAE GAN

Extract meaningful representation to reconstruct the input data

EncoderData Decoder Data

Latent representation

willoweit.
Typewritten Text
should be linear

Network architecture

AE VAE GAN

Encoder Decoder

z
FC FC FC FC

!”

“!

“#!…
…

Network architecture

AE VAE GAN

Conv
Max-

pooling

Encoder Decoder

Flatten FC FC Flatten Up-
sampling Conv

… …

Upsampling

AE VAE GAN

Upsampling

Size: The upsampling factors for rows and columns

Training and testing

AE VAE GAN

Predictions

Targets=input

Reconstruction
loss

Training:

Loss:

‘mse’:
L = E[(X − ypre)2]

Testing:

EncoderData

‘binary_crossentropy’: Input is between 0~1
17

Dimension reduction

AE VAE GAN

decodingencoding

EncoderData Decoder Data

Image denoisying

AE VAE GAN

Noisy

Clean

Noisy

Clean

Ground-truth
Predicted

Denoisying Autoencoder

AE VAE GAN

… …

Reconstruction
loss

Training (have access to clean data):

Denoisying Autoencoder

AE VAE GAN

… …

Testing:

Generate new data?

AE VAE GAN

DecoderEncoderData Data

fixed
fixed(existing)

Data generation from latent variables

AE VAE GAN

-3.71

-2.36

⋯ ⋯

Encoding

-2.89

-1.74

Decoding

-3.21

-2.83

Decoding

Data generation from latent variable

AE VAE GAN

p(x): probability of the data x
p(z): probability of the latent variable z
p(x|z): probability of x given z

p(x) = ∫ p(x |z)p(z)dz

Turn latent variable into data using a decoder (network)

AE VAE GAN

Randomly

sampling

p(z) = N(z |0,I)Assume: I: identity matrix

Decoder

p(x |z)

z x̂x

Same
distribution as

the training set?

? x

Encoder: Reduce sampling space

AE VAE GAN

Decoder

Randomly

sampling

p(z |x)

p(x |z)
Use posterior distribution to sample z is more likely to produce x

x̂xzx Encoder

q(z |x)

∼

q(z |x) = N(z |μz, σ
2
z * I)

μz

σz

Intractable true
distribution

Math is coming…

AE VAE GAN

Network optimisation: Maximize log likelihood and gradient ascend

AE VAE GAN

log p(x(i)) = Ez∼q(z|x)[log p(x
(i))]

= Ez∼q(z|x) [log
p(x(i) |z)p(z)

p(z |x(i)) ] = Ez∼q(z|x) [log
p(x(i) |z)p(z)

p(z |x(i))
q(z |x(i))
q(z |x(i)) ]

= Ez∼q(z|x) [log p(x(i) |z)] − Ez∼q(z|x) [log
q(z |x(i))

p(z) ] + Ez∼q(z|x) [log
q(z |x(i))
p(z |x(i)) ]

= Ez∼q(z|x) [log p(x(i) |z)] − DKL [q(z |x(i)) | |p(z)] + DKL [q(z |x(i)) | |p(z |x(i))]

Divergence

Maximize lower bound

AE VAE GAN

max Ez∼q(z|x) [log p(x(i) |z)] :

Unknown true
posterior

Minimize reconstruction loss between output and input
q(z |x) : Output of encoder

p(z) : Prior distribution, assumed to be
standard Gaussian distribution p(z) = N(z |0,I)

q(z) = N(z |μ, σ2I)

DKL [q(z |x(i)) | |p(z)] : KL divergence of two gaussian distributions29

AE VAE GAN

Math is coming AGAIN…

Warning of math: KL divergence

AE VAE GAN

μKL(p1 | |p2) = ∫ p1(x)log
p1(x)
p2(x)

dx
p1(x) =

σ1 2π
e

− 12 ( x − μ1σ1 )
2

p2(x) =
1

σ2 2π
e

− 12 ( x − μ2σ2 )
2

= ∫ [−logσ1 −
1
2

(log(2π)) −
1
2 (

x − μ1
σ1 )

− (−logσ2 −
1
2

(log(2π)) −
1
2 (

x − μ2
σ2 )

]p1(x)dx

= − logσ1 + logσ2 +
1

2σ22
E1[(x − μ2)

2]−
1

2σ21
E1[(x − μ1)

= − logσ1 + logσ2 +
1

2σ22
E1[(x − μ2)

2]−
1
2

Warning of math: KL divergence

AE VAE GAN

p1(x) ∼ N(μ, σ
2) p2(x) ∼ N(0,1)

KL(p1, p2) = − logσ −
1
2

+
1
2

(σ2 + μ2)

=
1
2

(−logσ2 + σ2 + μ2 − 1)

KL(p1 | |p2) = ∫ p1(x)log
p1(x)
p2(x)

dx = − logσ1 + logσ2 +
1

2σ22
E1[(x − μ2)

2]−
1
2

(Summary) Different from AE: estimate the distribution of z

AE VAE GAN

Encoder Decoder
Fixed

log σ2

Randomly z using
Reparameterization trick

x ̂x
Training

For easy and stable
training: output logσ2

Sample z: Reparameterization trick to make network differentiable

AE VAE GAN

N(0,I) z0 μ + σz0 = N(μ, σ
2I)z

https://www.jeremyjordan.me/variational-autoencoders/
https://www.jeremyjordan.me/variational-autoencoders/

(Summary) Different from AE: Two losses

AE VAE GAN

Targets=input Reconstruction
loss

KL
divergence

Data Encoder
Decoder

Randomly

sampling

log σ2

N(μ, σ2)

Training

KL =
1
2

(−logσ2 + σ2 + μ2 − 1)

p(z) = N(z |0,I)
35

https://www.jeremyjordan.me/variational-autoencoders/
https://www.jeremyjordan.me/variational-autoencoders/
https://www.jeremyjordan.me/variational-autoencoders/
https://www.jeremyjordan.me/variational-autoencoders/

DataDecoder

Randomly

sampling

z ∼ N(0,I)

z
log σ2

AE VAE GAN

Testing

Random z

z ∼ N(0,I)

Random digit

AE VAE GAN

Testing

AE VAE GAN

Conditional VAE

Data
Encoder

DataDecoder

Randomly

sampling

log σ2

μμ z

condition

conditionConcatenation

Concatenation

condition: c = [1,0,0,⋯,0]T

https://www.jeremyjordan.me/variational-autoencoders/
https://www.jeremyjordan.me/variational-autoencoders/

AE VAE GAN

Two-player game

Generator

https://pixabay.com/photos/tiger-cub-tiger-cub-sumatran-cute-164992/Source:

Discriminator
Real ?

Fake?Aims to generate more realistic images
to cheat the discriminator

Aims to tell whether the image

is generated or not

z ∼ N(0,I)

https://pixabay.com/photos/tiger-cub-tiger-cub-sumatran-cute-164992/
https://pixabay.com/photos/tiger-cub-tiger-cub-sumatran-cute-164992/

AE VAE GAN

Discriminator: binary cross-entropy

https://pixabay.com/photos/tiger-cub-tiger-cub-sumatran-cute-164992/Source:

Discriminator

y=1

y=0

binary cross-
entropy

Discriminator binary cross-
entropy

(training)

Generated image from noise with generator

Real image from training dataset

https://pixabay.com/photos/tiger-cub-tiger-cub-sumatran-cute-164992/
https://pixabay.com/photos/tiger-cub-tiger-cub-sumatran-cute-164992/

AE VAE GAN

Keras code for discriminator

AE VAE GAN

Generator: Fool discriminator by adversarial loss

Generator

(fix weights
just for

prediction)
Discriminator

Adversarial loss

y=1

binary cross-
entropy (BCE)

(training)

Is the generated image similar to real image?
Use the discriminator to check by predicting score.
If similar to real, discriminator will output high score,

and the BCE with 1 (adversarial loss) should be small

So the generator update weights to

minimise the adversarial loss

AE VAE GAN

Keras code for generator

AE VAE GAN

Conditional generative adversarial nets

Mirza, Mehdi, and Simon Osindero. “Conditional generative adversarial nets.” arXiv preprint arXiv:1411.1784 (2014).

AE VAE GAN

Context encoder

Pathak, Deepak, et al. “Context encoders: Feature learning by inpainting.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.

AE VAE GAN

Context encoder

Pathak, Deepak, et al. “Context encoders: Feature learning by inpainting.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.

Generator

Generator
(Frozen)

Discriminator

y=1

binary cross-
entropy

(training)

Reconstruction
loss (L2)

Discriminator is the same as the original gan

Adversarial loss

AE VAE GAN

Other applications

Jin, Yanghua, et al. “Towards the automatic anime characters creation with generative adversarial networks.” arXiv preprint arXiv:1708.05509 (2017)47

AE VAE GAN

Other applications

Turning a horse video into a zebra video (by CycleGAN)

Summary

• How to train AE to learn meaningful representation and denoisy image?

• How to train VAE?

• How to train GAN?

Related Posts