深度学习入门(8) - Generative models 生成模型

Generative models

Supervised vs. Unsupervised
Discriminative Model vs. Generative Models vs. Conditional Generative

Discriminative: only label compete for probability mass, no competition between images

Generative: images compete with each other for probability mass

usage

Discriminative:

  1. Feature learning
  2. Assign labels to data

Generative:

  1. detect outliers
  2. feature learning
  3. sample to generate new data

Conditional Generative:

  1. assign labels while rejecting outliers
  2. generate new data conditioned on input labels

请添加图片描述

Autoregressive

Goal: Write down an explicit function for p(x)=f(x,W)p(x) = f(x,W)p(x)=f(x,W)

We can break down the probability function to get p(x)=p(x1,x2,...,xT)=Πt=1Tp(xt∣x1,x2,...xt−1)p(x) = p(x_1,x_2,...,x_T) = \Pi_{t=1}^Tp(x_t|x_1,x_2,...x_{t-1})p(x)=p(x1,x2,...,xT)=Πt=1Tp(xtx1,x2,...xt1)

We can use RNN to train a density function

PixelRNN

generate image pixels one at a time, staring at the upper left corner

compute a hidden state for each pixel that depends on hidden state and RGB values from the left and above (LSTM recurrence)

hx,y=f(hx−1,y,hx,y−1,W)h_{x,y} = f(h_{x-1,y},h_{x,y-1},W)hx,y=f(hx1,y,hx,y1,W)

At each pixel, predict red, then blue, then green

Each pixel depends implicitly on all pixels above and to the left

Problem: Really slow both training and testing

PixelCNN

Dependency on previous pixel now modeled using a CNN over context

on new CIFAR images

Autoregressive Models: PixelRNN/CNN

Pros:

  1. can explicitly compute likelihood
  2. gives good evaluation metric
  3. good samples

Cons:

  1. sequential generation -> slow

Variational Autoencoders

VAE define an intractable density.

we can optimize a lower bound on this density instead.

(non-variational) Autoencoders

Features should extract useful information that can use for downstream tasks

Problem: how can we learn this feature transform from raw data?

idea: use the features to reconstruct the input data with a decoder

loss: L2 distance between input and reconstructed data

Variational Autoencoders

  1. learn latent features z from raw data
  2. sample from the model to generate new data

请添加图片描述

How to train this model: maximize the likelihood of data

However, we can’t get full access to all the x

so, we need to find a computable lower bound of the likelihood

Hopefully, with the growing of lower-bound, the real likelihood will increase

请添加图片描述

Process:

  1. run input data through encoder to get a distribution over latent codes
  2. encoder output should match the prior p(z)p(z)p(z)

​ Here, we assume the distribution we learn and the prior p(z)p(z)p(z) are both diagonal Gaussian distribution for computational convenience.

  1. sample code z from encoder output
  2. run sampled code through decoder to get a distribution over data samples
  3. original input data should be likely under the distribution output from step 4

Step 2, 5 are the two terms of loss we try to minimize

Generative Adversarial Networks

Setup: Assume we have data xxx drawn from distribution pdata(x)p_{data}(x)pdata(x). We want to sample from pdata(x)p_{data}(x)pdata(x).

Idea: Introduce a latent variable z with simple prior p(z)p(z)p(z).

sample z from p(z)p(z)p(z) and pass to a Generator Network x=G(z)x = G(z)x=G(z)

Then x is a sample from the generator distribution pGp_GpG , we want pG=pdatap_G = p_{data}pG=pdata

We train the Generator Network to convert z into fake data x sampled from pGp_GpG

by fooling the discriminator D

Train D to classify data as real or fake

We will train the two networks jointly, they are fighting against each other.

loss

请添加图片描述

problem: At the beginning of training, vanishing gradients for G

solution: change minimize log(1-D(G(z))) to maximize -log(D(G(z))) then G gets strong gradients at the beginning of training ! nice idea

We can do some math calculations to prove GAN can get the optimal pdatap_{data}pdata.

But there are still caveats about the capability of our fixed architecture to reach the optimal and its convergence.

DC-GAN

interpolating between points in latent z space

We can even do latent vector math!
Conditional GANs

we can use conditional Batch Normalization

learn parameters for different labels

Spectral Normalization

we can generate images in specific labels

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值