DCGAN

第二部分,关于使用TensorFlow.js将Keras模型迁移发布到浏览器已经写完了,关于在转化到web版本与JavaScript的内容可以看了。

深度卷积生成对抗网络

基于tensorflow 的深度卷积生成对抗网络

生成对抗网络(GANs)是现代化的机器学习的热点领域之一,生成对抗网络有两个子网络生成器与判别器,生成器的目的在于通过负载网络将随机噪声转化为与原始数据集高度相似的假数据,而判别器是目的在于把混在一起的数据中虚假的数据挑选出来。两个网络共同训练、协同进化,最终是要达到生成以假乱真的数据。

以图片为例,两个模型通过对抗过程同时训练。一个生成器(“艺术家”)学习创造看起来真实的图像,而判别器(“艺术评论家”)学习区分真假图像。训练过程中,生成器在生成逼真图像方面逐渐变强,而判别器在辨别这些图像的能力上逐渐变强。当判别器不再能够区分真实图片和伪造图片时,训练过程达到平衡。

现在以生成96×96大小图像为例,利用tf2的keras高级API组建一个DCGAN并进行训练。训练数据集来源于https://zhuanlan.zhihu.com/p/24767059

选取这个数据集的原因:

  1. 二次元图片结构简单,特征明显,容错率高,可以在计算力不宽裕的条件下,在合理时间内产生出可见结果
  2. 有现成的,不用自己处理

设备用的是垃圾笔记本,i7 6700HQ 16G内存 GTX950M 4GB 显存

软件环境是Ubuntu 18.04 LTS cuda 10.2 tensorflow 2.1.0

首先导入必要的python库

其中这一部分是为了在低显存设备上按需分配显存防止显存不足。


from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession

config = ConfigProto()

config.gpu_options.allow_growth = True

session = InteractiveSession(config=config)

引入tqdm 是为了加上进度条

from __future__ import absolute_import, division, print_function, unicode_literals

# TensorFlow and tf.keras
import tensorflow as tf
from tensorflow import keras

# Helper libraries
import numpy as np
import matplotlib.pyplot as plt

from tensorflow.keras import datasets, layers, models

from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession

config = ConfigProto()
config.gpu_options.allow_growth = True
session = InteractiveSession(config=config)

print(tf.__version__)
WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation.
WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation.
WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation.
WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation.
WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation.
WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation.
WARNING:root:Limited tf.summary API due to missing TensorBoard installation.
WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation.
WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation.
WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation.
WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation.
2.1.0
import glob
import imageio
import PIL
import time
from IPython import display
import os

from tqdm import *

生成器网络

采用keras高级api可以轻易搭建简单的卷积神经网络

这个网络的输入是一个100长度的随机数组,然后链接到一个6×6×1024的密集层(layers.Dense)上面,加上激活函数之后给整成6×6×1024的张量,然后就是四个上采样卷基层,将图像的规模从6×6逐级转化为96×96×3。

示意图
示意图

然后看一下这个网络的摘要。一个月20,974,720个可训练的参数。

model = tf.keras.Sequential()
model.add(layers.Dense(6*6*1024, use_bias=False, input_shape=(100,)))
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())
model.add(layers.Reshape((6, 6, 1024)))
model.add(layers.Conv2DTranspose(512, (5, 5), strides=(2, 2), padding='same', use_bias=False))
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())
model.add(layers.Conv2DTranspose(256, (5, 5), strides=(2, 2), padding='same', use_bias=False))
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())

model.add(layers.Conv2DTranspose(128, (5, 5), strides=(2, 2), padding='same', use_bias=False))
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())

model.add(layers.Conv2DTranspose(3, (5, 5), strides=(2, 2), padding='same', use_bias=False, activation='tanh'))
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_5 (Dense)              (None, 36864)             3686400   
_________________________________________________________________
batch_normalization (BatchNo (None, 36864)             147456    
_________________________________________________________________
leaky_re_lu_25 (LeakyReLU)   (None, 36864)             0         
_________________________________________________________________
reshape_5 (Reshape)          (None, 6, 6, 1024)        0         
_________________________________________________________________
conv2d_transpose_6 (Conv2DTr (None, 12, 12, 512)       13107200  
_________________________________________________________________
batch_normalization_1 (Batch (None, 12, 12, 512)       2048      
_________________________________________________________________
leaky_re_lu_26 (LeakyReLU)   (None, 12, 12, 512)       0         
_________________________________________________________________
conv2d_transpose_7 (Conv2DTr (None, 24, 24, 256)       3276800   
_________________________________________________________________
batch_normalization_2 (Batch (None, 24, 24, 256)       1024      
_________________________________________________________________
leaky_re_lu_27 (LeakyReLU)   (None, 24, 24, 256)       0         
_________________________________________________________________
conv2d_transpose_8 (Conv2DTr (None, 48, 48, 128)       819200    
_________________________________________________________________
batch_normalization_3 (Batch (None, 48, 48, 128)       512       
_________________________________________________________________
leaky_re_lu_28 (LeakyReLU)   (None, 48, 48, 128)       0         
_________________________________________________________________
conv2d_transpose_9 (Conv2DTr (None, 96, 96, 3)         9600      
=================================================================
Total params: 21,050,240
Trainable params: 20,974,720
Non-trainable params: 75,520
_________________________________________________________________

判别器网络

判别器的网络基本上是生成器的逆过程

三个卷基层+20个神经元的密集层,最后的激活函数在损失函数里面有,这里模型就不加了。
两层之间添加Dropout遗忘层,防止判别器把数据集的图片给记住了……

def make_discriminator_model():
    model = tf.keras.Sequential()
    model.add(layers.Conv2D(128, (5, 5), strides=(2, 2), padding='same',
                                     input_shape=[96, 96, 3]))
    model.add(layers.LeakyReLU())
    model.add(layers.Dropout(0.3))

    model.add(layers.Conv2D(256, (5, 5), strides=(2, 2), padding='same'))
    model.add(layers.LeakyReLU())
    model.add(layers.Dropout(0.3))
    
    model.add(layers.Conv2D(512, (5, 5), strides=(2, 2), padding='same'))
    model.add(layers.LeakyReLU())
    model.add(layers.Flatten())
    model.add(layers.Dense(20))
#     model.add(layers.Flatten())
    model.add(layers.Dense(1))
    return model
dent=make_discriminator_model()
dent.summary()
Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_21 (Conv2D)           (None, 48, 48, 128)       9728      
_________________________________________________________________
leaky_re_lu_32 (LeakyReLU)   (None, 48, 48, 128)       0         
_________________________________________________________________
dropout_2 (Dropout)          (None, 48, 48, 128)       0         
_________________________________________________________________
conv2d_22 (Conv2D)           (None, 24, 24, 256)       819456    
_________________________________________________________________
leaky_re_lu_33 (LeakyReLU)   (None, 24, 24, 256)       0         
_________________________________________________________________
dropout_3 (Dropout)          (None, 24, 24, 256)       0         
_________________________________________________________________
conv2d_23 (Conv2D)           (None, 12, 12, 512)       3277312   
_________________________________________________________________
leaky_re_lu_34 (LeakyReLU)   (None, 12, 12, 512)       0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 73728)             0         
_________________________________________________________________
dense_8 (Dense)              (None, 20)                1474580   
_________________________________________________________________
dense_9 (Dense)              (None, 1)                 21        
=================================================================
Total params: 5,581,097
Trainable params: 5,581,097
Non-trainable params: 0
_________________________________________________________________

两个网络加起来有着两千五百万个个参数。GTX 950 M 瑟瑟发抖 搞完之后保存网络,防止产生系统性风险。

model.save('g.h5')
dent.save("d.h5")

导入数据集

这里采用大佬搞好了的数据集,基本上没有预处理过程,采用keras自带的图像数据自动化生成器,给定图片目录就可以自动识别,甚至可以调整图片大小。

train_datagen.flow_from_directory返回的是一个python生成器,更多相关信息可以去找keras文档和python教程。

from  tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
                './face', # 目标目录
                target_size=(96, 96), # 所有图像调整为96x96
                batch_size=50,#一批数据50张图片,再多显存要爆了
                class_mode=None) # 没有标签
Found 51223 images belonging to 1 classes.

从数据集目录之中一共找到了51223张图片,从里面选一张来看一看。

for ll in train_generator:
    fu=ll
    break
plt.imshow(fu[19,:])
<matplotlib.image.AxesImage at 0x7ff1edd9dc88>

output_19_1.png
output_19_1.png

损失函数

这个dcgan网络不是可以简单套用keras的简单网络,需要自定义损失函数,这里用的是交叉熵损失函数。

discriminator_optimizer = tf.keras.optimizers.Adam(1e-4)
generator_optimizer = tf.keras.optimizers.Adam(1e-4)

这是优化器参数,可以调整学习参数。

cross_entropy = tf.keras.losses.BinaryCrossentropy(from_logits=True)
def discriminator_loss(real_output, fake_output):
    real_loss = cross_entropy(tf.ones_like(real_output), real_output)
    fake_loss = cross_entropy(tf.zeros_like(fake_output), fake_output)
    total_loss = real_loss + fake_loss
    return total_loss
def generator_loss(fake_output):
    return cross_entropy(tf.ones_like(fake_output), fake_output)
generator_optimizer = tf.keras.optimizers.Adam(1e-4)
discriminator_optimizer = tf.keras.optimizers.Adam(1e-4)

训练函数

在安排好损失函数之后,就该搞具体的训练过程了,这种复杂网络并不能直接套用model.compile之类的,需要自己编写训练过程,同时最好还要有一个可视化函数,便于在计算过程中查看输出与点滴记录学习过程,至于损失函数输出,在dcgan里面损失函数并没有什么实际意义,两个网络彼此对抗,都在动态变化,输出结果才是决定应因素。

EPOCHS = 50
noise_dim = 100
num_examples_to_generate = 16
# BATCH_SIZE = 40

# 我们将重复使用该种子(因此在动画 GIF 中更容易可视化进度)
seed = tf.random.normal([num_examples_to_generate, noise_dim])
BATCH_SIZE = 50

函数上面添加@tf.function修饰器表示在tf过程中进行"编译"

@tf.function
def train_step(images):
    noise = tf.random.normal([BATCH_SIZE, noise_dim])

    with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:
      generated_images = gnet(noise, training=True)

      real_output = dent(images, training=True)
      fake_output = dent(generated_images, training=True)

      gen_loss = generator_loss(fake_output)
      disc_loss = discriminator_loss(real_output, fake_output)

    gradients_of_generator = gen_tape.gradient(gen_loss, gnet.trainable_variables)
    gradients_of_discriminator = disc_tape.gradient(disc_loss, dent.trainable_variables)
#     print('genloss: '+str(gen_loss)+' dis_loss '+str(disc_loss))

    generator_optimizer.apply_gradients(zip(gradients_of_generator, gnet.trainable_variables))
    discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, dent.trainable_variables))

可视化查看与保存函数,可画图并保存。

def generate_and_save_images(model, epoch, test_input):
  # 注意 training` 设定为 False
  # 因此,所有层都在推理模式下运行(batchnorm)。
    predictions = model(test_input, training=False)

    fig = plt.figure(figsize=(4,4))

    for i in range(predictions.shape[0]):
        plt.subplot(4, 4, i+1)
        plt.imshow(np.array(predictions[i, :] * 127.5 + 127.5).astype(np.int))
        plt.axis('off')

    plt.savefig('./pic/019kimage_at_epoch_{:04d}.png'.format(epoch),bbox_inches='tight')
    plt.show()
generate_and_save_images(gnet,2,seed)

output_32_0.png
output_32_0.png

训练大循环,用于进行连续训练

def train(dataset, epochs):

    for epoch in trange(epochs):
        start = time.time()

        for image_batch in dataset:
            train_step(image_batch)
            break

        display.clear_output(wait=True)

    # 每 1000个 epoch 保存一次模型
        if (epoch + 1) % 100 == 0:

            generate_and_save_images(gent,(epoch + 1),seed)
            print ('Time for epoch {} is {} sec'.format(epoch + 1, time.time()-start))
        if (epoch + 1) % 1000 == 0:
            gent.save('./pic2/gnet6000c{}.h5'.format(epoch+1))
            dnet.save('./pic2/dent6000c{}.h5'.format(epoch+1))
    gent.save('gn.h5')
    dnet.save('dn.h5')
#             checkpoint.save(file_prefix = checkpoint_prefix)

#         print ('Time for epoch {} is {} sec'.format(epoch + 1, time.time()-start))

进行三十循环的测试性训练

train(train_generator,30)
# gent.save('gent.h5')
# dnet.save('dnet.h5')



100%|██████████| 30/30 [01:05<00:00,  2.98s/it]




Time for epoch 30 is 4.958022832870483 sec
gent.save('g.h5')
dnet.save('d.h5')

开始训练

一切测试完成之后,剩下的就交给GPU了,觉得效果不理想可以稍微改一下参数,很多情况下不理想的原因就是计算次数不够多,两个模型加起来可是有两千五百万个参数。

断断续续搞了两周之后的结果

train(train_generator,2000)
gent.save('g.h5')
dnet.save('d.h5')

output_39_0.png
output_39_0.png

Time for epoch 2000 is 4.741416692733765 sec



100%|██████████| 2000/2000 [1:09:34<00:00,  3.03s/it]

最后就可以通过tensorflow.js发布到网页上去了,demo在https://www.dogcraft.top/dcgan/