
å
容
æåŸã®éšåã§ã¯ãããŒããŒãã¶ã®Webãµã€ããè§£æããæåã«é¢ããããŒã¿ãããŠã³ããŒãããŸãããæãéèŠãªã®ã¯ããã¶ã®åçã§ãã åèš20ã®ãã¶ãèªç±ã«äœ¿çšã§ããŸããã ãã¡ãããããã20æã®ç»åãããã¬ãŒãã³ã°ããŒã¿ãçæããããšã¯ã§ããŸããã ãã ãããã¶ã®è»žå¯Ÿç§°ã䜿çšã§ããŸããåçã1床ãã€å転ãããåçŽã«åå°ãããããšã«ããã1æã®åçã720æã®ç»åã®ã»ããã«å€æã§ããŸãã ãŸããååã§ã¯ãããŸããããããã§ã詊ããŠã¿ãŠãã ããã
æ¡ä»¶ä»ãå€åãªãŒããšã³ã³ãŒããŒããã¬ãŒãã³ã°ããŠããããããäœã§ããã-çæçæµå¯Ÿãããã¯ãŒã¯-ã«é²ã¿ãŸãããã
CVAE-æ¡ä»¶ä»ãå€åãªãŒããšã³ã³ãŒããŒ
èªåãšã³ã³ãŒããŒã®æç¶ãã«ã€ããŠã¯ã次ã®åªããèšäºã圹ç«ã¡ãŸãã
èªãããšã匷ããå§ãããŸãã
ããã§ãã€ã³ãã«çŽè¡ããŸãã
CVAEãšVAEã®éãã¯ããšã³ã³ãŒããŒãšãã³ãŒããŒã®äž¡æ¹ãå
¥åããããã«å¥ã®ã©ãã«ãæäŸããå¿
èŠãããããšã§ãã ãã®å Žåãã©ãã«ã¯OneHotEncoderããåãåã£ãã¬ã·ãã®ãã¯ãã«ã«ãªããŸãã
ãããããã¥ã¢ã³ã¹ããããŸã-ãããŠãã©ã®æç¹ã§ã©ãã«ãæåºããã®ãæå³ããããŸããïŒ
ç§ã¯2ã€ã®æ¹æ³ã詊ããŸããïŒ
- æåŸã«-ãã¹ãŠã®ç³ã¿èŸŒã¿ã®åŸ-å®å
šã«æ¥ç¶ãããã¬ã€ã€ãŒã®å
- æå-æåã®ç³ã¿èŸŒã¿ã®åŸ-远å ã®ãã£ãã«ãšããŠè¿œå ãããŸã
ååãšããŠãã©ã¡ãã®æ¹æ³ã«ãååšããæš©å©ããããŸãã ã©ãã«ãæåŸã«è¿œå ãããšãç»åã®é«ã¬ãã«ã®æ©èœã«ã©ãã«ã远å ãããã®ã¯è«ççãªããã§ãã ãŸãããã®é-æåã«è¿œå ãããšãäœã¬ãã«ã®æ©èœã«é¢é£ä»ããããŸãã äž¡æ¹ã®æ¹æ³ãæ¯èŒããŠã¿ãŸãããã
ã¬ã·ãã¯æå€§9ã€ã®ææã§æ§æãããŠããããšãæãåºããŠãã ããã 28åãããŸãããã¬ã·ãã³ãŒãã¯9x29ãããªãã¯ã¹ã§ããããããæ¡åŒµãããš261次å
ã®ãã¯ãã«ãåŸãããããšãããããŸãã
32x32ã®ãµã€ãºã®ç»åã®å Žåã512ã«çããé ãã¹ããŒã¹ã®ãµã€ãºãéžæããŸãã
ããå°ãªãæ°ãéžæããããšãã§ããŸãããåŸã§èŠãããããã«ãããã¯ãããŒãããçµæã«ã€ãªãããŸãã
ã©ãã«ã远å ããæåã®æ¹æ³ã䜿çšãããšã³ã³ãŒããŒã®ã³ãŒãã¯ããã¹ãŠã®ç³ã¿èŸŒã¿ã®åŸã§ãã
def create_conv_cvae(channels, height, width, code_h, code_w): input_img = Input(shape=(channels, height, width)) input_code = Input(shape=(code_h, code_w)) flatten_code = Flatten()(input_code) latent_dim = 512 m_height, m_width = int(height/4), int(width/4) x = Conv2D(32, (3, 3), activation='relu', padding='same')(input_img) x = MaxPooling2D(pool_size=(2, 2), padding='same')(x) x = Conv2D(16, (3, 3), activation='relu', padding='same')(x) x = MaxPooling2D(pool_size=(2, 2), padding='same')(x) flatten_img_features = Flatten()(x) x = concatenate([flatten_img_features, flatten_code]) x = Dense(1024, activation='relu')(x) z_mean = Dense(latent_dim)(x) z_log_var = Dense(latent_dim)(x)
ã©ãã«ã远å ãã2çªç®ã®æ¹æ³ïŒæåã®ç³ã¿èŸŒã¿ã®åŸïŒã远å ãã£ãã«ãšããŠäœ¿çšãããšã³ã³ãŒããŒã®ã³ãŒãïŒ
def create_conv_cvae2(channels, height, width, code_h, code_w): input_img = Input(shape=(channels, height, width)) input_code = Input(shape=(code_h, code_w)) flatten_code = Flatten()(input_code) latent_dim = 512 m_height, m_width = int(height/4), int(width/4) def add_units_to_conv2d(conv2, units): dim1 = K.int_shape(conv2)[2] dim2 = K.int_shape(conv2)[3] dimc = K.int_shape(units)[1] repeat_n = dim1*dim2 count = int( dim1*dim2 / dimc) units_repeat = RepeatVector(count+1)(units) #print('K.int_shape(units_repeat): ', K.int_shape(units_repeat)) units_repeat = Flatten()(units_repeat) # cut only needed lehgth of code units_repeat = Lambda(lambda x: x[:,:dim1*dim2], output_shape=(dim1*dim2,))(units_repeat) units_repeat = Reshape((1, dim1, dim2))(units_repeat) return concatenate([conv2, units_repeat], axis=1) x = Conv2D(32, (3, 3), activation='relu', padding='same')(input_img) x = add_units_to_conv2d(x, flatten_code) #print('K.int_shape(x): ', K.int_shape(x)) # size here: (17, 32, 32) x = MaxPooling2D(pool_size=(2, 2), padding='same')(x) x = Conv2D(16, (3, 3), activation='relu', padding='same')(x) x = MaxPooling2D(pool_size=(2, 2), padding='same')(x) x = Flatten()(x) x = Dense(1024, activation='relu')(x) z_mean = Dense(latent_dim)(x) z_log_var = Dense(latent_dim)(x)
ã©ã¡ãã®å Žåã®ãã³ãŒãã³ãŒããåãã§ããã©ãã«ã¯æåã«è¿œå ãããŸãã
z = Input(shape=(latent_dim, )) input_code_d = Input(shape=(code_h, code_w)) flatten_code_d = Flatten()(input_code_d) x = concatenate([z, flatten_code_d]) x = Dense(1024)(x) x = Dense(16*m_height*m_width)(x) x = Reshape((16, m_height, m_width))(x) x = Conv2D(16, (3, 3), activation='relu', padding='same')(x) x = UpSampling2D((2, 2))(x) x = Conv2D(32, (3, 3), activation='relu', padding='same')(x) x = UpSampling2D((2, 2))(x) decoded = Conv2D(channels, (3, 3), activation='sigmoid', padding='same')(x)
ãããã¯ãŒã¯ãã©ã¡ãŒã¿ã®æ°ïŒ
- 4'221'987
- 3'954'867
1ã€ã®æä»£ã®åŠç¿é床ïŒ
- 60ç§
- 63ç§
ç ç©¶ã®40æä»£åŸã®çµæïŒ
- æå€±ïŒ-0.3232 val_lossïŒ-0.3164
- æå€±ïŒ-0.3245 val_lossïŒ-0.3191
ã芧ã®ãšããã2çªç®ã®æ¹æ³ã§ã¯ãANNã®ã¡ã¢ãªãå°ãªããŠæžã¿ãããè¯ãçµæãåŸãããŸããããã¬ãŒãã³ã°ã«ã¯å°ãæéãããããŸãã
çµæãèŠèŠçã«æ¯èŒããããšã¯æ®ã£ãŠããŸãã
- å
ã®ç»åïŒ32x32ïŒ
- äœæ¥ã®çµæã¯æåã®ã¡ãœããã§ãïŒlatent_dim = 64ïŒ
- çµæã¯æåã®ã¡ãœããã§ãïŒlatent_dim = 512ïŒ
- äœæ¥ã®çµæã¯2çªç®ã®æ¹æ³ã§ãïŒlatent_dim = 512ïŒ




ããã§ããã¶ãå
ã®ã¬ã·ãã§ãšã³ã³ãŒããããå¥ã®ã¬ã·ãã§ãã³ãŒããããå Žåããã¶ã®ã¹ã¿ã€ã«è»¢éã®ã¢ããªã±ãŒã·ã§ã³ãã©ã®ããã«èŠããããèŠãŠã¿ãŸãããã
i = 0 for label in labels: i += 1 lbls = [] for j in range(batch_size): lbls.append(label) lbls = np.array(lbls, dtype=np.float32) print(i, lbls.shape) stt_imgs = stt.predict([orig_images, orig_labels, lbls], batch_size=batch_size) save_images(stt_imgs, dst='temp/cvae_stt', comment='_'+str(i))
ã¹ã¿ã€ã«è»¢éã®çµæïŒ2çªç®ã®ãšã³ã³ãŒãæ¹æ³ïŒïŒ

GAN-çæçæµå¯Ÿãããã¯ãŒã¯
ãã®ãããªãããã¯ãŒã¯ã®ç¢ºç«ããããã·ã¢èªã®ååãèŠã€ããããšãã§ããŸããã§ããã
ãªãã·ã§ã³ïŒ
- çæçç«¶äºãããã¯ãŒã¯
- ã©ã€ãã«ãããã¯ãŒã¯ã®çæ
- ç«¶åãããããã®çæ
ç§ã¯ããã奜ãã§ãïŒ
- çæçæµå¯Ÿãããã¯ãŒã¯
äžé£ã®åªããèšäºãGANã®ä»äºã®çè«ã«åœ¹ç«ã¡ãŸãã
ããæ·±ãçè§£ããããã«-ODSã®ææ°ããã°èšäºïŒ ãã¥ãŒã©ã«ãããã¯ãŒã¯ã·ãã¥ã¬ãŒã·ã§ã³ã²ãŒã
ããããçæãã¥ãŒã©ã«ãããã¯ãŒã¯ãçè§£ããç¬ç«ããŠå®è£
ããããšãããšãããã€ãã®å°é£ã«çŽé¢ããŸããã äŸãã°ãçºé»æ©ãçã«ãµã€ã±ããªãã¯ãªåçãäœæããããšããããŸããã
å®è£
ã®çè§£ã«åœ¹ç«ã€ããŸããŸãªäŸïŒ
ã±ã©ã¹ã®MNISTçæçæµå¯Ÿã¢ãã« ïŒ mnist_gan.py ïŒã
DCGAN ïŒDeep Convolutional GANïŒã«é¢ããFacebook調æ»ããã®2015幎æ«ã®èšäºããã®ã¢ãŒããã¯ãã£ã®æšå¥šäºé
ïŒ
æ·±ãç³ã¿èŸŒã¿ã®çæçæµå¯Ÿãããã¯ãŒã¯ã«ããæåž«ãªã衚çŸåŠç¿
GANãæ©èœãããããã®äžé£ã®æšå¥šäºé
ïŒ
GANããã¬ãŒãã³ã°ããæ¹æ³ã¯ïŒ GANãæ©èœãããããã®ãã³ããšã³ã ã
GANãã¶ã€ã³ïŒ
def make_trainable(net, val): net.trainable = val for l in net.layers: l.trainable = val def create_gan(channels, height, width): input_img = Input(shape=(channels, height, width)) m_height, m_width = int(height/8), int(width/8) # generator z = Input(shape=(latent_dim, )) x = Dense(256*m_height*m_width)(z) #x = BatchNormalization()(x) x = Activation('relu')(x) #x = Dropout(0.3)(x) x = Reshape((256, m_height, m_width))(x) x = Conv2DTranspose(256, kernel_size=(5, 5), strides=(2, 2), padding='same', activation='relu')(x) x = Conv2DTranspose(128, kernel_size=(5, 5), strides=(2, 2), padding='same', activation='relu')(x) x = Conv2DTranspose(64, kernel_size=(5, 5), strides=(2, 2), padding='same', activation='relu')(x) x = Conv2D(channels, (5, 5), padding='same')(x) g = Activation('tanh')(x) generator = Model(z, g, name='Generator') # discriminator x = Conv2D(128, (5, 5), padding='same')(input_img) #x = BatchNormalization()(x) x = LeakyReLU()(x) #x = Dropout(0.3)(x) x = MaxPooling2D(pool_size=(2, 2), padding='same')(x) x = Conv2D(256, (5, 5), padding='same')(x) x = LeakyReLU()(x) x = MaxPooling2D(pool_size=(2, 2), padding='same')(x) x = Conv2D(512, (5, 5), padding='same')(x) x = LeakyReLU()(x) x = MaxPooling2D(pool_size=(2, 2), padding='same')(x) x = Flatten()(x) x = Dense(2048)(x) x = LeakyReLU()(x) x = Dense(1)(x) d = Activation('sigmoid')(x) discriminator = Model(input_img, d, name='Discriminator') gan = Sequential() gan.add(generator) make_trainable(discriminator, False) #discriminator.trainable = False gan.add(discriminator) return generator, discriminator, gan gan_gen, gan_ds, gan = create_gan(channels, height, width) gan_gen.summary() gan_ds.summary() gan.summary() opt = Adam(lr=1e-3) gopt = Adam(lr=1e-4) dopt = Adam(lr=1e-4) gan_gen.compile(loss='binary_crossentropy', optimizer=gopt) gan.compile(loss='binary_crossentropy', optimizer=opt) make_trainable(gan_ds, True) gan_ds.compile(loss='binary_crossentropy', optimizer=dopt)
ã芧ã®ãšãããåŒå¥åšã¯ä»¥äžãçæããéåžžã®ãã€ããªåé¡åšã§ãã
1-å®éã®åçã®å Žåã
0-åœç©ã
ãã¬ãŒãã³ã°æé ïŒ
- å®éã®åçã®äžéšãååŸããŸã
- ãã€ãºãçæããããã«åºã¥ããŠãžã§ãã¬ãŒã¿ãŒãç»åãçæããŸã
- å®éã®ç»åïŒã©ãã«1ãå²ãåœãŠãããŸãïŒãšãžã§ãã¬ãŒã¿ãŒããã®åœç©ïŒã©ãã«0ïŒã§æ§æãããåŒå¥åšããã¬ãŒãã³ã°ããããã®ãããã圢æããŸã
- èšç·ŽåŒå¥åš
- GANããã¬ãŒãã³ã°ãïŒå€å¥åšãã¬ãŒãã³ã°ãç¡å¹ã«ãªã£ãŠããããããžã§ãã¬ãŒã¿ãŒããã¬ãŒãã³ã°ãããŸãïŒãå
¥åã«ãã€ãºãé©çšããåºåãããŒã¯1ã«ãªãã®ãåŸ
ã¡ãŸãã
for epoch in range(epochs): print('Epoch {} from {} ...'.format(epoch, epochs)) n = x_train.shape[0] image_batch = x_train[np.random.randint(0, n, size=batch_size),:,:,:] noise_gen = np.random.uniform(-1, 1, size=[batch_size, latent_dim]) generated_images = gan_gen.predict(noise_gen, batch_size=batch_size) if epoch % 10 == 0: print('Save gens ...') save_images(generated_images) gan_gen.save_weights('temp/gan_gen_weights_'+str(height)+'.h5', True) gan_ds.save_weights('temp/gan_ds_weights_'+str(height)+'.h5', True) # save loss df = pd.DataFrame( {'d_loss': d_loss, 'g_loss': g_loss} ) df.to_csv('temp/gan_loss.csv', index=False) x_train2 = np.concatenate( (image_batch, generated_images) ) y_tr2 = np.zeros( [2*batch_size, 1] ) y_tr2[:batch_size] = 1 d_history = gan_ds.train_on_batch(x_train2, y_tr2) print('d:', d_history) d_loss.append( d_history ) noise_gen = np.random.uniform(-1, 1, size=[batch_size, latent_dim]) g_history = gan.train_on_batch(noise_gen, np.ones([batch_size, 1])) print('g:', g_history) g_loss.append( g_history )
ããªãšãŒã·ã§ã³èªåãšã³ã³ãŒããŒãšã¯ç°ãªãããžã§ãã¬ãŒã¿ãŒã®ãã¬ãŒãã³ã°ã«ã¯å®ç»åã¯äœ¿çšããããåŒå¥åšã©ãã«ã®ã¿ã䜿çšãããããšã«æ³šæããŠãã ããã ããªãã¡ çºçåšã¯ãåŒå¥åšããã®èª€å·®åŸé
ã§èšç·ŽãããŸãã
æãè峿·±ãã®ã¯ãæµå¯Ÿçãªãããã¯ãŒã¯ãšããååã¯ããèšèã§ã¯ãªããšããããšã§ãã圌ãã¯æ¬åœã«ç«¶äºããŠãããåŒå¥åšãšçºé»æ©ã®æå€±ã®æž¬å®å€ã远跡ããã®ã楜ããã§ãã
æå€±æ²ç·ãèŠããšãåŒå¥åšã¯ãžã§ãã¬ãŒã¿ãŒã«ãã£ãŠçæãããå
ã®ãŽããšå®éã®ç»åãåºå¥ããããšãããã«åŠç¿ããŸãããæ²ç·ã¯æ¯åãå§ããŸã-ãžã§ãã¬ãŒã¿ãŒã¯ããŸããŸãé©åãªç»åãçæããããšãåŠç¿ããŸãã

1ã€ã®ãã¶ïŒãªã¹ãã®æåã®ãã¶ã¯ããã«ããããïŒã®ãžã§ãã¬ãŒã¿ãŒïŒ32x32ïŒã®åŠç¿ããã»ã¹ã瀺ãgifïŒ

äºæ³ã©ãããGANã®çµæã¯ãå€åãšã³ã³ãŒããŒãšæ¯èŒããŠãããé®®æãªç»åãæäŸããŸãã
CVAE + GAN-æ¡ä»¶ä»ãå€åãªãŒããšã³ã³ãŒããŒããã³çæçæµå¯Ÿãããã¯ãŒã¯
CVAEãšGANãçµã¿åãããŠãäž¡æ¹ã®ãããã¯ãŒã¯ãæå€§éã«æŽ»çšããŸãã ãŠããªã³ã®åºæ¬ã¯ã·ã³ãã«ãªã¢ã€ãã¢ã§ã-VAEãã³ãŒããŒã¯GANãžã§ãã¬ãŒã¿ãŒãšãŸã£ããåãæ©èœãå®è¡ããŸãããç°ãªãæ¹æ³ã§å®è¡ããã³åŠç¿ããŸãã
ãã®ãã¹ãŠãäžç·ã«æ©èœãããæ¹æ³ãå®å
šã«æç¢ºã§ã¯ãªãã£ããšããäºå®ã«å ããŠãKerasã§ããŸããŸãªæå€±é¢æ°ãã©ã®ããã«äœ¿çšã§ãããã«ã€ããŠãæç¢ºã§ã¯ãããŸããã§ããã ãã®åé¡ã®æ€çŽ¢ã¯ãgithubã®äŸã«ãã£ãŠå©ããããŸããã
â Keras VAEããã³GAN
ãã®ãããKerasã®ããŸããŸãªæå€±é¢æ°ã®ã¢ããªã±ãŒã·ã§ã³ã¯ãç¬èªã®ã¬ã€ã€ãŒïŒ ç¬èªã®Kerasã¬ã€ã€ãŒãèšè¿°ããïŒãcallïŒïŒã¡ãœããã«è¿œå ããããšã§å®è£
ã§ããŸã.callïŒïŒã¡ãœããã§ã¯ãå¿
èŠãªèšç®ããžãã¯ãadd_lossïŒïŒã¡ãœãããžã®åŸç¶ã®åŒã³åºãã§å®è£
ã§ããŸãã
äŸïŒ
class DiscriminatorLossLayer(Layer): __name__ = 'discriminator_loss_layer' def __init__(self, **kwargs): self.is_placeholder = True super(DiscriminatorLossLayer, self).__init__(**kwargs) def lossfun(self, y_real, y_fake_f, y_fake_p): y_pos = K.ones_like(y_real) y_neg = K.zeros_like(y_real) loss_real = keras.metrics.binary_crossentropy(y_pos, y_real) loss_fake_f = keras.metrics.binary_crossentropy(y_neg, y_fake_f) loss_fake_p = keras.metrics.binary_crossentropy(y_neg, y_fake_p) return K.mean(loss_real + loss_fake_f + loss_fake_p) def call(self, inputs): y_real = inputs[0] y_fake_f = inputs[1] y_fake_p = inputs[2] loss = self.lossfun(y_real, y_fake_f, y_fake_p) self.add_loss(loss, inputs=inputs) return y_real
åŠç¿ããã»ã¹ïŒ64x64ïŒã瀺ãgifïŒ

ã¹ã¿ã€ã«ç§»è»¢äœæ¥ã®çµæïŒ

ãããŠä»ã楜ããéšåã§ãïŒ
å®éã«ã¯ãããããã¹ãŠã ã£ããã®ã®ããã«-éžæãã飿ã®ããã®ãã¶ã®äžä»£ã
1ã€ã®æåãããªãã¬ã·ãïŒã€ãŸãã1ã27ã®ã³ãŒãïŒã§ãã¶ãèŠãŠã¿ãŸãããã

äºæ³ãããããã«-æã人æ°ã®ããæå24ã20ã17ïŒãããããããããã¢ããã¡ã¬ã©ïŒãå«ããã¶ã®ã¿ãå€ããå°ãªããèŠãã-ä»ã®ãã¹ãŠã®ãªãã·ã§ã³ã¯ãäžžã圢ãšãããŸããªç°è²ã®æç¹ããããå¿
èŠãªå Žåã®ã¿ããªãã¯äœããæšæž¬ããããšããããšãã§ããŸãã
ãããã«
äžè¬ã«ãå®éšã¯éšåçã«æåãããšã¿ãªãããšãã§ããŸãã ãããããã®ãããªããã¡ãã®äŸã§ãããããŒã¿ã¯æ°ãããªã€ã«ã§ãããšããåpathã®è¡šçŸã«ã¯ãç¹ã«æ©æ¢°åŠç¿ã«é¢ããŠååšããæš©å©ããããšæããããšãã§ããŸãã
çµå±ã®ãšãããæ©æ¢°åŠç¿ã«åºã¥ãã¢ããªã±ãŒã·ã§ã³ã®å質ã¯ãäž»ã«ããŒã¿ã®å質ãšéã«äŸåããŸãã
ãžã§ãã¬ãŒãã£ããããã¯ãŒã¯ã¯éåžžã«è峿·±ããã®ã§ãããè¿ãå°æ¥ããããã®ã¢ããªã±ãŒã·ã§ã³ã®å€ãã®ç°ãªãäŸãèŠããããšæããŸãã
ãšããã§ãåçã«å¯Ÿããæš©å©ããã®äœæè
ã®ãã®ã§ããå Žåããã¥ãŒã©ã«ãããã¯ãŒã¯ãäœæããç»åã«å¯Ÿããæš©å©ã¯èª°ãææããŠããã®ã§ããããïŒ
ãæž
èŽããããšãããããŸããïŒ
NBã ãã®èšäºãæžããŠãããšãããã¶ã¯1ã€ããããããŠããŸããã
åç
§è³æ