Basic Fame Work of GAN: Generative Adversarial Network part 1

袁晗 | Luo, Yuan Han
Geek Culture
Published in
4 min readSep 27, 2021

--

img src: Sedki Alimam https://society6.com/product/spy-vs-spy-ma7_print

Like many breakthroughs in history, GAN isn’t complicated. It’s brilliant. If you are familiar with Object oriented programming and rudimentary machine learning, GAN is the obvious next step: dynamics of multiple models working together. Not only GAN creates amazing results, it also teaches us effective learning method: healthy competition. This is only the tip of the iceberg where a classifier model is pitted against a generator model. Imagine the possibilities of using more than 2 models and a different sets of relationships instead of competition. The implication is limitless.

Generator, Discriminator, and loss functions

Let me make an analogy. A cell is composed of molecules, which itself consist of chemical elements, and all the way down to subatomic particles. GAN is similar, it is composed of models, which it self consist of deep neural networks, all the way down to a single line of code. I hope you see what I am trying to do here. I am trying to say that GAN is nothing more than interactions between a bunch of deep neural models: a Generator, a Discriminator, and a loss function, at least for the very simple GAN. If you understand this, you understand the basic architecture of GAN. Of course in reality, GAN has evolved with many nuances to improve its result, much like the 2021 Audi engine won’t be the same as the first combustible engine, but the series will focus on the most simple combustible engine. And if I have time, I might write about more contemporary GAN in the future.

Generator

There are 2 components to a Generator: get_generator_block and Generator. Lets take a close look at what’s inside get_generator_block class.

Img source: Coursera, Generative Adversarial Networks (GANs) Specialization

We began first by importing pytorch from the first line, this will save us a lot of time. Of course sklearn and tensor flow can achieve the same thing. The only problem with pytorch is that it tucks away many parts that is key to understanding, so I will go over it line by line.

Img src: http://www.sharetechnote.com/html/Python_PyTorch_nn_Linear_01.html

This is what a nn.linear(p1,p2) layer looks like. The first parameter controls the number of coefficients in a linear node (purple circle) plus a bias term (aka y-intercept). In visual data, the input that interacts with the coefficients are pixel values. The second parameter controls how many linear nodes will be in the network.

Img src: Coursera Deep learning specialization

The second layer nn.BatchNorm1d() normalizes each values as shown above: divide each value with its corresponding norm. Every row has a different norm. The norm of each row is calculated by summing up all the square values of that row. As seen above, square root of 0*0 + 3*3 + 4*4 equals to 5. Normalizing data will speed up training process via speeding up convergence by a significant amount.

Img src: https://towardsdatascience.com/gradient-descent-algorithm-and-its-variants-10f652806a3

We use nn.Relu() because other substitutes such as sigmoid and tanh functions create vanishing/exploding gradient problem. Since this problem is outside the scope, I will post links to 2 videos that does a very good job explaining It down below.

Conclusion

Keep in mind that although get_generator_block gives us an layer in deep neural network, it self is consists of 3 layers: Linear, Batchnorm, and Relu. I understand this can be confusing, that is why you will see some literatures call it a block instead of a layer, for example the name of the class it self is get_generator_block. But I do believe there is value in calling it layer, because it is an easier analogy for people to envision something passing through a layer rather than a block. That is all for the class, I will finish up the Generator model in the next blog, stay tuned.

--

--