Introduction to the principles of AI painting

Image Principle

RGB Principle

RGB Principle

Each color contains three values of R,G,B. The three primary colors R,G,B make up all the colors visible to the naked eye!


  1. the smallest unit that makes up a picture;

  2. Each pixel is a small square;

  3. A pixel cannot be subdivided;

  4. each pixel has only one color, that is, each pixel has only one (set of) fixed R,G,B value, which is the value represented by the three numbers in 0-255;

  5. in a picture, each pixel has a unique coordinate value;

For example: a 512 * 512 picture, is composed of 512 * 512 pixels, because each pixel has three values of R,G,B, so a 512 * 512 picture, has 512 * 512 * 3 = 786432 numbers that indicate the color;

Image Display Principle

Diffusion (figuring out how AI generates images)

Researchers add noise to an image to gradually turn it into a purely noisy image;

Then let AI learn the inverse of this process, which is how to get a high-definition picture with information from a noisy picture.

Introduction to AI Painting Principles

This model is the Diffusion Model in AI painting.

Michelangelo: The statue was already in the stone, I just removed the unwanted parts.

CLIP (understanding what kind of images AI generates)

CLIP stands for Contrastive Language-Image Pre-Training, which means “Contrastive Language-Image Pre-Training”;

Humans can easily do the following picture-text matching linking

Introduction to AI Painting Principles

AI needs to learn to match pictures with text in a huge number of ways to achieve the same effect!

How big is "massive"?

For example, the largest image generation model has learned a total of 5.85 billion images.

Latent Space (Understanding the way AI learns to recognize pictures by trickery)

In the process of AI learning to recognize pictures, it will map different pictures to Latent Space according to different textual description contents, and only then it will learn by diffusion and inverse diffusion, figuratively understood, as shown in the figure below:

Introduction to Diffusion & Backward Diffusion

The more similar the text descriptions are, the closer the objects are in this Latent Space.

The benefits of this treatment are:

1, the text description can be used to classify, index and generate images;

2, reduce the amount of computation when AI generates pictures, you can quickly locate the picture that is closest to the description according to the text description;

Introduction to Diffusion & Backward Diffusion

Introduction to the principles of AI painting
Posted on
November 8, 2023
Licensed under