New artificial intellect DALL·E 2 generates any image by text

In early 2022, OpenAI (founded by Elon Musk, among others) introduced DALL·E 2, an artificial intelligence-based system that creates images from descriptions. And the more detailed the request is written, the better the picture will appear.

It is popularly called a “artificial intellect”, but in its essence and composition it is a finished product based on several previous OpenAI developments.

The output image does not just match the query, but frighteningly accurately reproduces its context.

  • DALL·E 2 consists of three large parts, the basis for which was developed by Google, but “assembled” in OpenAI.
  • The first artificial intellect “reads” the text and draws a “draft” of the future image.
  • The second artificial intellect turns the “draft” into a small final image.
  • The third artificial intellect enlarges this small picture by 16 times, adding the necessary details.

Step by step it happens like this:

1. The first artificial intellect is called CLIP, it translates our written (human) text into a computer language in the form of numbers.

2. Next, CLIP turns this set of numbers into a table with other numbers. Such a table plays the role of a “sketch” or “skeleton” on which the final image is created. To make it work, CLIP was trained on 600 million pictures and their captions.

3. “Draft” goes into the second artificial intellect called GLIDE.

4. The second GLIDE artificial intellect takes the original computer text from point 1 and the resulting scheme from point 2, combines the data from them. On the basis of such a mix, she creates a gray grainy square, from which she gradually removes the grain and thereby shows the picture in poor quality. This development method is called “applying the Diffuse Model”.

5. The third artificial intellect increases the image quality by 16 times and shows us the final result.

No matter how perfect the images look, people just don’t post bad shots. This generative model does not always cope with requests, often making mistakes in what is asked of it – although it still generates objects in realistic lighting and a plausible relationship to each other.

Errors occur due to the peculiarities of its CLIP component, which deals with the vector relationship of words in latent space.

This leads to the fact that DALL·E 2 is confused as to which object should be assigned the described characteristic.

For example, in the query “A red cube lies on a blue cube”, it does not always understand which cube should be red and which should be blue.

We also decided to conduct our own investigation and generate different pictures related to the ScoogeFrog

To put it briefly, such visual content is completely original and you do not have to worry about the copyright of the image. That is, DALL·E 2 can be used as a fingerless way to quickly generate an advertising creative without the use of stock, which simplifies the work of designers and creatives. In the field of marketing, you can find various advantages of using an artificial intellect, moreover, if this is an advertisement for games, theartificial intellect draws cool fantastic pictures. You can also create a logo. There are online services that generate unique logos if the user sets the company name, field of activity, references and colors.

Artificial intelligence can create an interesting image, you just need to come up with an idea for it and get a little hand in writing descriptions.