Over the past month and a half, I have been working with stable diffusion models, exploring both their technical and creative potential as "AI artists." As I reflected on this term, I realized that the design of these deep learning models brings together several years of professional breakthroughs, with training on enormous amounts of images and fine-tuning techniques. Although the AI can generate astonishing images, making it generate what I want is a lot of work. In this article, I want to share some of my thoughts on the creative process involved, not as a tutorial or a step-by-step guide, but as a reflection on my own experiences.

The kid who hated to draw

As a child, I often had to do drawing tasks in kindergarten, which I hated, especially when it came to drawing people. Most of the people around me were white, and whenever I tried to draw their faces using a brown pencil, they turned out black. Even when I copied other kids' techniques and used a pink pencil, it still looked unnatural. But it's fine, everybody did the same.

Then something happened. It was a general drawing session in the kindergarten, but different. We got red paper. And it made me so exited. It was so interesting. The task was to draw our gymnastics lessons, where we wore white T-shirts. Of course, my drawings were still weird and unnatural, like a kid's drawing, but when I tried to use the pink pencil trick on the faces, it didn't work. You can imagine how the pink color looked on red paper. So my entire drawing career as a five-year-old boy was a tragedy.

Later nothing has changed

I enjoyed school and was good at math, but I still struggled with art. I remember origami-ing a ship with a chimney for a technics lesson and receiving a 3 out of 5 grade, which was my worst grade at the time. I followed the instructions carefully, but the result was not what it was supposed to look like.

My fiancée also points out how bad my photos are. Once, I stopped on a bridge to take a picture of the big moon in the night sky. I thought it would be simple.

  1. You see the big moon with your eyes.
  2. You put the camera to your eyes, so the camera sees the same.  
  3. Take a picture

But no matter how hard I tried, the huge moon always turned out small in my photos. I have a collection of photos of small moons from the past years. This is an example of how different observers see the world in different ways.

Absolutely bad picture about the moon

Directing short film

In secondary school, I directed a short film about the Hungarian revolution of 1848. The task was to create a few-minute movie with a given narrator speech. My feelings about the result are mixed. I didn't know enough about politics at that age, so my concept was a simple poor-rich conflict, which was too superficial. My friends and I played the roles, which was what it was. But I enjoyed the process of directing.

I have always preferred movies to theater because they seem more real to me, even with their visual effects. The loud speaking and exaggerated gestures in theater seem too artificial. But as I became an adult, I learned that the language of theater is not meant to copy reality, and that movies are far, far away from reality, even with their characters. Many characters are simple archetypes, without any real personality.

Whatever that experience that I directed a film, where the cameraman records the picture well. The images don't looked like a 5-year-old kid's drawing meant a lot to me.

The joy for me in AI artist processes

I believe you can discover what AI art is for now. AI art has the potential to create visually appealing results. I mainly focus on high-level creative processes rather than line drawing, color techniques, etc. Unlike photography, I do not have to travel to the target object's location.

Of course, it requires new skills to put together a good dataset for fine-tuning. For instance, my initial cover photo idea was Michelangelo painting on a computer screen with brushes. I had to find the appropriate prompt for text-to-image conversion, fix the resulting image using image-to-image translation, and then outpaint to fit the cover photo size. I used Midjourney to get a base image, then input a stable diffusion v1.5 model for the image-to-image process.

Creating good AI arts are not different to any other arts. You have to be a good craftsman in the art's own technics. You have to be able to define something, you want to show, not just on level of the prompt, but on the level of message you want to convey through your image. Perhaps you can not deliver as many messages as Hieronymus Bosch did in one painting yet. The artists always used the available tools in their era. We live an era, where the AI is an available tool.