Talking To Machines, Or "Serendipitous Prompting for AI Image Generators"
Talking about how Midjourney has changed how I approach creative work. October 24, 2022 Midjourney AI Art
I’ve been actively using Midjourney since I wrote that post back in June. The experience has been staggering in how so clearly there is my creative process before AI Art and after AI Art. The thing is, you’ll probably never see anything I generate (besides blog header images).
This is maybe counter to a lot of the “protests” around AI Art. Roughly stated I think there is a (somewhat) legitimate worry about the displacing of artists through image generators like Midjourney, StableDiffusion, Dall-e, etc. The imagined displacement is where people just generate copies or similar works of other artists work and then sell that work, or produce in such a quantity that it devalues the original artists work.
I have very little faith that this will happen. I suspect, in our eternal craving for authenticity, it will drive up the cost of original work (look at the traditional art market!), and if there are thousands of AI artists that are creating work that directly references your work (to the point of copying it), I would think that maybe you would generate some inbound interest to the “source”? IDK (I’m aware this sounds very “trickle down”). I also come from the world of games, and if the worry is about people buying cheap knockoffs (or “stealing” art through generative models), those people would likely not be “buyers” in the first place, which what was the case with game piracy. Someone pirating your game isn’t a lost sale, and I think the same is true for generated art.
AS A SIDE NOTE I think the consent model (or total lack thereof) for how the models themselves are trained IS BAD. Hoovering up art with no seeming interest in license restrictions for the expressed purpose of reproducing that art is BAD. Artists should have not “some” say but “all” the say in this. Maybe artists get comped based on how often their name is used for prompts? IDK. Some people seem to be working on this.
Midjourney, for me, is not some tool to generate real mid-ass Sublime Titty Anime Girl Ghibli 4K Beautiful image. It’s not a tool that I use to produce art with intent to share. Instead it’s become something of a generative moodboard tool for me, or something like a visual sketchpad for nascent ideas. It helps me “see what I am trying to see” quicker and fast in my head than any other method.
Here’s an example. Let’s imagine I want to create a chess set with intent to publish it on Kickstarter. I know a few things about this chess set:
- It should be roughly “medieval”
- I want it to use creatures instead of traditional pieces
- The pieces themselves should be sort of “weird”
In a world before AI art, this would mean I would now hit up Google and just start looking for tons of reference. If I was actually doing this I would still probably do that. But for now, let’s take that to MJ and see what we get:
Look at what’s going on here — MJ has synthesized a chess set that is not only stylistically cohesive but iterates on common themes within the pieces themselves. Look at how the motif of that flat smooth wood is repeated on all the pieces. How the second and last piece have a similar arch brow and eyes but are clearly evoking different characters. By now we shouldn’t be surprised that these generators can generate compelling imagery, but also every time I’m like DAMN. IT REALLY DID IT.
The consistently amazing part of using MJ is that it will give me things I didn’t ask for but that work with what I’m getting. Or said differently, serendipity. This is the killer app of all this stuff IMO as creative tools before art largely lacked this ability (expect for generative art tools, but even then they are often explicitly parametized). MJ’s ability to produce serendiptious results is what makes using MJ feel like working with another (human) collaborator. The “latent space” of the model is able to pull in things that you wouldn’t expect, but nevertheless “work”, much like any good creative partner.
Here’s the issue though — I think most people don’t know how to prompt for serendipity. Maybe most people don’t want it, but imagine how you would prompt MJ for the above image, based on the criteria above. I’d guess it’s something like:
smooth wooden chess set with medieval fantasy characters, dark wood square bases, light wood pieces, beady eyes, ink wash, pointy heads, all unique, minuatures photography, black background, well lit
But remember, I don’t actually know that I wanted what I ended up getting. I didn’t know I wanted square bases, or beady eyes, ink washes, or even smooth wood. Instead I wanted something medieval, with creatures, and weird. The actual prompt?
beskinki goblin chess set, painted wood
Did I get a great result on the first prompting? Of course not. I got some AI-generated ass looking art:
But what matters is that I get the right seed of the idea to MJ. If you overload it with paramaters it will keep trying to fit the generated output to the prompt, meaning it becomes harder and harder for it to do something beyond what it first generates. It will continue to double down on aspects of your prompt, pigeonholing you into what it thinks you want. As said here, “anything left unsaid might surprise you”.
I think what people often do for MJ and other tools (and I’m mostly guessing here but also based on what people do in the Discord, prompts I see on the MJ site) is that they try to “solve” for the “correct” image with the prompt itself. If the prompt generates a bad image, they will try to make a better prompt. Most all prompt guides right now are targeted to the idea of specificity, that you are trying to get the tool to generate a specific image. Maybe if I knew EXACTLY what I wanted, this would be relevant. But if I knew exactly what I wanted, I would just do that in some other tool.
“Trying to craft the perfect prompt” for me, is like Prompting 1.0. Stone Age Prompting. It implies that the human knows how to traverse the latent space. Spoilers: we don’t. What I try to do is instead surrender myself to the latent space. I try to get something in the right vein from a “minimum viable prompt” in order to make room for serendipity.
Instead of producing an image from a prompt and maybe a variation or two before quitting and changing the prompt, I start with a super simple prompt and produce variations of variations with the goal of exploring the possibility space in hopes of finding something that catches my eye. The key is that, when I use MJ, I don’t know what I want. I prompt it, and see what happens. I’ll then follow different version threads along the way, generating a corpus of possibility of stuff that I like.
This also means that I can easily spend 20-30 minutes for a single image. But I also produce a wake of possibility as part of that process, with other strands that can be pulled and returned to if desired.
I think “prompting for undefinied ideas” is maybe the minority of use cases (anecdotally, discords I’m in that have an MJ bot often have people clearly seeking The Correct Result for some image in their head). BUT here’s some reasons you may want to instead do what I do:
- Exploring an aesthetic (what would a Beksiński desk look like? What about Lichtenstein?)
- “Target Keyframes” for media — I do this a ton. I use MJ to generate screenshots of game ideas I have to get a sense of what it would look like. The serendipity here is off the charts, as MJ will often produce images that directly feed back into the design itself.
- “Finding” cool “new” art. I’m ashamed with how much stuff I’ve generated that I think looks really cool. The thing about these models is that they, theoretically, can basically serve any image to suit any taste. In this way I’m able to generate bookoo amounts of stuff I “like”. It’s great. I’m ordering some prints. It feels like an aesthetic instagram page or twitter account where I favorite every picture. I’m telling it roughly what I want, so the idea I get back something that is that is delightful.
Pursuing the above also makes MJ feel more “playful”. Instead of trying to wrest control of it into a specific form, it’s delightful to use and explore with and just see what happens. It has the effect of feeling not like you’re “creating” new art, but instead something akin to digging in the crates. It feels like finding a pitch deck for some unproduced movie, or stills from some unknown photographer. It’s less about “creating” and more about “discovering”. It’s the best game I’ve played in 2022.
DateOctober 24, 2022
PreviouslyFinding answers to difficult questions Exploring what to do when you're trying to find answers to hard technical problems on the internet, and all the usual options aren't working.
subscribe to my newsletter to get posts delivered directly to your inbox