How to use AI image generators DALL-E and Midjourney
Learn how to create AI generated images using the DALL-E and Midjourney image generator tools while keeping risks in mind.
Apr 11, 2024 • 7 Minute Read
Have you ever heard the phrase, “A picture is worth a thousand words?”
Images enhance understanding, evoke emotions, and create more visual appeal. But finding the right image isn’t always a simple task. While there are sites like Getty Images and iStockPhoto, these photos can be expensive—and you might not be able to find the exact photo you’re looking for.
What if you could describe what you’re looking for via text and get the image that you want? Welcome to AI image generation.
Table of contents
Before we begin: The risks of creating and using AI generated images
In just a few moments, I’ll show you how to create images that look as if they came straight from your imagination. But there are a few things I need to tell you first.
An AI model must be trained on existing data in order to create new things, whether that’s text, predictions, or images. Many of the models used by image-based generative AI platforms have been trained on publicly available data. However, just because an image is publicly available doesn’t mean that it’s freely available.
Some photos created by AI image generators may be too similar to copyrighted images. Before using AI generated images, check the rules of the platform where you plan to post or use it. Some require you to add a watermark or disclaimer to any images created with AI.
In some cases, you shouldn’t use AI generated images at all. You also have to bear in mind that you are not legally protected in any way.
I don’t want to stop you from experimenting with these tools, but you should be aware of the risks before you dive in. That said, let me introduce you to DALL-E and Midjourney.
How to create images with DALL-E and ChatGPT-4
DALL-E is a series of AI models you can use to create images from natural language. At the time of writing this, the latest version is DALL-E 3. It’s integrated into ChatGPT Plus subscriptions and available via the OpenAI API.
Interesting fact: The name DALL-E comes from the name of the animated Pixar robot WALL-E and the Spanish artist Salvador Dalí.
To generate an image in DALL-E:
Open ChatGPT-4. (You must be a ChatGPT Plus subscriber to gain access.)
Write a prompt that describes the image you want.
Hit enter.
Yep, that’s pretty much it!
The simplicity is what makes image-based generative AI so valuable (and why it’s important to be cautious about how you use these images). Now, let me give you a few tips on how to improve your image generation game.
4 tips for creating better images with DALL-E
There are several things that you can do to improve the quality of your generated images.
1. Ask ChatGPT to help you write the prompt
You can ask ChatGPT to create a clear prompt for image generation. Creating prompts to generate other prompts is a common practice in text-based generative AI.
2. Create a clear prompt for the image generator
ChatGPT can’t read your mind. When you request an image, include all the details. For example, you can specify the style, background, number of people, and any other detail.
If you aren’t clear, you may receive unexpected outputs. For example, in the case below, I asked ChatGPT to create a prompt for my image, but since I wasn’t clear enough, it started creating the image right away.
3. Change the aspect ratio of an AI generated image
Once you’ve created a picture using an image generator, you can ask for changes. For example, you can ask for a different aspect ratio. In this case, I asked for 16:9, which is commonly used for presentations and videos.
4. Change individual components of an AI generated image
You can modify individual elements of AI generated images. For example, let’s say I want a pink unicorn instead of a white one.
How to use DALL-E using the OpenAI API
If you need to generate images programmatically, you can use the OpenAI API to generate new images with DALL-E via the image generations endpoint. Here’s what you need to do to get started.
1. Get access to the OpenAI API
Make sure you can access the API. To do this, you need to sign up and obtain an API secret key.
2. Determine how you’ll access the API
Accessing the image generation endpoint directly is the most straightforward way to use the API (and it works with all languages). You can also use one of the officially supported or community managed OpenAI libraries which provide an easy way to use the API.
3. Prepare the request, make the API call, and parse the response
Once you’ve got an API secret key and decided how you’ll access the API, it’s time to get coding. In this case, I’m going to use Python and call the endpoint via an HTTP call.
First, I specify the URL for the endpoint. Then, I describe the data. This includes the prompt, number of images I want, and the image size.
URL = "https://api.openai.com/v1/images/generations"
data = {
"prompt": "A unicorn flying on top of the rainbow on the moon",
"model": "dall-e-2",
"n": 2,
"size": "1024x1024"
}
Next, I create a function to make an HTTP request. This doesn’t contain anything special; it just makes a POST call to the URL it received as a parameter and sets the authorization header and data.
def make_openai_request(url, data=None):
"""Takes a prompt as an argument and sends a POST request to the OpenAI API"""
headers = {
'Authorization': f'Bearer {key}'
}
if data:
headers['Content-Type'] = 'application/json'
response = requests.post(url, headers=headers, json=data, timeout=20)
if response.status_code == 200:
result = response.json()
return result
print('Request failed with status code:', response.status_code)
return None
Now I make the call.
request_result = make_openai_request(URL, data)
The return value is a URL I can use to download the image I received as a result. Here’s a sample result with two images.
You can now experiment with image variations, inpainting, and other features which vary from model to model. For a more detailed explanation of how to generate images using the OpenAI API with DALL-E, check out my course Developing Generative AI Applications with Python and Open AI.
How to use the Midjourney AI image generator
Midjourney is another great option for generating images using AI and natural language prompts. Midjourney works similarly to OpenAI’s DALL-E platform. Here’s what you need to get started.
1. Get access to Discord and Midjourney
You use Midjourney inside the Discord platform, so you need to create a Discord account or log in to an existing account to use it. Because Midjourney isn’t free, you’ll also need to pick a subscription plan. Plans vary based on the amount of resources and GPU time, which determines how many images you can generate.
2. Navigate to a newbies channel in Discord
Once you’re inside the Midjourney Discord server, navigate to a newbies channel. Each channel has a number next to it, but it doesn’t matter which channel you choose.
These channels are designed for beginners to start using the Midjourney bot. (You can also generate images on other channels or servers where you’ve invited the Midjourney bot, but let’s stick to the newbies channel for now.)
3. Generate your first image with Midjourney
Next, generate the image. Type /imagine and then write your prompt.
Just as I did with DALL-E, I’ll ask for, “A unicorn flying on top of the rainbow on the moon.”
4. Upscale or change the generated image
Once you’ve generated images with Midjourney, you’ll notice several buttons below them. The numbers correspond to the quadrant. For example, 1 refers to the top left image, 2 represents the top right image, 3 is the bottom left image, and 4 is the bottom right image.
Use the U buttons to upscale an image.
Use the V buttons to create variations of an image. For example, I might like the image in quadrant 2 (the top right image) but want to see some different variations of it.
The circular arrow icon is called the re-roll. This reruns the job and creates a new selection of images.
5. Use parameters to edit the AI generated image
You can use several parameters to control the generated image. For example, you can use --ar to indicate the aspect ratio. Midjourney accepts values like 4:5, 2:3, 4:7, 1:1, and 16:9. A prompt with this parameter would look like this:
imagine/ prompt a unicorn flying over the rainbow --ar 4:5
You can also zoom in and out on an image and indicate how much images should favor artistic color, composition, and form. The sky's the limit. Actually, your imagination is the limit. Go ahead and start experimenting with Midjourney to generate images.
A few final words on using image generators
Image-based generative AI is a new field that will have a deep impact on how we work. In this blog post, I covered DALL-E with ChatGPT Plus, DALL-E with the OpenAI API, and Midjourney, but there are lots of other options.
Regardless of the platform you use, remember to use disclaimers and watermarks when needed. If really in doubt, consult with a legal professional. That said, I wish you the best when generating content using AI.