6.5 DALL-E Image Generation

Key Takeaways

  • Azure OpenAI generates images through the Images API using DALL-E 3 and the newer GPT-image-1 family; DALL-E 3 supports sizes 1024x1024, 1792x1024, and 1024x1792 only.
  • DALL-E 3 accepts n=1 per request (one image), with quality standard or hd and style vivid or natural.
  • DALL-E 3 automatically rewrites your prompt for detail and safety and returns the revised_prompt actually used in the response.
  • Content filtering runs on BOTH the input prompt and the generated image, using the same harm categories (hate, sexual, violence, self-harm) as text.
  • Responsible-AI restrictions block photorealistic images of identifiable real people, copyrighted characters, and trademarked logos.
Last updated: June 2026

Quick Answer: Azure OpenAI creates images via the Images API with DALL-E 3 (and the newer GPT-image-1 family). DALL-E 3 supports exactly 1024x1024, 1792x1024, 1024x1792, n=1 per call, quality standard/hd, style vivid/natural. DALL-E 3 auto-revises the prompt and returns revised_prompt. Content filters screen both the prompt and the output image.

Generating an Image

from openai import AzureOpenAI

client = AzureOpenAI(api_version="2024-10-21",
                     azure_endpoint="https://my-openai.openai.azure.com/",
                     azure_ad_token_provider=token_provider)

resp = client.images.generate(
    model="dalle3",                 # deployment name
    prompt="A serene mountain lake at sunset, digital art",
    n=1,                            # DALL-E 3 supports only 1
    size="1024x1024",               # or 1792x1024 / 1024x1792
    quality="hd",                   # or standard
    style="vivid"                   # or natural
)
image_url = resp.data[0].url                  # valid ~24 hours
revised = resp.data[0].revised_prompt          # what DALL-E actually used

On the Exam: The returned url is temporary (expires roughly 24 hours) — download and persist the image (e.g. to Blob Storage) if you need it long term. You can also request response_format="b64_json" to get the bytes inline instead of a URL.

DALL-E 3 Parameters

ParameterAllowed valuesNotes
size1024x1024, 1792x1024, 1024x1792No custom sizes; DALL-E 3 dropped 256/512
qualitystandard, hdhd adds detail at higher cost
stylevivid, naturalvivid = dramatic/hyper-real; natural = photographic
n1 onlyMultiple images = multiple calls

Trap: Options offering 256x256 or 512x512 are testing whether you remember those are DALL-E 2 sizes, removed in DALL-E 3. Likewise, asking for n=4 in one DALL-E 3 call fails.

Prompt Revision

DALL-E 3 rewrites your prompt before generating, adding descriptive detail, artistic direction for vague requests, and safety-oriented phrasing. The revised_prompt field returns the exact text used, so you can log it, display it, or detect when the model substantially reinterpreted your intent. This behavior is unique to DALL-E 3 and is a favorite exam fact.

Dual Content Filtering

Unlike text generation (filtered once), image generation is screened twice:

StageWhenWhat it checks
Prompt filterBefore generationBlocks requests for harmful imagery
Image filterAfter generationAnalyzes the produced image for harmful content
Copyright / IP checkAfter generationBlocks copyrighted characters & trademarked logos

Both filters use the same four harm categories as text — hate, sexual, violence, self-harm — at configurable severity. A prompt may pass yet still have its output blocked, and you can receive a content_filter finish reason on either side.

Responsible-AI Restrictions

  • No photorealistic images of identifiable real people (celebrities, public figures).
  • No reproduction of copyrighted characters or trademarked logos.
  • Refuses prompts for misleading, deceptive, or otherwise harmful imagery.
  • A blocked request returns an error / content_filter flag rather than an image.

Scenario: an app sends "a photorealistic portrait of [a named celebrity]". The prompt filter rejects it under the real-person rule — the fix is to describe a fictional or non-identifiable subject, not to raise any parameter.

Reading the finish reason

When a request is blocked, you do not get a usable image. The Images API surfaces this either as an HTTP error carrying a content-filter code or, in chat-style flows, as a content_filter finish reason. Your application must handle this gracefully — show a friendly message and let the user revise the prompt rather than silently failing. This dual-stage screening means defensive code should anticipate a block at either the prompt or the image stage, not just on submission.

Persisting generated images

Because the returned url expires (~24 hours), production apps follow a consistent pattern: generate the image, immediately download the bytes (or request response_format="b64_json"), and store them in durable storage such as Azure Blob Storage with your own access controls. Relying on the temporary URL for anything beyond an immediate preview is a frequently-tested mistake. If you need several variations, loop and call the API multiple times with n=1 each, since DALL-E 3 never returns more than one image per request.

DALL-E 3 vs. GPT-image-1

CapabilityDALL-E 3GPT-image-1 (newer)
Text-to-imageYesYes
Auto prompt revisionYesModel-managed
Image editing / inpaintingNoYes (image + mask input)
Transparent backgrounds, better text renderingLimitedImproved

Quality vs. style, decoded

The two creative knobs are independent and often confused. quality (standard vs. hd) controls how much rendering effort and detail go into the image — hd costs more and takes longer but sharpens fine detail. style (vivid vs. natural) controls the aesthetic: vivid pushes dramatic, saturated, hyper-real results, while natural yields more muted, photographic output. A scenario asking for "realistic, true-to-life product photos" points to style=natural; one asking for "eye-catching, dramatic marketing art" points to style=vivid. Neither knob changes the allowed sizes or the n=1 limit.

Responsible-AI summary for images

RestrictionPractical effect
No identifiable real peoplePhotorealistic celebrity/public-figure portraits blocked
No copyrighted charactersBranded cartoon/film characters refused
No trademarked logosCompany marks not reproduced
Harm-category filtersHate, sexual, violence, self-harm screened on prompt and image

These are enforced by the service, not optional toggles a developer can disable, and they are the most likely responsible-AI image questions on the exam.

On the Exam: Remember the dual filter (prompt AND image), the fixed three sizes with n=1, the revised_prompt field, the temporary URL that must be persisted, and the real-person / copyright restrictions. If a scenario requires editing an existing image with a mask, that points to GPT-image-1, not DALL-E 3, which only generates from text.

Test Your Knowledge

Which image sizes does DALL-E 3 support on Azure OpenAI?

A
B
C
D
Test Your Knowledge

What does the revised_prompt field in a DALL-E 3 response contain?

A
B
C
D
Test Your Knowledge

How does content filtering apply to image generation on Azure OpenAI?

A
B
C
D
Test Your Knowledge

An application must edit an existing photo by replacing a masked region with new content. Which model is appropriate?

A
B
C
D