Multimodal Models

AI/IA

GPT-4o: Native Image Generation

Launched March 2025 — a new axis for thinking about image generators:

Instruction-following: follows complex, specific prompts with high fidelity
Context-aware: understands and preserves context across multi-turn edits
Reference image uploads: accepts your own images as input
Precise editing: modify specific parts of an image without touching the rest

Midjourney optimizes for aesthetic quality
GPT-4o optimizes for instruction precision

Two different tools for different creative goals.