
How Our AI Agent Actually Works: Intent, Modes, and Automatic Model Selection
Go behind the scenes of the AI Agent β how it reads your intent, picks the right mode and model, enhances your prompt, and chains multi-step plans for you.
From a Sentence to a Finished Image
The AI Agent lets you skip the menus. You describe what you want in plain language and it handles model choice, settings, and execution. Here is what actually happens under the hood between your message and the finished result.
Step 1: Understanding Your Intent
The agent reads your message together with its context β any images you uploaded and the conversation so far β to work out what you are really asking for, not just the literal words.
Step 2: Detecting the Mode
Next it picks the right mode: create a brand-new image, transform or edit an existing one, upscale, generate a video, or animate a still. If it thinks you want to switch modes, it tells you first so you are never surprised.
Step 3: Choosing the Model
This is where the agent earns its keep. It routes your request to the model best suited for the job:
- Text inside the image, like posters or signage β GPT Image 2 or Flux 2 Flex
- Highest quality or a hero shot β Flux 2 Max or Imagen 4 Ultra
- Speed and high volume β Flux 2 Klein or Nano Banana
- Real-world or current subjects β Flux 2 Max with grounding web search
- Editing or combining references β a Flux 2 Edit model, up to eight images
- Video from a still β Veo, Kling, or Seedance image-to-video
Step 4: Enhancing Your Prompt
Before generating, the agent automatically translates non-English prompts to English and enriches them with style, lighting, composition, and mood detail. A three-word request becomes a precise brief the model can actually follow.
Step 5: Multi-Step Plans
For bigger requests the agent builds a plan and executes it step by step β for example, generate an image and then upscale it, or produce a set of variations β showing progress as it goes instead of making you run each step yourself.
Multi-Reference: Consistency Across Scenes
When you need the same character in different scenes, a person dropped into a new background, or a product placed into a mockup, the agent passes multiple reference images to a Flux 2 edit model and keeps the key elements consistent.
You Stay in Control
Every decision is transparent. You can see which model the agent chose, override it whenever you like, or step into the Studio for full manual control. The agent is there to remove busywork, not to take away your choices.
See it in action
Chat with the AI Agent
