Experiment with Gemini 2.0 Flash native picture era

In December we first launched native picture output in Gemini 2.0 Flash to trusted testers. Right this moment, we’re making it accessible for developer experimentation throughout all areas at present supported by Google AI Studio. You possibly can take a look at this new functionality utilizing an experimental model of Gemini 2.0 Flash (gemini-2.0-flash-exp) in Google AI Studio and through the Gemini API.

Gemini 2.0 Flash combines multimodal enter, enhanced reasoning, and pure language understanding to create pictures.

Listed below are some examples of the place 2.0 Flash’s multimodal outputs shine:

1. Textual content and pictures collectively

Use Gemini 2.0 Flash to inform a narrative and it’ll illustrate it with footage, preserving the characters and settings constant all through. Give it suggestions and the mannequin will retell the story or change the model of its drawings.

Sorry, your browser would not help playback for this video

Story and illustration era in Google AI Studio

2. Conversational picture enhancing

Gemini 2.0 Flash helps you edit pictures via many turns of a pure language dialogue, nice for iterating in the direction of an ideal picture, or to discover totally different concepts collectively.

Sorry, your browser would not help playback for this video

Multi-turn dialog picture enhancing sustaining context all through the dialog in Google AI Studio

3. World understanding

Not like many different picture era fashions, Gemini 2.0 Flash leverages world information and enhanced reasoning to create the precise picture. This makes it excellent for creating detailed imagery that’s lifelike–like illustrating a recipe. Whereas it strives for accuracy, like all language fashions, its information is broad and common, not absolute or full.

Sorry, your browser would not help playback for this video

Interleaved textual content and picture output for a recipe in Google AI Studio

4. Textual content rendering

Most picture era fashions battle to precisely render lengthy sequences of textual content, usually leading to poorly formatted or illegible characters, or misspellings. Inside benchmarks present that 2.0 Flash has stronger rendering in comparison with main aggressive fashions, and nice for creating commercials, social posts, and even invites.

Sorry, your browser would not help playback for this video

Picture outputs with lengthy textual content rendering in Google AI Studio

Begin making pictures with Gemini at this time

Get began with Gemini 2.0 Flash through the Gemini API. Learn extra about picture era in our docs.

from google import genai
from google.genai import varieties

shopper = genai.Consumer(api_key=“GEMINI_API_KEY”)

response = shopper.fashions.generate_content(
mannequin=“gemini-2.0-flash-exp”,
contents=(
“Generate a narrative a few cute child turtle in a 3d digital artwork model. “
“For every scene, generate a picture.”
),
config=varieties.GenerateContentConfig(
response_modalities=[“Text”, “Image”]
),
)

Whether or not you might be constructing AI brokers, growing apps with stunning visuals like illustrated interactive tales, or brainstorming visible concepts in dialog, Gemini 2.0 Flash means that you can add textual content and picture era with only a single mannequin. We’re desperate to see what builders create with native picture output and your suggestions will assist us finalize a production-ready model quickly.

Source link

Experiment with Gemini 2.0 Flash native picture era

So A lot for Trump’s Peace Take care of Russia: US Backs Zelensky with Pre-Rejected-by-Russia Ceasefire Scheme

Gemini Robotics brings AI into the bodily world

Gemini Robotics brings AI into the bodily world

Leave a Reply Cancel reply

Categories

Recent News