Google releases new AI video mannequin Veo 3.1 in Circulate and API: what it means for enterprises

Metro Loud
10 Min Read



As anticipated after days of leaks and rumors on-line, Google has unveiled Veo 3.1, its newest AI video era mannequin, bringing a set of artistic and technical upgrades geared toward enhancing narrative management, audio integration, and realism in AI-generated video.

Whereas the updates broaden potentialities for hobbyists and content material creators utilizing Google’s on-line AI creation app, Circulate, the discharge additionally alerts a rising alternative for enterprises, builders, and artistic groups searching for scalable, customizable video instruments.

The standard is increased, the physics higher, the pricing the identical as earlier than, and the management and modifying options extra sturdy and assorted.

My preliminary assessments confirmed it to be a robust and performant mannequin that instantly delights with every era. Nonetheless, the look is extra cinematic, polished and a bit of extra "synthetic" than by default than rivals similar to OpenAI's new Sora 2, launched late final month, which can or will not be what a specific consumer goes after (Sora excels at handheld and "candid" model movies).

Expanded Management Over Narrative and Audio

Veo 3.1 builds on its predecessor, Veo 3 (launched again in Might 2025) with enhanced help for dialogue, ambient sound, and different audio results.

Native audio era is now accessible throughout a number of key options in Circulate, together with “Frames to Video,” “Substances to Video,” and “Prolong," which give customers the power to, respectively: flip nonetheless photos into video; use gadgets, characters and objects from a number of photos in a single video; and generate longer clips than the preliminary 8 seconds, to greater than 30 seconds and even 1+ plus when persevering with from a previous clip's ultimate body.

Earlier than, you had so as to add audio manually after utilizing these options.

This addition provides customers higher command over tone, emotion, and storytelling — capabilities which have beforehand required post-production work.

In enterprise contexts, this stage of management could scale back the necessity for separate audio pipelines, providing an built-in technique to create coaching content material, advertising movies, or digital experiences with synchronized sound and visuals.

Google famous in a weblog put up that the updates mirror consumer suggestions calling for deeper creative management and improved audio help. Gallegos emphasizes the significance of creating edits and refinements attainable immediately in Circulate, with out transforming scenes from scratch.

Richer Inputs and Modifying Capabilities

With Veo 3.1, Google introduces help for a number of enter varieties and extra granular management over generated outputs. The mannequin accepts textual content prompts, photos, and video clips as enter, and likewise helps:

  • Reference photos (as much as three) to information look and magnificence within the ultimate output

  • First and final body interpolation to generate seamless scenes between mounted endpoints

  • Scene extension that continues a video’s motion or movement past its present length

These instruments purpose to offer enterprise customers a technique to fine-tune the appear and feel of their content material—helpful for model consistency or adherence to artistic briefs.

Further capabilities like “Insert” (add objects to scenes) and “Take away” (delete components or characters) are additionally being launched, although not all are instantly accessible via the Gemini API.

Deployment Throughout Platforms

Veo 3.1 is accessible via a number of of Google’s present AI companies:

  • Circulate, Google’s personal interface for AI-assisted filmmaking

  • Gemini API, focused at builders constructing video capabilities into functions

  • Vertex AI, the place enterprise integration will quickly help Veo’s “Scene Extension” and different key options

Availability via these platforms permits enterprise clients to decide on the suitable atmosphere—GUI-based or programmatic—based mostly on their groups and workflows.

Pricing and Entry

The Veo 3.1 mannequin is at the moment in preview and accessible solely on the paid tier of the Gemini API. The price construction is similar as Veo 3, the previous era of AI video fashions from Google.

  • Normal mannequin: $0.40 per second of video

  • Quick mannequin: $0.15 per second

There is no such thing as a free tier, and customers are charged provided that a video is efficiently generated. This mannequin is according to earlier Veo variations and offers predictable pricing for budget-conscious enterprise groups.

Technical Specs and Output Management

Veo 3.1 outputs video at 720p or 1080p decision, with a 24 fps body fee.

Length choices embody 4, 6, or 8 seconds from a textual content immediate or uploaded photos, with the power to increase movies as much as 148 seconds (greater than 2 and half minutes!) when utilizing the “Prolong” function.

New performance additionally consists of tighter management over topics and environments. For instance, enterprises can add a product picture or visible reference, and Veo 3.1 will generate scenes that protect its look and stylistic cues throughout the video. This might streamline artistic manufacturing pipelines for retail, promoting, and digital content material manufacturing groups.

Preliminary Reactions

The broader creator and developer group has responded to Veo 3.1’s launch with a mixture of optimism and tempered critique—notably when evaluating it to rival fashions like OpenAI’s Sora 2.

Matt Shumer, an AI founding father of Otherside AI/Hyperwrite, and early adopter, described his preliminary response as “disappointment,” noting that Veo 3.1 is “noticeably worse than Sora 2” and likewise “fairly a bit costlier.”

Nonetheless, he acknowledged that Google’s tooling—similar to help for references and scene extension—is a vibrant spot within the launch.

Travis Davids, a 3D digital artist and AI content material creator, echoed a few of that sentiment. Whereas he famous enhancements in audio high quality, notably in sound results and dialogue, he raised considerations about limitations that stay within the system.

These embody the shortage of customized voice help, an incapacity to pick generated voices immediately, and the continued cap at 8-second generations—regardless of some public claims about longer outputs.

Davids additionally identified that character consistency throughout altering digital camera angles nonetheless requires cautious prompting, whereas different fashions like Sora 2 deal with this extra robotically. He questioned the absence of 1080p decision for customers on paid tiers like Circulate Professional and expressed skepticism over function parity.

On the extra constructive finish, @kimmonismus, an AI publication author, acknowledged that “Veo 3.1 is superb,” although nonetheless concluded that OpenAI’s newest mannequin stays preferable general.

Collectively, these early impressions counsel that whereas Veo 3.1 delivers significant tooling enhancements and new artistic management options, expectations have shifted as rivals increase the bar on each high quality and value.

Adoption and Scale

Since launching Circulate 5 months in the past, Google says over 275 million movies have been generated throughout varied Veo fashions.

The tempo of adoption suggests vital curiosity not solely from people but in addition from builders and companies experimenting with automated content material creation.

Thomas Iljic, Director of Product Administration at Google Labs, highlights that Veo 3.1’s launch brings capabilities nearer to how human filmmakers plan and shoot. These embody scene composition, continuity throughout photographs, and coordinated audio—all areas that enterprises more and more look to automate or streamline.

Security and Accountable AI Use

Movies generated with Veo 3.1 are watermarked utilizing Google’s SynthID expertise, which embeds an imperceptible identifier to sign that the content material is AI-generated.

Google applies security filters and moderation throughout its APIs to assist reduce privateness and copyright dangers. Generated content material is saved quickly and deleted after two days except downloaded.

For builders and enterprises, these options present reassurance round provenance and compliance—essential in regulated or brand-sensitive industries.

The place Veo 3.1 Stands Amongst a Crowded AI Video Mannequin Area

Veo 3.1 isn’t just an iteration on prior fashions—it represents a deeper integration of multimodal inputs, storytelling management, and enterprise-level tooling. Whereas artistic professionals might even see speedy advantages in modifying workflows and constancy, companies exploring automation in coaching, promoting, or digital experiences could discover even higher worth within the mannequin’s composability and API help.

The early consumer suggestions highlights that whereas Veo 3.1 provides priceless tooling, expectations round realism, voice management, and era size are evolving quickly. As Google expands entry via Vertex AI and continues refining Veo, its aggressive positioning in enterprise video era will hinge on how rapidly these consumer ache factors are addressed.

Share This Article