Amazon Touts AI Milestone, Says New Model Outshines GPT-4o In Real-Time Speech

Amazon.com Inc. AMZN has released a new speech-based AI model called Amazon Nova Sonic, designed to change real-time voice interactions in AI-powered applications.

This system integrates both speech comprehension and voice generation within one unified architecture, removing the need for multiple standalone models to manage each task separately.

Nova Sonic streamlines speech processing by replacing the conventional multi-step approach, in which separate systems handle recognition, interpretation, and speech output, with a single, integrated framework.

This all-in-one model enables smoother and more lifelike interactions. Accessible via Amazon Bedrock through a bi-directional streaming API, the technology is poised to support diverse sectors, including healthcare, travel, and hospitality.

Nova Sonic captures subtle elements of speech, including intonation, rhythm, and pauses, allowing it to respond with a level of sensitivity that closely resembles human dialogue, the company says.

It adapts to real-time interruptions, holding off on replies until it's contextually appropriate to speak. This conversational awareness creates a more lifelike and engaging interaction, making it especially effective for roles in customer service and AI-driven assistance.

Also Read: What’s Going On With Rocket Lab Stock Today?

“From the invention of the world’s best personal AI assistant with Alexa, to developing AWS services like Connect, Lex, and Polly that are used across a wide range of industries, Amazon has long believed that voice-powered applications can make all of our customers’ lives better and easier,” said SVP of Amazon Artificial General Intelligence, Rohit Prasad.

In standardized industry evaluations, Nova Sonic outperformed competitors, including OpenAI's GPT-4o (Realtime) and Google's Gemini Flash 2.0 in several categories.

Notably, Nova Sonic scored higher win-rates in both masculine and feminine American English voices, as well as British English, when measured against datasets like Common Eval and Multilingual LibriSpeech, according to Amazon.

Nova Sonic delivered speech recognition results across five key languages, recording a word error rate of 4.2%, a more than 36% improvement over OpenAI's equivalent offering.

It also prevailed under challenging audio conditions, surpassing competitors by roughly 47% in noisy, real-world tests. With an average reply speed just above one second, it also stands out for its affordability, costing nearly 80% less than GPT-4o.

In February, Amazon said that around 1,000 generative AI projects are currently underway or already created across its various business divisions, spanning multiple operational areas, from customer service improvements to inventory management.

The company is investing about $100 billion in artificial intelligence initiatives this year, aligning with rivals like Alphabet ($75 billion) and Microsoft ($80 billion).

The tech companies’ chase for AI dominance became significant after the launch of Chinese AI startup DeepSeek’s R1, which made waves with its performance and lower costs.

Price Action: AMZN shares traded higher by 1.6% at $178.07 at last check on Tuesday.

Read Next:

Nvidia, Applied Materials Back Digital Engineering Startup In $115M Round

AMZNAmazon.com Inc

$185.002.09%

Edge Rankings

Momentum62.11

Growth94.19

Quality79.76

Value47.98

Price Trend

Short

Medium

Long

Overview

Got Questions? Ask

Which companies might benefit from Nova Sonic adoption?

How will AI advancements reshape customer service roles?

Which sectors will leverage speech recognition tech?

Could Amazon's investment trigger growth in AI startups?

How might healthcare improve with Nova Sonic integration?

Which travel companies could adopt AI voice tech?

What challenges will competitors face against Nova Sonic?

Are AI-driven applications a viable investment opportunity?

How will cost savings impact AI product pricing?

Which investors are focusing on AI technologies now?

This content was partially produced with the help of AI tools and was reviewed and published by Benzinga editors.

Market News and Data brought to you by Benzinga APIs

Zinger Key Points

Stock Score Locked: Want to See it?

Edge Rankings

Price Trend

Comments

Zinger Key Points

Stock Score Locked: Want to See it?

Edge Rankings

Price Trend

Popular Channels

Tools & Features

Partners & Contributors

About Benzinga