A recent Financial Times article about Amazon's plans to upgrade Alexa with generative AI capabilities sparked some fascinating discussions about the future of voice commerce. The comments section, often a goldmine of consumer sentiment, painted a telling picture of current user perception:
"Ours is used mainly for setting a timer for boiled eggs."
"Turn off Alexa 3 years ago, never looked back and saved some electricity."
"So it's slow, expensive, unreliable, and makes stuff up. What's not to like?"
These sentiments reflect a broader challenge in voice AI adoption, not just for Alexa but across platforms like Siri and Google Assistant. When I shared my thoughts about how Amazon's Rufus AI could change how we search and shop on Amazon, industry expert Martin Huebel pushed back. (Shout-out to Martin, who’s the smartest person I know about Amazon's relationships with Vendors)
![](https://www.retailmediabreakfastclub.com/content/images/2025/01/Screenshot-2025-01-28-at-12.21.18-PM.png)
I can see how he made the comparison between Alexa and Rufus. But there’s a big difference between these 2 technologies:
- Only a limited type of purchase lends itself well to a voice-only medium. Comparing options and researching for high-consideration purchases or those that have a visual element, like a new TV or apparel items, is confusing and tedious if it's voice-only.
- Alexa turned out to be pretty average at understanding user questions and requests when it came to shopping. Alexa could read the label, but she couldn’t tell you if it was a good product for your needs.
However, the landscape may be shifting. There's a big initiative at Amazon to redesign Alexa and make her smarter. Let's examine why this time might be different and what it means for the industry.
Amazon's New Vision for Alexa
Amazon is reimagining Alexa as an AI "agent" powered by their in-house Nova AI large language model which is optimized for conversational AI, and supplemented by Anthropic's Claude (backed by Amazon's $8B investment). This transformation signals a shift from simple command-response interactions to more nuanced, context-aware exchanges.
What was wrong with Alexa?
With all the advancement in generative AI today, contrasted with Alexa's technology seeming to have stalled, I was curious to dig into why.
With all of Amazon's engineering power, why didn't Alexa get significantly smarter over the past 8 years?
There was a series of articles fom Geekwire that I trawled to understand these key reasons:
a) Alexa was built on an older framework. When Amazon first created Alexa, they quickly pieced together a system (sometimes called the “scaffolding”) to handle voice requests like checking weather, playing music, or setting timers. This early design was good enough to launch the product, but it wasn’t built for deep, free-flowing conversations. Over time, it’s been hard to rebuild these “foundations” without disrupting existing features.
b) Voice is harder to get right. Text-based chatbots like ChatGPT can afford to pause briefly and “think” as they generate their answers. By contrast, Alexa is supposed to respond within seconds (or less) as soon as you say “Alexa…” People notice even small delays when speaking out loud. Managing this real-time voice interaction (including wake word detection, speech recognition, and immediate replies) is more complex and places heavier constraints on the AI.
c) Users expect reliability. If ChatGPT makes a mistake—like stating something incorrectly—it’s not the end of the world. But if Alexa mishears your command to turn off the oven or unlock the door, that’s a bigger deal. So Alexa’s creators have spent much of their effort on ensuring it can reliably handle day-to-day tasks rather than dabbling in more experimental “chat” features.
d) Limited conversation context. Voice queries tend to be short, one-off requests. That’s partly because of habit (we’re used to giving Alexa basic commands) and partly because long-form speech in everyday life can be cumbersome. This means Alexa often isn’t fed the lengthy context that a chatbot uses to generate more nuanced answers.
When Rohit Prasad, Senior Vice President, Amazon Alexa was asked to compare Alexa to ChatGPT, he pointed to ChatGPT's trust issues, namely hallucinations. "Alexa is orchestrating or interacting with thousands of services in real time," he added. "And what it makes seemingly simple is incredibly complex, because underneath, it has more than 30 machine learning systems working together to give you that outcome in less than a second, often.”
How Amazon plans to make Alexa smarter
The technical challenges are significant: addressing AI hallucinations, reducing latency, and maintaining reliability at scale. Additionally, Amazon faces the complex task of transitioning Alexa's legacy infrastructure to accommodate these new capabilities.
Here's what I gleaned about how Amazon is thinking about Alexa's 'brain transplant':
a) Building bigger language models. Amazon has hinted that they’re working on a new LLM for Alexa to better understand context, handle more varied questions, and even show some creativity.
b) Partnering with outside AI model. Recent reports say Amazon is using Anthropic’s Claude to enhance Alexa’s conversational skills.
c) Layering new AI on top of existing features. Amazon isn’t scrapping Alexa’s current approach. Instead, they’re weaving new generative AI capabilities into the existing voice infrastructure. The goal is to keep Alexa’s speed and reliability while adding deeper conversation and reasoning.
Where Voice Commerce Makes Sense
With this context around Alexa's past shortcomings and potential upgrades, we can look again at the promise of a brave new world of voice commerce.
When Alexa launched, many thought voice shopping would take off—just say “Alexa, order more paper towels” and done. But in practice, people usually like to see pictures, read reviews, and compare prices before buying. Voice alone hasn’t matched how most of us prefer to shop.
The most promising applications align with low-consideration, routine purchases. Think consumables, subscriptions, and reorders – situations where convenience outweighs the need for detailed comparison. For instance, imagine receiving contextual alerts about products you regularly use: "Your favorite household items are on sale" or "Time to restock your pantry essentials?"
With advanced AI, Alexa might be able to do a better job of recommending products, summarizing reviews, or guiding you through options in a more conversational way. For quick reorders of known products (“Buy the same laundry detergent as last time”), voice can still be handy.
But shopping may remain a niche use case for Alexa unless Amazon can make voice interactions feel as detailed and trustworthy as browsing on a phone or computer. AI could help by giving better product summaries or personalized suggestions, but it must overcome people’s natural habit of wanting visual confirmation before buying.
The Path Forward
Let's consider the bull and bear case for Alexa’s future.
Bull Case:
• Massive user base: Millions of households already have Alexa devices, giving Amazon a built-in audience for any upgrades.
• Integration with everyday life: Alexa can control lights, TVs, music, and more, meaning it’s well-placed to become a “central hub” if new AI makes it more conversational.
• Amazon’s resources: Amazon has deep pockets, owns the e-commerce giant we all know, and also runs AWS—one of the biggest cloud providers—which could give Alexa the computing power it needs to compete in generative AI.
• Ongoing improvements: If new LLMs (Large Language Models) and outside partnerships successfully layer in advanced capabilities, Alexa could finally feel less “dumb” and more like a genuine personal assistant.
Bear Case:
• Voice limitations: Even if generative AI gets better, voice assistants face a tougher job than text-based apps. Real-time listening, immediate responses, and high user expectations for accuracy can limit how “intelligent” Alexa appears in everyday use.
• Lack of compelling new use cases: For many people, Alexa remains a way to set timers and play music. Advanced AI might not shift that habit if people still find voice awkward for complex tasks like online shopping or detailed Q&A.
• Competitive landscape: Google Assistant, Apple’s Siri, and third-party AI assistants are all racing to improve. OpenAI’s ChatGPT itself is also expanding into voice, potentially overshadowing Alexa’s enhancements.
• Profitability challenges: Amazon’s devices division has reportedly lost billions on Alexa over the years. Without a clear path to making money, future investments could slow or stall.
While Amazon is investing heavily in making transplanting Alexa's brain, we're not sure how long the surgery will take. And once Alexa is recovered from her brain transplant, will she even really be up for those shopping trips, or will she prefer to stay at home and talk about the weather?