Conversational commerce

Voice Commerce Is Back. And It Is About Your Q&A Schema, Not Your Skill.

Voice commerce failed in 2018 because Alexa skills were the wrong abstraction. In 2026 it is back, on different rails — and the winning surface is your Q&A schema, not a voice app.

Rohin AggarwalApril 9, 20261 min read

Voice commerce had its first wave around 2017-2019. Alexa skills, Google Actions, every DTC brand building a "voice app" that almost no one ever used. The wave broke because the abstraction was wrong: shoppers do not want a brand-specific voice app. They want their existing assistant to know about your products.

In 2026 voice is back, on different rails. ChatGPT Voice, Claude Voice, Gemini Live and Siri's LLM-powered upgrade all give shoppers spoken access to commerce queries against your catalogue — without any brand-specific app. The winning surface is your Q&A schema, not a voice skill.

Why this wave will not break

Three differences from the 2018 wave.

Voice now runs through general LLM assistants, not brand-specific apps. Zero adoption friction.
Latency dropped below the conversational threshold. 2026 voice interaction is sub-300ms end-to-end; 2018 was 1.5-2 seconds.
Multimodal context. The assistant knows the user is looking at their phone, can see what is on screen, and can blend visual + voice in the same query.

The net effect is a category of interactions that did not exist in 2018: a shopper standing in front of a wardrobe asks their phone "what should I wear for a 12-degree run later", gets a recommendation, and orders the missing piece — all spoken, in roughly 40 seconds.

Where the data the assistant reads lives

Voice assistants do not "have" your product catalogue. They retrieve it on demand from the web, the same way text-based AI agents do. The retrieval favours:

Q&A schema (QAPage) — read aloud almost verbatim.
FAQ schema (FAQPage) — read aloud with light paraphrasing.
Product attribute tables — referenced for specific spec questions.
AI-summarised review blocks — quoted to convey "what shoppers say".
Aggregate ratings — read with the count and average.

Notice what is not on this list: long-form product descriptions, marketing copy, video descriptions, hero imagery. Voice strips everything visual and pulls the structured, machine-readable text. If your PDP is 80% imagery and 20% structured data, your voice presence is small.

Optimising for voice

Make every answer voice-friendly

Read each FAQ answer aloud. If it sounds like a marketing brochure when spoken, rewrite. If it contains acronyms a voice assistant would mispronounce ("EPDM", "GSM"), expand them. If it is longer than 30 spoken seconds, split it.

Front-load the answer

Voice assistants quote the first 1-2 sentences and skip the rest. Get the answer in the first sentence. Reserve elaboration for sentences 2-3.

Use numerals, not spelled-out numbers

"14 days" reads better aloud than "fourteen days" — most voice TTS engines render numerals correctly while spelled-out numbers occasionally trip them up.

Pronunciation overrides

If your brand name or any key product name has an unusual pronunciation, ship a pronunciation hint via the speech-to-text x-pronunciation header. Idukki, for example, would emit a hint that the second syllable is stressed.

What not to build

Three things teams get tempted to build that they should skip:

A brand-specific voice app. Same reason it failed in 2018; nobody installs it.
A custom wake-word. Adoption is a non-starter; users say "Hey ChatGPT" or "Hey Siri", not "Hey BrandName".
An on-device voice assistant. Heavy engineering for a feature the OS-level assistants do better.

Measurable impact

Voice referrals are currently 1-3% of AI-engine referrals on mid-market stores — small, but growing 30-40% quarter-on-quarter through 2026. Brands that invest in voice-friendly Q&A schema now are seeing voice-driven sessions convert at 1.5-2x the rate of typed-AI sessions, presumably because the shopper has already verbally committed by the time they reach the PDP.

Closing

Voice commerce in 2026 is not a category to staff a team against. It is a property of well-structured Q&A and FAQ content. If you have already done the AEO work on schema, you are 80% of the way to a voice-friendly catalogue. The remaining 20% is half a week of editorial polish.

#voice

#conversational-commerce

#schema

#qa

Voice Commerce Is Back. And It Is About Your Q&A Schema, Not Your Skill.

Why this wave will not break

Where the data the assistant reads lives

Optimising for voice

Make every answer voice-friendly

Front-load the answer

Use numerals, not spelled-out numbers

Pronunciation overrides

What not to build

Measurable impact

Closing

Related reading

The Future Trends of Conversational Commerce

Anatomy of a Conversational PDP: The 9 Components Every Shop Needs by Q3

Multilingual UGC at Scale: Why Translation Kills Conversion, and What to Do Instead

Same data model. Every surface a shopper meets.