CatalogSignal White Paper

The AI Shelf

How agentic shopping is rewriting product discovery, and what commerce leaders must do now

CatalogSignal White Paper · For CMOs, CDOs, and Heads of eCommerce · June 2026

Executive summary

The front door to commerce is moving from the search-results page to the AI assistant. Shoppers are increasingly asking ChatGPT, Google's AI Mode, Perplexity, and Amazon's Rufus to find, compare, and recommend products, and those assistants are sending real, high-intent traffic to retailers. During the 2025 holiday season, referrals from generative-AI platforms to U.S. retail sites surged 693% year over year, and that traffic converted 31% more than other traffic sources (Adobe Analytics, reported by Digital Commerce 360, January 2026).

But there is a catch most commerce teams have not yet priced in: being mentioned by an AI is not the same as being represented accurately by it. In a CatalogSignal CEI Benchmark of 100 brands across 10 verticals and four AI providers (more than 100,000 AI shopping queries), mean funnel accuracy was 47.7%, mean hallucination was 9.3%, and mean commercial harm was 10.6%. The pattern was not simple familiarity: some high-familiarity brands still fell into hallucination-risk positions, while some cleaner brands were under-discovered.

This paper makes a straightforward argument for commerce leaders. Discovery is being intermediated by machines that read your catalog very differently than a human does, the revenue at stake is already material, and the brands that win will be the ones who make their product data legible, trustworthy, and recommendable to AI, deliberately and soon.

1. The shelf has moved

For two decades, the job was to rank on the search-results page and win the click. That job is changing. A large share of U.S. shoppers now use generative AI somewhere in the buying journey: 61% of consumers say they have used generative-AI tools like ChatGPT for online shopping (Capital One Shopping, 2026). And the appetite is growing: 71% of consumers want generative AI integrated into their shopping experiences, and roughly three-quarters are open to GenAI product recommendations, up from 63% in 2023 (Capgemini, January 2025).

The money is following the behavior. Adobe's analysis of the 2025 U.S. holiday season found generative-AI referrals drove a 693% year-over-year increase in retail site visits, and those visitors were better customers: revenue per visit from AI referrals grew 254% year over year, shoppers spent 45% more time on site, and AI referrals converted 31% more than other traffic sources (Adobe Analytics, via Digital Commerce 360, January 2026). Into early 2026, Adobe reported traffic from AI sources to U.S. retail sites up 393% year over year in Q1 (Adobe, April 2026).

Figure 1. The old path versus the new path: shopper to search to website to product, versus shopper to AI agent to recommendation.

Figure 2. AI-driven retail traffic growth (693% over the 2025 holiday, 393% in Q1 2026) alongside the 31% conversion lift. Source: Adobe.

And this is before the agents arrive in force. OpenAI and Stripe co-developed the Agentic Commerce Protocol and open-sourced it in 2025, and in January 2026 Google, Shopify, and partners launched the Universal Commerce Protocol (Fast Company, 2026). Standards and integrations are emerging to make real-time product data, checkout, scheduling, and returns agent-compatible. The shopper is becoming an agent acting on the shopper's behalf, and that agent does not browse your beautifully designed PDP. It parses your data.

2. Why this is a different game

Search engines reward pages built for humans: keywords, backlinks, page experience. AI assistants do something else. They assemble a picture of your product from structured data, descriptions, reviews, feeds, and the broader web, then decide, in a single conversational turn, whether to put your product on a very short list. There is no page two. There is often no page one. There is a recommendation, or there is silence.

That shift turns an old assumption on its head. Being online is no longer the same as being eligible to be recommended. A catalog can be fully indexed, perfectly merchandised for people, and still be effectively invisible to an assistant because the machine cannot confirm the product meets a shopper's stated constraint: the right size, the right material, the right use case, under the right price.

Adobe named the gap directly. AI traffic is surging, but many U.S. retail sites are not entirely readable by machines, with product pages averaging just 66% readability in Adobe's AI visibility benchmark (Adobe, April 2026). The demand is arriving faster than catalogs are getting ready for it.

Is this just SEO for AI? It is tempting to file this under search, but the mechanics differ. Search engines rank pages; assistants recommend products, and they do it by interrogating data, not prose. What moves an assistant is not keywords or backlinks but attribute completeness, claim consistency, and whether a product can satisfy a stated constraint, measured across the whole catalog. That is the layer traditional SEO never instrumented. Treat this as the discipline that begins where technical SEO ends, running alongside your search program rather than instead of it.

3. The risk leaders are not pricing in: recognition is not accuracy

Here is the finding that should reframe how commerce leaders think about AI readiness. In CatalogSignal's June 2026 CEI Benchmark, 100 brands were assessed across 10 verticals and four AI providers, drawing on more than 100,000 AI shopping queries. The results were sobering and, importantly, uneven:

Mean funnel accuracy across the measured panel was 47.7%.
Mean hallucination was 9.3% and mean commercial harm was 10.6%.
The accuracy gap between assistants was large: provider-level mean accuracy ranged from 12.5% at the low end to 53.2% at the high end. AI readiness is neither uniform nor self-correcting.

(These are directional panel metrics, not shopper-level probabilities. The benchmark is a directional panel, not a census of the retail sector. We report patterns, not guarantees, and we do not publish individual brands' scores.)

The most counterintuitive pattern: high familiarity did not automatically mean high quality. Some high-familiarity brands landed in hallucination-risk positions, while some accurately represented brands were under-discovered. Teams that treat "we get mentioned a lot" as proof of AI readiness are very likely misreading their exposure.

Figure 3. A familiarity-versus-accuracy matrix: the high-mention, low-accuracy "hallucination risk" quadrant and the high-mention, high-accuracy "healthy" quadrant. CatalogSignal CEI Benchmark, June 2026, directional.

Figure 4. Panel snapshot: 47.7% mean funnel accuracy, 9.3% hallucination, 10.6% commercial harm, provider accuracy range 12.5% to 53.2%. CatalogSignal CEI Benchmark, June 2026, directional.

Why does this matter commercially? Because trust is conditional and shoppers verify. Yext's 2026 consumer-search research found that even among high-trust AI users (those who rate their trust in AI recommendations 4 or 5 out of 5), more than 90% take at least one verification step before acting (Yext, 2026). And willingness to trust is fragile: 75% of Americans say they would trust an AI agent less if its recommendations were swayed by brand dollars (Quad / Harris Poll, April 2026). In an AI shelf, the brand the assistant can describe correctly and verifiably wins, not the brand that shouts loudest.

4. What's at stake

Put the two halves together, surging and high-converting AI traffic on one side and a meaningful accuracy and eligibility gap on the other, and the exposure is straightforward. Every shopping query an assistant answers about your category is a moment where your product is on the shortlist, misrepresented, or absent. Assistants are more likely to recommend products whose claims and constraints they can verify; when the underlying data is weak, they can still misread, omit, or substitute a competitor they can confirm instead.

This is not a far-future problem. Online holiday sales hit a record $257.8 billion in 2025 (Digital Commerce 360, 2026), with AI an increasingly large slice of the highest-intent traffic. A brand that is invisible or misrepresented inside AI shopping is, quite literally, invisible to a fast-growing share of revenue. And unlike a ranking dip, the brand often never sees the lost sale, because the shopper never lands on the site.

Figure 5. Revenue moving upstream: the decision happening inside the assistant, before the click.

5. What commerce leaders must do now

The good news: this is a data-readiness problem, and data readiness is measurable and fixable. The brands getting ahead are treating AI legibility as a deliberate program, organized around five plain-English questions an assistant implicitly asks before it recommends a product:

Can it read my catalog? Is the product data complete, consistently named, and machine-readable, or thin and ambiguous?
Can it tell my products apart? Do descriptions differentiate, or do they blur into boilerplate the assistant cannot choose between?
Can it find the right product? When a shopper asks for something specific, like "waterproof, wide fit, under $150," does the right item surface and honor the constraint?
Can it trust the claims? Are the claims in your catalog internally consistent and backed by evidence, so the assistant can assert them confidently?
Does the wider internet vouch for the brand? Do reviews, mentions, and authoritative sources reinforce what your catalog says?

None of this requires re-platforming or handing over IT access. It requires measuring how AI actually reads your catalog today, finding the specific gaps, and fixing them in priority order, then keeping them fixed as products change.

That is precisely what the Commerce Eligibility Index™ (CEI) is built to do. CEI Diagnose™ scores whether AI shopping systems can find, understand, trust, and recommend your products on a 0 to 100 scale, backed by product-level evidence. CEI Activate™ turns the findings into ready-to-apply fixes, and CEI Protect™ gates new and updated products so readiness does not drift backward. Measure the gap, close the gap, keep it closed.

The shelf has moved. The shoppers have moved with it. The only open question for commerce leaders is whether your catalog is ready to be recommended before your competitor's is.

See how AI reads your own catalog. A baseline CEI assessment turns AI catalog readiness into an executive scorecard, product-level evidence, and a prioritized fix queue. Request a CEI assessment at catalogsignal.com.

Sources

Capital One Shopping. AI Shopping Statistics (2026 Report). https://capitaloneshopping.com/research/ai-shopping-statistics/
Capgemini. 71% of consumers want generative AI integrated into their shopping experiences (January 9, 2025). https://www.capgemini.com/news/press-releases/71-of-consumers-want-generative-ai-integrated-into-their-shopping-experiences/
Digital Commerce 360. Generative AI shifts online holiday shopping traffic in 2025 (January 2026). https://www.digitalcommerce360.com/2026/01/13/generative-ai-online-holiday-shopping-traffic-2025/
Adobe. AI traffic grows but retail sites lag in AI search visibility (April 16, 2026). https://business.adobe.com/blog/ai-traffic-surge-retail-sites-not-machine-readable
Fast Company. Shop 'til you bot: Google, OpenAI, and the race to build agentic commerce (2026). https://www.fastcompany.com/91533534/shop-til-you-bot-google-openai-and-the-race-to-build-agentic-commerce
Yext. 7 Data-Backed Facts on AI Trust and Consumer Decision-Making in 2026. https://www.yext.com/blog/7-data-backed-facts-on-ai-trust-and-consumer-decision-making-in-2026
Quad / Harris Poll. Americans would lose trust in AI shopping if results were sponsored (April 13, 2026). https://www.quad.com/newsroom/americans-say-they-would-lose-trust-in-ai-shopping-if-results-were-sponsored
CatalogSignal. Commerce Eligibility Index Benchmark, June 2026 (100 brands, 10 verticals, four AI providers, 100,000+ queries; figures directional).

Next →Recommendation Eligibility