CatalogSignal Field Notes
What making our own site AI-ready taught us
We spend our days measuring whether AI shopping assistants can find, understand, trust, and recommend product catalogs. So when we rebuilt catalogsignal.com on a mainstream hosted website builder, we decided to hold our own site to the same standard we hold client catalogs to. It was a useful exercise in humility. A site can look finished to a human and still read as half-built to a machine, and ours was no exception. Here is what we hit, and what we changed.
Structured data that only exists after JavaScript runs is structured data an agent may never see. Plenty of platforms inject schema, forms, and even content on the client side, after the page loads. A person with a browser sees everything. A crawler or retrieval pipeline that reads the raw HTML sees a shell. We moved our important markup, the Organization, Service, FAQ, and Article schema, into the server-returned HTML so it is present on first read, with no JavaScript required. If an AI system has to execute your page to understand it, assume some of them will not.
Your sitemap probably does not contain the pages you care about most. Hosted builders generate a sitemap for you, which sounds convenient until you notice what is missing. Ours listed the original pages and quietly omitted newer ones, including several we most wanted surfaced. The cause was mundane: the builder only enumerates pages it treats as part of the site's navigation structure, and our newer pages had been created outside it. The lesson is to check the generated sitemap against the list of URLs you actually want found, rather than assuming the platform got it right.
A contact form a browser can use is not the same as a contact path an agent can use. Our form works perfectly for people. It is also injected by script, so a reader that does not run JavaScript sees no fields at all. We could not change how the platform renders the form, so we made sure a plain, crawlable contact path, a visible email address in server-rendered text, sits right beside it. The principle generalizes: every action you want a machine to be able to take should have a text-readable version.
Not every machine-readable endpoint is possible on every platform, so do not promise one you cannot serve. We wanted a clean plain-text brief at a predictable address for AI agents. Our platform serves HTML, not raw text files, at custom paths. Rather than advertise a file that does not behave as promised, we kept a single honest, well-structured AI brief and described it accurately. Saying what is true matters more than checking a box, especially for a company whose whole category is machine readability.
Repeated chrome and duplicate URLs are quiet noise that dilutes your real content. When every page carries its own copy of the header and footer, and when a staging address serves the same pages as your primary domain, crawlers and extraction pipelines spend effort on boilerplate and can get confused about which URL is canonical. We leaned on canonical tags pointing at the primary domain, and kept the unique substance of each page as early and as clean as the platform allowed.
None of this is an argument against hosted builders. They get a credible site live quickly, and most commerce teams, including the brands we work with, run on exactly these platforms. The point is narrower and more useful: the things that make a site easy for a person are not the things that make it legible to an agent, and the gap is invisible until you measure it. Server-rendered meaning, an honest and complete sitemap, text-readable actions, and clean canonical signals are not exotic. They are just easy to skip when the human preview looks fine.
That is the same gap we find in product catalogs every day. A page that reads beautifully to a shopper can still be missing the one attribute, the one parseable claim, the one machine-readable signal that decides whether an assistant puts it on the shortlist. Building our own site to the standard we measure for others was a good reminder that AI-readiness is something you check on purpose, not something you inherit from a platform.
Curious where your catalog stands? A Commerce Eligibility Index™ assessment shows exactly where AI assistants can, and cannot, find, trust, and recommend your products. Request one at catalogsignal.com.
About this note
These are practitioner notes from rebuilding catalogsignal.com. The readiness checks described here are the same ones the Commerce Eligibility Index applies to product catalogs.