Skip to content
Article

We Replaced Shopify Search With AI — Here’s What Actually Happened

How we built a custom MCP + OpenSearch product discovery system—and what real customers are already doing with it.
We Replaced Shopify Search With AI — Here’s What Actually Happened

The Moment Shopify Search Stopped Feeling Like Enough

There were two moments that pushed us to build this system.

The first was at Shopify Editions.dev.

Shopify showcased an early version of their own AI-powered product discovery. It worked—but it felt limited. It mostly relied on product titles and descriptions, with little ability to leverage structured data like metafields. If your product attributes weren’t written into the description, they were effectively invisible.

It also didn’t incorporate broader site content—policies, blog posts, educational material—and there was very little control over how the system actually behaved.

The second moment was more subtle, but more important.

While using ChatGPT for product research, we noticed something shift:
AI wasn’t just answering questions—it was recommending products, summarizing options, and guiding decisions.

That’s when it clicked:

This is how people are going to shop.

Not by filtering. Not by browsing.
By describing what they want—and letting the system figure it out.

The Problem With Traditional Ecommerce Search

Shopify’s native search and filtering works fine—if the user already knows what they’re looking for.

But real customers don’t think in filters like:

  • “6-seat aluminum dining set”

  • “sunbrella fabric”

  • “freestanding umbrella”

They think in terms of:

  • “Something that can handle Arizona sun”

  • “A set that fits a small patio”

  • “Something comfortable and modern”

That gap—between how users think and how search works—is where traditional ecommerce breaks down.

And it gets worse when:

  • Products have nuanced differences

  • Customers need education before buying

  • Important information lives outside product pages

At that point, search isn’t the problem.

The model itself is.

What We Built Instead

We built a conversational product discovery system that sits on top of Shopify—but doesn’t rely on Shopify search.

At a high level, the system is:

A controlled loop between the user, the AI, and a structured retrieval layer

Here’s how it works.

How the System Works (Simplified)

  1. User opens chat and asks a question
    Anything from product discovery to shipping to materials.

  2. Request is sent to OpenAI (Responses API)
    Along with:

    • A structured system prompt

    • A set of MCP tools (search + fetch across catalog, pages, policies)

  3. AI classifies the user’s intent
    We force the model to interpret each query as one of several “turn types”:

    • Product discovery

    • Product selection

    • Educational

    • Policy / logistics

  4. AI decides whether to use a tool
    Based on intent, it queries:

    • Product index

    • Blog / education index

    • Policy index

  5. MCP routes the request to our backend
    Built in Next.js, backed by OpenSearch

  6. We shape the data before it reaches the AI
    This is critical:

    • Strip unnecessary fields

    • Keep only reasoning-relevant content

    • Cache full data separately for UI rendering

  7. AI generates a structured response (JSON)
    Including:

    • Product recommendations

    • Supporting reasoning

    • UI-friendly data

  8. UI renders the result
    Chat + product cards + structured output

Why We Didn’t Use Shopify Search, Algolia, or a Vector DB

We didn’t build this because existing tools are bad.

We built it because they solve the wrong problem.

1. Shopify Search Was Too Constrained

  • Limited use of metafields

  • Minimal control over ranking and retrieval

  • No ability to shape outputs for AI

If the data isn’t in the description, it effectively doesn’t exist.

2. We Needed Full Control Over the Index

With OpenSearch, we control:

  • What gets indexed

  • How it’s structured

  • How it’s retrieved

We can include:

  • Product attributes

  • Metafields

  • Blog content

  • Policies

We stopped thinking about search as “finding products” and started thinking about it as:

Retrieving context for an AI system

3. Token Efficiency Changes Everything

This is the part most people miss.

Search results are no longer just for users—they’re inputs to an LLM.

That means:

  • Too much data = slow + expensive

  • Poor structure = bad answers

  • Uncontrolled output = token bloat

So we had to design for:

  • Minimal inputs

  • Structured outputs

  • Tight control over what the AI sees

The system isn’t just about retrieval—it’s about deciding what the AI is allowed to see.

4. MCP Gave Us Control Over Behavior

MCP isn’t just a connector—it’s a control layer.

It lets us:

  • Define tools

  • Control when they’re used

  • Shape inputs and outputs

  • Enforce boundaries

The hardest part wasn’t getting AI to answer questions—it was controlling what it sees, when it sees it, and how it responds.

What Didn’t Work (And Why It Matters)

This system didn’t work on the first try. Or the second.

Some of the biggest lessons:

🚫 Unconstrained AI = Wrong Answers + Higher Cost

Without guardrails:

  • The AI answers off-topic questions

  • Pulls from general knowledge

  • Becomes expensive quickly

We had to explicitly force:

  • Domain-specific behavior

  • Context-only answers

🧾 Freeform Responses Are Useless for UI

Letting the AI respond in plain text:

  • Breaks consistency

  • Can’t be reliably rendered

  • Creates unpredictable outputs

We moved to strict JSON schemas:

  • Product IDs

  • Structured recommendations

  • Controlled prose

AI responses aren’t just for humans—they’re inputs to your UI.

🎯 “Correct” Isn’t Enough—It Has to Be Predictable

Even when answers were right:

  • Formatting varied

  • Data was inconsistent

  • Hard to map to real products

We had to enforce:

  • Structured outputs

  • Clear schemas

  • Tight alignment with our data model

💸 Token Management Is a First-Class Problem

With large catalogs:

  • Input tokens explode

  • Output tokens vary

  • Costs and latency spike

We solved this by:

  • Stripping unnecessary fields

  • Sending only reasoning-relevant data

  • Separating AI data from UI data

🔄 AI and UI Need Different Data

The AI doesn’t need:

  • Images

  • Full metadata

  • Presentation details

The UI does.

So we separated them:

  • AI gets minimal data

  • UI pulls from local cache

We separated the data layer for reasoning from the data layer for rendering.

What Real Customers Are Already Doing

Even in early usage, one thing is clear:

Customers aren’t using this like a search bar—they’re using it like a sales associate.

They ask:

  • Detailed product questions

  • Material and configuration questions

  • Shipping and delivery questions

  • Replacement and compatibility questions

These are the kinds of questions that normally require:

  • Digging through product pages

  • Reading policies

  • Contacting support

Now they’re answered instantly.

And interestingly:

Customers seem more comfortable asking AI than asking a human.

They’re:

  • More direct

  • More detailed

  • Less inhibited

  • More willing to ask follow-ups

The Unexpected Benefit: A Built-In Research Engine

The biggest insight so far isn’t about conversion rates.

It’s this:

We can now see exactly what customers are thinking.

Not inferred from clicks.
Not guessed from analytics.

But expressed directly, in their own words.

This creates a feedback loop that didn’t exist before:

  • What customers care about

  • What confuses them

  • What information is missing

  • How they describe products

For the first time, we’re not guessing user intent—we’re observing it.

What This Means for Shopify Merchants

AI product discovery isn’t just a better search experience.

It’s a shift in how customers interact with your store.

From:

  • Search → Filter → Browse → Decide

To:

  • Describe → Refine → Learn → Decide → Confirm

And more importantly:

It collapses product discovery, education, and customer support into a single interface.

Final Thought

We didn’t set out to replace search.

We set out to build something that works the way customers already think.

And what we’re seeing early on is this:

Customers don’t want better filters.
They want better answers.