The Real Reason AI Analytics Fails in Production
You have a semantic layer. You spent months building it, curated metrics, encoded business logic, documented joins. You pointed an AI analytics tool at your data. The demo looked great.
Then it hit production. Users got wrong numbers. "Revenue" meant three different things depending on who was asking. The AI joined tables incorrectly. People stopped trusting it. The project stalled.
This isn't a model problem. It's a context problem, and most vendors are actively making it worse.
More Documentation Made Performance Worse
New research tested natural language-to-SQL systems across hundreds of real questions, comparing three approaches:
- Bare schema only (column names): 16.1% accuracy
- Verbose documentation (detailed descriptions): 13.8% accuracy
- High-signal context (usage patterns, guardrails, examples): 22.2% accuracy
The counterintuitive finding: adding more documentation made things worse, 13.8% vs. 16.1% bare schema, with 52% higher token costs on top of it.
The reason is subtle but important. AI models can't skim. They don't extract key signals from prose the way humans do. When you stuff a context window with detailed field descriptions, you're diluting the signals that actually matter. The model drowns.
Metadata Is Not Context
Here's the distinction that matters:
Metadata tells a model what exists.
"revenue_net: Total revenue after refunds and discounts."
Context tells a model how to use it.
"Use revenue_net for all financial reporting. Never use revenue_gross in customer-facing reports, it includes returns not yet processed. When querying monthly revenue, always join on order_date, never created_at, which reflects system entry time."
Same underlying information. Completely different signal quality.
The research found that effective context needs four ingredients:
Usage examples: actual SQL showing how fields get combined in practice
Explicit guardrails: what NOT to do, not just what to do
Disambiguation: contrasting look-alike fields side by side
Your terminology: the exact language your warehouse and your team actually uses
Your Semantic Layer Already Has This
Here's what most teams don't realize: organizations that have invested in LookML or dbt have already done most of this work.
LookML encodes how dimensions and measures are actually used. It captures which fields belong to which explores, how joins are structured, what aggregations are valid, what filters are required. Your dbt models document transformations, business logic, and the lineage of every metric. That's not just metadata, that's exactly the kind of high-signal context the research says improves accuracy by 38%.
The problem is that most AI analytics tools don't know what to do with it. They either:
- Connect directly to raw warehouse schemas, ignoring your semantic layer entirely
- Dump all your documentation in as verbose prose, which the research shows hurts performance
Both approaches waste the investment your data team has already made.
Why Orion Was Built Around Your Semantic Layer
When we talk to prospects, we often hear the same question: "Can Orion use our LookML?"
The answer is yes, and it's the point.
Orion extracts the high-signal context from your LookML and dbt models: field usage patterns, join logic, business rules, metric definitions, required filters. It structures that context the way the research shows actually works, not as verbose documentation, but as precise, machine-parseable signals that guide SQL generation.
Your three years of encoded business logic becomes an accuracy advantage, not an onboarding headache.
This is why we tell prospects: "Bring your LookML. We'll use it to write better SQL." It isn't positioning. It's what the research says works, and what most vendors are actively failing to do.
The Business Case Is Simple
The cost of high-signal context engineering is modest, roughly $4 per 1,000 queries. The cost of a wrong answer in a business analytics context, a campaign sized on incorrect cohort data, a retention decision made on the wrong churn metric, is orders of magnitude higher.
The teams getting real value from AI analytics aren't those with the most sophisticated models. They're the ones who've figured out that context is the work. And the companies that have already encoded that context in Looker or dbt?
They have a head start worth protecting.