AI for Structured Data: Building Production-Ready Agents

By David Kucher

Published on June 5, 2025

AI for Structured Data: Building Production-Ready Agents

At Numbers Station, our mission has always been to bring AI to structured data—and make it actually work in production. It’s a vision we formed years ago at the Stanford AI Lab and pursued at scale, building practical AI agents that could bridge the gap between natural language and highly structured enterprise data.

In this blog, I’ll share the technical foundation of what we’ve built: an agentic platform that transforms LLMs from passive text generators into intelligent, decision-making systems that drive real business outcomes. You’ll also learn how our approach is evolving as part of Alation, where we’re building the future of enterprise AI on a foundation of rich, trusted metadata.

The problem: The modern data stack has created more questions than answers

In the past decade, enterprises have done the hard work of modernizing their data stacks. They’ve invested in better pipelines, scalable storage, and powerful compute. But on the analytics side? The experience has only grown more fragmented.

Data teams are overwhelmed with questions like:

“Can you find me a dashboard that shows sales by region?”
“How is this metric calculated?”
“Can you write a query to compare performance this year vs. last?”

These aren’t one-off requests—they’re constant. The complexity of the modern data stack has made it harder, not easier, for users to find the answers they need. As a result, data teams are stuck in a support role, spending their time triaging tickets instead of building for the future. Rather than developing well-defined data products that foster trust and empower users, they’re forced into a never-ending game of data-request whack-a-mole.

And now, the pressure is even greater. With the rise of LLMs, everyone has seen a demo that writes SQL or builds a chart. Expectations are sky-high. But the reality? Very few of those demos actually work in production.

Our architecture: Turning complexity into intelligence

To meet those expectations—and relieve pressure on data teams—we built a platform from the ground up that can deliver intelligent, AI-powered analytics at scale.

The architecture has three essential layers:

Data Connection + Ingestion: We connect to various data sources to obtain as much information as possible about your business—including the technical definitions found in dimensions, metrics, tables, and dashboards, and query logs, along with the metadata and descriptions found in catalogs, messages, etc.
Knowledge Layer: Using that rich trove of business data, we automatically create a knowledge layer, which will be used with RAG (Retrieval Augmented Generation), to bring the power of LLMS to your business by contextualizing their responses using your trusted business definitions.
Agent layer: We provide tailor-made user experiences by composing our suite of agents for data-specific tasks. Examples of this include Chat, where users can understand and visualize their data with natural language, and Workflows, where powerful data-dependent actions can make your data useful, automating data-specific tasks.

This architecture doesn’t just streamline analytics—it fundamentally transforms it. By combining the flexibility of LLMs with the precision of structured data, and grounding it all in a rich metadata foundation, we empower AI agents to act with confidence and context.

This means users don’t just get answers—they get the right answers, delivered through intuitive interfaces, governed by enterprise rules, and tailored to the unique semantics of their business.

From text to SQL: Why actionability matters

Generating SQL from natural language is often considered the low-hanging fruit of enterprise AI—but turning that into a seamless, trustworthy, and production-ready experience is far more complex.

At Numbers Station, we began with this foundational idea: enable business users to ask data questions in plain English and have an AI agent translate that into executable SQL. This “Text2SQL” approach holds huge promise for reducing dependency on technical teams and democratizing data access. But as we quickly discovered, generating a SQL query is only part of the solution.

So, how does it work? The basic workflow here is straightforward: feed the model a prompt that includes the user’s question and optional schema context, and receive a SQL query in return. It’s fast and impressive—until something breaks.

That’s where most early systems have fallen short. The query might use incorrect syntax, reference a non-existent table, or return no results at all. Worse, there’s no feedback loop or awareness built into the process. Users are left to debug or rephrase, and the system becomes more of a prototype than a partner. For this reason, integrating tooling into our platform has been really critical.

To overcome these limitations, we moved from a static generation paradigm to one centered on agentic action—where an AI agent doesn’t just predict a query, but actively executes it, checks for success, and adapts when issues arise.

This shift required embedding control logic into the agent workflow. Now, when a query fails, the agent can identify the cause—whether it’s a syntax error, empty result, or a division-by-zero—and choose an appropriate next step. It can retry, correct, or reframe the query based on real-time context.

In this way, we now have a cyclical execution paradigm, which is incredibly helpful for handling any possible edge cases that are likely to arise.

By integrating tools and intelligent decision-making into our agents, we transformed Text2SQL from a novelty into a reliable, repeatable process—one that adapts dynamically to user needs, database constraints, and organizational context.

This foundation has become the springboard for everything that followed: multi-agent collaboration, context-aware retrieval, and ultimately, a platform that doesn’t just generate answers, but delivers them with confidence.

Why accuracy demands metadata (and RAG)

Even when a generated SQL query executes successfully, that doesn’t guarantee it’s delivering the right answer. In fact, one of the most persistent challenges in operationalizing LLMs for analytics is ensuring that outputs are not only syntactically correct but also semantically and contextually accurate.

The root of the problem lies in how language models interpret structured data. Without deep business context—definitions of metrics, relationships between tables, and known dimensions—models are prone to hallucinate or make assumptions. This creates a gap between a working query and a trusted, business-relevant result.

That’s why we built our platform around metadata. For AI agents to operate responsibly and accurately, they must be grounded in a unified source of truth: a knowledge layer that encodes organizational definitions of terms like churn, profit margin, or active customer. These aren’t just labels—they often map to precise SQL expressions and filters that, if misrepresented, lead to costly errors.

To make this metadata accessible at runtime, we use RAG. By embedding metric definitions, reports, dashboards, and schema information, we enable hybrid search—both vector and keyword-based—so the agent can retrieve and reference the most relevant entities when formulating responses. This added context ensures that what the agent generates aligns with how your business operates.

The result is a system that doesn’t just guess—it knows. It knows how your company defines its KPIs. It knows what dashboards your analysts have built. And it knows how to reason through that landscape to deliver answers you can act on with confidence.

Multi-agent systems: Specialized and orchestrated

As powerful as a single agent can be, enterprise workflows are rarely limited to one action. Real-world analytics use cases often require a sequence of operations—retrieving reports, refining queries, generating visualizations, and communicating results to collaborators. To meet this complexity, we extended our platform into a modular, multi-agent system.

Each agent in this ecosystem is specialized. The query agent handles SQL generation and execution. A dashboard agent can retrieve and rank existing dashboards. A charting agent builds visual summaries from tabular outputs. A slide agent assembles insights into presentations, and a messaging agent pushes results to stakeholders via tools like Slack or email.

To coordinate these capabilities, we introduced a planner agent—a higher-order agent responsible for breaking down a user request, assigning tasks to the appropriate agents, and maintaining shared context. This orchestration model allows agents to collaborate, build on each other’s outputs, and dynamically adjust the workflow based on real-time results.

This design unlocks far more than efficiency—it enables compositional reasoning. For example, a user might ask, “Show me a dashboard on profitability,” and then follow up with, “Why is Ohio underperforming?” The planner agent understands that the second query builds on the first. It coordinates a deeper dive, pulling the relevant data and pushing the query back through the system with the appropriate context.

Because agents are independent but interoperable, we can flexibly tailor workflows to specific use cases—whether chat-based, form-based, or fully automated. This modularity makes the platform not only powerful but highly extensible. New agents can be added, old ones swapped, and the system evolves as the needs of the enterprise grow.

The results speak for themselves. Internally, we’ve seen strong improvements in benchmark accuracy, but more importantly, our users now get a smarter, more helpful experience. Our users and customers are much happier with this agentic approach, as it’s more iterative and allows them to take much more powerful actions.

But the real reason this works is metadata.

Looking ahead: The future we’re building with Alation

At Numbers Station, we’ve always believed that the real power in enterprise AI comes not just from the models themselves, but from metadata. While others focused on refining LLMs, we built infrastructure to make AI truly useful for structured data—enabling agents to reason with context, precision, and trust.

That belief is what led us to Alation. As the leader in metadata intelligence, Alation provides the perfect foundation for operationalizing AI at scale. Together, we’re combining agentic AI with the industry’s most trusted metadata platform to deliver AI systems that not only understand your data—but act on it reliably.

This partnership allows enterprises to build AI-ready data products that are contextual, governed, and impossible to replicate. And it redefines how users interact with data: through intuitive natural language interfaces that power complex, end-to-end workflows—no technical expertise required.

By uniting our strengths, we’re ushering in a new era where metadata becomes the enabler of AI-driven transformation. And we’re just getting started.

Ready to start building? Let your AI agents do more—with trusted context.

Try the AI Agent SDK
Sign up for the beta program and start building metadata-aware agents today.
Review the Aggregated Context API Guide and Signature Guide on the Alation developer portal to dive deeper

The problem: The modern data stack has created more questions than answers
Our architecture: Turning complexity into intelligence
From text to SQL: Why actionability matters
Why accuracy demands metadata (and RAG)
Multi-agent systems: Specialized and orchestrated
Looking ahead: The future we’re building with Alation