At most large organizations, the word "revenue" means at least three different things depending on who you ask. Finance calculates it one way. Sales ops pulls it from Salesforce with a different set of filters. The data warehouse has its own version, built by an engineer who left two years ago and documented in a README no one can find. When those three numbers end up in the same board presentation, someone has a bad morning.
This is the definition problem at enterprise scale, and it gets worse as organizations grow, merge, and move more of their decision-making onto data. When business teams don't share a common vocabulary, reports conflict, AI models train on misaligned labels, and compliance teams can't certify that the data they're signing off on means what they think it means.
A business glossary is the governance layer designed to solve this. Not a spreadsheet of terms maintained by one team on a shared drive — a governed, searchable, collaboratively maintained collection of definitions that lives inside the data catalog, linked directly to the data assets that implement each concept. This post explains how Alation's business glossary works at enterprise scale: how it's structured, how terms move from proposal to trusted definition, how teams resolve definition conflicts across domains, how to manage thousands of terms at once, and how a glossary connects to the lineage, policies, and data quality signals that depend on it.
Alation's business glossary is built on a three-tier hierarchy: Document Hubs, Folders (which function as Glossaries), and Documents (which function as Terms). Understanding this structure is the foundation for understanding how glossary management scales.
Document Hubs are the top-level containers. Alation ships with a built-in Glossary Hub — the default home for business term definitions — and organizations can create additional custom Document Hubs for other documentation types, such as policies and procedures, technical documentation, or domain-specific knowledge bases. Document Hubs appear in the catalog's left navigation and are searchable across the platform. The built-in Glossary Hub is permanent and cannot be deleted or unpublished, which gives the organization's canonical terminology a stable home regardless of how the platform is configured over time.
Folders within a Document Hub function as Glossaries. A large enterprise might organize these by business domain — a Finance Glossary, a Marketing Glossary, an Operations Glossary — each scoped to a specific function and managed by subject matter experts in that area. Folders can contain subfolders, enabling hierarchical organization for complex domains. Permissions, domains, and workflows are configured at the folder level and inherited by the documents within; this means a governance policy applied to the Finance Glossary folder automatically governs every term it contains.
Documents within a Folder are the individual Terms. Each term follows a template — called a Term Type — that defines what fields it contains. Catalog Admins configure these templates with the fields that matter to their organization: a standard definition field, yes, but also custom fields like "Data Owner," "Regulatory Classification," "Business Unit," "Last Reviewed Date," or "Linked Data Products." The template is how the glossary becomes an organizational standard rather than a freeform wiki.
Every glossary term in Alation moves through a defined lifecycle. When workflows are enabled on a glossary, this lifecycle is enforced — preventing ungoverned definitions from becoming authoritative before they've been reviewed. The stages are:
1. Draft. A contributor creates a new term. The term page displays "Draft pending submission" at the top. The author fills in the title, definition, and any relevant custom fields, and selects which glossary or glossaries the term should belong to. A single term can be a member of multiple glossaries — useful when a concept like "customer" crosses domain boundaries and belongs in both a Finance Glossary and a CRM Glossary simultaneously.
2. Submit for review. When the author is satisfied with the draft, they click "Submit for Review." The term status changes to "Under Review."
3. Review. Designated reviewers receive an email notification. The term appears in the "Under Review" tab of any glossary it belongs to. Reviewers open the term and see an "Approve or Reject" button. They can view the approval path, add comments, and confirm their decision in the Approve Membership dialog.
4. Published. An approved term becomes the organization's canonical definition — searchable and discoverable by all users with appropriate access. A rejected term is returned to the author.
5. Ongoing maintenance. Terms are not static documents. As business definitions evolve — due to regulatory changes, business model shifts, or new data infrastructure — Stewards update them. Change history is preserved at the term level, which matters in regulated industries where definition changes must be auditable. Starting in version 2025.1.1, Stewards can apply Trust Flags to terms and glossary folders: a Deprecation flag applied to a folder cascades automatically to all its child terms, making it practical to deprecate an entire domain's terminology at once when a business unit is restructured.
This lifecycle — enforced by Alation's workflow engine, tracked in the audit log, and surfaced directly in the catalog UI — is what separates a governed business glossary from a terminology document that someone emailed around last quarter.
No central team knows every nuance of every term. The most accurate business glossary is one where the people closest to the data contribute to defining it — with Stewards maintaining governance accountability over what gets published.
Alation supports this distributed model in several ways.
Conversations on Terms. Any Alation user can start a Conversation directly on a term's catalog page — asking a question, flagging an ambiguity, or suggesting a correction. Stewards respond in context. The discussion is preserved on the term, creating a permanent record of how the definition was debated and refined. This turns the glossary into a living document that improves with use rather than decaying between annual reviews.
Starring and watching. Users can star a glossary to add it to their Alation favorites, making their most-used domain glossaries instantly accessible. Watching a glossary signs them up for email notifications whenever a change is made — so Finance analysts are automatically informed when a Finance term is updated, without needing to check the platform proactively.
@-mentions. Terms and folders can be @-mentioned anywhere in the Alation catalog — in a table description, a query annotation, or a data quality note. And the relationship works bidirectionally: from version 2024.1.3, when a catalog object's template includes a Mentions field, it displays all the documents and folders that reference it. A column in Snowflake can surface which glossary terms point to it, connecting the physical data to the business vocabulary in both directions.
Navigation Links. A term can be linked to multiple glossary folders via Navigation Links, making it discoverable from multiple domain contexts without duplication. A "Net Promoter Score" term maintained in the Customer Experience Glossary can surface in the Marketing Glossary as a navigation link — users in either domain find the same canonical definition.
Trust Flags. Users can mark terms as Trusted, Warning, or Deprecated, giving consumers an immediate signal about whether a definition should be relied upon. A Deprecated flag on a term is a clear message to analysts: find the successor before using this in production.
The hardest part of building an enterprise business glossary isn't the technology. It's what happens when two teams have been working from different definitions of the same word for years, and someone finally puts both definitions in the same room.
"Customer" is the most common example. The CRM team defines a customer as anyone who has ever made a purchase. Finance defines a customer as an account with a currently active, revenue-generating contract. Marketing defines a customer as anyone in the database with a marketing opt-in. All three definitions are reasonable. All three produce different numbers. And in most large organizations, all three have been quietly coexisting in separate systems — until someone tries to build a single customer dashboard and the numbers don't match.
This is not a data quality problem. It's a definition problem, and it won't be solved by better pipelines or cleaner schemas. It requires a deliberate process for surfacing the conflict, making a governance decision, and documenting the outcome in a place where everyone can find it.
Here is what that process tends to look like in practice at mature data organizations.
Surface the conflict explicitly before trying to resolve it. The most counterproductive thing a data governance team can do is quietly choose one definition and implement it as the default, hoping no one notices. The business units whose definitions get overridden will notice — and they'll stop trusting the glossary. Instead, document all the competing definitions in the catalog immediately. Create a term for each variant, note the domain it comes from, and make the disagreement visible. In Alation, you can open a Conversation directly on a term to flag that multiple definitions exist and invite the relevant stakeholders to weigh in. The goal at this stage is clarity, not consensus.
Identify who actually has decision rights. Most definition conflicts persist because no one with authority has been asked to make a call. The data team can document the conflict, but they typically can't resolve it — because the resolution requires a business decision about what the organization actually wants to measure. That decision belongs to a governance council, a CDO, or the business owners of the relevant domains. Part of establishing a healthy glossary practice is being explicit about who can approve a canonical enterprise definition and who gets consulted. Alation's workflow supports this by requiring designated reviewers to formally approve a term before it becomes authoritative, which forces the governance question rather than leaving it to drift.
Don't force false consensus when legitimate variation is real. Sometimes the right resolution isn't a single definition. Finance's "customer" and Marketing's "customer" may both be legitimate concepts that serve different business purposes — they've just been sharing a name that made the difference invisible. In that case, the governance decision isn't "pick one" — it's "name them differently and document the relationship." In Alation, terms can be linked as synonyms or related terms, making it explicit that these are distinct concepts with a known relationship. This is often the most honest and most durable resolution: the organization acknowledges the variation rather than papering over it.
Prioritize ruthlessly. A large enterprise could spend years resolving every definition conflict in its vocabulary. Most organizations are better served by identifying the ten or twenty terms that cause the most pain in reporting — the ones that create meeting conflicts, make dashboards untrustworthy, or delay quarter-close — and resolving those first. A narrow, high-quality glossary that everyone trusts is worth more than a comprehensive glossary that people have learned to work around.
Treat the outcome as a living record, not a final verdict. Business definitions evolve. A term approved two years ago may need to be revised as the business model changes or new regulatory requirements arise. The governance process should include a periodic review cycle — and the glossary should make change history visible, so that anyone who needs to understand why a definition changed has access to the record. When an old definition is superseded, deprecating it in the catalog rather than deleting it preserves that history while clearly signaling to users which version to rely on.
The most successful data governance programs treat the business glossary not as a documentation project that gets done once, but as the ongoing practice through which the organization maintains shared understanding of its own data. The technology makes the practice scalable. The governance process makes it stick.
The term lifecycle described above works well for individual terms. At enterprise scale — where an organization might onboard an existing corporate glossary of 5,000 terms, or needs to update steward assignments across 800 finance terms after a reorg — a different approach is needed. Alation's Bulk Utility makes this tractable.
The Bulk Utility supports bulk create, bulk edit, bulk move, and soft-delete of glossary terms. Here is the precise workflow for each:
Bulk create:
Ensure the Glossary Terms GA and Bulk Utility (Beta) feature flags are enabled in Admin Settings
Create one sample term of the desired Term Type — this serves as the template row
Use full-page search to find your sample term; filter for Terms; save the search
Navigate to the Bulk Utility (via Admin Settings, or by appending /bulk-utility/ to your Alation URL)
Select your saved search to generate the editable spreadsheet; download the zip file
Open the file named "use_this_for_upload"; add one row per new term, leaving the id column blank
Upload the completed file through the Bulk Utility; download the feedback file to confirm results
Bulk edit: The process is identical, except: rather than leaving the id column blank, preserve the existing term IDs. Alation uses the ID to match the upload row to the existing term and apply your changes. Do not edit or remove the id column.
Scale benchmarks: Alation supports bulk creation of up to 15,000 terms per day on the Deluxe tier and up to 50,000 terms per day on the Enterprise+ tier. For organizations migrating a legacy spreadsheet-based glossary into Alation, even a glossary of 50,000 terms can be fully onboarded in a single day at Enterprise+ scale.
Conflict resolution across domains. When two business units define the same concept differently — both legitimately — Alation supports multiple terms with explicit synonym and related term relationships rather than forcing a false consensus. A "Revenue" term in the Finance Glossary and a "Recognized Revenue" term in the Accounting Glossary can coexist with a documented relationship between them. Stewards can designate one as the preferred enterprise definition while preserving domain-specific variants with clear labeling.
A business glossary that exists in isolation from the data it describes is an expensive dictionary. Alation connects the glossary to the rest of the data intelligence platform in three ways that make definitions actionable.
Terms linked to catalog assets. Stewards can link any glossary term directly to the tables, columns, BI reports, and data products where that concept is implemented. A user browsing a revenue table in Snowflake sees the governing "Revenue" term inline — and a user browsing the "Revenue" term in the Finance Glossary can navigate directly to the three tables that calculate it. Definition and data are one click apart.
Terms and data lineage. When a term is linked to catalog assets that participate in Alation's lineage graph, the term becomes a semantic annotation on that graph. Teams can answer not just "where does this data come from?" but "where does this business concept flow through our pipelines?" — connecting the business vocabulary to the technical data movement that implements it. For regulated industries, this is foundational to answering audit questions about where a specific metric originates.
Terms and data policies. Alation's policy framework allows organizations to attach governance policies to glossary terms. A term like "Customer Email" can carry a policy stating that any data asset tagged with that term must be treated as PII and subject to specific access controls. When a new column is tagged with "Customer Email," the governing policy follows automatically — without requiring a manual governance review of every new data asset.
Terms and AI. As organizations build AI and analytics products on enterprise data, the business glossary becomes the semantic contract between business intent and data implementation. A model trained on "revenue" data is only as reliable as the organization's clarity about what "revenue" means. A governed, catalog-linked glossary makes that contract explicit and maintainable.
The organizations that get the most value from their data are the ones where everyone — from the data engineer building the pipeline to the executive reading the dashboard — is working from the same definitions. That alignment doesn't happen by accident. It requires a governance structure that is deliberately built, actively maintained, and embedded in the tools people use to work with data every day.
Alation's business glossary — built on Document Hubs, managed through a structured term lifecycle, scaled with the Bulk Utility, and connected to catalog assets, lineage, and policy — is designed to make that alignment achievable at enterprise scale. Not as a one-time project, but as a living practice that improves as the organization's data culture matures.
For more on enabling Document Hubs and Glossaries in Alation, see the Alation documentation. To see how Alation's business glossary works in your environment, request a demo.
A business glossary in a data catalog is a centrally governed collection of business term definitions — names, descriptions, synonyms, and related terms — that give all users of an organization's data a shared vocabulary. Unlike a standalone spreadsheet, a data catalog's business glossary is linked directly to the physical data assets that implement each term, making definitions discoverable at the moment of data use, not just in a separate reference document.
Alation's business glossary is built on a three-tier structure: Document Hubs (domain-level containers), Folders (which function as glossaries within a Hub), and Documents (which function as individual terms). Catalog Admins configure the structure and term templates. Stewards manage the term lifecycle — from draft through enrichment, review, and publication. Terms can be linked to catalog assets, connected to data lineage, and associated with data policies. The built-in Glossary Hub is available by default in Alation version 2025.3 and later.
Alation supports domain-scoped Glossary Folders within Document Hubs, each with its own Stewards, permissions, and workflows. Bulk create and edit via the Bulk Utility allows large term sets to be onboarded at once — up to 50,000 terms per day on the Enterprise+ tier. Cross-domain term relationships are made explicit through synonym links and Navigation Links, which surface a term in multiple domain contexts without duplicating the definition. This federated model lets global enterprises manage thousands of terms across dozens of domains without centralizing all governance decisions.
Yes. Glossary terms can be linked to the catalog assets — tables, columns, BI reports — that implement them. Because Alation tracks data lineage across those assets, a linked term becomes a semantic annotation on the lineage graph, showing not just how data flows but which business concepts it carries through the pipeline.
Alation does not force a single definition. Multiple terms can coexist with explicit synonym and related term relationships, allowing organizations to acknowledge legitimate variation — a Finance definition of "Revenue" and an Accounting definition of "Recognized Revenue" can both exist, with a documented relationship between them. Stewards can designate one as the preferred enterprise definition while preserving domain-specific variants.
Loading...