Why Implement a Data Catalog?

In a word: insight. Nowadays, businesses have more data than they know what to do with. Cutting-edge enterprises use their data to glean insights, make decisions, and drive value. In other words, they have a system in place for a data-driven strategy.

But let’s rewind: how do you know you need a data catalog in the first place? Good question. You may need a data catalog if you suffer from what we we like to call “the V⁴ data headache.”

The 4 Vs of Big Data: Volume, Variety, Veracity, Velocity

The V⁴ Data Headache

  • Volume makes finding trusted data difficult
  • Variety makes lineage & transformation tough to track
  • Veracity (or accuracy) is time-consuming to verify
  • Velocity, the high rate at which data is accumulated

Sound familiar? Folks who work with data face these challenges every day. But they don’t have to. A data catalog helps people find, understand, trust, and govern data.

The catalog gathers metadata, (or data about data), to add context to every asset. Users can see asset popularity and top users. Data people love data catalogs for a reason: they build trust in data, which builds trust across your enterprise. However, before you start shopping, be sure to clarify what your enterprise hopes to gain from the catalog. What problems will it solve?

How do you implement a data catalog?

Data catalogs are extremely useful for enterprises prepared to utilize them. They make data findable, accessible, interoperable, and reusable, which encourages employees to work with data.

Implementing a data catalog enhances an organization’s data management and allows for the democratization of that data. But how are data catalogs implemented? Three phases guide the process. In phase one, an enterprise must create a data strategy, which will inform later plans. With a strategy in place, the next two phases are preparation and implementation.

During the preparation phase, an enterprise determines goals & use cases, and applies concepts for selecting the proper data catalog tool. Once the business is ready, the implementation phase starts. Implementation refers to setting up the configuration of the selected data catalog tool and eventually feeding data into it.

Build a Data Strategy (Phase One)

Before an enterprise goes shopping for a data catalog, it must first determine a roadmap.

How to Develop a Data Catalog Strategy

  • Clarify mission and vision for data
  • Set goals accordingly
  • Identify use cases
  • Apply these concepts to the selection process
  • Consider offense v. defense

What is the intended outcome? How will the catalog help the enterprise grow? What future goals will it accomplish? What current challenges will it help overcome? These conversations are often difficult but illuminating. Aligning stakeholders around a shared mission and vision is a vital first step. This shared mission will inform more specific goals.

Once goals and scope are outlined, an enterprise should identify use cases to build evidence for the necessity of a data catalog. This evidence will help convince key stakeholders of a catalog’s long-term value. Further, identifying top use cases will narrow your search to the best data catalog tools for your enterprise. With these checkboxes ticked, you will have enough information to apply these concepts for selection of the best data catalog.

Business Goals & Scope

Perhaps the most important part of preparing for a data catalog is setting the roadmap for how it will be used. In other words, what should usage achieve? In this stage, decision makers should create goals based on the need to manage data and create data transparency. They should also consider depth: How deeply will the catalog connect to data domains, for example?

Keep your employees in mind. This will be a big change for them at first! If you demonstrate how the catalog will make their jobs easier (and less stressful) you’ll get more on board. Clarify how a data catalog supports business goals, both short- and long-term. Equally important is how it will support employee goals for growth and education.

Encourage user adoption, with an eye to making it fun. Request feedback and invite questions. Transparency will build trust, and employee buy-in will influence stakeholder approval.

Data Strategy: Offense v. Defense

Your industry, regulatory environment, and long-term goals influence your data strategy. Will the data catalog provide a single source of truth (SSOT)? Or multiple versions of truth (MVOTs), derived from one source?

That depends, again, on your goals and industry. When it comes to picking a plan of attack, “the CDO must determine the right trade-offs while dynamically adjusting the balance by leveraging the SSOT and MVOTs architectures” (Davenport).

This checklist from Harvard Business Review can help you determine where your firm falls on the strategy spectrum.

HBR Data Strategy Checklist

Again, your industry, regulatory climate, and goals will determine where you fall on the offense-defense spectrum.

“What’s critical is that single sources of the truth remain unique and valid, and that multiple versions of the truth diverge from the original source only in carefully controlled ways.” (Davenport).

Data governance ensures that “new truths” are produced systematically and with total transparency.

Data Catalog Preparation (Phase Two)

Identifying & Documenting Use Cases

Once general goals and scope are identified, plan out your specific use cases. Leaders decide how the data catalog tool will support vital areas.

Data Catalog Use Case Examples

Documenting how the tool will solve problems makes it easy to show the stakeholders key benefits. Which use cases are top of mind? Which can you back burner… for now? These conversations help you further define your enterprise’s data strategy; namely, whether you prioritize offense or defense.

Choosing a Data Catalog

Clarifying and documenting use cases, business goals, and scope will help you select the right tool. Again, keep your potential users in mind. How will a data catalog support their daily work? Encourage your team to shop around and explore resources, like demos and video walkthroughs. Gather feedback to flesh out your use cases. Which tools excite your team?

For lasting success, you should also consider how you will scale. Can the data catalog scale as the company grows? This initial information gathering phase will ensure the chosen tool meets the needs of your team and your enterprise at large.

Data Catalog Implementation (Phase Three)

Switching people to a new working environment doesn’t happen overnight. Before you kick off implementation, you’ll need a clear plan to transition workflows as you ease people into the new day-to-day operations around a data catalog. So how do you get started?

Four Steps to Data Catalog Implementation

  1. Early user involvement
  2. Iterative application
  3. Role establishment
  4. Share benefits and outcomes

Adoption success depends heavily on how well the chosen data catalog meets the needs of data users. In step one, early user involvement, leaders should find willing “guinea pigs.” Attitude is everything, so pick open-minded folks hungry to learn, who will ask questions if they grow confused. Their input will guide customization as more employees use the tool.

Be sure to provide robust training and orientation to early users. Educate them on how data catalogs work in general. Orientation empowers data users to teach other team members about the catalog.

Each early user should have a clear role and responsibilities, such as data steward. But this can change, and that’s OK! Leaders should leverage iterative application along the way, assessing projects and developing key areas for improvement.

Work with your guinea pigs to ensure business goals are met. When you’re just starting out, it’s important to maintain a dialogue around expectations. Who does what? Equally important: “Who doesn’t do what?” In this step, you establish roles and accountability. Celebrate success, and be patient as people learn and adapt. Share business benefits and outcomes with leadership.

Keep in mind, there are data catalog features that must be present in order for this process to work successfully. Identifying these features early can mean the difference between successful implementation and failure.

Iterative Application

In the early stages of implementation, use the data catalog to solve the use cases identified during the preparation stage. This guarantees some quick initial wins and allows team members to see benefits of the new tool immediately.

As users become more familiar with the data catalog, the responsibility shifts to those team members to highlight their successes to other team members and stoke interest for new use cases. These use cases will inevitably expand as the data catalog is further implemented.

Establish Roles & Processes

You need high-quality data ASAP, and a data catalog should deliver it. If the data catalog is not populated with relevant data quickly, the implementation process can become drawn out, and perceived value suffers.

To prevent this, assign data stewards early. Specifying roles in the adoption team creates stakeholders and a sense of responsibility – they want the catalog to succeed. With data stewards assigned, teams have gatekeepers. These folks are responsible for data accuracy and documentation during the early stages.

Share Business Benefits & Outcomes

Implementation is a continuous process. The data catalog needs your social support throughout. Share success stories, lessons learned, and overall business benefits. The more you document your success during implementation, the more data users will be inspired to use the catalog.

The goal of communication is to alleviate pain points. Be open with your team about how best to evolve the data catalog to solve business needs, while keeping in mind the needs of the data users.

Are you in the market for a data catalog? Learn how data intelligence leverages the collective brainpower of your enterprise. This makes the best, most trusted data easy to find and use quickly. See our white paper, “The Catalog Is the Platform,” for insights into how the catalog works.

The Catalog is the Platform for Data Intelligence white paper