Common Data Governance Challenges
Every enterprise runs into data governance challenges eventually. Issues like data visibility, quality, and security are common and complex. Data governance is often introduced as a potential solution. Further, many industries are under increasing regulatory scrutiny. Add to that the pressure of complying with an endless alphabet soup of regulations, like the GDPR, HIPAA, SOX, and CCPA. Small wonder that leaders are overwhelmed!
Hefty fines penalize data consumers who violate privacy laws, like the GDPR. [source]
Yet tech is catching up. There are new ways to quickly and effectively overcome these data governance challenges. Obviously, leadership is an important component. A person or team with influence must take responsibility for reducing data governance risks.
That influencer, however, needs support. They should have resources, tools for connectivity and integration, and insights into data usage and needs. Finally, they need control and authority to make decisions that improve data governance. But first, they need to understand the top challenges to data governance, unique to their organization.
Source: Gartner: Adaptive Data and Analytics Governance to Achieve Digital Business Success
As data collection and volume surges, so too does the need for data strategy. As enterprises struggle to juggle all three, data governance offers a vital framework. The world is collectively generating trillions of gigabytes of new data. By 2025, volume is expected to explode beyond 175 trillion gigabytes of new data generated annually.
And one enterprise alone can generate a world of data. This means that leaders confront challenges within and without. They face external pressures from the world of compliance; they face internal pressures to do more with the data they have.
The logistics of collecting, storing, and accessing so much data (from many sources) creates myriad issues. Operational, security, compliance, and other challenges arise, to name a few. Another halts logistics; limited resources can stymie an adequate response.
Mitigating data governance risks requires resources. But finding the budget and people for an ongoing data governance program is not easy. It means competing against other projects and priorities. Additionally, ingrained perceptions can also pose issues.
Many leaders may think that “IT owns the data.” So “data” is their sole responsibility or domain. This means other teams must wait for IT to devote resources to data governance. This is a lot of pressure for one department! IT doesn’t want to be the bottleneck, nor do they have the time to “manage all the data.”
A strong case to leadership starts with pain. Those making the case for data governance should highlight the business pain caused by its lack. The business value of data governance is vast — with the right tools. Governance that empowers data access can speed up processes. If analysts can access the data they need instantly, that can save an enterprise months of lost labor. If tech leaders can quantify a governance solution with cost savings, they can add it to the budget. With governance in the budget, leadership can prioritize it.
Modern data governance relies on automation, which reduces costs. Automated tools make data governance processes very cost-effective. Machine learning plays a key role, as it can increase the speed and accuracy of metadata capture and categorization. These features add context to the data for effective “hands-free” governance.
Automation streamlines a range of governance tasks. New business terms are auto-added to glossaries, aligning teams on shared definitions. Automated governance tracks data lineage so users can see data’s origin and transformation. Auto-tracked metrics guide governance efforts, based on insights around data quality and profiling.
Modern governance can track current usage behaviors and surface them. This empowers leaders to see and refine human processes around data. The result? Deeper knowledge of how data is used powers deeper understanding of the data itself.
Silos exist in every enterprise, and they never fail to cause data governance challenges. Data silos arise for a range of reasons.
Why Do Data Silos Happen?
- The fast pace of data collection
- Constant turnover of technologies
- New data sources
- Evolving infrastructures
- Corporate cultures
- Internal friction
- Communication barriers
Silos arise for many reasons. The proliferation of tech has caused an explosion of data. If corporate cultures don’t change to meet that challenge, friction and problems around communication only get worse.
To build a solution, a change in thinking may be just as necessary as a change in process. A data catalog supports both. By making processes transparent, a catalog lets newcomers learn from leaders. And as users dismantle siloes, a catalog will unify all disparate data into one platform.
Data catalogs solve the technical data governance challenges caused by silos. By analyzing metadata, the catalog streamlines data management and search. This creates a complete inventory of data in a single place. It also provides key background metrics, lineage, and context. This empowers better decision-making and reduces risk. Data catalogs also provide insights into data quality with usage reports, warnings, and quality flags. These indicators alert users to potential data governance issues.
No Data Leadership
Data governance challenges are often exacerbated by a few patterns. Lack of strong data leaders is one. Not all are data literate! Leaders may need education around data governance: the risks it mitigates and its business value.
Second, misconceptions around data are common. Since every enterprise uses data daily, this can cause a misconception that data is “in good shape” and governance is thus not needed. However, every organization needs data governance. It ensures workers are finding, understanding, and using the right data to make the right decisions.
The first step is to set up a data governance team with the appropriate structure. This should include a knowledgeable and communicative leader. This role is commonly the chief data officer, or CDO. This person will be responsible for keeping leadership in the know. They will articulate the need for data governance and keep stakeholders informed.
The CDO needs a solid team. Direct reports include project managers, responsible for data governance initiatives. They may also call on strong communicators. These folks will articulate the critical elements of data to data consumers. They can play a key role in training teams on new processes. Lastly, they may report on data governance progress to top decision makers in the organization.
Indeed, the importance of clear communication with decision makers cannot be under-emphasized. Every role in the enterprise has a unique relationship to data governance. Executive leadership will be interested in high-level metrics. They seek an answer to an important question: what business value will data governance bring? Business users and data analysts need governance guidance at the point of access. Data governance roles want tools to ensure enterprise-wide compliance. As it turns out, a data catalog caters to each of these needs.
Data governance may be best understood by athletic analogy: offense or defense. Industry plays a large role in determining the side of that coin. Finance and healthcare, for example, face many compliance challenges. Companies in these sectors typically use more defensive governance. Retail, by contrast, may leverage more offensive, or aggressive data governance, and emphasize data democratization and access in contrast to a defensive, preventive mindset.
Each enterprise will have their own data governance challenges. Such challenges, along with company goals, will define the data governance team’s framework. Collaboration with the executives, leadership, and stakeholders is important. This maintains wide support and focus on data governance risks.
Making data governance a formal role is also important. This signals to the organization the value of data. It clarifies data governance as a key objective. Finally, it’s critical to have the right person in the role of CDO. This person is a vital leader, who may well become the face of data governance.
Data governance challenges depend upon context. What goals does the business have? What industry regulations are at play? Governance priorities arise from such realities. Overlooking data governance issues may lead to trouble. For example, compliance issues could result in regulatory fines. Security risks may result in a data breach. And inappropriate business usage could lead to poor decisions and wasted resources.
“Metadata” describes data about the data. How often is it accessed? Who accesses it? What are its contents? Does it include PII (personally identifiable information)? Metadata answers each of these questions. In this way, metadata provides crucial context around data for other users. It tells people who uses a dataset and how. It can even flag compliance areas, if, for example, an asset contains PII, which often merits a unique process.
In this way, metadata is critical for data governance. Indeed, such insights around data usage provide vital context for data governance. Data catalogs provide those insights through popularity and usage metrics. Data consumers can have inline conversations about the data where it lives.
A data catalog may even host wiki-like articles, where people can document details about the data. These articles form a living document: a given asset’s history and past applications. Is it deprecated? Is it usable? So often, the ideas that fuel a data’s application make it valuable to future users. These are important details to document and share!
In this way, data catalogs use feedback to flag potential risks — and tribal knowledge to capture wisdom. Catalogs provide real-time warnings to users when they sense a governance process at play. They can even aid compliance, by automatically concealing sensitive, classified, or private information from those without the right credentials.
Data quality is relative. Few would argue that “bad data is better than no data at all.” Or that “yesterday’s complete data is better than today’s incomplete data.” Today, the question is not, “is this data high quality?” but rather, “what level of quality is this data?” Today, data quality acceptance levels gauge overall quality. Errors arise from inaccuracies, age, and improper usage. For example, how old is too old?
Data governance challenges often arise from a relative perception of data quality. This is what makes data catalogs (and data profiling) so important to data governance. A data catalog profiles data quality, characteristics, usage, access, storage locations, and more. Such profiling enables comprehensive visibility; how did an asset change, when, and why? A complete picture of data quality and transformation empowers smart governance. In this way, leaders can address risks with proper policies, enforced in the catalog.
Source: Gartner: Adaptive Data and Analytics Governance to Achieve Digital Business Success
Lack of Control
Control over data, or lack thereof, has become a common data governance challenge. Lack of control can result in noncompliance — when people process data unlawfully. Remember, strict rules are in place! Personal, healthcare, payments, and other sensitive data are tightly regulated. Laws like the GDPR, HIPAA, PCI-DSS, CCPA, mandate proper data usage. Data consumers who run afoul of these laws risk hefty fines. But processes change often, and changing folks’ behavior in tandem is a challenge.
Big data challenges complicate governance risks. Lack of control over data is common when you’re drowning in it. These days, data challenges relating to variety, veracity, and volume abound. Indeed, each “V” may create its own data governance challenges. Variety muddles lineage and makes transformation tough to track. Veracity makes data difficult to verify. Sheer volume makes it difficult to find the right data. A data catalog cuts through the noise. It supports compliance by helping workers find, evaluate, and understand data.
The GDPR and CCPA are two particularly impactful regulations when it comes to data governance. These laws give individuals more control over their personal data; they regulate how organizations can use that information. Organizations must control how they manage and process data based on these two pieces. They must abide by the law, as well as each individual consumer’s choice to opt-in or out of certain processes.
This forces organizations to control how data is processed. They must have data traceability and lineage capabilities to track compliance. It also forces stringent cybersecurity rules, requiring organizations to not only take action to prevent data breaches, but to alert authorities when they do occur.
Data governance challenges arise when enterprises lack the ability to monitor and control how data is used, or to provide insights after noncompliance has occurred. In terms of business value, data governance risks associated with these regulations are decidedly material.
In fact, the CCPA has a maximum penalty per individual data violation of $7,500, which can quickly add up. Under GDPR, penalties reach up to €20 million or 4% of annual revenue, whichever is higher. For example, Google and retailer H&M were hit with GDPR fines of $57 million and $41 million, respectively. CCPA enforcement began on July 1, 2020, and as of May, 2021, there have yet to be any fines imposed.
In Conclusion: A Data Catalog Solves Common Data Governance Challenges
A data catalog offers a vital “bird’s-eye view” perspective of all data in an enterprise. Data catalogs capture metadata and combine it with data management, collaboration, and search tools to help data users quickly find and use the data they need. For data governance, a data catalog solves the biggest data governance challenges by providing a fast, efficient means for connecting siloed data, empowering governance leaders, informing governance efforts, and controlling data for compliance. Learn how enterprises can jumpstart data governance efforts with a data catalog.