The Role of the Data Catalog in Data Security

By Myles Suer

Published on June 14, 2021

The Role of the Data Catalog in Data Security

The Role of Catalog in Data Security

As the facilitator of the #CIOChat, I get to have fascinating conversations with CIOs and other technology leaders. We discuss how they are running the business of IT and cover subjects like digital transformation, business/IT alignment, IT leadership, and leading innovation.

Recently, I dug in with CIOs on the topic of data security. What came as no surprise was the importance CIOs place on taking a broader approach to data protection. What did come as a surprise was the central role of the data catalog for CIOs in data protection.

In this post, I will touch on three things. First is why CIOs want their CISOs leading data protection efforts. Second is the role of the data catalog that CIOs ascribe within data security. And third is what factors CIOs and CISOs should consider when evaluating a catalog – especially one used for data governance.

The Role of the CISO in Data Governance and Security

Without question, CIOs want their CISOs thinking about more than moots, castles and simple access control and security. They want CISOs putting in place the data governance needed to actively protect data. According to CIO Martin Davis, “if they aren’t doing this then they are missing the point.”

“The ultimate prize for threat actors is the data,” CIO Jason James agrees. “So CISOs must protect data. They should be held accountable because someone must be held accountable. It’s not a HIPAA regulation, but a finger-pointing culture. Granted, the same is true for many public companies. Someone takes the fall for the breach and it could be the CIO/CISO/CEO, or all. It seems that way these days. Granted the same may be true for the CIO when breaches occur. Look at how much turnover happens at companies post-breach. Breaches are resumé generating events.”

James’ comments demonstrate how fear and company culture can affect how a company handles a security breach. Dan Kirsch, Analyst, Hurwitz Associates, agrees that CISOs must take responsibility, when he says that “data protection is absolutely part of the CISO’s job. For this reason, smart CISOs are making sure that analytics and AI teams have data security in mind and are using secure data platforms. CISOs do not want to be thought of as Mr./Ms. No.”

Protecting the Sensitive Data Your Organization Creates

“Everything starts by answering basic questions,” says Wayne Anderson, Principal, Security Architect, Microsoft. “What am I required to do? What do we know? Do we know the business outcomes tied to data risk management? These questions drive classification. They drive labeling. Once you have data classification then you can talk about whether you need to tokenize and why, or anonymize and why, or encrypt and why, etc.” Indeed, defining key terms and assigning accountability are two essential first steps to data governance.

Isaac Sacolick, Former Business Week CIO, agrees with the importance of data governance. He says IT teams must do three key things.

IT Teams Must Do 3 Key Things to Implement Data Governance

  1. Catalog and label data (they need to know where it is, and the data’s sensitivity)

  2. Identify data owners (even if they’re not ready to handle the responsibilities), and

  3. Limit usage for BI, applications, etc. (until 1 and 2 are addressed).

“The main challenge is articulating the importance and responsibilities to get people actively involved in data governance,” Sacolick concludes.

Locating data, too, is key. “The first thing in protecting sensitive data is to find it,” points out Carrie Shumaker, CIO, University of Michigan Dearborn. Deb Gildersleeve, CIO at Quickbase, agrees: “Catalog it, build guardrails/governance around access to the data based on how it’s catalogued, and then limit overall access.”

“It is a little chicken and egg,” Dearborn admits.”You must start with where you are putting the data and how you access it, and then establish the data governance and privacy policies.”

Agreeing with these CIOs, Dion Hinchcliffe, of Constellation Research, asserts that “maintaining an accurate data ownership picture and catalog is crucial for effective data security. Yet, it is now growing ever-more difficult quickly with cloud, SaaS, Shadow IT sprawl. Ultimately, automated data discovery is the only answer.” Indeed, automation is a key element to data catalog features, which enhance data security.

Selecting a Data Catalog

To support data security, an effective data catalog should have features, like a business glossary, wiki-like articles, and metadata management. It is essential that the catalog makes it easier for data stewards and data security professionals to perform their jobs.

For this reason, effective data catalog software should include five things: data intelligence, data collaboration, guided navigation, active data governance, and broad data connectivity. Let’s take a look at each of these.

The Five Key Features of a Data Catalog

1. Data Intelligence

It is essential that the finding of sensitive data is effective and efficient for everyone involved in the data governance process. Intelligence is critical to achieving these ends, as it makes data searches relevant and curation of sensitive data scalable. Intelligence automatically surfaces clues in the data to remove the manual effort otherwise required for discovery; intelligence can also flag sensitive data within the huge volume, variety, and veracity of data facing the modern enterprise.

Intelligent systems powered by machine learning are necessary for overcoming the challenges of data management. According to a 2020 451 Research report, “data catalogs are rapidly building out automated functionality,” including “automated suggestions, automated discovery and tagging, and automated data-quality scoring.” These are essential to enabling a more rapid process of sensitive data discovery.

2. Data Collaboration

Data discovery has increasingly become a team sport. As more people move to remote work environments, collaboration becomes even more vital to data discovery and protection. Data catalogs should spur collaboration, not only across geographies but also across expertise. This should include the identification of potentially sensitive data and warnings for data users.

With effective collaboration, each contributor works toward a common goal: building off of the work of others and opening the door for more complete data governance. Without collaboration, the work of stewards is siloed and needlessly recreated. To create a data governance and culture, collaboration needs to be a seamless part of data governance and application.

3. Guided Navigation

Guided navigation helps data stewards locate sensitive data. This includes finding the most exposed sensitive data and ensuring it is used properly. There are many locations where sensitive data can reside — from data lakes, databases, and reports, to APIs and queries. As Hinchcliffe points out, this makes finding sensitive data even more difficult. And, finding data is only half the battle. It is also critical to understand how data is used (or misused.)

A data catalog with guided navigation not only provides data stewards with the right data and context, it also ensures that the data is being used properly. Rather than giving data stewards an atlas, guided navigation provides them with turn-by-turn directions.

4. Active Data Governance

According to Robert Seiner, author of Non-Invasive Data Governance: The Path of Least Resistance and Greatest Success, data governance is “the formalization of behavior around the definition, production, and usage of data to manage risk and improve quality and usability of selected data.

”The data catalog software should empower an active approach that encourages those who are working with data to be active in its governance. This should be a people-centric approach that helps to formalize (rather than require) behaviors. Specifically, this means applying governance at the point of data use and enabling stewards to prioritize their efforts.

An active approach to data governance includes people-friendly features. This includes agile approval processes, approvals, and human-centric policy enforcement. Stewards should not have the only role to play in data governance but cannot be the only ones involved in helping to shape and protect data. Citizen data stewards are critical to closing the gap between guidelines, policies, and the way that data is being used.

5. Broad and Deep Connectivity

Clearly, the move to a breadth of locations for sensitive data has only increased as organizations have become truly multi-cloud. It is critical, therefore, to have connectivity to all data sources, as well as the lineage of data. Pre-built connectors, an open connector framework, and SDK are all encouraged as ways to access all data sources. Additionally, having deep connectivity and query log processing into the data is equally recommended.

Parting Words

Without question, CIOs are looking for CISOs that grok data and data governance. Today, this means understanding the central role that data catalogs play in data governance. I have suggested here that there are five capabilities that CIOs and their CISOs should consider in a data catalog being used for data governance. Today, there is no perimeter. You can’t trust much of anything anymore, even inside a perimeter. For this reason, CISOs need to move towards protecting data. And according to Adam Martin, IT Director, “you can’t protect what you do not know about.”

Aon is a global firm for professional services, serving 120 countries with 50,000 employees. With the Alation data catalog, Aon fosters global collaboration while promoting high standards for data governance, privacy, and security. Watch the Aon webinar to learn how they use the data catalog to balance access with privacy in a competitive landscape.

    Contents
  • The Role of Catalog in Data Security
  • The Role of the CISO in Data Governance and Security
  • Protecting the Sensitive Data Your Organization Creates
  • Selecting a Data Catalog
  • Parting Words
Tagged with