5 Data Catalog Features that Improve Time-to-Decision and Analyst Productivity
Companies are discovering that embracing self-service analytics can lead to faster insights and better decisions. In fact, experience has shown that data catalogs can improve analyst and business user production by up to 50% and increase the speed of accurate documentation by up to 40%. And, unlike the old days (when IT exerted tight control over access to company data) these benefits are within reach of any data consumer in a data-driven organization. To positively influence the behavior of your analysts – or frankly anyone that self-serves data in your organization, consider the following five features of a data catalog that can lead to improved time-to-decision and productivity:
1. Unified view for all your data
A data catalog provides the most value when it has a robust search capability that covers captured metadata and observed users behavior across all datasets. Make sure that the catalog you choose offers you a combined view of all your data, not just a view of a subset of your data or one type of data. For example, a catalog for just Hadoop, or just relational databases will have limited functionality. In order to find the right data, you need to be able to search through all your data without exception.
2. Machine-human collaboration to enhance data context
Some data catalogs function as a simple data inventory – without the ability to observe or learn from user behavior. It’s important to find a catalog that not only provides an automated repository of all your data, but also incorporates a machine-human learning system with algorithms designed to provide context about your data and how it is used. A feedback loop in which behavior is observed and learned from in order to suggest other behavior and best practices allow your catalog to become smarter over time. As more and more human user behavior is observed and confirmed, you’ll receive remarkably fine-tuned contextual information. Data context derived from machine-human collaboration enhances decision making.
3. Verification of sources so you can trust your data
Most data catalogs will provide a data lineage feature to allow you to trace the sources of your data. As you choose a data catalog, look also for additional measures of verification, such as data flags or annotations. This capability allows users to endorse an asset of value, or provide a warning or deprecation if an asset is outdated or inaccurate. Direct human verification increases trust in data.
4. Just-in-time guidance to help you better understand your data
Significant value is added when a catalog can provide “just-in-time” suggestions based on the behavior of other users. A catalog that is responsive to user input – providing usage-based guidance at the point of consumption – can save you significant amounts of time. And may help you bring new analysts and business users on board more quickly.
5. Collaborative capabilities to break down organizational silos
Collaborative capabilities built into the fabric of your data catalog can transform the way your teams interact. And may materially impact the insights discovered. When teams of analysts work in silos, work is re-created instead of re-used, and a great amount of organizational knowledge remains unshared. Look for Wikipedia-like capabilities to share information across teams, and integrated communication tools to enable direct dialogue between team members and other experts. It’s an advantage if these assets are searchable. In this way tribal knowledge can be captured and codified, and shared with others across geographies and across time.