Supporting Remote Workers with a Data Catalog in the WFH Era

By Dave Wells

Published on April 23, 2020

Many companies have responded to the difficulties and uncertainties of COVID-19 by asking their employees to work remotely. The shift to work-from-home is a necessary and practical step to combat spread of the virus. But even if many industries were already becoming more remote-worker friendly, the abruptness of this crisis has created new challenges for those who depend on data and information to fulfill their responsibilities. Business decision makers at all levels need current and reliable information to navigate the turbulence of economic uncertainties, rapidly changing social conditions, and strategic volatility.

As the importance of rapid and real-time data analysis increases, the challenges and complexities of analytics are also expanding. For many organizations, the collaborative nature of data analysis is inhibited when decision makers, data subject matter experts, data engineers, and data analysts are not able to meet face-to-face.

The Challenges of Remote Data Analytics

Data analysis and data science are complex processes in the best of conditions. The work begins with understanding the requirements—knowing what information is needed—not as a project phase, but as an ongoing process of exploration and discovery that demands collaboration between decision makers (information consumers) and data analysts (information producers). Communication and continuous feedback are essential but difficult to achieve when everyone involved works remotely.

Getting from requirements to analysis-ready data is a multi-step and iterative process that involves finding data for analysis, understanding and evaluating the data, and preparing data for analysis. Data analysts often rely on tribal knowledge to help them find data. The marginal effectiveness of tribal knowledge networks is certain to suffer when those with knowledge all work in different remote locations.

Next, the analyst works intensely to explore, understand, cleanse, blend, and prepare data for analysis—work that by some estimates accounts for 80% of analytics work that is not spent analyzing data and that is a barrier to rapid and agile analytics. Sharing of data, data knowledge, and data preparation processes helps to accelerate data preparation work. But this kind of sharing typically depends on tribal knowledge networks.

With prepared data, the core data analysis work occurs—choosing analysis techniques and statistical methods, designing data visualizations, and interpreting the data. Data analysis at its best is an iterative process where each cycle of analysis finds meaning, generates ideas, and sparks insights that lead to deeper and richer analysis. But this kind of analysis is as much a human and cultural endeavor as it is a technical process. Ideation and collaboration among multiple people with diverse perspectives and thinking styles are the keys to ideation and discovery of insights. It is a social process that can be greatly hindered by the social barriers working remotely creates.

Finally, it is time to put the analysis to work—to devise strategies and tactics, to make decisions, and to take actions based on discoveries and insights from data. At this point the results of analysis are shared, and socialization is critical. The real power of data analytics is in the ways that it drives communications, conversations, and collective understanding. Bob Duniway, Assistant Vice President for University Planning at Seattle University, once said to me: “Business Intelligence doesn’t happen in computers. It happens between the ears and in the conversations between people.” That’s a powerful statement that emphasizes the futility of analytics without socialization. Clearly the social barriers of working remotely are as significant for information consumers as for information producers.

Data informed decisions and actions are the goals of data analysis, but they are not the end of the story. With each insight, and with every decision and action come new questions and new analysis needs. Decision and action are the end of one analytics cycle and the beginning of another. Cycle time accelerates in times of turbulence and uncertainty.

The Role of the Data Catalog

Analytics speed and agility are essential for businesses to navigate successfully through the coming months. Collaboration, communication, knowledge sharing, and socialization are the keys to enable agile analytics. Across the entire data analysis cycle, data catalog capabilities fill important roles in analytics agility. (See figure 1.)

Figure 1. People and Culture in Data Analytics

The right data catalog used in the right ways will make a real difference in your organization’s capabilities for listening to the data and making data-informed decisions—critical capabilities to navigate the societal and economic uncertainties of the future. Give special attention to these guidelines to maximize the impact of data cataloging:

  • The catalog must support collaboration and crowdsourcing as well as communication, comments and reviews, and knowledge sharing about data.

  • The data catalog should be widely adopted and valued by data stakeholders.

  • The data catalog should be supported with formal data curation practices.

  • The data catalog should provide features and functions to catalog more than datasets. It should support cataloging and sharing of data preparation processes, reports, and analysis results.

  • The data catalog should support critical data governance needs such as identification and tagging of privacy sensitive data.

And finally, two big “must have” items—these are more than guidelines. They are essentials for the data catalog as core business management technology.

  • The data catalog must recognize and integrate people as core elements of data ecosystems.

  • The data catalog must be an enterprise data catalog aware of all shared data regardless of locations and technologies.

Final Thoughts

The current COVID-19 situation won’t last forever. But it will have long-term economic effects, and it may have lasting impact on the way that we work. Data and analytics become more important than ever as businesses make critical decisions to adapt and adjust to continuously changing conditions. It is likely that work-from-home, when no longer a mandate, will become a preferred option for many employers and employees. Serving data and analytics needs in a world of remote workers brings many new challenges. The data catalog has an important role in meeting those challenges.

  • The Challenges of Remote Data Analytics
  • The Role of the Data Catalog
  • Final Thoughts
Tagged with