By Michael Meyer
Published on May 15, 2024
No matter how you slice it, data leaders face an uphill battle. In fact, a Gartner survey of CDOs from 2023 found that 69% of data analytical leaders are still struggling to deliver measurable ROI. Data products have emerged as a potential solution. This principle, which borrows from the world of retail and data mesh to align minds across departments, has found favor with successful CDOs like Steve Pimblett of The Very Group. In a recent Alation brief, I reviewed the ins and outs of data products – and how you can get started. Here’s a summary of the video talk (which you can also view here!)
When people search for data, what they often find does not answer their questions, like: What is this? How can I use it? How do I know I can trust it? Who can help me? This begs a larger question: When people seek out data, how do they find it, grasp it, and use it well? Data products have arisen to meet the needs of such seekers – and by extent, help the business.
From a formal definition perspective, a data product is the packaging of data assets so that they are discoverable, understandable, and trustworthy – meeting a critical business need. Data products ensure people are able to access key data assets in your organization and consume them.
Allow me to use an analogy: Data products are like cakes. Imagine the cake as your final data product, with its ingredients symbolizing various data assets like tables and views. Just as you blend ingredients for cake batter, you prepare data assets during the initial phase, forming what we call data pipelines. This stage involves transforming raw data from its source systems into a state ready for its final purpose.
Once a data product is “baked”, the focus shifts to ensuring its findability and consummability. Another formal definition outlines a data product as the packaging of data assets tailored to address specific business needs: Crucially, it must be discoverable, understandable, and trustworthy. Just as a cake delights the palate when well-made, a well-packaged data product satisfies a business user’s requirements effectively.
Data products are crucial because they streamline the process of accessing and utilizing data. They do this by putting the responsibility of data’s distribution into the hands of the domain experts, or those who know it best (not a central IT team).
As someone who has been on these central IT teams, I can tell you how this problem began, and how it compounded. Essentially, charging the same people building infrastructure for data with the task of also distributing that data was an inefficient process.
Centralized teams face particular challenges with data discovery. In agile projects, like the ones I've worked on, time is divided into 60-day epics. However, without proper data systems, what should be a five-day data discovery phase would often extend to ten or even 20 days. This is akin to telling customers they must wait days for a cake or settle for a thin slice! So you can imagine the frustration and “hanger” that results. This delay forces difficult choices between extending timelines or cutting project scope, highlighting the critical need for efficient data access methods.
Data products emerge as a solution, ensuring efficient data access and empowering teams to deliver timely and impactful insights, thereby alleviating these challenges and fostering smoother operations.
The data products mindset borrows from the world of software and agile programming for building apps, which prioritizes speed and collaboration It empowers product managers and self-sufficient teams to deliver end products more quickly. So why not do this with data?
A data product must possess inherent value while also complementing other elements. But how can we determine its value? This comes down to what domain experts know and can learn about data.
Within a data intelligence platform, direct conversations between consumers and data product owners facilitate feedback loops. This feedback enables enhancements to existing products as well as the creation of new ones.
Additionally, understanding usage metrics within the data product portfolio aids in gauging value. Leveraging advanced analytics, tracking usage becomes streamlined, empowering data product owners with actionable insights for refining their offerings.
Discoverability is the next key aspect of effective data products. Just as a bakery may position enticing cakes in the front window, data product owners should package and display their offerings to attract and compel usage.
A robust platform is essential to create a marketplace where data products are easily accessible and searchable. Implementing product registries further enhances discoverability, categorizing offerings by domains, business units, or departments. This streamlined approach empowers users to swiftly locate the data products they need, fostering efficiency and productivity within the organization.
Let's talk about addressability in data products, which is all about making them easy to find and interact with. It's not just about having a slick user interface but also ensuring there's smooth communication through application programming interfaces (APIs).
Imagine you need to automate tasks like deploying or updating a data product. Having unique identifiers for each product allows you to address them programmatically. For example, when updating, you might want to track the last refresh date and time with a timestamp. These features make managing data products a breeze, saving time and effort while ensuring smooth operations.
Understandability is another crucial aspect of effective data products. It's not just about having them available; it's about ensuring users can grasp what they offer. Clear and comprehensive descriptions are key, shedding light on the purpose and contents of the data product.
Furthermore, providing examples of the underlying data assets further enhances comprehension, giving users a tangible sense of the information a data asset contains (and how it might be used most effectively). Perhaps most importantly, any limitations or constraints associated with the product should be transparently communicated. This allows users to assess whether the data product aligns with their specific needs and objectives, facilitating informed decision-making and maximizing utility.
Trust is at the heart of effective data products, ensuring users can rely on the information they're working with. As data moves from its source to becoming a finished data product, maintaining its quality is crucial every step of the way.
With our open data quality framework and partnerships with providers focusing on data observability and quality, we ensure that quality is maintained from start to finish.
On the user end, trust flags that mimic stoplights give a quick snapshot of the data's current state—whether it's good to go (green), needs a second look (yellow), or should be avoided (red), helping users make informed decisions.
Transparency is key too, giving users insights into things like how often the data changes or its level of accuracy. Understanding where the data comes from and how it's been handled is important for everyone, from data analysts to business users. That's why we provide business lineage, so users can track the data's journey and feel confident in its reliability. And let's not forget about federated policies, which give users the power to apply consistent rules, further boosting trust in data products.
Another crucial aspect is ensuring native accessibility to data products. This means that once you discover a product, you can immediately access it without any hurdles.
Whether it's a dashboard report or a dataset, being able to seamlessly launch it into your preferred platform is essential. For data analysis, our query tool, Compose, allows for quick analysis, while our Compose query forms cater to business users' needs.
Additionally, for those who work extensively with spreadsheets like Excel or Google Sheets, our Connected Sheets feature enables direct analysis of datasets right within these familiar environments, facilitating faster insights at the point of consumption.
Finally, security is paramount in data products, ensuring that stringent security controls are in place for data access and classification of sensitive information. This is essential not only for internal products but also for external data marketplaces, where additional layers of security planning are necessary.
By prioritizing security measures, we uphold the integrity and confidentiality of data, safeguarding it against unauthorized access and potential risks.
In the realm of data management, terms like "data mesh" and "data fabric" have been gaining traction. These concepts aren't disparate; rather, they complement each other.
When I think of data fabric, I envision a centralized platform like a data intelligence platform such as Alation, which aggregates active metadata, providing a comprehensive view of data across the organization. This rich metadata not only facilitates locating data but also streamlines processes like automation and coding, expediting development phases.
When we bring together these concepts—data mesh, data fabric, and data products — we see a holistic approach to achieving business outcomes. Data mesh serves as the strategy guiding our approach, while data fabric forms the supportive infrastructure.
By treating data as a product and assigning data product owners, we foster collaboration between engineers, business stakeholders, and IT, aligning efforts with overarching business initiatives. This shift from siloed development to collaborative, business-driven practices ensures that solutions meet real needs, rather than relying on the outdated "build it and they will come" mentality. For consumers, this means faster access to trusted data and quicker insights, empowering them to make informed, data-driven decisions with confidence.
Curious to see the entire video brief? Watch it here.