Multiple Sources of Truth:
Decentralization and the Data Mesh

with Zhamak Dehghani, Creator of the Data Mesh

Zhamak Dehghani, Creator of the Data Mesh

Zhamak Dehghani

Creator of the Data Mesh

In 2018, Zhamak Dehghani created the data mesh concept, a paradigm shift focused on data decentralization. As a former ThoughtWorks technology advisory board member, Zhamak has contributed to multiple patents in distributed computing communications, as well as embedded device technologies.

Satyen Sangani, Co-founder & CEO of Alation

Satyen Sangani

Co-founder & CEO of Alation

As the Co-founder and CEO of Alation, Satyen lives his passion of empowering a curious and rational world by fundamentally improving the way data consumers, creators, and stewards find, understand, and trust data. Industry insiders call him a visionary entrepreneur. Those who meet him call him warm and down-to-earth. His kids call him “Dad.”

Satyen Sangani: (00:03) The data mesh is an enterprise data architecture that enables — and, in fact, encourages — the decentralization of data. Historically, decentralization is a touchy subject in data. In some sense, the history of data architecture comes down to what to centralize and what not to centralize.

Indeed, for decades, we've talked about building a single source of truth. Often, this form of the truth came in the form of a departmental data mart, or in some cases, an enterprise data warehouse. A more recent version of a central source of truth is an enterprise data lake. So take any data noun and put enterprise in front of it, and you have a single thing to rule them all. You theoretically have total power, total consistency, and complete control — theoretically.

In practice, you'd really just have entropy. Ultimately, it's kind of like squeezing the middle of a balloon. The more pressure you apply in one spot, the more the air just moves in a different direction.

Satyen Sangani: (01:06) Time after time, our guest saw this false sense of control and finally called BS on the entire endeavor. She proposed a different alternative. So today we're going to talk with the architect of the data mesh herself, Zhamak Dehghani. She first proposed the data mesh in a blog post in May 2019 and today is one of the leading voices for decentralized technology solutions. She's a former member of the ThoughtWorks technology advisory board and a founding member of several tech advisory boards. So if you're even remotely interested in the data mesh, you won't want to miss this fascinating conversation.

Producer: (01:49) Welcome to Data Radicals, a show about the people who use data to see things that nobody else can. This episode features an interview with Zhamak Dehghani, the creator of the data mesh. In this episode, she and Satyen discussed decentralization, how architecture can empower data professionals, and the difference between data mesh and data fabric. This podcast is brought to you by Alation. Alation enables people to find, understand, trust, and use data with confidence. Our data governance solution delivers trusted certified data fast. As one customer said, "Alation is like Google search for your data. It helps identify what data we have and more importantly, how to properly make use of that data." Learn more about Alation at alation.com.

 
 

Introduction to Data Mesh (And Why You Need a Decentralized Approach)

Satyen Sangani: (02:45) Let's get into it. You're associated with the data mesh. What is a data mesh?

Zhamak Dehghani: (02:49) Right. I read the one-line definition that I have. It may not be the best one. I may have to double-click into it, but let me read it out: Data mesh is a decentralized socio-technical approach to share access-managed analytical data in complex and large scale environments, whether it's within an organization or across organizations. So that's the one-line definition of what it is and it's a mouthful.

Satyen Sangani: (03:16) It is, but I think the key word is decentralization. Why do you need a decentralized approach? What is it that inspired you to think about a decentralized approach and proposing one?

Zhamak Dehghani: (03:27) I think the discord between real-world scale of complexity, inherent complexity of organizations and business — your problem is based that you want to get meaning out of data, find patterns within it, analyze it in a very, very complex world that is constantly changing. The sources can vary. The data itself can vary. The discord between that and the solutions that we had put in place to solve that problem.

I think the observations that — early days when I entered the world of data — I had were around kind of the repeated patterns and the stories that I was hearing from clients that were technologically fairly advanced, as in they had invested in their multiple revisions of the Hadoop infrastructure. They had migrated into the cloud data infrastructure, but the stories were the same. The problems were the same. They had invested in yet another data platform, active teams, or let's say those cross-functional domain teams that were building the business functions were kind of oblivious to data.

Zhamak Dehghani: (04:42) Data engineers are stuck in the middle where, under sheer pressure, they just move data from one bucket to another bucket without really understanding the data itself or understanding how the data's going to get used. And the data users or analysts, the data scientists, that are talking to working with, they were kind of frustrated because they were stuck without still not having access to the rights, trustworthy data for the ultimate value that they were intending to deliver. And the executives were burning money without showing value — so changing jobs, ultimately.

Zhamak Dehghani: (05:16) And everybody was looking for another silver bullet, so I think seeing the real-world problems made me curious about the data space, scratching the surface and seeing the discord between the reality of the complex world we live in with data and then the solutions that weren't really up to the task for dealing with that complexity got me kind of to look further for a solution. And then inspiration, I think, came from my experience with large-scale … kind of microservices and distributed systems in the world of functional, I guess, systems and tried to apply those techniques and learnings into the world of data, and that inspired, I guess, data mesh as a solution.

Satyen Sangani: (05:58) People who are app developers are often oblivious to the fact that they're developing data. Talk a little bit about that because I think that may be one of the root causes or at least to me is one of the root causes to the problems.

Zhamak Dehghani: (06:09) I don't blame the developers. The current state of affairs is that developers are building applications to satisfy a certain function of the business. And many of those functions today are built without being informed about the data or augmented by intelligence or ML informed by the data. So application developers saying, "I'm building a GUI. I'm building an e-commerce application, whatever else. I'm building a microservice." My relationship to the data is to store the current state of the data, to satisfy my application's behavior, and optimally store that so I have a really fast read for my application refreshing this screen, or I have transactional storage of the data for this particular transaction of the payment or purchase that got performed. So the data is modeled and hidden away behind the application, optimized for a particular feature. And the more we decentralize kind of the application world with microservices-type architectures, the more fragmented that data becomes because that data is only keeping the current state of that application.

Zhamak Dehghani: (07:22) And that's perfectly fine from an application developer. On the other hand, we said, "Okay, actually we need to have this other team, other parts of organization, dealing with gathering insights, training machine learning models so that we can intelligently act upon some of these transactions and respond and optimize, personalize the user's behavior, personalize for the user behavior and so on," but that is a problem that a different team needs to solve so that we have to get this data somehow extracted. I feel like that extraction, it's an insult almost. It's such a pathological coupling we create. Anyway, extracted from these application databases and then moved some other place and then modeled other place and then enriched some with metadata so that somebody else can solve the problem. So the application developer has no relationship with that world. And then when sometimes that world comes back, as now we've built an ML model that you can evoke as a microservice to get the best, I don't know, recommendation or as a database to show the proposed product catalog.

Zhamak Dehghani: (08:28) They just see that as yet another service somebody else is providing them. So I think there are two problems that the points of, I guess, friction or the tension that really doesn't let the application developer care about the data is one, they're so far away from intelligent aspects of their application. They're not posed with the problem of, "Actually your application needs to be intelligently augmented with ML, with insights, with analytics so that they have an intrinsic motivation to care about the data to be modeled in a way that not only support the application, but also with models that support the insights and the analytical view of the world." That's one problem. And then the other problem is that the people that are responsible for that insight gathering are just so far away. They're siloed somewhere else. So you don't have that natural empathy even built into the ecosystem.

Satyen Sangani: (09:21) My background is in economics, and I guess I always think about it as an externality. So there's this concept of, you have a car and the car is driving along and it creates carbon exhaust and the carbon exhaust has an impact on the environment. It's not like the car is moving in a vacuum. And therefore this carbon, it creates its own economic exhaust and all these bad things. And in some cases the externalities can be good. But I think of data as an externality.

Satyen Sangani: (09:54) The app developer is just trying to drive their car and they don't really think about the fact that there is all of this information that's being produced that somebody can do something with. And in many cases, I don't even think they know what could be done. They don't have the imagination or the intention. It's almost like asking a contractor or a plumber to know how somebody wants to use a steam bath. I mean, that's a simple case, but I think a lot of software development is so complicated and it's hard to know. So I don't know. I guess my thought is I've observed the same thing, but I don't know if it's easy to tell... I think it's hard to tell somebody who's developing an application to think about how to measure it.

Zhamak Dehghani: (10:36) I love the car analogy and externalizing the cost of running that car to somebody else. I think maybe you are right. But if we said that if the app developer is the car manufacturer or car builder, it would say, "Your car is not only running fast and smooth and delights the driver, but it also should be a sustainable car." So you've got to build that functionality of sustainability into the application for the app developer to have any motivation. And then it's not really that particular app developer that is optimizing the driving experience. Perhaps it's an augmented team. That's why we have in data mesh, you have kind of this cross-functional team with data embedded to say, "Actually, there is a part of this team that cares about the sustainability," which then requires data, not only from the car being emitting data but also the rest of the organization. So I think going to Maslow's hierarchy of needs, we've got to put some inspiration in terms of implementing the need for data back into the application, and that probably is the next-generation set of applications we're building, not today's generation of applications. Today's generation of applications solve a lot of problems with human reasoning and logic and not data-driven exploration of patterns.

Satyen Sangani: (11:55) That to me sounds right. I just totally learned something. But I do think that analytics, data science, the consumers of the data, have to be a first-class citizen in the design process for the software, day one.

Zhamak Dehghani: (12:12) Absolutely.

Satyen Sangani: (12:13) And that won't necessarily solve all the problems because maybe they'll think through the first-order analytical problems, but it's almost like having an environmental engineer consult on building a house. That would seem to be part of the design process that doesn't really exist. So then you went from there and to say, "Well, there's this new possibility."

 
 

The Roles Within Data Mesh

Satyen Sangani: (12:35) So then from there, it goes to that there's that software developer. And then of course the other old alternative architecture is this data engineer who's literally moving up around data as if you were moving around boxes in a warehouse — blind to what's inside of the box. And then of course there's the analyst. And I guess there's this new theoretical role that's kind of this analytical engineer where the idea as the analytics person becomes the data engineer. Does that factor into the thinking of how you consider the mesh?

Zhamak Dehghani: (13:08) As you described, the current organizational structure, the roles that we have defined, stems from this pipeline architecture, this manufacturing mindset that data’s in the source, in the format that nobody can use, and then there are people along the way that take this data through the manufacturing pipeline to pop out meaningful insights, ML models trained on the other end. And along the way we have these steps of cleansing and enriching and a bunch of ETL pipelines, storing, modeling, then overlaying kind of metadata, governance. And then somebody along, the ML engineers, pick that up and then yet again they transform me to put it into their future stores or analysts putting the BI databases so they can put reports on it. So you've got this kind of manufacturing mindset and that every time we hop one of these steps, there's a handshake.

Zhamak Dehghani: (14:06) There is a loss of either context or a new skill that needs to be developed. We realize that the gap between the engineer on the pure software engineer on the left and the analyst pure analysts on the right. There is a huge gap. And then we create these intermediary roles — analysts, engineers, and ML engineers — to kind of fill this long gap. What data mesh does: tries to shorten this gap as much as possible and bring what considered valuable consumable data as close as point of origin — and point of origin could be an app and point of origin could be a completely new set of data sets created — but really close that consumer/provider relationship so they can talk to each other directly. And then following that line, a lot of those intermediary roles go away and they get embedded into boundary of this concept of kind of domain-oriented data product team that is doing all of the massaging and whatever needs to happen within the boundary of domain to present data in a way that a pure analyst can directly use it.

Zhamak Dehghani: (15:16) So maybe the sum of those analyst engineers and ML engineers, those roles get embedded and I just simplify them and I will just say they become data product developers in a way. Their job is not just to do one step of the transformation or one step of the translation. Their job is to directly talk to the consumers of the data, and you see something really important that early in the conversation that the analyst data scientists should be the first-class customers or users of the data. So then they really get embedded into the domains and focus on the data in a way, presenting the data, sharing the data, in a way that an analyst and data scientist can directly use it and give value from it, from without yet another movement of steps in the process.

Satyen Sangani: (15:59) And I guess then, in that world where there are data product developers, the skills required by that individual dictate what data products they can develop, or the data products that they're trying to develop dictate the skills that are needed maybe is the other way to say it. And so if you need to know clustering or you need to understand neural nets, those are things that you can have as skills depending on the problem that you're trying to solve in the product that you're ultimately trying to develop.

Zhamak Dehghani: (16:29) Absolutely. And I think this is a kind of missed point in a lot of the conversations. What data mesh tries to create is some sort of a meta architecture for data sharing — not pure data sharing, but data as a product sharing and connecting these data products to create higher-value products. So a scale-out, network-based value-creation model that is purely about data sharing. But as you said, some of those data products could have simple transformation. Let's say I'm getting customer information from a variety of sources of applications, because I've got a call center and an e-commerce system and a bunch of other systems that generate information about the customer, and I want to have a customer representation, a data product, — which, by the way, I do not recommend — but let's go with that example.

Zhamak Dehghani: (17:18) So my transformation to provide analytical data and a historical view of customer information across all of my channels, across all of my regions, and touchpoints with the customer becomes perhaps a relatively easy data processing, like a typical pipeline processing. Conversely, as you mentioned, if a data product is now classification of the customer so that we can have different marketing strategies for them, profiling of the customers, that data product embeds an ML model inferring information and patterns from classification from this other data product. And they can be chained together as a graph at a macro level. People kind of forget that transformation inside it could have such a diverse implementation, and for some reason we got stuck in this super simple, running SQL and it's your data product.

 
 

Data Mesh: The Socio-Technical (Or ‘Techno-Social’) Approach

Satyen Sangani: (18:22) So let's go back to the definition of data mesh because we talked about this idea of decentralization, but there were other words in your definition, socio-technical approach, and I didn't write down the notes, exactly what the definition is, but I'd love to unpack those pieces as well, because I think there's a lot of instruction in the follow-on portions of the definition. So maybe let's talk about what a socio-technical approach is because that's not something that I heard of prior to.

Zhamak Dehghani: (18:54) Sure. And some people that might be listening said, "No, it should be techno-social," and whatever slant of it. So what I meant, and I'm sorry that I had to pick one of the two, but —

Satyen Sangani: (19:09) I really hope there isn't anybody who's listening that's saying that, but let's go with that.

Zhamak Dehghani: (19:13) I have been told that “you've got to rotate this because….” It concerns both the organizational design decisions, so organizational structure, roles, and also technical solutions, so excellence of or efficacy of the technical solutions as well as the social structure are both part of the solutions. In fact, I started just as an architecture, my first writings were more focused on the architecture. Then I realized I cannot just talk about architecture. This goes beyond that. There are, I guess, in the first principles of data mesh that focus on the architecture technology and some that on the people side. For example, the domain-oriented ownership of the data. The data as a product itself has a lot of social, I guess, design concerns there, how you organize your people and their responsibility around the data in a decentralized fashion, the federated kind of computational governance.

Zhamak Dehghani: (20:18) The federated part of it, again, has a social element that your governance shifts left, gets embedded into those domains. And a lot of those go — not all, but many of the governance concerns we have today — get, again, decentralized into the domains, so that federated aspect, again, organizational structure. On the technical side, I think it's kind of obvious even though it's technology agnostic, but the architecture and the choice of technology has to fit this, I guess, social structure that we are setting up and that the self-serve platform part of it, the architecture of the data product, the architecture around the data is sharing. That's the technical part of it.

 
 

Is Anything Shared in the Data Mesh Framework?

Satyen Sangani: (21:01) Decentralized architectures talk a lot about the sort of this shared-nothing concept. Is anything shared, and if there are things that are shared in the data mesh framework, what would they be?

Zhamak Dehghani: (21:15) So for the mesh to work, the interoperability standards are shared, so the data sharing standards that we have to put in place, so that we can, both at three layers, at the layer of both just simply communication: How do I create a data product that is consuming actually data from five other upstream data products and is computing and is creating a new one? How do these things connect with each other? So input, output, egress type of relationship — that needs a standard that is shared across all of the data products. And then, so as far as data modeling itself, how do you model independent notes? How do I model the order that a customer put in one data product versus the customer call with a call center in a way that I know it's the same customer across two independent nodes. So the semantic modeling with the semantic kind of linking of these data products is another shared concept.

Zhamak Dehghani: (22:15) There's many cross-functional, I think, concerns that can fit into the standards to get a distributed system to work, and then the infrastructure as well. We don't really want to be in a situation where every domain team is spending really valuable resources and people's time and effort on building full stack infrastructure. So the infrastructure is shared and I sometimes think the centralization/decentralization are really two sides of yins and yangs that need to exist to actually have a functioning system. So there are aspects that need to be centralized and shared. There are challenges to address to make sure those centralized pieces of the infrastructure don't become bottlenecks. But put that aside: I mean, you want to share technology so you have some sort of economy of scale. So I think these two are certainly shared.

 
 

Key Enabling Technologies for the Data Mesh

Satyen Sangani: (23:14) What do you think are the key enabling technologies for the data mesh? What are the standard ingredients that an org unit would need to have in place in order to get a data mesh going?

Zhamak Dehghani: (23:26) To get data mesh going? I think if I categorize technology into two general buckets, the buckets that enable implementation inside each data product, you would need transformation, you would need storage, and you need these in a way that can have the diversity of transformation implemented and so on. I think I put that bucket aside. We need that technology, but we also need that technology today for lakes or warehouses that may have a slightly different nature or are configured differently, but we need that. So let's put the technology that implements what's inside the node aside. I think the pieces of technology that are really critical and perhaps missing are the pieces that enable interconnectivity between the node. It really enables running this analytical workloads, like generate a report or keep training your machine learning model without data movement in a distributed fashion over the mesh. And for that to work, I think, again, if it works backwards, say, okay, what does that look like?

Zhamak Dehghani: (24:38) I think one piece of it again, is the standards, the language with which we would express those distributed computational workloads, the APIs on each node that enable the request response for those. So again, I'm putting my distributed kind of system engineer hat answering these questions. So I would think about the protocols as really the essential pieces: the protocols around discovery, addressability, data sharing, distributed analytical computation.

Zhamak Dehghani: (25:22) I mean, we have distributed or federated SQL, but what's the next phase of what's beyond that? I think those are the really key pieces that are missing. And then the second, as part of those protocols, again, if you unpack the protocols, I think that analytical data needs to have a temporal — by definition is a temporal data. And if it's distributed, it kind of needs to be mutable, because if you have a state-full distributed system, it's a disaster. So then the next-level protocol is that, what is a sensible, easy to use, high-level kind of temporal representation of analytical data that enables streaming-like processing? I'm not talking about event processing, but even if it's the windows of time that we analyze, what does that protocol look like? So yeah, long answer in short, protocols, data protocols, data contracts.

Satyen Sangani: (26:21) It's like the data warehouse and the centralization idea, and then there's this idea of distributed products. And I don't think you're saying, "Oh, well there's never going to be a warehouse where information is centralized." Some subset of information is centralized. That will exist, but perhaps give up the ghost around there being a singularity around truth or …

Zhamak Dehghani: (26:44) Absolutely.

Satyen Sangani: (26:45) … that everything is going to be centralized. And I think that to me is the key aha, the insight, because if you let go of the idea that everything needs to be centralized and you're thinking it just frees you up to be a lot more dynamic in how you treat data. And who can own it and who can use it. And that's really empowering. And obviously all the implications of the consumers of data being close to the design I think are obviously super useful and interesting. Since you've written the paper, have there been major evolutions in your thinking?

Zhamak Dehghani: (27:16) I think I had to introduce the fourth principle since the first paper. So that was a big change because I had it as one little line item, standards somewhere. And then I realized the governance, the computation of federated governance, had to be introduced as its own pillar for the pillars to be complementary and complete. So I think that was one big mutation that happened. I intentionally stayed away from practices and technical implementations and designs early on because I knew that would keep changing forever and improving for good reason. I stayed to the first principles and I think those first principles haven't changed so much beyond that one change that happened early on, but the interpretation and the shape of the solutions vary a lot.

Zhamak Dehghani: (28:11) And I don't try to claim ownership of what is the best way. We have to still discover what's the best way to design. My own personal thinking around the design is crystallizing over time. Maybe it starts with fuzzy and then just dig into the details, and with each iteration of implementation, the design gets a little more crystallized, but still a lot of fuzzy areas. I mean, what the heck is this distributed analytical data sharing when talking about APIs? So, yeah, so I think incremental just improvement practices, but the first principles pretty much stayed the same.

Satyen Sangani: (28:49) There's an ongoing debate about the merits of the data fabric versus the data mesh. You can guess which side Zhamak is on, but even still I asked her to break down the differences.

Zhamak Dehghani: (28:59) I think there are similarities and differences. So data fabric, the way I understand it, going back to the origin of it, really started as a system to allow accessed data at a physical level across boundaries of different platforms. So it was kind of the NetApps of the world and systems that really allowed, created a layer for you to access data on a hybrid cloud environment. It was kind of a cloud migration strategy for many. So again, it's a physical layer concern versus data mesh, which is a logical layer concern. And then fast forward, Gartner started overlaying on top of that idea and saying, "Well, what if we just assume data can be anywhere it is if there's a physical layer that gives you access to the data and let's just start overlaying magical AI intelligence metadata to then make meaning out of this data." So pretty much keep doing whatever you are doing, app developers. Put the data wherever it is, and then someone else that's on some other layer point with machine learning drives meaning out of this data that you have.

Zhamak Dehghani: (30:14) So data mesh: So on the first front, it's not a physical layer. It's a logical layer concern. And philosophically I think it’s in conflict with this idea that “keep doing whatever you want to do, unintentionally share data, and then some magical AI will make sense out of it.” In fact, data mesh ... I mean, my kind of value system puts responsibility on humans and saying, "In fact, let's get the human in the loop to intentionally build data as a product to emit design and emit the right metadata to serve meaningful data."

Zhamak Dehghani: (30:50) And yes, if you want to sprinkle intelligence on top and have this next generation, I don't know, data catalogs or whatever it is that gives you more meaning out of that, so be it, but don't build the system that is garbage and intelligent on top of it and then garbage app. I don't know, that's an apocalyptically dark world. So I think it depends on what you mean really by data fabric. Do you mean the system of data integration across multiple environments? And I think that could be absolutely part of data mesh or do we mean this sort of philosophy of share data as is and then kind of overlay that with intelligence for it to be meaningful. And I think that perhaps there is a conflict in philosophy around that.

Satyen Sangani: (31:33) That the AI pixie dust will solve all your problems.

Zhamak Dehghani: (31:39) AI and metadata pixie dust, yeah.

Satyen Sangani: (31:41) Well, I think the metadata has to exist in the mesh framework and does exist in the fabric framework. I think it just is a question of how you generate these things and how you generate it and how you actually produce it. And I think you're basically saying, "Look, as a product owner, that's your job." So I guess now looking forward a little bit, are you spending the majority or the vast majority of your time talking about data mesh? Is this where you spend your time? And is it more on ... I mean, are you in a place where you're spending more time explaining the concepts or are you a place where you're spending more time implementing the concept or is it a mix and how have you seen that change over the last couple of years?

Zhamak Dehghani: (32:22) So I think I'm still talking about it a lot, but the talking is based on the experience of seeing different patterns and has moved from what is data mesh to how to do data mesh and how to evaluate technology so that I can do data mesh right; what are the guidelines? So it's moving from “what” to “how” and “how to do it right,” and I hope that I at least stay close enough. And again, I'm not a developer on one implementation, but I do have touch-base with multiple development teams that are implementing it globally. So I hope I keep that connection so I can get the feedback coming in. What I hope to happen in the next few years is to kind of shift focus a little bit more to how to enable it. What is missing and contribute to the standards and contribute to the technology to fill the gap?

 
 

Final (Data Mesh) Thoughts

Satyen Sangani: (33:11) I mean, look, I'm right there with you. I think there's a lot to do to enable these architectures and — well, not architectures — these approaches so that people have a roadmap to get to this end state faster, which is just, I guess, the fun of all of our jobs in all of our work. Any warnings, common pitfalls, risks? If I were interested in data mesh and just hearing this podcast for the first time, what would you have me walk away with?

Zhamak Dehghani: (33:50) I think having a big deep introspective as “Is this the right thing for me and is this the right moment in time?” Don't assume that you can just buy it and over time you get to all the outcomes that you want. So a deep self-assessment of, should I make a move toward data mesh now and what that entails for me and if there are risks along the way that is applicable to my unique situation, what are those risks for me to mitigate? And in fact, I put that in the book because I had to cover the organizational side of it. And I thought the first step is, shall I do that now? That would be one thing I would say for people to take away. And the other is to go beyond — whether you are a researcher trying to understand or you're a vendor trying to implement or organization trying to adopt — go beyond the hype. Really understand the motivation behind it (do those motivations apply to me?) and then assess your technology choices, solutions that you are building for a long run based on a long-term kind of transformation, not based on a short-term, "I'm just going to plug this in and ‘bam’ I've got data mesh."

Zhamak Dehghani: (35:14) So really treat it as a transformation. And the last part I would say is that socio-technical we talked about. Organizational design is absolutely crucial because we could have the best organizational structure and, sorry, best distributed architecture and mesh, but yet centralized data organization and that architecture very quickly devolves or that solution very quickly devolves to mimic the organizational structure, right? Conway's law. So even if you have to do an inverse kind of Conway's maneuver of reshuffling your organization, even with a monolithic architecture, that would enforce changing and decoupling your architecture, because you mentioned earlier that becomes a bottleneck. So don't forget the organizational part very, very early.

Satyen Sangani: (36:04) It was such a pleasure to meet you and to have this conversation. We’ll look forward to having you back on at some point in the future, but thank you for taking the time.

Zhamak Dehghani: (36:15) Thank you for lending your platform and getting my whole voice heard. It was a pleasure.

Satyen Sangani: (36:26) So many data leaders have grown up believing in the value of centralization. But I think the data mesh shows that this focus is actually what's holding us back. As Zhamak put it, the data mesh is a socio-technical solution. It's not just an alternative technical architecture. It's an approach that requires you to consider how people actually behave. At the core, the data mesh seeks to enable people to work with data in a way that's more consistent with how they already behave and act. And if the work of building a data culture is hard, do we really need to make it any harder? This is Satyen Sangani, CEO and co-founder of Alation. Thank you Zhamak for proposing some truly radical thinking. And thank you for listening.

Producer: (37:12) This podcast is brought to you by Alation. Is your organization ready for its next compliance audit? Data governance can help you pass that audit while also supporting innovation, accelerating analytics, and mitigating risk. Read this evaluation of 12 data governance solutions at alation.com/DGQ3.

Other Episodes You Might Like :

Start with Story, End with Data

Ashish Thusoo

Ashish Thusoo

Founder of Qubole and Creator of Apache Hive

Subscribe to the Data Radicals

Get the latest episodes delivered right to your inbox.

Marketing by