By Myles Suer
Published on July 28, 2022
The cloud is no longer synonymous with risk. There was a time when most CIOs would never consider putting their crown jewels — AKA customer data and associated analytics — into the cloud. But today, there is a magic quadrant for cloud databases and warehouses comprising more than 20 vendors. And adoption is so significant that many participants have earned notable market capitalization.
As enterprises migrate to the cloud, two key questions emerge: What’s driving this change? And what must organizations overcome to succeed at cloud data warehousing?
It is natural to assume the biggest drivers are time and money. It’s costly and time-consuming to manage on-premises data warehouses — and modern cloud data architectures can deliver business agility and innovation. However, CIOs declare that agility, innovation, security, adopting new capabilities, and time to value — never cost — are the top drivers for cloud data warehousing.
“Cloud data warehouses can provide a lot of upfront agility, especially with serverless databases,” says former CIO and author Isaac Sacolick. “There are tools to replicate and snapshot data, plus tools to scale and improve performance.” Yet the cloud, according to Sacolick, doesn’t come cheap. “A misconception is cloud data warehouses and lakes are cheap or don’t require IT ops support.”
Many see the cloud as the most secure option. Much to my surprise, CIO Paige Francis claims that in her organization, “the number one driver is security, given the wide range of secure data types. I am not interested in owning that risk internally.” Improved, reliable security in the cloud has increased cloud usage, which is much greater than early estimates, thanks to the ability of the cloud to actually be more secure than an on-premises data center.
Data aggregation is another key benefit the cloud delivers. “Businesses we work with have so many different types of data on different systems and infrastructure, that the cloud makes sense as a single aggregation point,” shares industry analyst Dan Kirsch.
Those planning their migration to a cloud data warehouse would be wise to map out a strategy. What do you migrate, how, and when?
CIOs agree that organizations should avoid lift and shift migrations, as this approach often leads to fixing what is there, fixing it again, and finally getting it reengineered — and maybe still getting it wrong again. Analyst Dion Hinchcliffe succinctly summarizes the problem: “Lift and shift is usually the worst way to move anything to the cloud. It means you’re going to do one migration to get into the cloud. And then a second migration to get there right.”
Sacolick agrees, “Sadly, lift and shift often leads to more than two migrations.” Luckily, with the right planning, this migration can be done in one fell swoop. It’s critical IT leaders “define the problem, find the value, and architect a solution that meets the objectives,” he argues.
For this reason, CIOs recommend only migrating data with downstream business value. This means organizations need to develop a data warehouse plan early in the process, while establishing holistic, enterprise-level governance and management from both infrastructure and warehouse components.
As Kirsch suggests, it’s no surprise organizations should modernize their data: “To lift and shift then modernize is expensive and means you’ll be moving useless data. Clean out your closet before you move into a new house!”
Organizations are better off redesigning for the cloud and then focusing on building in small manageable chunks focused on business needs. For this redesign to succeed, it is critical to remember data governance becomes even more essential to understanding where your data is at all times.
Like any data migration, cloud data migration requires careful planning, design, and execution. But be warned: CIOs say IT leaders should not assume the cloud works like a data center. It is important to understand the unique responsibilities for a cloud data warehouse, and to include data governance.Indeed, data governance ensures data is labeld – giving migration leaders a clear view of what data is useful, usable, popular – and worth migrating at all. A common big issue, says Francis, is “bringing over the same garbage data or broken integrations. Where possible, IT teams should start as clean and fresh as they can.”
In other words, CIOs can’t just wing it and migrate the entire data landscape. They need a plan. “Incorrect scoping of the migration poses significant risk to the migration, especially around cost,” points out CIO Anthony McMahon.
Lift and shift perpetuates the same data problems, albeit in a new location. In many cases, businesses have tons of data, but the data can’t be trusted. If you don’t have a well-defined business problem, your analytics or data science project will be an expensive failure. Where the old data warehouse model was driven by feeding data into it, the new model is about providing a view across lots of data with a specific purpose.
Cloud data warehouses offer the potential to solve larger and more complex business data problems that could not be addressed via on-premises software and hardware. Cloud data should remove the infrastructure discussions and return attention to business, data, and outcomes.
With this said, Hinchcliffe summarizes the biggest cloud data migration risks as:
Source/target vendor lock-in
Consistent performance
Lower control
Data quality/wrangling
Regulatory issues
Cost monitoring/limiting
Ability to move out/costs of data egress
CIOs claim that data relationships are vital — from both building the business side of the data warehouse as well as understanding the resulting infrastructure requirements. Clearly, moving data isn’t free. Nor are architecting new solutions or changing how users access data.
“It’s key to understand data and data relationships, but so is data governance and data management,” says CIO Martin Davis. “Unless you understand all of these things, you will end up with issues and problems that will cause rework.”
Data discovery and relationship mapping are among the top ways to achieve high value from any kind of data warehouse. “You really need to understand the metadata and data definitions around different data sets,” Kirsch says. “Packaged analytics and data warehousing solutions are getting smarter, but just dumping into a cloud data warehouse will give you a swamp.”
And once data is in the cloud data warehouse, security, risk, and compliance are critical. This can bring to the foreground people, skills, and culture. In many cases, a culture change is also essential for success. A new data environment introduces new responsibilities. How will your team train and transition people to make your cloud data warehouse successful?
So how do you choose what to migrate? Migration leaders would be wise to filter out data, not to migrate via a clear policy. CIO Martin Davis stresses the importance of up-front policy planning: Decisions should be made “based on business need and data integrity requirements. If the business justifies it then it’s going in, if it is integral to the end results or to maintaining relationships within the data then you need it. But you must be tough!”
Hinchcliffe says it is important to define a policy with filters that remove:
Inaccurate data
Aged-out data
Violations of compliance in privacy/regulatory
Unwieldy data badly out of scope
In this process, organizations should be guided by:
Data regulations and compliance
Cloud costs
Latency
Current business processes
Cloud provider technology
Leaders would also be wise to envision migration, not as a mere move, but as a chance to re-architect a better data environment. According to Capgemini Chief Data Architect Steve Jones, “If we accept that data-driven business is the future, then there’s nothing left behind that has value. But that doesn’t mean you are migrating an existing data warehouse to the cloud, but rather building a new data landscape enabling the business to drive from data.”
“This is the fundamental question on cloud data warehouses,” he adds. “Are you building a better data warehouse, or is a cloud data warehouse one of the technologies you are using to surface data to the business from a collaborative data mesh? I’d argue if it’s not the latter, you might as well save your money.”
Cloud data warehousing has matured — and so has the market. However, launching your own cloud data warehouse requires a clearly defined business impact. Cost is no longer — if it ever has been — an adequate justification, and lift and shift is a losing strategy.
Just as important, IT leaders must make conscious decisions about the data to move and in many cases, this should be about ensuring the newly moved data is trustworthy and adds business value. And this needs to be done in a way that ensures data going forward is governed, protected, and supports compliance requirements. Check these boxes, and you’ll do more than widen access to enterprise data – you’ll launch a smarter foundation to support data-driven decision making across your entire organization.