Trusted AI Needs Trusted Data

By Salima Mangalji

Published on December 20, 2023

Illustration of a padlock to showcase trusted AI needs trusted data

The adoption of Generative AI has quickly gained widespread adoption, prompting teams to experiment with and implement AI initiatives to increase organizational productivity. What we are seeing today is an unprecedented shift, where Generative AI is driving advancements across industries — ultimately boosting productivity, organizational growth, and influence on business decisions. 

A McKinsey study highlighted that with the implementation of Generative AI, 30 percent of hours worked today can be automated by 2030. Enterprise organizations realize the value of AI and are looking to seize the opportunity. According to the IBM Institute for Business Value, 64% of CEOs say they face substantial pressure from investors, creditors, and lenders to accelerate the adoption of GenAI. The significance lies in the fact that, despite the seemingly straightforward nature of adoption, concerns exist around the implementation of artificial intelligence — notably regarding AI governance, in the realms of data privacy and accuracy.

Challenges of the current AI workflow

When we dig into this a bit more and when we look at the data from IDC, we can see where those challenges are. Highlighted below are the data-related aspects of the significant challenges to maximizing the value of organizational AI/ML initiatives.

Screenshot of IDC’s slide deck, titled, “Data, data intelligence, and tooling are at the top of the list of AI/ML challenges.”
Quote image from the Vice President of IDC’s Data Integration and Intelligence Software Service, Stewart Bond

*Alation Webcast titled, "From Data Culture Maturity Model to Business Value" with guest analyst speaker Steward Bond, Vice President of IDC's Data Integration and Intelligence Software Service

When examining this chart, some of the main challenges include: data availability, data quality, lack of trust in data, lack of cross-team collaboration, GDPR/compliance issues, and even the lack of a data culture. There is a growing concern about the data used to train LLMs, as well as ensuring that the AI models can be trusted. 

At Alation, we see three main buckets of challenges in the current AI workflow:

  • FIND

  • TRUST

  • GOVERN

Icon images displaying, FIND, TRUST, and GOVERN as the AI workflow

Data intelligence as a solution

In essence, the effectiveness of  AI — and BI — hinges on the quality of their inputs. The adage “garbage in, garbage out” applies here: Only high-quality data inputs for these technologies will yield meaningful and reliable results.

If you feed your models with a foundation of bad data, it will lead to bad model decision-making and ultimately negative outcomes for the organization: missed opportunities, wasted money, and even lawsuits.

Diagram showcasing AI & BI are only as good as their inputs

Organizations need to start with a foundation of quality and governed data. Data intelligence is a critical aspect of this process, allowing users to gain context on the data that is available.

Good data leads to smart model decision-making and ultimately, innovation and growth for the organization. Transparency through the model will likely yield that outcome.

Diagram showcasing Good Data > Good AI & BI > Goot Outcomes

So that leads to the question: How do we ensure trusted data is used for AI?

When building out an AI model, starting with a data intelligence platform is essential. Organizations want to get up and running with AI, but often, their enterprise data is simply not ready. Bad-quality data needs to be addressed before AI initiatives should begin.

Alation is a foundational solution for enabling AI initiatives by helping you find the trusted data sets you need, trust that your model is being trained on accurate data, and actively govern your datasets and AI models.

Find governed assets

When it comes to data ingestion, it is critical to understand where to find the correct datasets needed for specific AI models. Search and discovery are at the heart of Alation’s Data Intelligence Platform. Alation acts as a single source of reference for data teams to find datasets and relevant metadata needed for models.

Our Behavioural Analysis Engine, and recently announced Intelligent Search democratize this process by allowing cross-functional teams to self-serve the critical data they need, therefore enabling a stronger data culture throughout the organization.

With the growing amount of data needed for AI initiatives, data curation is more important than ever. These initiatives come with the same problem we’ve always had: There is just too much data, and not enough time or stewards to curate it.

We recently introduced ALLIE AI — and as part of that, introduced our new Intelligent Curation features. Teams can auto-generate GenAI descriptions based on metadata and find suggested stewards to curate quickly and efficiently. Such features will be able to automatically understand what data objects represent and deliver contextual information about the data.

Trust accurate AI model training

Alation enables trust by ensuring teams use the right data for AI models. Teams will be able to start with a foundation of clean and relevant datasets to ensure the model learns from reliable information. Users can also monitor data quality insights using the Alation Data Quality Health tab and use Data Quality Flags to flag depreciated or noncompliant datasets.

Collaboration is a huge part of trust — users can record metadata related to transformations and versioning information. They can also collaborate with cross-functional team members through Alation’s Conversations feature and share data objects with Alation Anywhere on Slack and Microsoft Teams.

Govern your datasets and AI models

With the growing AI regulatory landscape, governance and compliance need to be a focus. Alation helps ensure active data governance for your enterprise datasets and AI models. When it comes to datasets, we help you curate your data in the catalog and show the lineage between source and transformed data with data governance.

Today, you can catalog your AI model in Alation by cataloging it as a data table. From there you will be able to see trust flags, descriptions, the upstream lineage to the data sources, and the downstream lineage to where the model is being used. To layer on top of that, you will be able to view your business custom fields and metadata, which will provide valuable and contextual information about your data.

Next year, Alation will implement AI asset types to allow users to catalog journals and AI models directly within Alation — the next step in our AI governance journey. For more information and to learn more about how Alation can help catalog AI models, please take a look at our AI solutions or watch our ‘Trusted Data for Trusted AI - Seizing the AI Opportunity’ webinar. 

    Contents
  • Challenges of the current AI workflow
  • Data intelligence as a solution
  • Find governed assets
  • Trust accurate AI model training
  • Govern your datasets and AI models
Tagged with