This blog shares takeaways from the HIMSS Whitepaper: Building an AI-Ready Healthcare Organization with Data Intelligence to Drive Clinical Success. Primary voices include Beth Senay (Director of Data Trust, Children’s Hospital of Philadelphia) and Abdul Tariq (Associate Vice President of Data Science, CHOP)
Most healthcare leaders agree that data quality matters for AI. But “quality” in a clinical analytics context—where we scrub anomalies and remove outliers—isn’t the same as AI readiness. For AI, “clean” can actually mean incomplete. If models never see the messy reality of care delivery—edge cases, rare events, transcription quirks, emerging patterns—they learn a simplified world and stumble at the bedside.
Gartner puts it plainly: AI-ready data must be representative of the use case, inclusive of “every pattern, errors, outliers and unexpected emergence” the model will encounter in the wild. For this reason, AI-ready data isn’t a one-time hygiene project; it’s an ongoing practice of aligning, qualifying, and governing data for specific AI uses.
Below, we break down what “real, not just clean” looks like in practice—drawing on the HIMSS whitepaper and a joint interview with CHOP’s Beth Senay and Abdul Tariq—connecting it all back to Gartner’s guidance on AI-ready data for data & analytics leaders.
Traditional business intelligence favors tidy datasets. Analysts de-duplicate, impute, standardize, and trim outliers to produce metrics humans can trust and act on. But in model training and evaluation, those “imperfections” are often the signal:
Outliers may represent rare yet critical clinical states.
Errors and anomalies can encode operational realities (e.g., documentation lags or device quirks) that models must learn to handle.
Emergent patterns (new codes, novel care pathways, seasonality shifts) are exactly what AI must recognize early.
Gartner’s guidance reinforces that AI readiness is contextual and iterative: the only way to prove readiness is to align data with a specific use case, qualify that data against confidence requirements, and demonstrate appropriate governance over time.
CHOP’s leaders have operationalized this shift—starting with a cultural and architectural foundation that treats data as an institutional asset and places stewardship close to the work.
CHOP’s branded data catalog (“Gene”) centralizes definitions, lineage, quality signals, and governance implications so clinicians, researchers, and operations can find and trust the same assets. “Working from one centralized catalog… builds confidence and accelerates insights,” says Beth Senay, Director of Data Governance and Literacy. When everyone can see certified sources, usage notes, and lineage, confusion drops—and trust rises.
Rather than a “compliance gate,” CHOP’s Data Trust Office positions governance as an enabler. Senay describes the stance as patient-centered custodianship: the hospital is the steward of data owned by patients.
That ethos is reflected in practical controls, including classification, least-privilege, role-based access to the Helix data warehouse, automated and retrospective quality reviews, and audit trails. Such controls enable teams to innovate responsibly without compromising privacy.
As the Associate VP of Data Science at CHOP, Abdul Tariq has outlined a repeatable method for deploying AI models that begins with workflow first:
Define the insertion point. Where exactly in the clinical or operational workflow will the model act? If an algorithm could exacerbate inequities (e.g., using a “missed appointment” prediction to overbook), redesign the use to prevent harm.
Quantify subgroup performance. Evaluate bias thoroughly across cohorts; some groups may actually perform better or worse than the average—both matter.
Mitigate via math and workflow. Use statistical methods (e.g., oversampling) where feasible; otherwise, constrain or adjust deployment (e.g., ignore outputs for certain subpopulations or reserve human-only paths) until performance is acceptable.
AI-ready data isn’t just statistically sound; it’s operationally relevant. As Tariq puts it, the biggest challenge is not prediction accuracy—it’s proving how and why a solution improves the current state in practice.
HIMSS readers will recognize the industry trend from data assets to data products—curated, reusable datasets aligned to specific outcomes (e.g., feature libraries, trusted data layers). At CHOP and peer institutions, the move toward certified, reusable datasets:
Speeds up experimentation by giving data scientists a vetted starting point
Reduces technical debt and “one-off” pipelines
Bakes governance signals (lineage, sensitivity, policies) into the discovery experience
Gartner’s view echoes this: readiness hinges on metadata-rich alignment, qualification, and governance—including versioning, observability, and continuous regression testing—as models and data evolve.
Use this quick rubric—derived from CHOP’s approach and Gartner’s framework—to pressure-test whether a dataset is truly AI-ready for a specific clinical or operational use case:
1) Alignment to the use case
Have you specified the decision this model supports and where it fits in the workflow (ordering, triage, discharge, RevCycle action)?
Does your data include real-world variability you expect at the point of use (device drift, documentation gaps, population diversity, edge cases)?
Is the dataset matched to the technique (e.g., RAG for unstructured notes vs. time-series for vitals)?
2) Qualification for confidence
Have you validated performance by subgroup and across sites/units?
Do you track data and model versions, with the ability to revert? Are observability metrics (freshness, drift, latency, cost) monitored in dev and in production?
3) Contextual governance
Are policies and stewardship roles explicit for this use (PHI handling, model access, handoffs)?
Can users see lineage from source to feature to model output inside the catalog?
Have you documented intended use, contraindications, and fairness constraints, and designed the workflow to prevent harm if bias persists?
Tariq cautions against the “graveyard of prediction models” that never see daylight. Even when a model outperforms clinicians in a retrospective study, it can still fail in practice if it doesn’t change decisions. His analogy is memorable: “If you don’t have an umbrella, it doesn’t matter that you know it’s going to rain—you’ll still get wet. Knowing the storm is coming isn’t the same as being prepared for it.” CHOP’s response is to pilot in small cohorts, measure outcomes in “silent mode,” validate definitions and performance, then scale deliberately.
This is where a catalog like Gene pays off again: frontline users don’t just see a score—they see the definition behind the metric, where the data came from, how current it is, and what governance applies. That transparency delivers the human trust necessary to act on AI outputs.
If you’re framing your AI roadmap for 2026, start with the data. The HIMSS whitepaper, Building an AI-Ready Healthcare Organization with Data Intelligence to Drive Clinical Success, details CHOP’s journey—from patient-centered custodianship and role-based access in Helix, to Gene as a single front door for trusted data, to the three-step method for bias mitigation and real-world utility. It’s a practical playbook for moving from “clean” to real—and from pilots to practice.
Download the whitepaper to get the full story.
Loading...