Data Quality Score¶
The Data Quality Score is a key indicator that helps data stewards, engineers, and analysts prioritize remediation efforts and build trust in governed data assets. It reflects the overall quality of monitored data across your catalog. It provides a quick overview of how many quality checks are passing successfully and is updated daily or weekly as scheduled to reflect the most up-to-date quality status.
A higher score indicates better data quality and fewer issues detected, while a lower score highlights potential data reliability concerns that may require attention.
The Data Quality Score is designed to help users quickly assess whether data meets expectations across multiple quality dimensions like completeness, uniqueness, validity, and freshness.
The quality score appears on the following pages:
Catalog pages
Data Quality dashboards
Alation Chrome Extension
The health score calculation is an indicator of the number of active checks that have passed over the total number of enabled active checks indicating the following:
Executed checks include only those with results of pass or fail.
All checks contribute equally.
Score Calculation¶
Health Score = (Number of Passed Active Checks / Total Number of Enabled Active Checks) x 100
Score Interpretation Guidelines¶
90-100%:
Excellent data quality with minimal issues (Green)
70-89%:
Good quality with some attention required (Yellow)
Below 70%:
Poor quality requiring immediate attention (Red)
Example¶
Consider tables customer_profiles and orders with 5 data quality checks:
Table |
Check |
Result |
---|---|---|
customer_profiles |
missing_percent(email) < 2% |
Fail |
customer_profiles |
duplicate_count(customer_id) = 0 |
Pass |
customer_profiles |
freshness(updated_at) < 1d |
Pass |
orders |
row_count > 1000 |
Pass |
orders |
invalid_count(phone) = 0 |
Pass |
Data Quality Score = (4 / 5) x 100 = 80%
To determine action, consider the following:
Warning or Medium Score (75-89%): Some checks have failed
Failing or Low Score (<75%): Significant issues are detected in the monitored data
0% Score: All checks have failed and requires an immediate review