Manage Checks¶
Alation Cloud Service Applies to Alation Cloud Service instances of Alation
Alation Data Quality provides a no-code interface for configuring data quality checks across multiple categories. All checks are executed as SQL queries on the data using the Connector’s Query Service and return a status of Pass, Fail, or Error.
A data quality check is a rule applied to a table or column that evaluates whether the data meets an expected condition and within the defined threshold value.
A threshold is the expected value or range of values that a data quality metric must satisfy during a check. If the observed data meets the threshold, the check passes; if it violates the threshold, the check fails.
Managing Checks in SDK Monitors¶
The process for defining data quality checks is identical for both Manual and SDK-Enabled monitors. You continue to author, configure, and govern all checks centrally within the Alation UI.
Regardless of the monitor type, the actual query execution follows a pushdown model, meaning the SQL always runs directly on your database (for example, Snowflake, Databricks). The difference lies in how these checks are triggered:
For Manual Monitors: Alation’s internal scheduler orchestrates the process, triggering the pushdown queries automatically based on the schedule you define in the UI.
For SDK-Enabled Monitors: Your external pipeline orchestrates the process. The SDK runs within your orchestration environment (such as Apache Airflow or GitHub Actions), where it fetches the check definitions from Alation and triggers the execution of the queries on your target database.
Comparison Operators¶
Operator |
Meaning |
|---|---|
= |
Equal to |
< |
Less than |
> |
Greater than |
<= |
Less than or equal to |
>= |
Greater than or equal to |
!=, <> |
Not equal to |
between |
Value is within a specified range |
not between |
Value is outside a specified range |
Each check is designed to detect specific types of data quality issues such as missing values, invalid formats, duplicate records, or outdated timestamps.
Result |
Meaning |
Contribution to Score |
|---|---|---|
Pass |
The data meets the check’s condition |
Positive |
Fail |
The data violates the check’s condition |
Negative |
Error |
The check failed to execute due to a syntax or runtime issue |
Negative |
The data quality checks are grouped into two main categories:
Table-level check: Includes numerical and custom SQL query check.
Column-level check: Includes numerical, uniqueness, completeness, validity, and custom (common table expressions and SQL query) checks.
Data Quality Check Types by Column Data Type¶
Check Category & Metric |
Numerical |
Text |
Time/Date |
Table Level |
|---|---|---|---|---|
ACCURACY CHECKS |
||||
Average ( |
Yes |
— |
— |
— |
Average Length ( |
— |
Yes |
— |
— |
Maximum ( |
Yes |
— |
— |
— |
Minimum ( |
Yes |
— |
— |
— |
Maximum Length ( |
— |
Yes |
— |
— |
Minimum Length ( |
— |
Yes |
— |
— |
Percentile ( |
Yes |
— |
— |
— |
Standard Deviation ( |
Yes |
— |
— |
— |
Standard Deviation Population ( |
Yes |
— |
— |
— |
Standard Deviation Sample ( |
Yes |
— |
— |
— |
Sum ( |
Yes |
— |
— |
— |
Variance ( |
Yes |
— |
Yes |
— |
Variance Population ( |
Yes |
— |
Yes |
— |
Variance Sample ( |
Yes |
— |
Yes |
— |
Row Count ( |
— |
— |
— |
Yes |
UNIQUENESS CHECKS |
||||
Duplicate Count ( |
Yes |
Yes |
Yes |
— |
Duplicate Percentage ( |
Yes |
Yes |
Yes |
— |
COMPLETENESS CHECKS |
||||
Missing Count ( |
Yes |
Yes |
Yes |
— |
Missing Percentage ( |
Yes |
Yes |
Yes |
— |
VALIDITY CHECKS |
||||
Invalid Count ( |
Yes |
Yes |
Yes |
— |
Invalid Percentage ( |
Yes |
Yes |
Yes |
— |
TIMELINESS CHECKS |
||||
Freshness ( |
— |
— |
Yes |
— |
CUSTOM CHECKS |
||||
Common Table Expression ( |
Yes |
Yes |
Yes |
— |
SQL Query ( |
Yes |
Yes |
Yes |
Yes |