Configure Lineage

Alation Cloud Service Applies to Alation Cloud Service instances of Alation

To configure lineage for dbt Cloud and dbt Core in Alation, follow these steps:

  1. Customize Lineage Extraction Scope (Optional)

  2. Fetch dbt projects/environments and Map Datasources

  3. Run Lineage Extraction

Customize Lineage Extraction Scope (Optional)

Applies from connector version 1.4.0

You may have dbt projects deployed across multiple dbt Cloud environments (such as development, staging, and production). Depending on your cataloging goals, it may not be necessary to extract lineage from all environments. For example, if your focus is on production data, you can configure the connector to limit extraction to production-tagged environments only. Development and staging environments often contain experimental, duplicate, or temporary data flows that can clutter lineage visualization and complicate catalog curation and asset management efforts.

Note

This option doesn’t apply to dbt Core.

To limit the lineage extraction scope to production-only environments:

  1. Open the Lineage Settings tab in your dbt Gen2 ELT source.

  2. Go to the Lineage Configuration tab.

  3. Under the Customize lineage extraction scope section, turn on the Fetch only production tagged environments toggle. By default, this toggle is turned off, and lineage will be extracted from all dbt Cloud environments.

When enabled, the connector filters for dbt Cloud environments with a production tag and fetches projects and extracts lineage from those environments only.

Note

By default, this toggle is turned off, and lineage is extracted from all dbt Cloud environments.

Fetch dbt Projects/Environments and Map Datasources

To fetch dbt projects/environments, perform these steps:

  1. Open Lineage Settings in your dbt Gen2 ELT source.

  2. Go to the Lineage Configuration tab.

  3. Under the section Fetch dbt projects/environments and map datasources, click Run to fetch all Alation data sources that match with projects or environments within projects present in dbt and maps them to the corresponding data sources.

The retrieved list of projects/environments appears in the Lineage extracted sources table.

../../../_images/dbt-lineage-production-tagged-env.png

If the host name of a data source is not configured in the dbt project or the data source host name configured in Alation doesn’t match the one configured in the dbt project, Alation lists them as unmapped objects. The list of unmapped objects is displayed in the Unmapped objects alert table under the Lineage extracted sources table.

../../../_images/dbt-lineage-unmapped-objects.png

To generate lineage for unmapped objects:

  • Ensure the host and port information in the JDBC URIs match exactly between Alation and dbt.

  • Navigate to the General Settings tab of the data source that appears as unmapped. In the Additional data source connections field, enter the connection details for the sources required to establish lineage.

  • Refetch the projects. See Fetch dbt Projects/Environments and Map Datasources.

Run Lineage Extraction

Lineage is extracted from the project you fetched in the lineage configuration.

Lineage extraction fetches lineage metadata (additional metadata such as Jinja code and SQL query) from dbt and generates lineage between sources and targets such as RDBMS tables, views, and columns.

To run lineage extraction, perform these steps:

  1. Open Lineage Settings in your dbt Gen2 ELT source.

  2. Go to the Lineage Configuration tab.

  3. Click Run extraction.

You can schedule extraction depending on your requirements. See Schedule Extraction.

Schedule Extraction

You can also schedule the extraction. To schedule the extraction, perform these steps:

  1. Under the Run extraction section, turn on the Enable extraction schedule toggle.

  2. Using the date and time widgets, select the recurrence period and day and time for the desired extraction schedule. The next lineage extraction job for your data source will run on the schedule you have specified.

Note

Here are some of the recommended schedules:

  • Schedule extraction to run for every 12 hours at the 30th minute of the hour.

  • Schedule extraction to run for every 2 days at 11:30 PM.

  • Schedule extraction to run every week on the Sunday and Wednesday of the week.

  • Schedule extraction to run for every 3 months on the 15th day of the month.

View the Lineage Job History

You can view the status of lineage extraction jobs after running an extraction or when Alation runs the extraction on a scheduled basis.

To view the status of extraction, go to the Lineage Settings > Lineage Job History tab on the Settings page of your dbt Gen2 ELT source. The Extraction job status table displays the status of project extraction and the corresponding lineage extraction job.

Click the View Details link to view a detailed report of lineage extraction. If there are errors, the Job errors table displays the error category, error message, and a hint (ways to resolve the issue). Follow the instructions under the Hints column to resolve the error.

In some cases, Generate Error Report link is displayed above the Job errors table. Click the Generate Error Report link above the Job errors table to generate an archive (.zip) containing CSV files for different error categories, such as Data and Connection errors. Click Download Error Report to download the file.

Understand Lineage Extraction from dbt

Lineage extraction from dbt involves capturing the relationships between various data objects defined within dbt projects. Lineage is generated for data objects such as schemas, tables, views, and columns that are part of the dbt models present within a database. The lineage information includes details about how data flows from source tables to target tables through transformations defined in dbt models.

To view the lineage for the tables associated with a model, you can select the table from the Lineage tab on the model catalog page.

By default, the catalog page displays the lineage for the first table listed in the Source System information section.

Note

Objects that are related to the model but aren’t cataloged yet are listed without links in the Source System information section.

To learn more about viewing lineage in the Alation catalog, see Discover Lineage.

For information on how to configure Lineage using Alation user interface, see Configure Lineage.

The following walkthrough summarizes the steps to configure lineage extraction using the dbt Gen2 OCF connector and shows the end-to-end lineage in the Alation data catalog.

Configure dbt Gen2 Lineage

View Data Health Information from dbt

The connector pushes the MDE model execution and test run results to the associated data source tables. To view the health information for any table with an associated Data Health rule, navigate to the source table using the link on the Source System information section on the model catalog page and click Data Health.

Any failure in model execution gets propagated to downstream objects in the lineage graph, making it easy to find the root cause of the failure.

Here’s an example:

To understand more about viewing data health within the Alation data catalog, see View Data Health.