NEW YORK, New York — September 27, 2016– Alation, Inc., the data catalog company, today announced Q4 plans to release the Alation Data Catalog 4.0 with Alation Connect, a new connectivity layer that catalogs queries from popular compute engines, including Presto, SparkSQL and IBM Watson DataWorks. Alation Connect uses machine learning algorithms to automatically catalog queries executed through popular compute engines and track patterns in how joins, filters and query logic are used by analysts to interpret data. Alation Connect provides a more complete picture of analyst workflow, promoting the reuse of query logic and making it easier to enforce data governance policies.
“With the introduction of Alation Connect, we catalog queries alongside reports, dashboards and data. Most people access data through views, queries, reports and dashboards; so it’s critical for a data catalog to move beyond an inventory of only physical data assets like tables and files. Queries contain critical context about an analyst’s assumptions, calculations and methods. Cataloging those queries provides exponentially more knowledge than cataloging data alone. We can now provide more depth into the intent behind the analysis, promoting greater reproducibility, transparency, and productivity for how data is used within an organization,” said Satyen Sangani, CEO, Alation.
Alation Connect adds a common connectivity layer that extends across databases and Hadoop. The layer gives users a more complete view of how data is used within their organization, promoting greater reuse of queries and even the formula, joins and filters referenced within a query. With these assets, analysts can be far more productive in producing fast analytics and data stewards can easily enforce policies. Alation Connect enables Alation to consistently and automatically:
- Access the technical metadata that describes the different types of data stored
- Parse and normalize the query logs of popular compute engines and catalog analyst queries
- Extend existing Alation support for major databases and Hadoop query engines in Hive, Impala, Tez and Teradata QueryGrid with the addition of support for SparkSQL, Presto and IBM Watson DataWorks.
Alation Connect includes support for two query engines in high demand from analysts: Presto and SparkSQL. Presto is an open source SQL query engine that was first developed by Facebook. Analysts use Presto to query petabytes on-demand. Spark is a critical open source component and the foundation of many modern data architectures. Within those architectures, SparkSQL has grown in popularity as a means for analysts to query structured data inside Spark programs, using either SQL or a familiar DataFrame API. Alation Connect automatically creates data catalog entries for queries executed through Presto and SparkSQL as well as Hive, Impala, Tez and Teradata QueryGrid.
In addition to native support for query engines, the Alation Data Catalog 4.0 has deepened Alation’s support for the cloud. Alation is available to data professionals on IBM Watson DataWorks, the industry’s first cloud-based integrated data and analytics environment infused with cognitive capabilities. Alation provides users with a way to automatically generate a catalog of all data and queries. This automatically generated catalog captures key context around the data and how it is used, promoting collaboration and productivity and making it easier to enforce data governance policies.
“When we were designing IBM Watson DataWorks, we knew it would be important for organizations to collaborate effectively on data while also governing the data in a way that supports self-service data environments,” said Ritika Gunnar, vice president of offering management, IBM Analytics. “Alation provides a crucial repository for information on data and queries, helping to promote greater productivity and usage across teams.”
Find out why Gartner named Alation a Cool Vendor in Data Integration and Data Quality, 2016.