What’s the data environment like at Square?
Ryan Mason (Data Scientist): Square uses MySQL, Hive, and Vertica. We work with a variety of data sources, each with a separate interface.
What are your current data initiatives and projects?
Ryan: It’s really all about making our analysts productive and enabling them to perform analysis as easily as possible.
With multiple databases that analysts query across, we’re always looking to make analysis faster. Switching context and interfaces slows down work sometimes so anything that can ease that friction is great.
Collaboration between analysts is really important for us here at Square. We encourage sharing of work like best practice methods, great queries, recommended filters, efficient JOINs, and more so that everyone can learn from the more experienced analysts.
Documenting data and keeping it up-to-date is a major initiative. We want to be able to provide full information to analysts and promote knowledge sharing around data so they don’t have to send out as many emails asking about certain data sets or spend as much time trying to find experts on specific items.
Square + Alation
At Square, Alation has been able to
- provide analysts a consistent interface and query tool across all data sources, with power query features such as SmartSuggest
- enable query and result search, sharing, and discovery, with all results updated in real-time
- deliver a full documentation engine that analysts and data scientists rely on for up-to-date knowledge on the data across all data sources
How has Alation helped Square continue to be data-driven?
Ryan: Alation has really helped boost productivity and the analyst experience here at Square. We use Alation as a data dictionary, business glossary, as well as an analyst tool for day-to-day work.
I’m using Alation to document all our data sources. I love profiling, automated metadata extraction, and the ability to get usage stats from the query log.Ryan Mason, Data Scientist