I have stated earlier in a blogpost that, next to being a Snowflake Data Cloud enthusiast and part of the Snowflake ❄️ Data Superheroes 2023, I am also Director Data & AI at Pong. We provide Professional Services where we help organizations solve data related challenges. We believe building data products with e.g. Snowflake, can only succeed when the fundament is in place. For us this fundament can be achieved by applying Data Governance, which is all about people, processes, methods and technology.
At Snowflake Summit 2023, I was very curious which innovations are in the pipeline. Here a short overview of the new Data Governance capabilities in Snowflake.
Know your Data
- Data Quality Monitoring (in Private Preview soon)
Poor Data Quality will cost an organization time, money and reputation. Data owners need a consistent, secure and cost-effective solution to proactively monitor Data Quality issues. Data Quality monitoring in Snowflake will enable organizations to define, measure and monitor Data Quality natively in the Data Cloud.
A Data Metric function allows to define own custom Data Quality metrics with a new schema-level object. There will also be some ready to use system-generated Data Quality metrics like; Freshness, Volume, Accuracy and Statistics.
Snowflake takes care of the measurement in the background, incrementally. No need to specify tasks, just apply a specific metric to a column. All measurement will be automatically recorded in a centralized table for alerting, visualization, and troubleshooting. Snowflake manages this process, but does not have access to this metrics data.
Data Quality metrics can be defined once and reused consistently across multiple objects. There is no management overhead, so no tasks, stored procedures or jobs to manage. The Snowflake solution can be cost effective because it evaluates incremental data only.
The Snowflake Partner eco-system is very important. The idea is that Snowflake will be delivering building blocks for Data Quality Monitoring that the Snowflake Partner eco-system can further leverage and extend.
Governance and Privacy UI experience in Snowsight
The next two developments are an improvement of the Snowsight UI experience.
- Data Governance UI (in GA soon)
Summary Dashboard in Snowsight of tagged and protected assets with workflows to take action.
- Classification UI (in Private Preview)
This provides a intuitive interactive (accept or reject Snowflake recommended semantic category tags) workflow to classify data at scale in a desired schema.
- UK, Australia, Canada-based PII Classification (in Private Preview)
Snowflake has expanded the PII Classification (additional categories and grouping) to support UK, Australia, Canada-based data.
Advanced Data Access Audit.
The following two developments provide granular details about object access and modifications which are recorded for audibility.
- Tags & Policies History (in Public Preview)
‘Masking Policy assigned to Queried Data’ audits whether an accessed sensitive column in a table or a view had masking policy at the time it was queried.
‘Tag and Policy modification history’ tracks modifications (who and when) to tag and policy associations – tag value, policy body and assignment
- Table Schema Change History (in Public Preview)
Tracks new tables in sensitive schemas or new columns in monitored tables. This enables immediate action; e.g. trigger classification.
Protect your Data
- Schema-wide Tag-based Masking (in Public Preview)
This a continuation of Tag-based Masking where you can assign a tag to a masking policy and apply this to a column. With this new Schema-wide Tag-based Masking you can enable the same for all (future) objects in a schema.
- Query Constraints (in Private Preview)
Query Constraint policies enable organizations to enhance their Data Privacy with policies that constrain query types on protected data. This can be especially useful to be used in Data Clean rooms.
The Aggregation Constraint policy preserves individual privacy by allowing queries about groups, but not individuals.
The Projection Constraint policy, where you can use a column in a where-clause or a join-operation. At the same time this column cannot be exposed in the query-result.
- Data Access Policies – What’s new
Connect your Ecosystem
- Policies on Shared Objects (in Private Preview)
This allows to create a policy on the provider side ànd at the same time make it relevant at the consumer side. Database Roles will be necessary to make this happen.
Alation Data Governance Partner of the Year
Normally Snowflake and Databricks not always agree. This time they were unanimous. Both techgiants award Alation Data Governance Partner of the Year 2023! For Alation it is the third time in a row. As Alation and Pong are partnering in The Netherlands I thought I should mention that here.
Check the Snowflake Announcement here.
I have briefly covered the integration between Alation and Snowflake in a previous blogpost. During Snowflake Summit 2023 Alation also announced new innovations.
- Alation Connected Sheets
This self-service capability enables users to pull in data directly from up-to-date, governed and trusted Snowflake data while working natively in a spreadsheet (Microsoft Excel, Google Sheets).
Find a demo on the Alation website.
- Alation Open Data Quality Framework
This enables Data Quality information form Snowflake visible in Alation via:
- Data Health Tab
- Trust Flags
- Data Profiling
- Data Quality Policies
Read more in this press release.
Till next time.