Control access to metadata and data?
You can customize access for users through several mechanisms.
User roles
The most general mechanism is a user role. These define the very broad permissions a user has in Atlan—for example, whether they can administer other users, or only discover metadata. When it comes to what metadata and data a user can access, though, you need to use the additional mechanisms below.
Connection admins
Connection admins are users who manage connectivity to a data source. By default, these users can:
- Read and write all metadata on assets from that connection.
- Preview and query the data in all data assets from that connection.
- Manage access policies to grant others access to the assets from that connection.
You define the connection admin when crawling a new data source for the first time. A connection admin can also extend the list of connection admins on their connection at any time.
Access policies
A user must be both an admin user and a connection admin to define access policies for the connection's assets.
Access policies either enable or restrict access to certain assets. They let you be much more creative (and granular) about access than the all-or-nothing privileges of connection admins.
You start by defining which assets to control with each policy. There are two complementary mechanisms to do this in Atlan: personas and purposes.
Once you have defined the subset of assets, you can then define granular access to both metadata and data:
Metadata policies
Metadata policies control what users can do with the assets' metadata. Through them, you can control who can:
- Read: view an asset's activity log, custom metadata, and SQL queries
- Update: change asset metadata, including description, certification, owners, README, and resources
- Update Custom Metadata Values for the assets
- Add Tags to the assets
- Remove Tags from the assets
- Add Terms to the assets
- Remove Terms from the assets
- Create: create new assets within the selected connection (via API)
- Delete: delete assets within the selected connection (via API)
Data policies
Data policies control what users can do with the assets' data. Through them, you can control who can:
- Query and preview the data within the assets
- Whether to hide any data, through various masking techniques:
- Show first 4: replaces all the data with
Xexcept the first 4 characters of data. For example1234 5678 9012 3456becomes1234XXXX. - Show last 4: replaces all the data with X except the last 4 characters of data. For example
1234 5678 9012 3456becomesXXXX3456. - Hash: replaces the data with a consistent hashed value. Because the hash is consistent you can still join on it across assets. For example
1234 5678 9012 3456becomesf43jknscakc12nk21ak. - Nullify: replaces the data with the null value. For example
1234 5678 9012 3456becomesnull. - Redact: replaces all alphabetic data with x and all numeric data with 0. For example
1234 Street Namebecomes0000 Xxxxxx Xxxx.
- Show first 4: replaces all the data with
Glossary policies
Glossary policies control what users can do with glossary metadata—terms and categories. Through them, you can control who can do the following within each glossary:
- Read permission on terms, categories, and glossaries exists by default and can't be modified. Glossary policies don't restrict users from viewing any glossary and its contents within the Glossary section.
- Create terms and categories inside the glossary
- Update descriptions, certification, owners, READMEs, and resources for the glossary, terms and categories
- Link terms in the glossary with all other assets
- Delete terms and categories inside the glossary
- Add tags to the terms
- Remove tags from the terms
- Update custom metadata values for the terms and categories inside the glossary
Glossary policies can only be defined through personas.
Domain policies
Domain policies are only available if your workspace has the data products module enabled. If your team isn't using data domains, this section doesn't apply to you.
Domain policies govern access to the domain itself: its metadata, structure, subdomains, and products. They don't control access to the individual assets (tables, columns, dashboards, etc.) that live inside the domain. To control access to assets, use metadata policies or data policies.
Through them, you can control who can:
- Read: view metadata, resources, and READMEs for data domains
- Update Domains: update metadata, resources, and READMEs for data domains
- Create Sub-domains: create new data subdomains within a domain
- Update Sub-domains: update metadata, resources, and READMEs for data subdomains
- Create Products: create new data products within a domain
- Update Products: update metadata, resources, and READMEs for data products
- Delete Products: delete data products within a domain
- Update Custom Metadata For Domains: update custom metadata for data domains
- Update Custom Metadata For Sub-Domains: update custom metadata for data subdomains
- Update Custom Metadata For Products: update custom metadata for data products
Domain policies can only be defined through personas. For a full walkthrough, see Create domain policies.
Interactions
All these mechanisms can coexist. This is powerful, but can also be a bit overwhelming to think about. What takes priority when a user is under the control of all these mechanisms? 😵💫
It's actually not as bad as you might think—only these three rules:
Access is denied by default (implicitly)
By default, users won't have the permissions listed here. This remains true until you explicitly grant a user a permission.
For example, imagine you haven't set up any access policies and a new user joins.
- They won't have any of the permissions listed here for any assets in Atlan.
Users have read permission on terms, categories, and glossaries by default in Atlan.
Explicit grants are combined
When you grant a user a permission, this is combined with all other permissions you have granted the user.
Continuing this example, imagine you add the new user to a group defined as the connection admins for Snowflake.
- The user now has full read/write access to all metadata for Snowflake assets, and can query and preview the data in those assets.
Then you add the user to a persona that gives read/write access to a Looker project.
- The user now has access to all Snowflake assets and a Looker project's assets.
Explicit restrictions (denies) take priority
When you explicitly deny a user a permission, this takes priority over all other permissions you have granted the user.
Continuing this example, imagine you define a purpose with a data policy that masks PII data.
- The user still has full read/write access to all metadata for Snowflake assets and a Looker project's assets.
- In general, they can still query and preview the data in the Snowflake assets.
- However, any PII data in Snowflake is now masked.
Then you add a metadata policy to the purpose that denies permission to remove the PII tag.
- The user no longer has full read/write access to all metadata for Snowflake assets and a Looker project's assets.
- The user can't remove the PII tag from any of these assets.
The combination of mechanisms in this example shows their power. Through a small number of controls, you can define wide-ranging but granular access permissions.