Auto-PII Detection

Automated PII classification of data assets by the Atlan Bot

๐Ÿ”– Auto-PII classification via the Atlan Bot

Classification is a tag ๐Ÿท๏ธ that helps you group data assets of similar access policies.

We don't deny that classifying each data asset can be a bit daunting. To make the process easier, say hi ๐Ÿ‘‹ to the Atlan Bot ๐Ÿค–

The Atlan Bot supports you by intelligently identifying data assets with Personal Identifiable Information (PII), and then attaching the PII classification and the access policies.

Let's see how ๐Ÿ‘€

Atlan Bot auto-detecting PII columns in a table

๐Ÿค– How auto-classification works in Atlan

The Atlan Bot uses specific algorithms to auto-detect Personal Identifiable Information (PII).

It first checks the column metadata like column headers against our internal PII terms master database for PII terms like a credit card, bank account, etc. If the matching (Levenshtein) score of the column header & any PII term in master data is above a threshold value, then it gets tagged as PII

As column headers are not always explicit, the Atlan Bot also checks for patterns inside the column values. It checks sample values of the column for the presence of any type of PII value (like credit card numbers, email addresses, etc).

For example, to detect credit card numbers, we have converted pattern guidelines given by credit card providers like Visa, Mastercard, AMEX, and others around the world into regular expressions that machines can understand and use for detection.

If any of the above two methods indicate PII then that asset gets tagged under PII classification.

โ–ถ๏ธ How to trigger the bot to auto-classify assets as PII

STEP 1: Go to the dedicated data table page

Go to the Discover list, and click the name of the data table you want to auto-classify.

STEP 2: Go to the "Profile" tab

Click the third tab from the right, labeled "Profile".

STEP 3: Click "Configure & Run"

This blue button is on the right-hand side. As soon as you click it, a configuration set-up modal will open.

STEP 4: Choose "Yes" to auto-classify

In the set-up modal, choose "Yes" for the auto-classify option, and click "Update".

The data quality profile will then run and auto-classify the table's columns as PII if they match the conditions given in the bot's algorithm (see the previous section).

Want to know more about classification? Read the article on this topic below ๐Ÿ‘‡