Asset Import for data products
App

The Asset Import app loads metadata for data domains and data products from a CSV file into Atlan. It's designed to support large-scale enrichment and migration of data product metadata without manual entry.

CSV files can be provided either by uploading them directly from your local machine or by fetching them from a cloud object store. Supported object storage systems include Amazon S3, Google Cloud Storage (GCS), and Azure Data Lake Storage (ADLS).

This reference provides complete configuration details for data product imports, including input handling rules, options for updates versus creation, and behavior when working with hierarchical structures like parent domains or data products.

You can also use the same Asset Import app to import other types of metadata, such as Assets, Tags and Glossaries. See their respective references for details.

Access

The Asset Import app isn't enabled by default. To use this app, contact Atlan support and request it be added to your tenant. Once enabled, data product imports can be set up and run by admins or users with workflow permissions.

Source

This section defines how the input CSV file for data product metadata is provided and identified in Atlan.

Workflow name

Specifies the display name for the workflow in Atlan. This name is used to identify the import job in the UI and logs. Choose a name that clearly reflects the purpose or scope of the data product import.

Example: If you're importing data products for the finance domain, you might set:

Finance data products import

Import metadata from

This property defines how the CSV file containing data product metadata is provided to the workflow. The file format must match the CSV file format for data products.

There are two ways to provide the file:

Direct file uploads: Upload a CSV file directly from your local machine. This is useful for smaller files or ad-hoc imports. See Direct file uploads.
Object storage: Fetch the CSV file from a supported cloud object store (S3, GCS, or ADLS). This is recommended for larger files or recurring imports. See Object storage.

Direct file uploads

Upload a CSV file directly from your local machine. This option is best for smaller files or ad-hoc imports that are run manually.

File size limit

Direct file uploads are limited to ~10 MB. Only one file can be uploaded per run. For larger or recurring imports, use object storage.

Object storage

This option imports the data product CSV file from a cloud object store rather than a local upload. It's recommended for large files and for recurring imports. Supported providers are Amazon S3, Google Cloud Storage (GCS), and Azure Data Lake Storage (ADLS). When this option is selected, additional storage-specific properties such as bucket, project ID, or container become available.

Amazon S3
Google Cloud Storage
Azure Data Lake Storage

Amazon S3 enables you to store and retrieve objects at scale. You can use this option when the data product CSV file is stored in an S3 bucket.

AWS access key

The access key for your AWS account. You can find this in the AWS Management Console > IAM > Users > Security credentials tab.

Must have a value if you are using the access/secret key authentication method.
Must be blank if your setup is tenant-backed, cross-account, or role-based.

Example:

AKIAIOSFODNN7EXAMPLE

AWS secret key

The secret key that pairs with your access key. This is generated when you create an access key in IAM. You must download it at creation time or rotate and generate a new one if lost.

Must have a value if you are using the access/secret key authentication method.
Must be blank if your setup is tenant-backed, cross-account, or role-based.

Example:

wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

AWS role ARN

The ARN of the AWS role to use to access S3. You must set this up separately in AWS, and grant permissions for Atlan to assume this role.

Must have a value if you are using role-based authentication method.
Must be blank if your setup is tenant-backed, cross-account, or access/secret key based.

Example:

arn:aws:iam::123456789012:role/roleName

Region

The AWS region in which your bucket is located (for example, us-east-1). You can find this in the S3 service dashboard when selecting your bucket.

Must have a value if you are using the access/secret key authentication method.
Must be blank in all other scenarios, where the region is inferred from the tenant or role.

Example:

ap-southeast-1

Bucket

The name of the S3 bucket that contains your data product CSV file. The bucket name is listed in the S3 service dashboard.

Must be blank to use the tenant-backed object store's bucket.
Must have a value in all other scenarios.

Example:

my-company-data-products

Google Cloud Storage (GCS) provides durable, secure storage for objects. Use this option if your data product CSV file is stored in a GCS bucket.

Project ID

The ID of your Google Cloud project. You can find this in the Google Cloud Console > Home Dashboard > Project info panel.

Must have a value if you are using your own managed GCS bucket.
Must be blank if your setup is tenant-backed (Atlan-managed).

Example:

my-data-products-project-123456

Service account JSON

A JSON key file containing service account credentials with permission to access the bucket. You can create this in the Google Cloud Console > IAM & Admin > Service accounts.

Must have a value if you are using your own managed GCS bucket.
Must be blank if your setup is tenant-backed (Atlan-managed).

Example:

{
  "type": "service_account",
  "project_id": "my-data-products-project-123456",
  "private_key_id": "abc123def456...",
  "private_key": "--BEGIN PRIVATE KEY--\nMIIEvQIBADANBgkqhkiG9w0BAQEFAASCBKcwggSjAgEAAoIBAQC...\n--END PRIVATE KEY--\n",
  "client_email": "data-products-import@my-data-products-project-123456.iam.gserviceaccount.com",
  "client_id": "123456789012345678901",
  "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  "token_uri": "https://oauth2.googleapis.com/token"
}

Bucket

The name of the GCS bucket containing your data product CSV file. You can find this in the Cloud Storage > Buckets page.

Must have a value if you are using your own managed GCS bucket.
Must be blank if your setup is tenant-backed (Atlan-managed).

Example:

my-company-data-products-storage

Azure Data Lake Storage (ADLS) provides scalable, secure storage for files and objects. Use this option if your data product CSV file is stored in an ADLS container.

Azure client ID

The application (client) ID for the app registered in Azure AD. You can find this in the Azure Portal > Azure Active Directory > App registrations.

Must have a value if you are using your own managed ADLS container.
Must be blank if your setup is tenant-backed (Atlan-managed).

Example:

12345678-1234-1234-1234-123456789012

Azure client secret

The client secret value for the registered app. You can generate this in the Azure Portal > Azure Active Directory > App registrations > Certificates & secrets.

Must have a value if you are using your own managed ADLS container.
Must be blank if your setup is tenant-backed (Atlan-managed).

Example:

abc123d45pqr678stu901vwx234

Azure tenant ID

The tenant ID of your Azure Active Directory instance. This is available in the Azure Portal > Azure Active Directory > Overview page.

Must have a value if you are using a your own managed ADLS container.
Must be blank if your setup is tenant-backed (Atlan-managed).

Example:

87654321-4321-4321-4321-210987654321

Storage account name

The name of your storage account. You can find this in the Azure Portal > Storage accounts list.

Must have a value if you are using your own managed ADLS container.
Must be blank if your setup is tenant-backed (Atlan-managed).

Example:

mydataproductsstorage

Container

The ADLS container that contains your data product CSV file. The container name is available under Azure Portal > Storage accounts > Containers.

Must have a value if you are using your own managed ADLS container.
Must be blank if your setup is tenant-backed (Atlan-managed).

Example:

data-products

Data products file

This property is available only when the Direct file uploads option is selected under Import metadata from. It defines the CSV file that contains data product metadata for import into Atlan.

The data product CSV follows a structured format with columns for data domains, data products, and their attributes. Each row in the file represents a single data product object and its attributes, while parent–child relationships (such as data products under domains) are encoded in specific columns.

You can upload one CSV file per workflow run. The file must follow the Data Products CSV format, and only CSV files are supported, formats such as JSON or Excel aren't accepted.

Duplicates cause errors

Duplicate data domain or product entries in the same file cause errors and workflow failure.

For detailed information on the required structure and field definitions, see the Data Products CSV format.

Prefix (path)

This property is available only when the Object storage option is selected under Import metadata from. It specifies the directory or path within your selected cloud object store where the data product CSV file is located.

If left blank, the system searches from the root of the bucket or container.

With prefix: Only files under the specified path are processed.
Without prefix: System searches from the root of the storage location.
Format: Use forward slashes (/) as path separators.
Trailing slash: Not required, Atlan appends automatically if missing.

Example: If your data product file is stored in a folder called finance inside the data-products directory of your bucket, set the prefix to:

data-products/finance

Object key (filename)

This property is available only when the Object storage option is selected under Import metadata from. It specifies the exact CSV file to import from your chosen cloud object store. The value entered here is combined with the optional Prefix (path) to form the complete location of the file.

The object key must include the file name and extension. Only one file can be provided per configuration. If you have multiple CSV files, the Object key (filename) property can only reference one file at a time. A new workflow run is required for each file, even if they're stored in the same prefix or folder.

Single file: Only one CSV file per workflow configuration
Multiple files: Requires separate workflow runs for each file
File extension: Must include the .csv extension
Path combination: Object key + prefix = complete file location (within the bucket)

Example: If your data product file is stored under a folder called data-products/finance in your object store, you can configure:

Prefix:

data-products/finance

Object key:

finance-products.csv

Complete path: {{bucket}}/data-products/finance/finance-products.csv

Input handling

The Input handling property defines how the workflow processes data product metadata from the CSV file when matching it with existing assets in Atlan. It controls whether new data products are created or only existing ones are updated.

Data product matching rules

Data domains and products have specific matching behavior:

Data domains: qualifiedName is ignored, only name is used for matching
Subdomains: qualifiedName is ignored, unique combination of name and parentDomain is used
Data products: qualifiedName is ignored, unique combination of name and dataDomain is used

This means you can't use qualifiedName to control updates, so you can't use this app to change the names of existing domains or data products.

Create and update

This option creates new data product assets from the CSV file and updates any that already exist in Atlan. With this configuration, the workflow reads each row in the CSV and either:

Creates a data domain or data product if it doesn't already exist, or
Updates the corresponding data domain or data product if it's already present.

This option is commonly used when loading data products into Atlan for the first time, or when expanding an existing data product catalog with additional products while refreshing the details of existing entries.

Update only

This option only updates data domains and data products that already exist in Atlan. With this configuration, the workflow applies changes from the CSV file to matching data domains or data products, but doesn't create any new ones.

This option is useful when you want to be sure you are only enriching or correcting existing data domains or data products, and avoid ever creating new ones.

Options

The Options property defines how the workflow interprets and applies data from the data product CSV file. These settings control error handling, attribute overwrites, and how multi-valued fields like tags or custom metadata are managed.

Default
Advanced

When Default is selected, the following behaviors apply:

Blank fields in the CSV are ignored and don't overwrite existing values.
Any invalid value in a field causes the import to fail.
Comma (,) is used as the field separator.
Up to 20 records are processed in each API request.
Custom metadata attribute values are merged with any existing values.
Links are updated if their URL is the same, otherwise new links are added.

Atlan tags behavior

If the atlanTags column exists in your CSV, it overwrites existing tags completely, including removing any tags from existing assets where it's empty on a given row.

Selecting Advanced provides more control over how data product imports are processed.

Remove attributes, if empty

This setting lets you remove specific attributes when the CSV field is blank. For example, if you select Description and leave the description field empty in the CSV, the description is cleared from the corresponding data domain or product asset in Atlan.

Fail on errors?

Defines how the workflow responds when encountering invalid values in the CSV file.

Yes: The workflow stops and fails when an error is found.
No: The workflow skips the invalid value, logs a warning, and continues processing.

Example: If a data product contains an invalid classification, selecting No lets the workflow continue importing other products.

Field separator

Specifies the character used to separate fields in the CSV file.

Default is , (comma).
Other common options include ; (semicolon) or | (pipe), depending on your CSV.

Example: If your CSV uses semicolons, set the field separator to ;.

Batch size

Defines the maximum number of rows to process in a single API request.

Default: 20 records.
Increasing this value can improve performance but may risk hitting API limits.

Example: If you set batch size to 50, each request processes 50 data product rows.

Custom metadata handling

Controls how custom metadata attributes are applied when multiple values exist.

Ignore: Custom metadata in the CSV is ignored.
Merge: Merges new metadata attributes from the CSV with existing attributes in Atlan (attribute-level merge, not value concatenation).
Overwrite: Replaces all existing metadata attributes with those from the CSV.

Example: If a data product already has Data Quality custom metadata with attribute completeness=80 and the CSV provides attribute Data Quality::accuracy=90:

Ignore: → product keeps only its existing attribute: completeness=80.
Merge → product keeps both attributes: completeness=80 and accuracy=90.
Overwrite → product ends with only accuracy=90 (completeness is empty).

Atlan tag association handling

Controls how Atlan tag associations from the CSV are applied.

Ignore: Tags in the CSV are ignored.
Append: Adds tags from the CSV to existing tags.
Replace: Replaces all existing tags with those from the CSV.
Remove: Removes all tags from the asset.

Example: If a data product already has tag PII and the CSV lists Sensitive:

Ignore → existing PII only
Append → PII, Sensitive
Replace → Sensitive only
Remove → no tags remain

Linked resource idempotency

Defines how related resources (links) are handled when they already exist.

URL: Identifies unique links by their URL.
Name: Identifies unique links by their name.

Example: If a data product already has a link named Additional info with a URL https://example.com/addl-info, and the CSV specifies a link value with the name Additional info and a URL https://example.com/some-other-info:

URL → product has two links: both named Additional info but each with a different URL.
Name → product has one link named Additional info with the URL https://example.com/some-other-info.

Data products CSV file

The data product CSV file defines the metadata for data domains and data products to be imported. Each row represents one data product object and its attributes. Parent–child relationships are encoded using specific columns such as dataDomain (for data products) and parentDomain (for subdomains).

Required fields

Data domains

typeName: Must be DataDomain for data domains.
name: The name of the data domain.

Example: Customer Analytics (creates or updates a data domain called "Customer Analytics")

Subdomains

typeName: Must be DataDomain for subdomains.
name: The name of the subdomain.

Example: Customer Segmentation (creates or updates a subdomain called "Customer Segmentation")
parentDomain: The name of the parent domain.

Example: Customer Analytics (creates or updates the "Customer Segmentation" subdomain only under the "Customer Analytics" domain)

Data products

typeName: Must be DataProduct for data products.
name: The name of the data product.

Example: Customer 360 Dashboard (creates or updates a data product called "Customer 360 Dashboard")
dataDomain: The domain (or subdomain path) that contains the data product.

Example: Customer Analytics@Customer Segmentation (creates or updates the "Customer 360 Dashboard" product only under the "Customer Segmentation" subdomain of the "Customer Analytics" domain)

dataProductAssetsDSL: The Elasticsearch DSL specifying the criteria for selecting which assets are part of the data product. You can either export this or use one of the SDKs to generate it, rather than trying to write it by hand.

Example: this query includes all tables with a name of "DIM_CUSTOMER" or "Customers"

{"query":{"dsl":{"from":0,"query":{"bool":{"filter":{"bool":{"minimum_should_match":"1","must":[{"term":{"__superTypeNames.keyword":{"value":"Referenceable","case_insensitive":false}}},{"term":{"__state":{"value":"ACTIVE"}}}],"should":[{"bool":{"filter":[{"term":{"__typeName.keyword":{"value":"Table","case_insensitive":false}}},{"term":{"__state":{"value":"ACTIVE"}}},{"term":{"name.keyword":{"value":"DIM_CUSTOMER","case_insensitive":false}}}]}},{"bool":{"filter":[{"term":{"__typeName.keyword":{"value":"Table","case_insensitive":false}}},{"term":{"__state":{"value":"ACTIVE"}}},{"term":{"name.keyword":{"value":"Customers","case_insensitive":false}}}]}}]}}}}},"attributes":["__traitNames","connectorName","__customAttributes","certificateStatus","tenantId","anchor","parentQualifiedName","Query.parentQualifiedName","AtlasGlossaryTerm.anchor","databaseName","schemaName","parent","connectionQualifiedName","collectionQualifiedName","announcementMessage","announcementTitle","announcementType","announcementUpdatedAt","announcementUpdatedBy","allowQuery","allowQueryPreview","adminGroups","adminRoles","adminUsers","category","credentialStrategy","connectionSSOCredentialGuid","certificateStatus","certificateUpdatedAt","certificateUpdatedBy","classifications","connectionId","connectionQualifiedName","connectorName","dataType","defaultDatabaseQualifiedName","defaultSchemaQualifiedName","description","displayName","links","link","meanings","name","ownerGroups","ownerUsers","qualifiedName","typeName","userDescription","displayDescription","subDataType","rowLimit","queryTimeout","previewCredentialStrategy","policyStrategy","policyStrategyForSamplePreview","useObjectStorage","objectStorageUploadThreshold","outputPortDataProducts"],"suppressLogs":true,"requestRelationshipAttrsForSearch":false,"showSearchMetadata":false,"showHighlights":false,"includeClassificationNames":true},"filterScrubbed":true}TODO

Matching behavior

Data domains: only name is used for matching
Subdomains: unique combination of name and parentDomain is used
Data products: unique combination of name and dataDomain is used

Can't update names

The qualifiedName field is completely ignored, so you can't use this app to change the names of existing domains or data products.

Common fields

You can use these common fields on data domains and data products.

userDescription: Description or definition of the asset.

Example: Assets available for use for analytics of our marketing campaigns. (gives the definition of a product)
assetThemeHex: Hexadecimal RGB value to use as the colored theme for the asset.

Example: #525C73
assetIcon: Name of the Phosphor icon to use to represent the asset, in CamelCase.

Example: PhAirplaneTakeooff

Certificates

certificateStatus: Optional certificate on the asset. Must be one of (case-sensitive):
- VERIFIED
- DRAFT
- DEPRECATED
- or empty.
certificateStatusMessage: Optional message to associate with the certificate. Atlan only shows this if the certificateStatus is non-empty.

Example: Confirmed by reviewing the description and readme.

Announcements

announcementType: Optional type of announcement on the asset. Must be one of (case-sensitive):
- information
- warning
- issue
- or empty.
announcementTitle: Optional heading line for the announcement. Atlan only shows this if the announcementType is non-empty.

Example: Unconfirmed quality
announcementMessage: Optional detailed message that can be associated with the announcement. Atlan only shows this if the announcementType is non-empty.

Example: The quality of this asset has not been validated by either automated or manual checks. Use at your own risk.

Owners

ownerUsers: Optional list of individual users who are owners of the asset. Separate each username by a newline within the cell.

Example: this assigns both "jane" and "joe" as individual owners of the asset
```
jane
joe
```
ownerGroups: Optional list of groups who are owners of the asset. Separate each group name by a newline within the cell.

Example: this assigns both "finance" and "marketing" as group owners of the asset
```
finance
marketing
```

Atlan tags

atlanTags: Optional list of the tags to assign to the asset. Separate each tag by a newline within the cell, and format as one of the following:
- Tag Name to directly assign the tag to the asset but not propagate it.
- Tag Name>>FULL to directly assign the tag to the asset and propagate it both down the hierarchy and through lineage.
- Tag Name>>HIERARCHY_ONLY to directly assign the tag to the asset and only propagate down the hierarchy (not through lineage).
- Tag Name<<PROPAGATED to indicate the tag has been propagated to the asset.
  
  Propagated tags are ignored
  Any tag marked propagated (Tag Name<<PROPAGATED) is ignored by an import. Only those tags that are directly applied are imported, though of course any tags applied up-hierarchy or upstream that are marked to propagate are still propagated accordingly.
For source tags (with values), you can extend the tag name portion as follows:
- Tag Name {{connector-type/Connection Name@@sourceTagLocation??key=value}}, where:
  - connector-type is the type of the source tag (snowflake, dbt, etc)
  - Connection Name is the name of the connection for the source the tag is synced from
  - sourceTagLocation is the path within that connection where the source tag exists
  - key is an optional key for the associated value for the tag
  - value is the value for the associated tag
Example: this associates the Confidential Atlan tag, which is synced with the CONFIDENTIAL Snowlfake tag in the DEMO database's CUSTOMER schema as part of the Production Snowflake connection. It has a value of Not Restricted in Snowflake, and the tag itself is fully-propagated in Atlan.
```
Confidential {{snowflake/Production@@DEMO/CUSTOMER/CONFIDENTIAL??=Not Restricted}}>>FULL
```

Links

links: Optional list of the links to assign to the asset. Separate each link by a newline within the cell, and format as embedded JSON.
- typeName: set to Link
- attributes: containing a substructure of
  - name: the name (title) to give the link
  - link: the URL of the link
Example: this creates 2 links for the asset, one named Example and the other named Google
```
{"typeName":"Link","attributes":{"name":"Example","link":"https://www.example.com"}}
{"typeName":"Link","attributes":{"name":"Google","link":"https://www.google.com"}}
```

Readmes

readme: Optional HTML-formatted contents to use as the README for the asset. (You can limit this to only what's inside the <body></body> HTML tags.)

Example: this sets the README to include a heading of "Overview" with some descriptive content underneath
```
<h1>Overview</h1>
<p>
  Some descriptive content about this asset,
  including <a href="https://example.com">links</a>
  and other rich HTML content.
</p>
```

Starred details

starredDetails: Optional list of the users you want to star the asset for. Separate each entry by a newline within the cell, and format as embedded JSON.
- assetStarredBy: set to the username
- assetStarredAt: set to the (epoch-style) timestamp of when to star it for them
Example: this ensures the asset is starred for two users, "Jane" and "Joe"
```
{"assetStarredBy":"joe","assetStarredAt":1698769268966}
{"assetStarredBy":"jane","assetStarredAt":1698769268966}
```

Product-specific fields

You can use these fields on data products (only).

daapCriticality: Optional criticality of the data product. Must be one of (case-sensitive):
- High
- Medium
- Low
- or empty.
daapSensitivity: Optional sensitivity of the data product. Must be one of (case-sensitive):
- Public
- Internal
- Confidential
- or empty.
daapVisibility: Optional visibility of the data product. Must be one of (case-sensitive):
- Private
- Protected (shows as "Restricted" in the UI)
- Public
- or empty.
daapVisibilityUsers: Optional list of usernames for users who are allowed to see this data product. Separate each username by a newline within the cell.

Example: this gives users jane and joe permission to view the data product
```
jane
joe
```
daapVisibilityGroups: Optional list of group names for groups who are allowed to see this data product. Separate each group name by a newline within the cell.

Example: this gives groups finance and marketing permission to view the data product
```
finance
marketing
```

Custom metadata

You can also manage custom metadata attributes through the app. For these, use the name of the custom metadata and attribute as the name of the column in the CSV in the format Custom Metadata Name::Attribute Name.

Example:

Data Quality::Completeness: this column sets values for the Completeness attribute in custom metadata named Data Quality.

For any attribute that supports multiple values, separate each value by a newline within the cell.

Sample CSV file

Download a sample CSV file to understand the required structure: Download sample data products CSV

Sample file disclaimer

This sample file shows the structure and format only. It may not import as-is and is merely a template for creating your own CSV files.

Access​

Source​

Workflow name​

Import metadata from​

Direct file uploads​

Object storage​

AWS access key​

AWS secret key​

AWS role ARN​

Region​

Bucket​

Project ID​

Service account JSON​

Bucket​

Azure client ID​

Azure client secret​

Azure tenant ID​

Storage account name​

Container​

Data products file​

Prefix (path)​

Object key (filename)​

Input handling​

Create and update​

Update only​

Options​

Remove attributes, if empty​

Fail on errors?​

Field separator​

Batch size​

Custom metadata handling​

Atlan tag association handling​

Linked resource idempotency​

Data products CSV file​

Required fields​

Data domains​

Subdomains​

Data products​

Matching behavior​

Common fields​

Certificates​

Announcements​

Owners​

Atlan tags​

Links​

Readmes​

Starred details​

Product-specific fields​

Custom metadata​

Sample CSV file​

See also​

Access

Source

Workflow name

Import metadata from

Direct file uploads

Object storage

AWS access key

AWS secret key

AWS role ARN

Region

Bucket

Project ID

Service account JSON

Bucket

Azure client ID

Azure client secret

Azure tenant ID

Storage account name

Container

Data products file

Prefix (path)

Object key (filename)

Input handling

Create and update

Update only

Options

Remove attributes, if empty

Fail on errors?

Field separator

Batch size

Custom metadata handling

Atlan tag association handling

Linked resource idempotency

Data products CSV file

Required fields

Data domains

Subdomains

Data products

Matching behavior

Common fields

Certificates

Announcements

Owners

Atlan tags

Links

Readmes

Starred details

Product-specific fields

Custom metadata

Sample CSV file

See also