Data Model Ingestion App
The Data Model Ingestion app loads logical and relational metadata from an input file into Atlan. It’s designed to help teams define how their data is structured, organized, and related within Atlan without manual setup.
Files can be provided either by uploading them directly from your local machine or by fetching them from a cloud object store. Supported object storage systems include Amazon S3, Google Cloud Storage (GCS), and Azure Data Lake Storage (ADLS).
This reference provides complete configuration details for data model ingestion, including input handling rules and the structure of the required input file.
Access
The Data Model Ingestion app isn't enabled by default. To use this app, contact Atlan support and request it be added to your tenant. Once enabled, data model ingestion workflows can be set up and run by admins or users with workflow permissions.
Source
This section defines how the input file for data model ingestion is provided and identified in Atlan.
Workflow name
Specifies the display name for the workflow in Atlan. This name is used to identify the ingestion job in the UI and logs. Choose a name that clearly reflects the purpose or scope of the import.
Example:
Production Finance Data Model
Import metadata from
This property defines how the file containing the data model is provided to the workflow. The file must match the Data Model file format.
There are two ways to provide the file:
- Direct file uploads: Upload a Excel file directly from your local machine. This is useful for smaller files or ad-hoc imports. See Direct file uploads.
- Object storage: Fetch the file from a supported cloud object store (S3, GCS, or ADLS). This is recommended for larger files or recurring imports. See Object storage.
Direct file uploads
Upload a file directly from your local machine. This option is best for smaller files or ad-hoc imports that are run manually.
Direct file uploads are limited to ~10 MB. Only one file can be uploaded per run. For larger or recurring imports, use object storage.
Object storage
This property imports the data model file from a cloud object store rather than a local upload. It's recommended for large files and recurring ingestion jobs. Supported providers are Amazon S3, Google Cloud Storage (GCS), and Azure Data Lake Storage (ADLS). When this option is selected, additional storage-specific properties such as bucket, project ID, or container become available.
- Amazon S3
- Google Cloud Storage
- Azure Data Lake Storage
Amazon S3 enables you to store and retrieve objects at scale. You can use this option when the data model Excel file is stored in an S3 bucket.
AWS access key
The access key for your AWS account. You can find this in the AWS Management Console > IAM > Users > Security credentials tab.
- Required if you are using access/secret key authentication.
- Keep empty if you are using tenant-backed, cross-account, or role-based authentication.
Example:
AKIAIOSFODNN7EXAMPLE
AWS secret key
The secret key that pairs with your access key. This is generated when you create an access key in IAM. You must download it at creation time or rotate and generate a new one if lost.
- Required if you are using access/secret key authentication.
- Keep empty if you are using tenant-backed, cross-account, or role-based authentication.
Example:
wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
Region
The AWS region in which your bucket is located (for example, us-east-1
). You can find this in the S3 service dashboard when selecting your bucket.
- Required if you are using access/secret key authentication.
- Keep empty if you are using tenant-backed, cross-account, or role-based authentication (region is inferred automatically).
Example:
ap-southeast-1
Bucket
The name of the S3 bucket that contains your data model file. The bucket name is listed in the S3 service dashboard.
- Required in all scenarios (cross-account, access/secret keys, and role-based).
- Keep empty if you are using tenant-backed authentication and don't know the specific bucket name.
Example:
my-company-data-models
Prefix
Optional. Directory path within the bucket where your data model file is stored.
Example:
data-models/sales/
Object key
The file name (including extension) of your data model file within the bucket or prefix.
Example:
sales-data-model-template.xlsx
Google Cloud Storage (GCS) provides durable, secure storage for objects. Use this option if your data model Excel file is stored in a GCS bucket.
Project ID
The ID of your Google Cloud project. You can find this in the Google Cloud Console > Home Dashboard > Project info panel.
- Required if you are using customer-managed GCS bucket.
- Keep empty if you are using tenant-backed (Atlan-managed) storage.
Example:
my-data-model-project-123456
Service account JSON
A JSON key file containing service account credentials with permission to access the bucket. You can create this in the Google Cloud Console > IAM & Admin > Service accounts.
- Required if you are using customer-managed GCS bucket.
- Keep empty if you are using tenant-backed (Atlan-managed) storage.
Example:
{
"type": "service_account",
"project_id": "my-data-model-project-123456",
"private_key_id": "abc123def456...",
"private_key": "--BEGIN PRIVATE KEY--\nMIIEvQIBADANBgkqhkiG9w0BAQEFAASCBKcwggSjAgEAAoIBAQC...\n--END PRIVATE KEY--\n",
"client_email": "data-model-import@my-data-model-project-123456.iam.gserviceaccount.com",
"client_id": "123456789012345678901",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://oauth2.googleapis.com/token"
}
Bucket
The name of the GCS bucket containing your data model file. You can find this in the Cloud Storage > Buckets page.
- Required if you are using customer-managed GCS bucket.
- Keep empty if you are using tenant-backed (Atlan-managed) storage and don't know the specific bucket name.
Example:
my-company-data-model-storage
Prefix
Optional. Directory path within the bucket where your data model file is stored.
Example:
data-models/finance/
Object key
The file name (including extension) of your data model file within the bucket or prefix.
Example:
finance-data-model-template.xlsx
Azure Data Lake Storage (ADLS) provides scalable data lake storage for big data analytics. Use this option if your data model Excel file is stored in an ADLS container.
Client ID
The application (client) ID of your Azure Active Directory app registration. You can find this in the Azure Portal > Azure Active Directory > App registrations.
- Required if you are using customer-managed Azure storage.
- Keep empty if you are using tenant-backed (Atlan-managed) storage.
Example:
12345678-1234-1234-1234-123456789012
Client secret
The client secret value for your Azure AD app registration. This is generated when you create a new client secret in the app registration.
- Required if you are using customer-managed Azure storage.
- Keep empty if you are using tenant-backed (Atlan-managed) storage.
Example:
abc123def456ghi789jkl012mno345pqr678stu
Tenant ID
The Azure Active Directory tenant ID. You can find this in the Azure Portal > Azure Active Directory > Overview.
- Required if you are using customer-managed Azure storage.
- Keep empty if you are using tenant-backed (Atlan-managed) storage.
Example:
87654321-4321-4321-4321-210987654321
Account name
The name of your Azure Data Lake Storage account. You can find this in the Azure Portal > Storage accounts.
- Required if you are using customer-managed Azure storage.
- Keep empty if you are using tenant-backed (Atlan-managed) storage and don't know the specific account name.
Example:
mydatamodelstorage
Container
The name of the container within your ADLS account where your data model file is stored.
- Required if you are using customer-managed Azure storage.
- Keep empty if you are using tenant-backed (Atlan-managed) storage and don't know the specific container name.
Example:
data-models
Prefix
Optional. Directory path within the container where your data model file is stored.
Example:
sales-models/
Object key
The file name (including extension) of your data model file within the container or prefix.
Example:
sales-data-model-template.xlsx
Excel input
The input file for Data Model Ingestion must follow the provided Excel format. The Data Model Ingestion app works by reading a structured Excel file that describes your data model. This file contains up to four sheets, each serving a different purpose in defining and enriching your model:
-
Objects: defines the core building blocks: models, entities, and attributes.
-
Relations: captures how entities are connected to each other.
-
Mappings: links entities and attributes across tiers.
-
Implementations: binds your model to real-world assets in Atlan (Snowflake, BigQuery, etc.).
You can 📥 download the sample Excel template for the correct structure and example data.
At a minimum, you must populate the Objects sheet. The other sheets, Relations, Mappings, and Implementations are optional, but recommended if you want to capture richer metadata.
If any of the optional sheets are left empty, those aspects of the model simply won't be ingested. Below are the required formats for each sheet.
- Objects Sheet
- Relations
- Mappings
- Implementations
The Objects sheet defines the core building blocks of the data model—models, entities, and attributes. Each row in the Objects sheet represents either an attribute within an entity, or just the entity itself (if no attributes are defined).
- Model-level fields (
Model Name
,Model Display Name
,Model Description
,Model Type
) remain the same for all entities that belong to that model. - Entity-level fields (
Entity Name
,Entity Display Name
,Entity Description
) remain the same for all attributes of that entity. - Attribute-level fields (
Attribute Name
,Attribute Display Name
,Attribute Description
,Attribute Data Type
,Attribute Is PK
) vary for each attribute.
Structure:
- Models are the top-level layer
- Each model can contain entities
- Each entity can contain attributes
Columns:
Column | Description | Example |
---|---|---|
Model Name* | The technical name of the model | SalesOverview |
Model Display Name* | The human-readable name shown in Atlan | Sales Overview |
Model Description | A summary of what this model represents | Comprehensive view of sales data and performance metrics |
Model Type* | Classification of the model | CONCEPTUAL , LOGICAL , or PHYSICAL |
Entity Name | The technical name of the entity | Account , Opportunity |
Entity Display Name | The human-readable name of the entity | Account , Sales Opportunity |
Entity Description | A business-friendly description of what this entity represents | Customer account information and details |
Attribute Name | The technical field/column name | AccountNumber , AccountName |
Attribute Display Name | A user-friendly name for the attribute | Account Number , Account Name |
Attribute Description | Describes what the attribute means | Unique identifier for the account |
Attribute Data Type | The type of data stored (must be all caps) | NUMBER , STRING , DATE |
Attribute Is PK | Marks whether this attribute is the Primary Key | TRUE or FALSE |
Example: Sales overview with accounts and opportunities
Example: A SalesOverview
model can include entities such as Account
for customer details and Opportunity
for deals, with attributes like AccountNumber
or OpportunityId
as unique identifiers.
Model Name | Model Display Name | Model Description | Model Type | Entity Name | Entity Display Name | Entity Description | Attribute Name | Attribute Display Name | Attribute Description | Attribute Data Type | Attribute Is PK |
---|---|---|---|---|---|---|---|---|---|---|---|
SalesOverview | Sales Overview | Comprehensive view of sales data | CONCEPTUAL | Account | Account | Customer account information | AccountNumber | Account Number | Unique identifier for the account | STRING | TRUE |
AccountName | Account Name | Name of the account | STRING | FALSE | |||||||
Opportunity | Sales Opportunity | Sales opportunities and deals | OpportunityId | Opportunity ID | Unique identifier for opportunity | STRING | TRUE |
The Relations sheet defines how entities connect to each other within models. Each row in the Relations sheet defines a relationship between two entities.
- Model-level fields (
Model Type
,Source Model
,Destination Model
) often stay the same for multiple rows. - Entity-level fields (
Source Entity
,Destination Entity
) remain the same for all attributes or sub-relationships under that entity pairing. - Attribute-level fields (
Relation Source Attribute
,Relation Destination Attribute
) are optional—fill them only if you want to define relationships at the attribute level instead of the entity level. - Cardinality fields (
Source Cardinality
,Source Min Cardinality
,Destination Cardinality
,Destination Min Cardinality
) use1
for one and*
for many.
Columns:
Column | Description | Example |
---|---|---|
Model Type* | Specifies the model type | CONCEPTUAL , LOGICAL , or PHYSICAL |
Source Model* | The model that contains the starting entity | SalesOverview |
Source Entity* | The entity from which the relationship originates | Account |
Destination Model* | The model containing the related entity | SalesOverview |
Destination Entity* | The entity to which the relationship points | Opportunity |
Relation Type* | Defines the type of relationship | Association , Inheritance |
Cardinality Nullability | Defines the relationship's multiplicity | ONE-TO-MANY , MANY-TO-MANY |
Relation Name | A descriptive name for the relationship | AccountHasOpportunities |
Relation Source Attribute | Attribute in the source entity used for the relationship (leave empty for entity-level relationships) | AccountNumber |
Relation Destination Attribute | Attribute in the destination entity used for the relationship (leave empty for entity-level relationships) | AccountNumber |
Source-to-Destination Label* | A user-friendly label | parentOf |
Destination-to-Source Label* | Reverse relationship label | childOf |
Source Cardinality | Define the source cardinality on the source side | ONE , MANY |
Source Min Cardinality | Define the min cardinality on the source side | 0 , 1 |
Destination Cardinality | Define the cardinality on the destination side | ONE , MANY |
Destination Min Cardinality | Define the minimum occurrences on the destination side | 0 , 1 |
Example: Defining relationships between entities
Example: The SalesOverview
model can define relationships between Account
and Contact
. An Account
may have many Contacts
, while join tables such as AccountContactRelation
capture the detailed associations.
Model Type | Source Model | Source Entity | Destination Model | Destination Entity | Relation Type | Cardinality Nullability | Relation Name | Relation Source Attribute | Relation Destination Attribute | Source-to-Destination Label | Destination-to-Source Label | Source Cardinality | Source Min Cardinality | Destination Cardinality | Destination Min Cardinality |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
LOGICAL | SalesOverview | Account | SalesOverview | Contact | Association | ONE-TO-MANY | Enables | parentOf | childOf | 1 | 0 | * | 0 | ||
Account | AccountContactRelation | Association | AccountNumber | AccountNumber | relatedTo | relatedTo | 1 | 1 | * | 0 | |||||
Contact | AccountContactRelation | Association | ContactId | ContactId | relatedTo | relatedTo | 1 | 1 | * | 0 |
The Mappings sheet defines how different layers of your data model connect with each other across the three model tiers — Conceptual, Logical, and Physical. It captures how entities (and optionally attributes) in one model correspond to entities in another model. Each row in the Mappings sheet links entities (and optionally attributes) across model tiers: Conceptual ↔ Logical ↔ Physical.
- Model-level fields (
Source Model Type
,Source Model
,Destination Model Type
,Destination Model
) usually stay the same across multiple rows. - Entity-level fields (
Source Entity
,Destination Entity
) remain the same for all attributes in that mapping. - Attribute-level fields (
Source Attribute
,Destination Attribute
) are optional. Leave them blank if the mapping is at the entity level only.
Supported mapping modes:
- Conceptual → Logical → Physical (full hierarchy mapping)
- Conceptual → Logical
- Logical → Physical
- Standalone (within the same model type, mapping attributes/entities within itself)
Columns:
Column | Description | Example |
---|---|---|
Source Model Type | Defines the type of the source model | CONCEPTUAL , LOGICAL , or PHYSICAL |
Source Model* | The name of the model from which the mapping originates | SalesOverview |
Source Entity* | The entity in the source model that you are mapping | Account |
Destination Model Type* | Specifies the type of model being mapped to | CONCEPTUAL , LOGICAL , or PHYSICAL |
Destination Model* | The target model name | SalesLogical |
Destination Entity* | The target entity name in the destination model | Customer |
Source Attribute | The attribute in the source entity (leave blank for entity-level mappings) | AccountNumber |
Destination Attribute | The attribute in the destination entity (leave blank for entity-level mappings) | CustomerId |
Example: Mapping conceptual, logical, and physical models
Example: The Agreement
entity from the conceptual model maps to multiple entities in the SalesOverview
logical model, including Contact
, Opportunity
, and Account
. Logical entities such as Contact
and Opportunity
are then linked to their physical implementations, with attributes like ContactId
or Amount
mapped to columns in the physical schema.
Source Model Type | Source Model | Source Entity | Destination Model Type | Destination Model | Destination Entity | Source Attribute | Destination Attribute |
---|---|---|---|---|---|---|---|
CONCEPTUAL | Concepts | Agreement | LOGICAL | SalesOverview | Contact | ||
Agreement | Opportunity | ||||||
Agreement | Account | ||||||
LOGICAL | SalesOverview | Contact | PHYSICAL | Implementation | CONT | ContactId | CONT_ID |
EmailAddress | |||||||
LOGICAL | SalesOverview | Opportunity | PHYSICAL | Implementation | OPP | Name | NAME |
Amount | AMT | ||||||
CloseDate | CLOSE_DT |
The Implementations sheet connects your logical or physical entities and attributes to actual system assets in Atlan (for example, Snowflake tables and columns).
- Model-level fields (
Source Model Type
,Source Model
) remain the same for all rows within the same model. - Entity-level field (
Source Entity
) stays the same for all attributes under that entity. - Source Attribute is optional—leave it blank if you are linking at the entity level only.
- Destination fields (
Destination Asset Type
,Destination Connector Type
,Destination Connection Name
,Destination Asset Qualified Name
) are always required.
Currently supported asset types are: MaterialisedView
, Table
, Column
, GCS Object
, S3Object
, S3 Bucket
, GCS Bucket
, ADLSObject
. If you need support for additional asset types, raise a support request.
Columns:
Column | Description | Example |
---|---|---|
Source Model Type* | Type of model | CONCEPTUAL , LOGICAL , or PHYSICAL |
Source Model* | The name of the model | SalesLogical |
Source Entity* | The entity being mapped | Customer |
Source Attribute | The specific attribute in that entity (leave empty for entity-level mappings) | CustomerId |
Destination Asset Type* | Type of asset in Atlan | Table , Column , MaterialisedView , GCS Object , S3Object , S3 Bucket , GCS Bucket , ADLSObject |
Destination Connector Type* | The system type | Snowflake , BigQuery , PostgreSQL |
Destination Connection Name* | The name of the connection configured in Atlan | production , staging |
Destination Asset Qualified Name* | The qualified name of the asset (excluding connection QN prefix) | ANALYTICS/WIDE_WORLD_IMPORTERS/people_snfw |
Example: Implementing logical entities in physical assets
Example: The CONT
entity from the logical model is implemented in Snowflake as both a table and columns. Attributes such as CONT_ID
and EMAIL
are linked directly to their physical counterparts in the PEOPLE
table, ensuring the logical model is connected to real assets in Atlan.
Source Model Type | Source Model | Source Entity | Source Attribute | Destination Asset Type | Destination Connector Type | Destination Connection Name | Destination Asset Qualified Name |
---|---|---|---|---|---|---|---|
PHYSICAL | Implementation | CONT | Table | Snowflake | production | WIDE_WORLD_IMPORTERS/BRONZE_APPLICATION/PEOPLE | |
CONT_ID | Column | Snowflake | production | WIDE_WORLD_IMPORTERS/BRONZE_APPLICATION/PEOPLE/PERSONID | |||
Column | Snowflake | production | WIDE_WORLD_IMPORTERS/BRONZE_APPLICATION/PEOPLE/EMAILADDRESS | ||||
Table | Snowflake | production | WIDE_WORLD_IMPORTERS/PROCESSED_SILVER/SILVER_PEOPLE |
Connection
The Connection property defines where the ingested data model is stored and how it links to assets in Atlan. A connection represents a technical integration with a data source such as Snowflake, BigQuery, or Redshift.
There are two options for configuring the connection:
-
Create: This option establishes a new connection in Atlan. It's used when onboarding a new system or domain, or when the data model must remain logically separate from existing connections. The workflow collects connection details such as type, credentials, and configuration. Once created, the connection is available for reuse in future ingestion jobs.
Example: Creating a dedicated
Finance_Prod
connection to hold the finance data model for the first time. -
Reuse: This option links the data model to an existing connection in Atlan. It's used when enriching or extending metadata for assets that are already cataloged under a configured connection. The ingestion job attaches the new metadata to the selected connection, keeping related information grouped under one integration.
Example: Reusing an existing
Snowflake_Prod
connection when ingesting a sales conceptual model that maps onto Snowflake assets already present in Atlan.
Connection name
The connection name applies only when Create is selected and specifies the display name of the new connection. This name is used to identify the connection in the Atlan UI and across workflows.
The name must be unique and descriptive, reflecting the source system or domain being modeled (for example, Finance_Prod
or Sales_Snowflake
). Once the connection is created, the name is permanent and available for reuse in future ingestion workflows.
Connection admins
Applies only when Create is selected.
Defines the list of users who administer the connection. Connection admins have full control over the connection, including configuration and management of associated workflows.
- Assign specific users: Add one or more users as connection admins by specifying their Atlan usernames.
- Include all admins: Select this option to grant all tenant admins administrative rights on the connection.
Example: When creating a new connection named Finance_Prod
, specify [email protected]
and [email protected]
as connection admins, or select Include all admins to grant rights to all tenant administrators.
Connection
The connection applies only when Reuse is selected and specifies the existing connection in Atlan where the data model is ingested. The ingestion process attaches the new metadata to the selected connection, keeping all related information grouped under one integration.
You must select the connection name from the list of available connections in the tenant and verify that it corresponds to the data source or domain that the ingested model is intended to enrich. Using an existing connection avoids duplication and maintains consistency across ingestion jobs. For example, when ingesting a sales conceptual model that maps to Snowflake assets already cataloged in Atlan, you can select the existing connection Snowflake_Prod
.