Skip to main content

Data Model Ingestion
App

The Data Model Ingestion app loads logical and relational metadata from an input file into Atlan. It’s designed to help teams define how their data is structured, organized, and related within Atlan without manual setup.

Files can be provided either by uploading them directly from your local machine or by fetching them from a cloud object store. Supported object storage systems include Amazon S3, Google Cloud Storage (GCS), and Azure Data Lake Storage (ADLS).

This reference provides complete configuration details for data model ingestion, including input handling rules and the structure of the required input file.

Access

The Data Model Ingestion app isn't enabled by default. To use this app, contact Atlan support and request it be added to your tenant. Once enabled, data model ingestion workflows can be set up and run by admins or users with workflow permissions.

Source

This section defines how the input file for data model ingestion is provided and identified in Atlan.

Workflow name

Specifies the display name for the workflow in Atlan. This name is used to identify the ingestion job in the UI and logs. Choose a name that clearly reflects the purpose or scope of the import.

Example:

Production Finance Data Model

Import metadata from

This property defines how the file containing the data model is provided to the workflow. The file must match the Data Model file format.

There are two ways to provide the file:

  • Direct file uploads: Upload a Excel file directly from your local machine. This is useful for smaller files or ad-hoc imports. See Direct file uploads.
  • Object storage: Fetch the file from a supported cloud object store (S3, GCS, or ADLS). This is recommended for larger files or recurring imports. See Object storage.

Direct file uploads

Upload a file directly from your local machine. This option is best for smaller files or ad-hoc imports that are run manually.

File size limit

Direct file uploads are limited to ~10 MB. Only one file can be uploaded per run. For larger or recurring imports, use object storage.

Object storage

This property imports the data model file from a cloud object store rather than a local upload. It's recommended for large files and recurring ingestion jobs. Supported providers are Amazon S3, Google Cloud Storage (GCS), and Azure Data Lake Storage (ADLS). When this option is selected, additional storage-specific properties such as bucket, project ID, or container become available.

Amazon S3 enables you to store and retrieve objects at scale. You can use this option when the data model Excel file is stored in an S3 bucket.

AWS access key

The access key for your AWS account. You can find this in the AWS Management Console > IAM > Users > Security credentials tab.

  • Required if you are using access/secret key authentication.
  • Keep empty if you are using tenant-backed, cross-account, or role-based authentication.

Example:

AKIAIOSFODNN7EXAMPLE

AWS secret key

The secret key that pairs with your access key. This is generated when you create an access key in IAM. You must download it at creation time or rotate and generate a new one if lost.

  • Required if you are using access/secret key authentication.
  • Keep empty if you are using tenant-backed, cross-account, or role-based authentication.

Example:

wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

Region

The AWS region in which your bucket is located (for example, us-east-1). You can find this in the S3 service dashboard when selecting your bucket.

  • Required if you are using access/secret key authentication.
  • Keep empty if you are using tenant-backed, cross-account, or role-based authentication (region is inferred automatically).

Example:

ap-southeast-1

Bucket

The name of the S3 bucket that contains your data model file. The bucket name is listed in the S3 service dashboard.

  • Required in all scenarios (cross-account, access/secret keys, and role-based).
  • Keep empty if you are using tenant-backed authentication and don't know the specific bucket name.

Example:

my-company-data-models

Prefix

Optional. Directory path within the bucket where your data model file is stored.

Example:

data-models/sales/

Object key

The file name (including extension) of your data model file within the bucket or prefix.

Example:

sales-data-model-template.xlsx

Excel input

The input file for Data Model Ingestion must follow the provided Excel format. The Data Model Ingestion app works by reading a structured Excel file that describes your data model. This file contains up to four sheets, each serving a different purpose in defining and enriching your model:

  • Objects: defines the core building blocks: models, entities, and attributes.

  • Relations: captures how entities are connected to each other.

  • Mappings: links entities and attributes across tiers.

  • Implementations: binds your model to real-world assets in Atlan (Snowflake, BigQuery, etc.).

You can 📥 download the sample Excel template for the correct structure and example data.

At a minimum, you must populate the Objects sheet. The other sheets, Relations, Mappings, and Implementations are optional, but recommended if you want to capture richer metadata.

If any of the optional sheets are left empty, those aspects of the model simply won't be ingested. Below are the required formats for each sheet.

The Objects sheet defines the core building blocks of the data model—models, entities, and attributes. Each row in the Objects sheet represents either an attribute within an entity, or just the entity itself (if no attributes are defined).

  • Model-level fields (Model Name, Model Display Name, Model Description, Model Type) remain the same for all entities that belong to that model.
  • Entity-level fields (Entity Name, Entity Display Name, Entity Description) remain the same for all attributes of that entity.
  • Attribute-level fields (Attribute Name, Attribute Display Name, Attribute Description, Attribute Data Type, Attribute Is PK) vary for each attribute.

Structure:

  • Models are the top-level layer
  • Each model can contain entities
  • Each entity can contain attributes

Columns:

ColumnDescriptionExample
Model Name*The technical name of the modelSalesOverview
Model Display Name*The human-readable name shown in AtlanSales Overview
Model DescriptionA summary of what this model representsComprehensive view of sales data and performance metrics
Model Type*Classification of the modelCONCEPTUAL, LOGICAL, or PHYSICAL
Entity NameThe technical name of the entityAccount, Opportunity
Entity Display NameThe human-readable name of the entityAccount, Sales Opportunity
Entity DescriptionA business-friendly description of what this entity representsCustomer account information and details
Attribute NameThe technical field/column nameAccountNumber, AccountName
Attribute Display NameA user-friendly name for the attributeAccount Number, Account Name
Attribute DescriptionDescribes what the attribute meansUnique identifier for the account
Attribute Data TypeThe type of data stored (must be all caps)NUMBER, STRING, DATE
Attribute Is PKMarks whether this attribute is the Primary KeyTRUE or FALSE
Example: Sales overview with accounts and opportunities

Example: A SalesOverview model can include entities such as Account for customer details and Opportunity for deals, with attributes like AccountNumber or OpportunityId as unique identifiers.

Model NameModel Display NameModel DescriptionModel TypeEntity NameEntity Display NameEntity DescriptionAttribute NameAttribute Display NameAttribute DescriptionAttribute Data TypeAttribute Is PK
SalesOverviewSales OverviewComprehensive view of sales dataCONCEPTUALAccountAccountCustomer account informationAccountNumberAccount NumberUnique identifier for the accountSTRINGTRUE
AccountNameAccount NameName of the accountSTRINGFALSE
OpportunitySales OpportunitySales opportunities and dealsOpportunityIdOpportunity IDUnique identifier for opportunitySTRINGTRUE

Connection

The Connection property defines where the ingested data model is stored and how it links to assets in Atlan. A connection represents a technical integration with a data source such as Snowflake, BigQuery, or Redshift.

There are two options for configuring the connection:

  • Create: This option establishes a new connection in Atlan. It's used when onboarding a new system or domain, or when the data model must remain logically separate from existing connections. The workflow collects connection details such as type, credentials, and configuration. Once created, the connection is available for reuse in future ingestion jobs.

    Example: Creating a dedicated Finance_Prod connection to hold the finance data model for the first time.

  • Reuse: This option links the data model to an existing connection in Atlan. It's used when enriching or extending metadata for assets that are already cataloged under a configured connection. The ingestion job attaches the new metadata to the selected connection, keeping related information grouped under one integration.

    Example: Reusing an existing Snowflake_Prod connection when ingesting a sales conceptual model that maps onto Snowflake assets already present in Atlan.

Connection name

The connection name applies only when Create is selected and specifies the display name of the new connection. This name is used to identify the connection in the Atlan UI and across workflows.

The name must be unique and descriptive, reflecting the source system or domain being modeled (for example, Finance_Prod or Sales_Snowflake). Once the connection is created, the name is permanent and available for reuse in future ingestion workflows.

Connection admins

Applies only when Create is selected.

Defines the list of users who administer the connection. Connection admins have full control over the connection, including configuration and management of associated workflows.

  • Assign specific users: Add one or more users as connection admins by specifying their Atlan usernames.
  • Include all admins: Select this option to grant all tenant admins administrative rights on the connection.

Example: When creating a new connection named Finance_Prod, specify [email protected] and [email protected] as connection admins, or select Include all admins to grant rights to all tenant administrators.

Connection

The connection applies only when Reuse is selected and specifies the existing connection in Atlan where the data model is ingested. The ingestion process attaches the new metadata to the selected connection, keeping all related information grouped under one integration.

You must select the connection name from the list of available connections in the tenant and verify that it corresponds to the data source or domain that the ingested model is intended to enrich. Using an existing connection avoids duplication and maintains consistency across ingestion jobs. For example, when ingesting a sales conceptual model that maps to Snowflake assets already cataloged in Atlan, you can select the existing connection Snowflake_Prod.

See also