FAQ

This FAQ answers common questions about the Lakehouse platform, including what data it exposes, how fresh it's, and current limitations.

What metadata is included in the Lakehouse?

The Lakehouse is designed to include all of the metadata stored in Atlan's metastore. For a detailed overview of Atlan's metadata model and assets, see the metadata assets model.

Each table in the Lakehouse corresponds to an entity in Atlan's metadata model. The data in each table includes metadata for that entity, including:

Metadata on attributes inherited from the entity's supertypes
For example, asset type, GUID, created by.
Metadata on attributes specific to instances of the entity (and all of its subtypes)
For example, for a table asset: column count and row count.
Metadata on relationships inherited from the entity's supertypes, and specific to instances of the entity (and all of its subtypes)
For example, the column table contains Atlan metadata for all instances of the column entity in Atlan.

The data types for each table are equivalent to the data types defined in the corresponding entity's documentation. For example, see the column entity model.

What is the latency of data in the Lakehouse?

The data in the Lakehouse usually has a latency of up to 15 minutes. This is due to the process of extracting data from Atlan's internal metadata store into the lakehouse.

What typedef coverage is currently supported?

The Lakehouse currently includes entity attributes with primitive types, such as strings, timestamps, and numerical types like int and long. Entity attributes with more complex types, such as enums, structs, nested structs, or arrays (except arrays of strings), are currently unsupported. The team has prioritized achieving full asset and attribute coverage over time.

What cloud storage options are available?

All files for a customer's Lakehouse reside in an Atlan-managed Amazon S3 bucket, regardless of the cloud provider where the customer's Atlan tenant is deployed. These files include Iceberg metadata (such as metadata files, manifest lists, and snapshots) and Parquet data files.

Support for cloud provider-specific storage is on the roadmap. For example, if a customer's Atlan tenant is deployed on GCP, the goal is for that tenant's Lakehouse data to be stored in GCS. The same may apply for Azure and ADLS.

Which query engines support Iceberg REST Catalog compatibility?

Any Iceberg REST-compatible client can query the Lakehouse, because its catalog is based on Apache Polaris (incubating), which implements the Apache Iceberg REST Catalog API.

Today, you can query the Lakehouse catalog directly by creating an external catalog integration in any Iceberg REST-compatible engine. However, some vendors are still building compatibility with Iceberg REST catalogs. For example, as of July 1, 2025, Databricks' federation capabilities don't support Iceberg REST catalogs.

Atlan is working with vendors to accelerate their Iceberg REST catalog support. For example, Databricks has shared a private preview timeline for Q3 2025, and Atlan is also working with Google to define a similar timeline for BigQuery.

What are the cloud provider requirements for compute?

To query the Lakehouse, your compute must be on the same cloud provider where your Atlan tenant is deployed. For example, if your Atlan tenant is deployed on AWS, your Snowflake instance must also be deployed on AWS. Cross-cloud querying of the Lakehouse is currently unsupported.

Need help

If you have questions about the Lakehouse that aren't covered in this FAQ, contact Atlan support by submitting a support request.

What metadata is included in the Lakehouse?​

What is the latency of data in the Lakehouse?​

What typedef coverage is currently supported?​

What cloud storage options are available?​

Which query engines support Iceberg REST Catalog compatibility?​

What are the cloud provider requirements for compute?​

Need help​