What does Atlan crawl from Iceberg
Atlan crawls metadata from your Iceberg catalog, including catalogs, namespaces (with nested namespace support), tables, and columns (including nested columns).
Lineage
Atlan establishes the following lineage between Iceberg assets:
- Catalog -> Namespaces: Each catalog contains multiple namespaces.
- Namespace -> Tables: Each namespace contains multiple tables.
- Namespace -> Namespaces: Nested namespaces have parent-child relationships.
- Table -> Columns: Each table contains multiple columns.
- Column -> Columns: Nested columns have parent-child relationships (for STRUCT, LIST, and MAP types).
Assets
Atlan crawls the following Iceberg assets and metadata fields.
IcebergCatalog
Iceberg catalogs represent the top-level catalog instances that contain namespaces and tables.
| Source field | Atlan field | Description |
|---|---|---|
catalog_name | name | Catalog name |
catalog_name | qualifiedName | Unique qualified name for the catalog |
catalog_type | icebergCatalogType | Type of catalog (for example, rest) |
uri | icebergUri | REST catalog URI |
iceberg_warehouse | icebergWarehouse | Warehouse identifier |
scope | icebergScope | Access scope configuration |
total_namespaces | schemaCount | Number of namespaces in the catalog |
iceberg_catalog_properties | icebergCatalogProperties | Catalog configuration properties |
IcebergNamespace
Namespaces represent logical containers for organizing tables within a catalog. Iceberg supports nested namespaces.
| Source field | Atlan field | Description |
|---|---|---|
namespace_str | name | Namespace name (leaf segment for nested namespaces) |
namespace_str | qualifiedName | Unique qualified name for the namespace |
namespace_hierarchy | icebergNamespaceHierarchy | Ordered namespace hierarchy path |
namespace_str | icebergParentNamespaceQualifiedName | Parent namespace qualified name (for nested namespaces) |
table_count | tableCount | Number of tables in the namespace |
IcebergTable
Iceberg tables represent table assets with metadata including partitions, snapshots, and table-level properties.
| Source field | Atlan field | Description |
|---|---|---|
table_name | name | Table name |
table_name + namespace context | qualifiedName | Unique qualified name for the table |
table_uuid | assetSourceId | Source identifier for the table |
location | externalLocation | Storage location of table data |
location | externalLocationRegion | Parsed storage region (when derivable) |
schema_fields | columnCount | Number of columns |
snapshots.summary.total-records | rowCount | Number of records |
snapshots.summary.total-files-size | sizeBytes | Table size in bytes |
partitions | isPartitioned | Whether the table is partitioned |
current_snapshot_id | icebergCurrentSnapshotId | Current snapshot identifier |
last_updated_ms | sourceUpdatedAt | Last updated timestamp on source |
source_created_at | sourceCreatedAt | Created timestamp on source |
format_version | icebergFormatVersion | Iceberg format version |
properties | icebergTableProperties | Table-level properties |
partitions | icebergTablePartitions | Partition specification details |
snapshots | icebergSnapshots | Snapshot metadata |
IcebergColumn
Columns represent table fields, including nested field metadata for complex types.
| Source field | Atlan field | Description |
|---|---|---|
column_name | name | Column name |
column_name / column_path + table context | qualifiedName | Unique qualified name for the column |
data_type | dataType | Column data type |
nullable | isNullable | Whether the column accepts null values |
is_partition | isPartition | Whether the column is a partition column |
description | description | Column description |
sub_type | subType | Nested subtype marker |
column_depth_level | columnDepthLevel | Nesting depth level |
column_order | order | Column order within parent scope |
nested_column_order | nestedColumnOrder | Hierarchical order for nested fields |
nested_column_count | nestedColumnCount | Number of child columns |
parent_column_name | parentColumnName | Parent column name |
parent_column_qualified_name | parentColumnQualifiedName | Parent column qualified name |
column_hierarchy | columnHierarchy | Ancestor hierarchy for nested columns |