S3 Inventory Report Structure
This reference outlines the expected folder layout and file format for Amazon S3 inventory reports used by Atlanβs S3 crawler when running inventory-based ingestion.
important
The crawler supports a single destination bucket, with an optional prefix, to store all inventory reports from multiple source buckets.
Folder structureβ
To enable successful inventory-based crawling, your destination bucket must follow this structure once inventory reports are generated:
π¦ <destination-bucket>/
βββ [π <optional-prefix>/] (applies to all buckets)
β βββ π <source-bucket-1>/
β β βββ π <inventory-config-1>/
β β β βββ π <YYYY-MM-DDTHH-MM-Z>/ (timestamp folders)
β β β β βββ π manifest.json
β β β βββ π data/ (CSV or gzipped files)
β β β βββ π <inventory-file>.csv.gz
β β βββ π <inventory-config-2>/
β βββ π <source-bucket-2>/
βββ ...
Required | Component | Description | Example |
---|---|---|---|
β | Destination bucket | Single S3 bucket to store all inventory reports | atlan-inventory-reports |
β | Prefix | Folder prefix to organize reports (same for all source buckets) | inventory-reports |
β | Source bucket folder | Folder named after each source bucket | source-bucket-1 |
β | Inventory config folder | Folder named after the inventory configuration | daily-inventory |
β | Timestamp folder | Folder with report generation timestamp | 2024-01-16T00-00Z |
β | Manifest file | manifest.json containing report metadata | manifest.json |
β | Data folder | Folder containing compressed inventory files | data |
β | Inventory files | .csv.gz or .parquet files with actual data | inventory-2024-01-16-00-00Z.csv.gz |
Examplesβ
π Basic structure (single source bucket)
π¦ atlan-inventory-reports/
βββ π source-bucket-1/
βββ π inventory-config-1/
βββ π 2024-01-16T00-00Z/
β βββ π `manifest.json` {/* Required metadata file */}
βββ π data/
βββ π inventory-2024-01-16-00-00Z.csv.gz
π Multiple source buckets
π¦ atlan-inventory-reports/
βββ π source-bucket-1/
β βββ π inventory-config-1/
β βββ π 2024-01-16T00-00Z/
β β βββ π `manifest.json` {/* Required metadata file */}
β βββ π data/
β βββ π inventory-2024-01-16-00-00Z.csv.gz
βββ π source-bucket-2/
βββ π inventory-config-1/
βββ π 2024-01-16T00-00Z/
β βββ π `manifest.json` {/* Required metadata file */}
βββ π data/
βββ π inventory-2024-01-16-00-00Z.csv.gz
π With optional prefix
π¦ atlan-inventory-reports/
βββ π inventory-reports/
βββ π source-bucket-1/
β βββ π inventory-config-1/
β βββ π 2024-01-16T00-00Z/
β β βββ π `manifest.json` {/* Required metadata file */}
β βββ π data/
β βββ π inventory-2024-01-16-00-00Z.csv.gz
βββ π source-bucket-2/
βββ π inventory-config-1/
βββ π 2024-01-16T00-00Z/
β βββ π `manifest.json` {/* Required metadata file */}
βββ π data/
βββ π inventory-2024-01-16-00-00Z.csv.gz
See alsoβ
- Set up inventory reports for S3: Set up inventory reports for S3