Skip to main content

S3 Inventory Report Structure

This reference outlines the expected folder layout and file format for Amazon S3 inventory reports used by Atlan’s S3 crawler when running inventory-based ingestion.

important

The crawler supports a single destination bucket, with an optional prefix, to store all inventory reports from multiple source buckets.

Folder structure​

To enable successful inventory-based crawling, your destination bucket must follow this structure once inventory reports are generated:

πŸ“¦ <destination-bucket>/
β”œβ”€β”€ [πŸ“ <optional-prefix>/] (applies to all buckets)
β”‚ β”œβ”€β”€ πŸ“ <source-bucket-1>/
β”‚ β”‚ β”œβ”€β”€ πŸ“ <inventory-config-1>/
β”‚ β”‚ β”‚ β”œβ”€β”€ πŸ“ <YYYY-MM-DDTHH-MM-Z>/ (timestamp folders)
β”‚ β”‚ β”‚ β”‚ └── πŸ“„ manifest.json
β”‚ β”‚ β”‚ └── πŸ“ data/ (CSV or gzipped files)
β”‚ β”‚ β”‚ └── πŸ“„ <inventory-file>.csv.gz
β”‚ β”‚ └── πŸ“ <inventory-config-2>/
β”‚ └── πŸ“ <source-bucket-2>/
└── ...
RequiredComponentDescriptionExample
βœ…Destination bucketSingle S3 bucket to store all inventory reportsatlan-inventory-reports
❌PrefixFolder prefix to organize reports (same for all source buckets)inventory-reports
βœ…Source bucket folderFolder named after each source bucketsource-bucket-1
βœ…Inventory config folderFolder named after the inventory configurationdaily-inventory
βœ…Timestamp folderFolder with report generation timestamp2024-01-16T00-00Z
βœ…Manifest filemanifest.json containing report metadatamanifest.json
βœ…Data folderFolder containing compressed inventory filesdata
βœ…Inventory files.csv.gz or .parquet files with actual datainventory-2024-01-16-00-00Z.csv.gz

Examples​

πŸ“ Basic structure (single source bucket)
πŸ“¦ atlan-inventory-reports/
└── πŸ“ source-bucket-1/
└── πŸ“ inventory-config-1/
β”œβ”€β”€ πŸ“ 2024-01-16T00-00Z/
β”‚ └── πŸ“„ `manifest.json` {/* Required metadata file */}
└── πŸ“ data/
└── πŸ“„ inventory-2024-01-16-00-00Z.csv.gz
πŸ“ Multiple source buckets
πŸ“¦ atlan-inventory-reports/
β”œβ”€β”€ πŸ“ source-bucket-1/
β”‚ └── πŸ“ inventory-config-1/
β”‚ β”œβ”€β”€ πŸ“ 2024-01-16T00-00Z/
β”‚ β”‚ └── πŸ“„ `manifest.json` {/* Required metadata file */}
β”‚ └── πŸ“ data/
β”‚ └── πŸ“„ inventory-2024-01-16-00-00Z.csv.gz
└── πŸ“ source-bucket-2/
└── πŸ“ inventory-config-1/
β”œβ”€β”€ πŸ“ 2024-01-16T00-00Z/
β”‚ └── πŸ“„ `manifest.json` {/* Required metadata file */}
└── πŸ“ data/
└── πŸ“„ inventory-2024-01-16-00-00Z.csv.gz
πŸ“ With optional prefix
πŸ“¦ atlan-inventory-reports/
└── πŸ“ inventory-reports/
β”œβ”€β”€ πŸ“ source-bucket-1/
β”‚ └── πŸ“ inventory-config-1/
β”‚ β”œβ”€β”€ πŸ“ 2024-01-16T00-00Z/
β”‚ β”‚ └── πŸ“„ `manifest.json` {/* Required metadata file */}
β”‚ └── πŸ“ data/
β”‚ └── πŸ“„ inventory-2024-01-16-00-00Z.csv.gz
└── πŸ“ source-bucket-2/
└── πŸ“ inventory-config-1/
β”œβ”€β”€ πŸ“ 2024-01-16T00-00Z/
β”‚ └── πŸ“„ `manifest.json` {/* Required metadata file */}
└── πŸ“ data/
└── πŸ“„ inventory-2024-01-16-00-00Z.csv.gz

See also​