What does Atlan crawl from SageMaker
Atlan crawls comprehensive metadata from your Amazon SageMaker AI platform, including models, training jobs, and datasets information.
Asset
Atlan crawls the following SageMaker assets, each with specific metadata fields.
Model
SageMaker models represent trained machine learning models that can be deployed for inference. Atlan maps models from Amazon SageMaker to its AIModel asset type.
| Source field | Atlan field | Description |
|---|---|---|
ModelName | name | Model name |
ModelArn | sagemakerArn | AWS ARN of the model |
CreationTime | sourceCreatedAt | When the model was created |
ContainerImage | sagemakerModelContainerImage | Docker container image for the model |
ContainerModelData | sagemakerModelContainerModelData | S3 URI of model artifacts |
ExecutionRoleArn | sagemakerModelExecutionRoleArn | IAM role for model execution |
ModelStatus | sagemakerModelStatus | Current status of the model |
ExternalUrl | sourceURL | Link to AWS SageMaker console |
Model Group
SageMaker model groups represent collections of versioned model packages that can be organized and managed together. Atlan maps model package groups from Amazon SageMaker to its SagemakerV3ModelGroup asset type.
| Source field | Atlan field | Description |
|---|---|---|
ModelPackageGroupName | name | Model package group name |
ModelPackageGroupArn | sagemakerArn | AWS ARN of the model package group |
ModelPackageGroupDescription | description | Description of the model package group |
CreationTime | sourceCreatedAt | When the model package group was created |
CreationTime | sourceUpdatedAt | Last update time (defaults to creation time) |
ModelPackageGroupStatus | sagemakerModelGroupStatus | Current status of the model package group |
Job
SageMaker jobs represent various types of ML job executions including training, processing, and transform jobs. Atlan maps jobs from Amazon SageMaker to its SageMakerJob asset type.
| Source field | Atlan field | Description |
|---|---|---|
TrainingJobName | name | Job name |
TrainingJobArn | sagemakerArn | AWS ARN of the job |
JobType | sagemakerJobType | Type of job (training, processing, transform) |
TrainingJobStatus | sagemakerJobStatus | Current status of the job |
CreationTime | sourceCreatedAt | When the job was created |
TrainingEndTime | sagemakerJobEndTime | When the job completed |
ExternalUrl | sourceURL | Link to AWS SageMaker console |
Dataset
SageMaker datasets represent curated/intermediate datasets (Glue table or S3 prefix) used by SageMaker for training and inference. Atlan maps datasets from Amazon SageMaker to its SageMakerDataset asset type.
| Source field | Atlan field | Description |
|---|---|---|
DatasetName | name | Dataset name |
DatasetArn | sagemakerArn | AWS ARN of the dataset |
Platform | sagemakerDatasetPlatform | Platform where dataset is stored (S3, Glue) |
OfflineStoreS3Uri | sagemakerDatasetOfflineStoreS3Uri | S3 URI of offline store data |
DataCatalogTableName | sagemakerDatasetDataCatalogTableName | AWS Glue Data Catalog table name |
Feature Group
SageMaker Feature Groups represent collections of related features for machine learning training and inference. Atlan maps feature groups from Amazon SageMaker to its SageMakerFeatureGroup asset type.
| Source field | Atlan field | Description |
|---|---|---|
FeatureGroupName | name | Feature group name |
FeatureGroupArn | sagemakerArn | AWS ARN of the feature group |
Description | description | Feature group description |
CreationTime | sourceCreatedAt | When the feature group was created |
FeatureGroupStatus | sagemakerFeatureGroupStatus | Current status of the feature group |
RecordIdName | sagemakerFeatureGroupRecordIdName | Name of the record identifier feature |
OfflineStoreS3Uri | sagemakerFeatureGroupOfflineStoreS3Uri | S3 URI of offline store data |
GlueDatabase | sagemakerFeatureGroupGlueDatabase | AWS Glue database name |
GlueTable | sagemakerFeatureGroupGlueTable | AWS Glue table name |
ExternalUrl | sourceURL | Link to AWS SageMaker console |
Model Deployment
SageMaker endpoints represent deployed models that serve real-time inference requests. Atlan maps model deployments from Amazon SageMaker to its SagemakerModelDeployment asset type.
| Source field | Atlan field | Description |
|---|---|---|
EndpointName | name | Endpoint name |
EndpointArn | sagemakerArn | AWS ARN of the endpoint |
CreatedAt | sourceCreatedAt | When the endpoint was created |
ModelDeploymentStatus | sagemakerModelDeploymentStatus | Current status of the endpoint |
EndpointConfigName | sagemakerModelDeploymentEndpointConfigName | Associated endpoint configuration |
ExternalUrl | sourceURL | Link to AWS SageMaker console |
Feature
SageMaker features represent individual features within Feature Groups, including their data type and metadata. Atlan maps features from Amazon SageMaker to its SagemakerFeature asset type.
| Source field | Atlan field | Description |
|---|---|---|
FeatureName | name | Feature name |
FeatureGroupArn | sagemakerArn | AWS ARN of the feature group |
FeatureGroupName | sagemakerFeatureFeatureGroupName | Name of the containing feature group |
DataType | sagemakerFeatureDataType | Data type of the feature |
IsRecordIdentifier | sagemakerFeatureIsRecordIdentifier | Whether this feature serves as record identifier |
Dataset column
SageMaker dataset columns represent individual columns within SageMaker datasets. Atlan maps dataset columns from Amazon SageMaker to its SagemakerDatasetColumn asset type.
| Source field | Atlan field | Description |
|---|---|---|
ColumnName | name | Column name |
ColumnArn | sagemakerArn | AWS ARN of the column |
DataType | dataType | Column data type |
IsNullable | isNullable | Whether the column accepts null values |
IsPartitionColumn | isPartitionColumn | Whether this is a partition column |
IsPrimaryKey | isPrimaryKey | Whether this is a primary key column |
OrdinalPosition | ordinalPosition | Column position in the dataset |
Precision | precision | Precision for numeric columns |
Scale | scale | Scale for numeric columns |