Skip to main content

What does Atlan crawl from SageMaker

Atlan crawls comprehensive metadata from your Amazon SageMaker AI platform, including models, training jobs, and datasets information.

Asset

Atlan crawls the following SageMaker assets, each with specific metadata fields.

Model

SageMaker models represent trained machine learning models that can be deployed for inference. Atlan maps models from Amazon SageMaker to its SageMakerModel asset type.

Source fieldAtlan fieldDescription
ModelNamenameModel name
ModelArnsageMakerArnAWS ARN of the model
CreationTimesourceCreatedAtWhen the model was created
ContainerImagesageMakerModelContainerImageDocker container image for the model
ContainerModelDatasageMakerS3UriS3 URI of model artifacts
ExecutionRoleArnsageMakerModelExecutionRoleArnIAM role for model execution
ModelStatusaiModelStatusCurrent status of the model
ModelPackageGroupNamesageMakerModelModelGroupNameName of the parent Model Group
ModelPackageGroupArnsageMakerModelModelGroupQualifiedNameQualified name of the parent Model Group
ExternalUrlsourceURLLink to AWS SageMaker console

Model Group

SageMaker model groups represent collections of versioned model packages that can be organized and managed together. Atlan maps model package groups from Amazon SageMaker to its SageMakerModelGroup asset type.

Source fieldAtlan fieldDescription
ModelPackageGroupNamenameModel package group name
ModelPackageGroupArnsageMakerArnAWS ARN of the model package group
ModelPackageGroupDescriptiondescriptionDescription of the model package group
CreationTimesourceCreatedAtWhen the model package group was created
CreationTimesourceUpdatedAtLast update time (defaults to creation time)
ModelPackageGroupStatussageMakerModelGroupStatusCurrent status of the model package group

Job

SageMaker jobs represent various types of ML job executions including training, processing, and transform jobs. Atlan maps jobs from Amazon SageMaker to its SageMakerJob asset type.

Source fieldAtlan fieldDescription
TrainingJobNamenameJob name
TrainingJobArnsageMakerArnAWS ARN of the job
JobTypesageMakerJobTypeType of job (training, processing, transform)
TrainingJobStatusflowStatusCurrent status of the job
CreationTimesourceCreatedAtWhen the job was created
TrainingStartTimeflowStartedAtWhen the job started
TrainingEndTimeflowFinishedAtWhen the job completed
ExternalUrlsourceURLLink to AWS SageMaker console

Dataset

SageMaker datasets represent curated/intermediate datasets (Glue table or S3 prefix) used by SageMaker for training and inference. Atlan maps datasets from Amazon SageMaker to its SageMakerDataset asset type.

Source fieldAtlan fieldDescription
DatasetNamenameDataset name
DatasetArnsageMakerArnAWS ARN of the dataset
PlatformsageMakerDatasetPlatformPlatform where dataset is stored (S3, Glue)
OfflineStoreS3UrisageMakerS3UriS3 URI of offline store data
DatabaseNamedatabaseNameDatabase name

Feature Group

SageMaker Feature Groups represent collections of related features for machine learning training and inference. Atlan maps feature groups from Amazon SageMaker to its SageMakerFeatureGroup asset type.

Source fieldAtlan fieldDescription
FeatureGroupNamenameFeature group name
FeatureGroupArnsageMakerArnAWS ARN of the feature group
DescriptiondescriptionFeature group description
CreationTimesourceCreatedAtWhen the feature group was created
FeatureGroupStatussageMakerFeatureGroupStatusCurrent status of the feature group
RecordIdNamesageMakerFeatureGroupRecordIdNameName of the record identifier feature
OfflineStoreS3UrisageMakerS3UriS3 URI of offline store data
GlueDatabasesageMakerFeatureGroupGlueDatabaseAWS Glue database name
GlueTablesageMakerFeatureGroupGlueTableAWS Glue table name
ExternalUrlsourceURLLink to AWS SageMaker console

Model Deployment

SageMaker endpoints represent deployed models that serve real-time inference requests. Atlan maps model deployments from Amazon SageMaker to its SageMakerModelDeployment asset type.

Source fieldAtlan fieldDescription
EndpointNamenameEndpoint name
EndpointArnsageMakerArnAWS ARN of the endpoint
CreatedAtsourceCreatedAtWhen the endpoint was created
ModelDeploymentStatussageMakerModelDeploymentStatusCurrent status of the endpoint
EndpointConfigNamesageMakerModelDeploymentEndpointConfigNameAssociated endpoint configuration
ModelNamesageMakerModelDeploymentModelNameName of the parent Model
ModelArnsageMakerModelDeploymentModelQualifiedNameQualified name of the parent Model
ExternalUrlsourceURLLink to AWS SageMaker console

Feature

SageMaker features represent individual features within Feature Groups, including their data type and metadata. Atlan maps features from Amazon SageMaker to its SageMakerFeature asset type.

Source fieldAtlan fieldDescription
FeatureNamenameFeature name
FeatureGroupArnsageMakerArnAWS ARN of the feature group
FeatureGroupNamesageMakerFeatureFeatureGroupNameName of the containing feature group
FeatureGroupQualifiedNamesageMakerFeatureFeatureGroupQualifiedNameQualified name of the containing feature group
DataTypesageMakerFeatureDataTypeData type of the feature
IsRecordIdentifiersageMakerFeatureIsRecordIdentifierWhether this feature serves as record identifier

Dataset column

SageMaker dataset columns represent individual columns within SageMaker datasets. Atlan maps dataset columns from Amazon SageMaker to its SageMakerDatasetColumn asset type.

Source fieldAtlan fieldDescription
ColumnNamenameColumn name
ColumnArnsageMakerArnAWS ARN of the column
DataTypedataTypeColumn data type
IsNullableisNullableWhether the column accepts null values
IsPartitionColumnisPartitionColumnWhether this is a partition column
IsPrimaryKeyisPrimaryKeyWhether this is a primary key column
OrdinalPositionordinalPositionColumn position in the dataset
PrecisionprecisionPrecision for numeric columns
ScalescaleScale for numeric columns

See also