Skip to main content

What does Atlan crawl from SageMaker

Atlan crawls comprehensive metadata from your Amazon SageMaker AI platform, including models, training jobs, and datasets information.

Asset

Atlan crawls the following SageMaker assets, each with specific metadata fields.

Model

SageMaker models represent trained machine learning models that can be deployed for inference. Atlan maps models from Amazon SageMaker to its AIModel asset type.

Source fieldAtlan fieldDescription
ModelNamenameModel name
ModelArnsagemakerArnAWS ARN of the model
CreationTimesourceCreatedAtWhen the model was created
ContainerImagesagemakerModelContainerImageDocker container image for the model
ContainerModelDatasagemakerModelContainerModelDataS3 URI of model artifacts
ExecutionRoleArnsagemakerModelExecutionRoleArnIAM role for model execution
ModelStatussagemakerModelStatusCurrent status of the model
ExternalUrlsourceURLLink to AWS SageMaker console

Model Group

SageMaker model groups represent collections of versioned model packages that can be organized and managed together. Atlan maps model package groups from Amazon SageMaker to its SagemakerV3ModelGroup asset type.

Source fieldAtlan fieldDescription
ModelPackageGroupNamenameModel package group name
ModelPackageGroupArnsagemakerArnAWS ARN of the model package group
ModelPackageGroupDescriptiondescriptionDescription of the model package group
CreationTimesourceCreatedAtWhen the model package group was created
CreationTimesourceUpdatedAtLast update time (defaults to creation time)
ModelPackageGroupStatussagemakerModelGroupStatusCurrent status of the model package group

Job

SageMaker jobs represent various types of ML job executions including training, processing, and transform jobs. Atlan maps jobs from Amazon SageMaker to its SageMakerJob asset type.

Source fieldAtlan fieldDescription
TrainingJobNamenameJob name
TrainingJobArnsagemakerArnAWS ARN of the job
JobTypesagemakerJobTypeType of job (training, processing, transform)
TrainingJobStatussagemakerJobStatusCurrent status of the job
CreationTimesourceCreatedAtWhen the job was created
TrainingEndTimesagemakerJobEndTimeWhen the job completed
ExternalUrlsourceURLLink to AWS SageMaker console

Dataset

SageMaker datasets represent curated/intermediate datasets (Glue table or S3 prefix) used by SageMaker for training and inference. Atlan maps datasets from Amazon SageMaker to its SageMakerDataset asset type.

Source fieldAtlan fieldDescription
DatasetNamenameDataset name
DatasetArnsagemakerArnAWS ARN of the dataset
PlatformsagemakerDatasetPlatformPlatform where dataset is stored (S3, Glue)
OfflineStoreS3UrisagemakerDatasetOfflineStoreS3UriS3 URI of offline store data
DataCatalogTableNamesagemakerDatasetDataCatalogTableNameAWS Glue Data Catalog table name

Feature Group

SageMaker Feature Groups represent collections of related features for machine learning training and inference. Atlan maps feature groups from Amazon SageMaker to its SageMakerFeatureGroup asset type.

Source fieldAtlan fieldDescription
FeatureGroupNamenameFeature group name
FeatureGroupArnsagemakerArnAWS ARN of the feature group
DescriptiondescriptionFeature group description
CreationTimesourceCreatedAtWhen the feature group was created
FeatureGroupStatussagemakerFeatureGroupStatusCurrent status of the feature group
RecordIdNamesagemakerFeatureGroupRecordIdNameName of the record identifier feature
OfflineStoreS3UrisagemakerFeatureGroupOfflineStoreS3UriS3 URI of offline store data
GlueDatabasesagemakerFeatureGroupGlueDatabaseAWS Glue database name
GlueTablesagemakerFeatureGroupGlueTableAWS Glue table name
ExternalUrlsourceURLLink to AWS SageMaker console

Model Deployment

SageMaker endpoints represent deployed models that serve real-time inference requests. Atlan maps model deployments from Amazon SageMaker to its SagemakerModelDeployment asset type.

Source fieldAtlan fieldDescription
EndpointNamenameEndpoint name
EndpointArnsagemakerArnAWS ARN of the endpoint
CreatedAtsourceCreatedAtWhen the endpoint was created
ModelDeploymentStatussagemakerModelDeploymentStatusCurrent status of the endpoint
EndpointConfigNamesagemakerModelDeploymentEndpointConfigNameAssociated endpoint configuration
ExternalUrlsourceURLLink to AWS SageMaker console

Feature

SageMaker features represent individual features within Feature Groups, including their data type and metadata. Atlan maps features from Amazon SageMaker to its SagemakerFeature asset type.

Source fieldAtlan fieldDescription
FeatureNamenameFeature name
FeatureGroupArnsagemakerArnAWS ARN of the feature group
FeatureGroupNamesagemakerFeatureFeatureGroupNameName of the containing feature group
DataTypesagemakerFeatureDataTypeData type of the feature
IsRecordIdentifiersagemakerFeatureIsRecordIdentifierWhether this feature serves as record identifier

Dataset column

SageMaker dataset columns represent individual columns within SageMaker datasets. Atlan maps dataset columns from Amazon SageMaker to its SagemakerDatasetColumn asset type.

Source fieldAtlan fieldDescription
ColumnNamenameColumn name
ColumnArnsagemakerArnAWS ARN of the column
DataTypedataTypeColumn data type
IsNullableisNullableWhether the column accepts null values
IsPartitionColumnisPartitionColumnWhether this is a partition column
IsPrimaryKeyisPrimaryKeyWhether this is a primary key column
OrdinalPositionordinalPositionColumn position in the dataset
PrecisionprecisionPrecision for numeric columns
ScalescaleScale for numeric columns

See also