Set up SageMaker
Configure AWS credentials and permissions to enable Atlan to connect to your SageMaker environment and extract lineage from your machine learning workflows.
Prerequisites
Before you begin, make sure you have:
- AWS account with SageMaker access and IAM permissions to create users or roles
- AWS region codes where your SageMaker resources are located (for example,
us-east-1,eu-west-1) for Atlan configuration
Configure AWS authentication
Choose your authentication method and configure the necessary credentials for SageMaker access.
- IAM user
- IAM role
- Sign in to the AWS Management Console.
- Navigate to IAM > Users > Create user.
- Enter the user details:
- User name:
atlan-sagemaker-service - Select Provide user access to the AWS Management Console if console access is needed
- Choose I want to create an IAM user
- User name:
- Click Next and attach these AWS managed policies:
AmazonSageMakerReadOnlyAccessAmazonS3ReadOnlyAccessAmazonGlueReadOnlyAccess
- Complete the user creation by clicking Create user.
- Create access keys for programmatic access:
- In the user details page, go to the Security credentials tab
- Click Create access key
- Choose Application running outside AWS
- Click Create access key
- Copy and securely store the Access Key ID and Secret Access Key
- Navigate to IAM > Roles > Create role.
- Select the trusted entity:
- Trusted entity type: AWS account
- Account ID: Enter the Atlan AWS account ID (provided by Atlan support)
- (Optional) Check Require external ID and enter the external ID provided by Atlan support
- Click Next and attach these AWS managed policies:
AmazonSageMakerReadOnlyAccessAmazonS3ReadOnlyAccessAmazonGlueReadOnlyAccess
- Enter a role name (for example,
AtlanSageMakerRole) and click Create role. - Copy the Role ARN from the role summary page for use in Atlan configuration.
When using IAM roles, you can use External ID to enhance security:
- Contact Atlan support to obtain your organization's External ID.
- Enter this External ID when configuring the IAM role trust relationship.
- Use the same External ID in Atlan when configuring the workflow. See Configure authentication.
Configure S3 access for artifacts
Grant your IAM user or role access to the S3 buckets where SageMaker stores model artifacts, training data, and job outputs.
- Find your SageMaker S3 buckets in the AWS Console:
- Navigate to Amazon SageMaker > Models
- Click on the desired model to view its details
*Copy the S3 bucket path from the Model artifacts section. For example,
s3://my-sagemaker-bucket/models/ - For training job buckets, navigate to Training > Training jobs
- Click on a completed training job to view its details
- Copy the S3 bucket path from the Output data configuration section. For example,
s3://my-sagemaker-bucket/training-output/
- Add S3 permissions to your IAM user or role:
- Go back to IAM and select your user or role
- Click Add permissions > Attach policies directly
- Click Create policy and paste this JSON:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket",
"s3:GetObjectVersion"
],
"Resource": [
//replace <your-sagemaker-bucket> with your actual bucket name
"arn:aws:s3:::<your-sagemaker-bucket>",
"arn:aws:s3:::<your-sagemaker-bucket>/*"
]
}
]
}
- Give an unique and meaningfule name to the policy. For example,
AtlanSageMakerS3Access) and click Create policy. - Return to your user or role and attach the newly created policy (
AtlanSageMakerS3Access).
Troubleshooting
If you encounter connection or authentication issues during the crawl setup, see Connection and authentication issues for detailed troubleshooting steps.
Next steps
- Crawl SageMaker assets: Configure and run the crawler to extract lineage from SageMaker