Athena assets app
The Athena assets app crawls Amazon Athena databases, tables, views, and columns and
publishes them to Atlan. Build it with the AtlanAthena builder.
Creating an app creates a new connection
Each create mints a new connection and new assets. To re-crawl, re-run the existing workflow (see Re-run an existing app).
Athena supports two authentication methods: access key/secret and IAM role.
The port is optional and defaults to 443.
Access key authentication
- Python
Athena crawling with access key/secret
from pyatlan.client.atlan import AtlanClient
from pyatlan.model.apps import AtlanAthena
client = AtlanClient()
response = (
AtlanAthena(client)
.basic( # (1)
username="AKIA...", # (2)
password="••••••", # (3)
s3_output_location="s3://my-athena-results/", # (4)
workgroup="primary", # (5)
host="athena.us-east-1.amazonaws.com", # (6)
)
.connection(
name="production-athena",
admin_roles=[client.role_cache.get_id_for_name("$admin")],
)
.include_metadata({"my_catalog": ["default"]}) # (7)
.run(name="athena-prod")
)
print(response.slug, response.run_id)
- Step 1—Credential. AWS access key/secret auth; the secret is vaulted.
- Required. AWS access key.
- Required. AWS secret key.
- Required. The S3 location where Athena writes query results.
- Optional. The Athena workgroup.
- Required. The Athena host. The port (
port=) defaults to443. - Databases/schemas to crawl, as
{database: [schema, ...]}.
IAM role authentication
- Python
Athena crawling with an IAM role
(
AtlanAthena(client)
.role(
aws_role_arn="arn:aws:iam::123456789012:role/atlan", # (1)
aws_external_id="...", # (2)
s3_output_location="s3://my-athena-results/", # (3)
workgroup="primary",
host="athena.us-east-1.amazonaws.com",
)
.connection(name="production-athena", admin_roles=[...])
.run(name="athena-prod")
)
- Optional. The IAM role ARN to assume.
- Optional. AWS external id for the role.
- Required. The S3 output location.
workgroupandportare optional.
Configuration options
All metadata options are optional:
- Python
Athena metadata configuration
(
AtlanAthena(client)
.basic(username="AKIA...", password="••••••", s3_output_location="s3://...", host="...")
.connection(name="production-athena", admin_roles=[...])
.include_metadata({"my_catalog": ["default"]}) # (1)
.exclude_metadata({"my_catalog": ["staging"]}) # (2)
.exclude_regex_for_tables_views(".*_tmp$") # (3)
.enable_source_level_filtering(False) # (4)
.advanced_config("default") # (5)
.run(name="athena-prod")
)
- Databases/schemas to include, as
{database: [schema, ...]}. - Databases/schemas to exclude.
- Regex of tables/views to ignore.
- Apply schema-level filtering at the source (only include-filter schemas are fetched).
- Set the crawler's advanced configuration.