BigQuery miner app
The BigQuery miner app mines query history from BigQuery to generate lineage and
usage (popularity) metrics. Build it with the BigqueryMiner builder.
Unlike crawlers, a miner doesn't create a connection or take a credential.
It runs on an existing BigQuery connection and reuses that connection's
own credential—so you only supply the connection's qualifiedName.
Source extraction
To mine query history from an existing BigQuery connection:
- Python
Mine query history from BigQuery
from pyatlan.client.atlan import AtlanClient
from pyatlan.model.apps import BigqueryMiner
client = AtlanClient()
response = (
BigqueryMiner(client) # (1)
.connection( # (2)
qualified_name="default/bigquery/1700000000",
)
.start_date(1704067200) # (3)
.calculate_popularity("true") # (4)
.popularity_window_days(30) # (5)
.excluded_users(["system@my-project.iam.gserviceaccount.com"]) # (6)
.run(name="bigquery-prod-miner") # (7)
)
print(response.slug, response.run_id)
- Base configuration for a new BigQuery miner. You must provide a
client. - The exact
qualifiedNameof the existing BigQuery connection to mine. The builder resolves that connection's credential automatically—no credential step is needed. - The date (as an epoch) from which to start mining query history.
- Generate popularity metrics from the mined query history.
- Number of days of history to consider when calculating popularity.
- Optionally exclude users (for example, service accounts) from usage metrics.
- Always pass an explicit
namefor miners. A miner has no connection display name to derive one from, so a bare.run()defaults the workflow name to the app id (bigquery-miner) and a second run collides (409 already exists).
Region
By default the miner mines from the connection's region. To target a specific BigQuery region:
- Python
Mine from a custom region
(
BigqueryMiner(client)
.connection(qualified_name="default/bigquery/1700000000")
.start_date(1704067200)
.region("custom") # (1)
.custom_big_query_region("region-us") # (2)
.run(name="bigquery-prod-miner")
)
- Switch region selection to
custom. - The BigQuery region to mine from.
Advanced config
- Python
Custom feature-flag config
(
BigqueryMiner(client)
.connection(qualified_name="default/bigquery/1700000000")
.start_date(1704067200)
.control_config("custom") # (1)
.custom_config('{"flag": true}') # (2)
.run(name="bigquery-prod-miner")
)
- Switch advanced config to
custom. - Supply experimental feature-flag config as a JSON string.