AtlasTransformer
Classapplication_sdk.transformers.atlasConverts raw metadata into Atlas entities using pyatlan library classes. Processes metadata row-by-row, creating properly structured Atlas entities with relationships, qualified names, and workflow metadata enrichment. Uses entity class definitions to transform raw metadata into Atlas-compatible entities for each entity type (DATABASE, SCHEMA, TABLE, etc.).
Methods4
__init__
__init__(self, connector_name, tenant_id, current_epoch='0', connection_qualified_name=None)Parameters
connector_namestrtenant_idstrcurrent_epochstrconnection_qualified_namestrtransform_metadata
transform_metadata(self, typename, dataframe, workflow_id, workflow_run_id, entity_class_definitions=None, **kwargs)Parameters
typenamestrdataframedaft.DataFrameworkflow_idstrworkflow_run_idstrentity_class_definitionsOptional[Dict[str, Type[Any]]]**kwargsdictReturns
daft.DataFrame - Transformed DataFrame with Atlas entities as dictionariestransform_row
transform_row(self, typename, data, workflow_id, workflow_run_id, entity_class_definitions=None, **kwargs)Parameters
typenamestrdataDict[str, Any]workflow_idstrworkflow_run_idstrentity_class_definitionsOptional[Dict[str, Type[Any]]]**kwargsdictReturns
Optional[Dict[str, Any]] - Transformed entity as dictionary, or None if transformation fails_enrich_entity_with_metadata
_enrich_entity_with_metadata(self, workflow_id, workflow_run_id, data)Parameters
workflow_idstrworkflow_run_idstrdataDict[str, Any]Returns
dict - Dictionary with attributes and custom_attributes keys containing enriched metadataUsage Examples
Basic transformation
Initialize transformer and transform metadata for tables
from application_sdk.transformers.atlas import AtlasTransformer
# Initialize transformer
transformer = AtlasTransformer(
connector_name="postgresql-connector",
tenant_id="tenant-123"
)
# Transform metadata
transformed_df = transformer.transform_metadata(
typename="TABLE",
dataframe=raw_table_df,
workflow_id="extract-tables",
workflow_run_id="run-001",
connection={
"connection_name": "production",
"connection_qualified_name": "tenant/postgresql/1"
}
)
Processing multiple entity types
Transform different entity types in sequence
# Transform different entity types
databases_df = transformer.transform_metadata(
typename="DATABASE",
dataframe=raw_databases_df,
workflow_id="workflow-123",
workflow_run_id="run-456",
connection=connection_info
)
schemas_df = transformer.transform_metadata(
typename="SCHEMA",
dataframe=raw_schemas_df,
workflow_id="workflow-123",
workflow_run_id="run-456",
connection=connection_info
)
tables_df = transformer.transform_metadata(
typename="TABLE",
dataframe=raw_tables_df,
workflow_id="workflow-123",
workflow_run_id="run-456",
connection=connection_info
)
Error handling
The transformer handles errors gracefully. Invalid rows are logged with warnings and skipped. Row-level errors don't stop the entire transformation.
try:
transformed_df = transformer.transform_metadata(
typename="TABLE",
dataframe=raw_df,
workflow_id="workflow-123",
workflow_run_id="run-456",
connection=connection_info
)
except Exception as e:
logger.error(f"Transformation failed: {e}")
# Handle error
Default entity classes
The transformer includes default entity class definitions:
| Entity Type | Class | Description |
|---|---|---|
DATABASE | Database | Database entities |
SCHEMA | Schema | Schema entities |
TABLE | Table | Table entities |
VIEW | Table | View entities (uses Table class) |
MATERIALIZED VIEW | Table | Materialized view entities |
COLUMN | Column | Column entities |
FUNCTION | Function | Function entities |
PROCEDURE | Procedure | Stored procedure entities |
TAG_REF | TagAttachment | Tag attachment entities |
Entity classes
Entity classes are defined in application_sdk.transformers.atlas.sql and extend pyatlan asset classes.
Database
Transforms database metadata into Database entities.
Required fields:
database_name: Name of the databaseconnection_qualified_name: Connection qualified name
Attributes created:
qualified_name: Built from connection and database namename: Database nameconnection_qualified_name: Connection reference
Schema
Transforms schema metadata into Schema entities.
Required fields:
schema_name: Name of the schemadatabase_name: Name of the parent databaseconnection_qualified_name: Connection qualified name
Attributes created:
qualified_name: Built from connection, database, and schema namename: Schema namedatabase_qualified_name: Parent database reference
Table
Transforms table metadata into Table entities.
Required fields:
table_name: Name of the tabletable_schema: Name of the parent schematable_catalog: Name of the parent databaseconnection_qualified_name: Connection qualified name
Attributes created:
qualified_name: Built from connection, database, schema, and table namename: Table nameschema_qualified_name: Parent schema referencedatabase_qualified_name: Parent database reference- Additional table-specific attributes (row_count, column_count, etc.)
Column
Transforms column metadata into Column entities.
Required fields:
column_name: Name of the columntable_name: Name of the parent tabletable_schema: Name of the parent schematable_catalog: Name of the parent databaseconnection_qualified_name: Connection qualified name
Attributes created:
qualified_name: Built from connection, database, schema, table, and column namename: Column nametable_qualified_name: Parent table reference- Additional column-specific attributes (data_type, nullable, etc.)
Function
Transforms function metadata into Function entities.
Required fields:
function_name: Name of the functionfunction_definition: Source code of the functionfunction_catalog: Database containing the functionfunction_schema: Schema containing the functionconnection_qualified_name: Connection qualified name
Procedure
Transforms stored procedure metadata into Procedure entities.
Required fields:
procedure_name: Name of the procedureprocedure_definition: Source code of the procedureprocedure_catalog: Database containing the procedureprocedure_schema: Schema containing the procedureconnection_qualified_name: Connection qualified name
Entity enrichment
All entities are automatically enriched with:
Workflow metadata
last_sync_workflow_name: Workflow identifierlast_sync_run: Workflow run identifierlast_sync_run_at: Timestamp of last sync
Connection metadata
connection_name: Name of the connectionconnector_name: Connector type derived from qualified nameconnection_qualified_name: Full connection qualified name
Source metadata (when available)
description: Processed from remarks or commentsource_created_by: Original creatorsource_created_at: Creation timestampsource_updated_at: Last update timestampsource_id: Source system identifier
See also
- Transformers: Overview of all transformers and the TransformerInterface
- Query-based transformer: Transform metadata using SQL queries defined in YAML templates
- Application SDK README: Overview of the Application SDK and its components