Metadata Completeness App
The Metadata Completeness workflow computes a score based on the completeness and enrichment of metadata for each asset. The score is attached to each asset as custom metadata, helping you identify which assets need additional metadata to meet your governance standards.
The scoring criteria and the assets in scope are configurable during workflow setup. The workflow can run in full mode to score all assets, or incremental mode to score only assets that have been updated since the last run.
Access
The Metadata Completeness workflow isn't enabled by default. To use this workflow, contact Atlan support and request it be added to your tenant. Once enabled, workflows can be set up and run by admins or users with workflow permissions.
Configuration
This section defines the fields required for workflow setup.
Workflow name
Specifies the display name for the workflow in Atlan. This name is used to identify the workflow in the UI and logs. Choose a name that clearly reflects the purpose or scope of the metadata completeness scoring.
Example:
Production metadata completeness scoring
Mode
Defines how the workflow processes assets when computing completeness scores.
-
Full: Computes scores for all assets that match the in-scope criteria. This mode is recommended for initial setup or periodic full recalculations. When selected, the workflow:
- Fetches all assets based on the filters defined in the Configuration section.
- Computes the completeness score for each asset.
- Identifies missing metadata for each asset.
- Updates the custom metadata attributes (score, missing metadata, and score contributors).
Example: If you have 10,000 tables and only 200 changed this week, full mode evaluates all 10,000.
-
Incremental: Computes scores only for assets that have been updated since the last workflow run. This mode is recommended for regular scheduled runs to improve performance. During the first execution, the workflow automatically behaves as full mode. When selected, the workflow:
- Fetches assets updated since the last workflow run.
- Computes the completeness score for each updated asset.
- Identifies missing metadata for each updated asset.
- Updates the custom metadata attributes (score, missing metadata, and score contributors).
Example: If only 50 glossary terms were modified since yesterday, incremental mode evaluates just those 50 instead of the entire glossary.
In scope assets
Specifies which asset types are included in the scoring calculation. Enter a comma-separated list of asset type names.
If you select non-glossary asset types and don't specify a connection, all connections are included. If you select glossary asset types, connection filtering doesn't apply. For a complete list of available asset types, see the Atlan data model reference.
Example:
Table,Column,View,AtlasGlossaryTerm
Connection
Specifies which connections to include when scoring physical assets. This field is a multi-select dropdown and is optional. It only applies to non-glossary asset types.
- If you don't select any connections and select non-glossary asset types in In scope assets, all connections are included.
- If you select one or more connections, only assets from those connections are scored.
- This field doesn't apply to glossary assets.
Custom metadata name
Specifies the name of the custom metadata structure used to store completeness results. This custom metadata is automatically created during the first workflow run if it doesn't already exist. If it already exists, the workflow updates its definition based on the in-scope assets and connections.
This field is required, as the workflow uses this custom metadata structure to store the score, missing metadata, and score contributors for each asset. The custom metadata structure created by the workflow can't be edited from the Atlan UI.
Example:
Metadata Completeness Score
Custom metadata score attribute name
Specifies the name of the attribute within the custom metadata structure that stores the numerical score value.
If you leave this field empty, the default attribute name Score is used.
Example:
Completeness Score
Generic score configuration
Score value components applicable to both physical and glossary assets. These settings define how much each metadata element contributes to the overall completeness score.
Description score
Specifies the score value added when an asset contains a description. If the asset doesn't have a description, this component isn't included in its overall completeness score.
Use this field to emphasize the importance of descriptive metadata in your scoring model. A higher value encourages teams to provide clear, useful descriptions for their assets.
Example: If you set Description Score to 10, any asset that includes a description receives an additional 10 points toward its completeness score. Assets without a description receive 0 points for this component.
Owner score
Specifies the score value added when an asset has at least one owner assigned. If the asset has no owners, this component isn't included in its overall completeness score.
Use this field to emphasize the importance of asset ownership in your scoring model. A higher value encourages teams to assign clear ownership for their assets.
Example: A table with an assigned owner receives 15 points. A table without an owner receives 0 points for this component, helping you identify assets that need ownership assignment.
Tag score
Specifies the score value added when an asset has at least the number of tags specified in Number of tags. If the asset has fewer tags than the threshold, this component isn't included in its overall completeness score.
Use this field to emphasize the importance of tagging in your scoring model. A higher value encourages teams to apply relevant tags to their assets for better discoverability and classification.
Example: With Tag Score set to 8 and Number of tags set to 2, a column tagged with "PII" and "Customer Data" earns 8 points. A column with only one tag or no tags earns 0 points for this component.
Number of tags
Specifies the minimum number of tags that must be assigned to an asset for the Tag score to be applied. If an asset has fewer tags than this threshold, the tag score component isn't included in its overall completeness score.
Use this field to set the minimum tagging requirement for your assets. A higher threshold encourages more comprehensive tagging, while a lower threshold makes it easier for assets to qualify for the tag score.
If you leave this field empty, it defaults to 1.
Example: Setting Number of tags to 5 means a view must have at least 5 tags (such as "Production," "Sales," "Quarterly," "Validated," "Critical") to qualify for the Tag score. Views with 4 or fewer tags don't receive points from this component.
Verified score
Specifies the score value added when an asset has a Verified certificate status. If the asset has a different certificate status or no certificate, this component isn't included in its overall completeness score.
Use this field to reward assets that have been verified as meeting your data quality standards. A higher value encourages teams to complete the verification process for their assets.
Example: If Draft score is set to 5 and Verified score is set to 10, a dashboard with a Verified certificate earns 10 points from this component (the full Verified score). A dashboard with a Draft certificate earns 5 points (the Draft score), showing that Verified assets receive the additional bonus on top of the Draft score. Dashboards with Deprecated status or no certificate earn 0 points for this component, highlighting which assets still need verification.
Draft score
Specifies the score value added when an asset has a Draft certificate status. If the asset has a different certificate status or no certificate, this component isn't included in its overall completeness score.
Use this field to provide partial credit for assets that are in the certification process but not yet verified. A lower value than Verified score encourages teams to complete verification.
Example: A pipeline with a Draft certificate receives 7 points, acknowledging progress toward verification. Pipelines that are Verified, Deprecated, or uncertified receive 0 points for this component, encouraging teams to move Draft assets to Verified status.
Resource score
Specifies the score value added when an asset has at least one linked resource. If the asset has no linked resources, this component isn't included in its overall completeness score.
Use this field to encourage teams to link relevant resources such as documentation, dashboards, or external references to their assets. A higher value emphasizes the importance of providing additional context and resources.
Example: A table linked to a Confluence documentation page or a Looker dashboard earns 12 points from this component. Tables without any linked resources earn 0 points, helping identify assets that benefit from additional context.
Readme characters threshold
Specifies the minimum number of characters in a readme that determines which readme score component applies. Assets with readmes containing more characters than this threshold use Readme exceeding threshold score, while assets with readmes at or below this threshold use Readme below threshold score.
Use this field to differentiate between basic and comprehensive readme documentation. A higher threshold encourages more detailed documentation, while a lower threshold makes it easier for assets to qualify for the higher readme score.
Example: With Readme characters threshold set to 750, a dataset with a detailed 1,200-character readme explaining its purpose, schema, and usage patterns qualifies for the Readme exceeding threshold score. A dataset with a brief 300-character readme qualifies only for the Readme below threshold score.
Readme exceeding threshold score
Specifies the score value added when an asset has at least one readme with more characters than the Readme characters threshold. If the asset has no readme or a readme at or below the threshold, this component isn't included in its overall completeness score.
Use this field to reward assets with comprehensive readme documentation. A higher value encourages teams to provide detailed, thorough documentation for their assets.
Example: With Readme exceeding threshold score set to 18 and threshold at 600, a model with an 800-character readme covering business logic, data sources, and refresh schedules earns 18 points. Models with shorter readmes or no readme earn 0 points, incentivizing comprehensive documentation.
Readme below threshold score
Specifies the score value added when an asset has at least one readme with characters equal to or less than the Readme characters threshold. If the asset has no readme or a readme exceeding the threshold, this component isn't included in its overall completeness score.
Use this field to provide partial credit for assets with basic readme documentation. A lower value than Readme exceeding threshold score encourages teams to expand their documentation to qualify for the higher score.
Example: With Readme below threshold score set to 6 and threshold at 400, a report with a 250-character readme providing basic context earns 6 points. Reports with readmes exceeding 400 characters earn the higher Readme exceeding threshold score, while reports without readmes earn 0 points.
Asset score configuration
Score value components applicable to physical assets only. These settings don't apply to glossary assets.
Term score
Specifies the score value added when a physical asset has at least one linked glossary term. If the asset has no linked terms, this component isn't included in its overall completeness score.
Use this field to encourage teams to link physical assets to glossary terms, establishing semantic relationships between technical assets and business concepts. A higher value emphasizes the importance of connecting technical metadata to business terminology.
Example: A column linked to the "Customer ID" glossary term earns 12 points, showing alignment between technical implementation and business definitions. Columns without linked glossary terms earn 0 points, indicating opportunities to connect technical assets to your business glossary.
Custom metadata scores
A comma-separated list of custom metadata scores to include in the completeness calculation. Each element must follow the format:
Custom metadata name@@@Attribute name@@@Score
The workflow evaluates each specified custom metadata attribute. If an asset contains a value for that attribute, the defined score is added to its total completeness score.
Example: Multiple business attributes
Products@@@Transactions@@@50,SLA@@@Update Frequency@@@5
In this example:
-
If an asset has a value in the
Transactionsattribute of theProductscustom metadata, add 50 points. -
If an asset has a value in the Update Frequency attribute of the SLA custom metadata, add 5 points.
Example 2: Compliance and quality-related attributes
Compliance@@@PII Classification@@@25,QualityMetrics@@@Freshness Score@@@10
In this example:
-
If an asset contains a value in the
PII Classificationattribute of theCompliancecustom metadata, add 25 points. -
If an asset has a value in the Freshness Score attribute under QualityMetrics, add 10 points.
Glossary score configuration
Score value components applicable to glossary assets only. These settings don't apply to physical assets.
Linked asset score
Specifies the score value added when a glossary term has at least one linked physical asset. If the term has no linked assets, this component isn't included in its overall completeness score.
Use this field to encourage teams to link glossary terms to actual physical assets, demonstrating that business concepts are implemented in the data. A higher value emphasizes the importance of connecting business terminology to technical implementations.
Example: A glossary term "Revenue" linked to tables like sales_fact and revenue_summary earns 18 points, demonstrating that the business concept has real data implementations. Terms without linked physical assets earn 0 points, highlighting business concepts that may need data mapping.
Related term score
Specifies the score value added when a glossary term has at least one linked related term. If the term has no related terms, this component isn't included in its overall completeness score.
Use this field to encourage teams to establish relationships between related glossary terms, building a more connected and navigable business glossary. A higher value emphasizes the importance of creating semantic relationships within your glossary.
Example: A glossary term "Customer Lifetime Value" linked to related terms like "Customer Acquisition Cost" and "Customer Retention Rate" earns 11 points, showing interconnected business concepts. Terms without related term relationships earn 0 points, indicating opportunities to build a more connected glossary.
Custom metadata scores
A comma-separated list of custom metadata scores to include in the completeness calculation for glossary assets. Each element must follow the format: Custom metadata name@@@Attribute name@@@Score.
The workflow evaluates each custom metadata attribute specified. If the glossary asset has a value for that attribute, the corresponding score is added to the total.
Example:
Business Context@@@Data Domain@@@20,Classification@@@Sensitivity@@@15
In this example:
- If a glossary term has a value in the
Data Domainattribute of theBusiness Contextcustom metadata, add 20 points. - If a glossary term has a value in the
Sensitivityattribute of theClassificationcustom metadata, add 15 points.