Extraction pipeline
When you upload a knowledge file, a context agent reads the document and extracts relevant business rules and glossary terms from its content. The agent produces two outputs: glossary terms with attached business rules, and a skill file. Both outputs appear in the catalog as assets linked back to the source knowledge file.
How processing works
The context agent parses the document's natural-language content to identify discrete rules and definitions. For each rule it finds, the agent:
- Creates or updates a glossary term in Atlan with the rule's name and a generated description
- Attaches the extracted business rule text to that term
- Links the relevant catalog assets to the term
For the document as a whole, the agent generates one skill file that packages the rules and logic extracted from the document in a form that other agents can directly consume.
Glossary terms
Extracted terms appear in your Atlan glossary like any other term. The difference is their source: the term's lineage traces back to the knowledge file it came from, so you can always see which document a rule originated in.
Business rules are attached directly to each term.
Skills
A skill file packages the rules and logic extracted from a document in a form that agents can directly use. Skills are created by context agents and are visible in the catalog as assets. Agents consume skills to apply the encoded rules when answering questions or executing tasks.
Lineage
The catalog records the full processing trail as lineage on the knowledge file asset. You can see the source knowledge file, the context agent that processed it, and the glossary terms and skill file it produced.
See also
- Knowledge folders and files: What knowledge folders and files are as catalog assets
- Create knowledge folder and upload files: How to upload files and review extracted terms and skill files
- Context Agents Studio: Configure and manage the context agents that process knowledge files