Atlan AI security

Atlan AI is designed with multiple security controls to protect metadata, credentials, and communication between systems. This document outlines the AI architecture, security practices, data handling, encryption, and compliance frameworks for Atlan AI.

Architecture

What services does Atlan AI use?

Atlan AI uses a multi-model architecture powered by a centralized AI gateway. Instead of relying on a single LLM provider, Atlan routes all AI requests through an open-source LLM proxy. This gateway supports multiple LLM providers, including:

Anthropic (Claude models via AWS Bedrock)
OpenAI (GPT models)
Google (Gemini models)
Open-source models hosted by Atlan

The AI gateway acts as a centralized control plane that handles model routing, load balancing, rate limiting, and observability across all LLM providers. Model versions may change as providers release improvements.

Anthropic Claude models are accessed via AWS Bedrock, which keeps traffic within AWS private networking and supports AWS compliance certifications.

How's Atlan's AI gateway deployed?

The AI gateway is deployed as a Kubernetes-based service within Atlan's infrastructure:

Multi-region deployment: The gateway is hosted across multiple regions (United States, EU, APAC) to support data residency and compliance with regional regulations.
Tenant isolation: Each tenant receives a unique API key for the gateway, with per-tenant budgets and rate limits enforced at the gateway level.
Secure connectivity: All traffic between tenant environments and the AI gateway uses VPC peering or PrivateLink to keep data within private network boundaries.
Observability: The gateway provides built-in logging, metrics, traces, and cost tracking. All observability data is pushed to a centralized monitoring system.

How's Atlan's vector store secured?

Atlan uses TurboPuffer, a serverless vector and full-text search database, as the primary vector store for powering semantic search across AI applications. TurboPuffer stores vector embeddings and associated metadata used by capabilities like conversational AI and the MCP server.

Security controls for the vector store include:

Tenant isolation: Each tenant has a dedicated namespace within TurboPuffer, segmented by application and use case (for example, tenant-name/application-name/use-case). Tenants can't access each other's namespaces.
Encryption: Each namespace is encrypted using a Customer Managed Encryption Key (CMEK).
Access control: Access to each namespace is controlled via per-tenant API keys. Applications only have read/write access to their own namespaces.
Network security: TurboPuffer is deployed in a dedicated Kubernetes cluster within Atlan's cloud account, with PrivateLink connectivity to keep all traffic private.
Regional data residency: Namespaces are co-located in the region closest to the tenant, supporting GDPR, CCPA, and other data residency requirements.
Bring-your-own-bucket: Customers can choose to store vector data in their own cloud storage bucket for additional data isolation.

How are embeddings generated?

Vector embeddings are generated using embedding models managed within Atlan's infrastructure. Atlan uses Cloudflare Workers AI for embedding generation. Embedding requests are routed over TLS 1.2+ and are subject to Atlan's enterprise data processing agreements with Cloudflare. Only metadata is embedded—Atlan doesn't embed or process your actual data.

AI capabilities and infrastructure

Capability	What it processes	Key infrastructure
Atlan AI enrichment	Asset/term metadata, SQL transformations	AI gateway, LLM providers
Context Agents Studio	Asset metadata, lineage, SQL patterns, usage signals	AI gateway, embedding models, vector store, Atlan governed write-back
Conversational AI	User queries, catalog metadata, lineage, glossary, conversation history	AI gateway, LLM providers, vector store, embedding models, AI observability tooling, conversation storage
Atlan MCP / agentic retrieval	User queries, retrieved context, permission-scoped metadata	Conversational AI APIs, retrieval systems, Atlan access control

All capabilities share Atlan's common security controls: tenant isolation, encryption in transit and at rest, no training on customer data, and compliance with Atlan's data processing agreements.

Data handling

What data does Atlan AI send to external services?

Atlan doesn't send your data to any AI service. Only metadata is sent for supported capabilities:

Asset descriptions: table, view, column, database, or schema name
Term descriptions: glossary name and description, category name and description, and term name
Lineage explanations: SQL transformations with upstream and downstream asset names
Aliases: table, view, column, database, or schema name
Term READMEs: glossary, category, and term name and description, and existing READMEs within the same glossary
Conversational AI: user natural-language query text, retrieved catalog metadata, lineage context, glossary context, and conversation history for the active session
Context Agents Studio: asset metadata including table/column names, lineage relationships, SQL patterns, and glossary terms used to generate descriptions, READMEs, and term linkages

Conversational AI also uses AI observability tooling for LLM monitoring (see Does Atlan use AI observability tooling? below).

Does Atlan use customer metadata or data to train AI models?

No. Customer metadata, prompts, and outputs processed through Atlan AI capabilities aren't used by Atlan or by Atlan's AI providers to fine-tune or train foundation models.

How's tenant isolation enforced for conversational and retrieval-based AI?

Tenant isolation is enforced at multiple layers:

AI gateway: Each tenant has a unique API key with per-tenant rate limits and budget controls. Requests are authenticated and scoped to the tenant.
Vector store: Each tenant has a dedicated TurboPuffer namespace, encrypted with CMEK. Tenants can't access each other's namespaces.
Retrieval: Context retrieved during conversational AI queries is constrained by the requesting user's Atlan permissions. Users only see metadata they're authorized to access.
Conversation storage: Conversation history is stored per-tenant in the tenant's own database, not in any shared store.

What is Atlan AI's data retention policy?

Atlan doesn't retain prompts or responses in the centralized AI control plane. Retention by component:

Prompts and responses: Stored in the tenant's own database for operational purposes. Data isn't retained in the centralized AI control plane.
Conversation history: For conversational AI, multi-turn conversation context is stored per-tenant to support the active session. This storage is tenant-scoped and encrypted.
Vector embeddings: Stored in the tenant's dedicated TurboPuffer namespace, encrypted with CMEK. Embeddings can be selectively removed to support data deletion requests.
AI-generated metadata: Only the metadata generated using Atlan AI (such as descriptions and READMEs) is cataloged in Atlan and marked as AI-generated in the activity log.
LLM provider retention: Atlan's agreements with LLM providers make sure that prompts and responses aren't used for model training or retained beyond the scope of the API request.
AI observability tooling: Conversational AI prompts and completions are retained for 30 days for evaluation and performance monitoring. Data is tagged with a tenant identifier for isolation. This data isn't used for model training.

Does Atlan use AI observability tooling?

Yes. Atlan uses AI observability tooling for LLM evaluation and monitoring across Conversational AI capabilities. The observability tooling's data plane is hosted within Atlan's infrastructure. This enables Atlan to monitor response quality, latency, and cost across AI interactions.

Data captured for each Conversational AI request:

Full prompt content (user query, retrieved catalog metadata, lineage context, conversation history)
Full completion content (AI-generated response)
Tenant identifier, latency, token counts, and model metadata

Retention: Observability data is retained for 30 days. This data isn't used for model training.

Tenant isolation: Data is logically separated by tenant identifier. Because the observability tooling's data plane runs within Atlan's infrastructure, data doesn't leave Atlan's environment.

Scope: All Conversational AI interactions, including in Slack and Microsoft Teams integrations, are subject to AI observability tooling.

Encryption

Is data processed through Atlan AI encrypted?

Yes. Data is encrypted both in transit and at rest:

In transit: TLS 1.2 or higher for all communication
At rest: AES-256 encryption
CMEK: Customer Managed Encryption Keys are used for vector store namespaces and tenant-level data
HTTPS: All requests are made over HTTPS from your tenant across all supported cloud platforms
Network isolation: PrivateLink and VPC peering make sure that traffic between tenants and the AI control plane remains within private network boundaries

Model management

What AI models does Atlan support?

Atlan supports multiple LLM providers through the centralized AI gateway. Supported providers include Anthropic (Claude), OpenAI (GPT), Google (Gemini), and select open-source models. The specific model used for each capability may vary and is managed centrally by Atlan.

Customers can also bring their own models by deploying a dedicated AI gateway instance within their tenant environment. This keeps model credentials and data entirely within the customer's infrastructure.

How are model API keys managed?

Each tenant is provisioned a unique API key for the AI gateway. This key:

Controls which models the tenant can access
Enforces per-tenant budget and rate limits
Is managed centrally and rotated according to Atlan's key management policies

For customers who bring their own models, API keys are managed within the customer's own environment and never shared with Atlan.

AI feature security

How's conversational AI secured?

Conversational AI lets users ask natural-language questions about their metadata. Security controls include:

Access control: Conversational AI respects Atlan's existing access controls and permission policies. Users only see metadata they're authorized to view.
Feature gating: Enablement is controlled via feature flags and requires explicit admin approval.
Data isolation: Each tenant's data is isolated, and AI model access is managed through tenant-level AI gateway keys.
Audit trail: All changes made through conversational AI are marked as "Updated using Atlan AI" in the activity log.
MCP action controls: If MCP actions are enabled in conversational AI, admins control which users or groups can access them.

How's Context Agents Studio secured?

Context Agents Studio automates metadata enrichment using specialized AI agents. Security controls include:

No overwrites: AI agents only enrich assets that are missing the target metadata attribute. Existing values are never overwritten.
Activity logging: All enrichment activity is logged, including when generation was triggered, how many assets were updated, and by whom.
Admin control: Context Agents Studio requires Atlan Lakehouse to be enabled and is accessible from the Governance Center.
Cost awareness: Each agent run consumes AI credits, and credit usage is tracked per tenant.

How's Atlan's MCP server secured?

The Atlan MCP server provides a secure bridge between Atlan's metadata platform and external AI tools. Security controls include:

Authentication: The Remote MCP server uses the same authentication and authorization policies already configured in Atlan. Users authenticate with their Atlan credentials.
Per-tenant isolation: Each tenant has its own hosted MCP server instance. There's no shared state between tenants.
Permission enforcement: All MCP tool calls respect Atlan's access control policies. Users can only search, view, or update metadata they're authorized to access.
Admin controls: Admins control which users or groups can access MCP capabilities, including which tools are available in conversational AI.

Compliance

Does Atlan AI comply with any governance or legal frameworks?

Yes. Atlan AI operates within Atlan's established security, privacy, and compliance programs. Atlan is fully compliant with major data protection frameworks, including:

HIPAA (Health Insurance Portability and Accountability Act)
GDPR (General Data Protection Regulation)
CCPA/CPRA (California Consumer Privacy Act / California Privacy Rights Act)

These frameworks provide safeguards around the collection, processing, and handling of sensitive and personal data, including data used by AI features.

The multi-region AI gateway architecture supports regional data residency requirements. Tenant data is processed and stored in the region closest to the tenant (the United States, EU, or APAC), and customers can request selective data removal to comply with data deletion obligations.

Regular security and privacy assessments are conducted across the platform, including new AI features, to maintain continued compliance and risk mitigation. AI development processes are governed by internal policies that align with emerging standards around AI transparency, fairness, and accountability.

For detailed compliance information, certifications, audit reports, and security documentation, see the Atlan Trust Portal.

Does Atlan AI process PII or other sensitive data?

Atlan AI processes user input and metadata, which typically doesn't contain PII or sensitive data. Organizations are responsible for making sure that PII or sensitive data isn't available in metadata or shared via user input.

The AI gateway includes guardrails for prompt injection prevention and PII detection at the gateway level to provide an additional layer of protection.

Development and operations

How does Atlan manage security development of Atlan AI?

Atlan AI development follows OWASP Top 10 security practices, including application security reviews and Static Application Security Testing (SAST) tools.

How does Atlan manage security vulnerabilities for Atlan AI?

Vulnerabilities and incidents are managed in accordance with the existing program and policy.

How does Atlan manage performance and scale for Atlan AI?

Atlan AI leverages the scalability of its cloud infrastructure and the centralized AI gateway. The multi-region gateway architecture, combined with per-tenant rate limiting and budget controls, ensures consistent performance across tenants. The vector store autoscales to handle embedding and search workloads.

Architecture​

What services does Atlan AI use?​

How's Atlan's AI gateway deployed?​

How's Atlan's vector store secured?​

How are embeddings generated?​

AI capabilities and infrastructure​

Data handling​

What data does Atlan AI send to external services?​

Does Atlan use customer metadata or data to train AI models?​

How's tenant isolation enforced for conversational and retrieval-based AI?​

What is Atlan AI's data retention policy?​

Does Atlan use AI observability tooling?​

Encryption​

Is data processed through Atlan AI encrypted?​

Model management​

What AI models does Atlan support?​

How are model API keys managed?​

AI feature security​

How's conversational AI secured?​

How's Context Agents Studio secured?​

How's Atlan's MCP server secured?​

Compliance​

Does Atlan AI comply with any governance or legal frameworks?​

Does Atlan AI process PII or other sensitive data?​

Development and operations​

How does Atlan manage security development of Atlan AI?​

How does Atlan manage security vulnerabilities for Atlan AI?​

How does Atlan manage performance and scale for Atlan AI?​