Data handling Private Preview
This page covers where your data goes during each stage of Context Engineering Studio. Use it for internal security reviews and governance assessments.
Data flows
Each row below maps a CES operation to where the data travels and where it stays.
| Flow | Where your data goes |
|---|---|
| Asset discovery | Atlan catalog only. No warehouse query runs. |
| Column metadata fetch | Warehouse INFORMATION_SCHEMA → Atlan backend → your browser. |
| AI model generation | Catalog metadata → Atlan AI → structured suggestions back to your browser. Atlan AI generates synthetic examples rather than consuming real warehouse rows. |
| Snowflake chat and simulation | Question + YAML → Cortex Analyst (in-account) → SQL + results. Judge runs in Snowflake via SNOWFLAKE.CORTEX.COMPLETE(). Result data stays in Snowflake. |
| Databricks chat | Question → Genie Space (in-workspace) → SQL + results. Result data stays in Databricks. |
| Databricks simulation judge | Natural-language question + generated SQL + verified SQL → Atlan AI → semantic-equivalence verdict. |
| Deployment | DDL + YAML → Snowflake (in-account via SYSTEM$CREATE_SEMANTIC_VIEW_FROM_YAML) or Databricks (in-workspace via CREATE OR REPLACE VIEW ... WITH METRICS LANGUAGE YAML plus Genie Space API calls). |
Data residency
Where data resides and how it moves at each stage, per engine.
- Snowflake
- Databricks
- Chat and Simulate: question text + YAML travels to Cortex Analyst, which runs inside your Snowflake account. SQL generation, execution, and the simulation judge all happen in-account. No query result data leaves Snowflake.
- Deployment: CES calls
SYSTEM$CREATE_SEMANTIC_VIEW_FROM_YAMLin-account. The deployed Semantic View lives in the database and schema you chose. - Observability: CES reads production traces from
SNOWFLAKE.LOCAL.CORTEX_ANALYST_REQUESTS_V, a built-in system view, and syncs them into the CES Observe tab.
- Chat: question text travels to your Genie Space in your workspace. SQL generation and execution happen inside Databricks. No query result data leaves for chat.
- Simulate: questions run through the Genie Space the same way. The LLM judge for semantic-equivalence verdicts runs through Atlan AI, which receives the natural-language question and both SQL statements (generated and verified). No warehouse row data is sent.
- Deployment: Metric Views and the companion Genie Space are created entirely inside your workspace. Atlan writes Unity Catalog table and column comments in place so descriptions surface in Catalog Explorer.
Authentication
CES authenticates to each engine using the credentials configured on your existing Atlan connection.
- Snowflake: Atlan authenticates via the service account role on your existing Snowflake connection (Personal Access Token or RSA keypair).
- Databricks: Atlan authenticates via Service Principal + OAuth M2M on your existing Databricks connection. PAT is supported as a fallback but isn't recommended for production.
See Grant Snowflake permissions and Grant Databricks permissions for setup detail.
Atlan AI security model
Atlan AI operates under a metadata-only security model:
- Atlan AI doesn't have direct query access to your warehouse and can't issue arbitrary queries on your tables.
- Atlan AI operates on metadata (schemas, column names and types, catalog descriptions, glossary terms, lineage, and query-history patterns), not row data.
- Where sample examples are needed, Atlan AI generates synthetic examples based on sample-value characteristics already captured in the catalog. Actual rows from your source tables aren't sent to Atlan AI.
Security review reference
Common vendor security and governance questions, mapped to CES-specific answers.
| Concern | Detail |
|---|---|
| Atlan AI access scope | Metadata-only. Atlan AI has no direct query access to your warehouse and doesn't consume row data. Sample examples are synthesized from catalog metadata. |
| Result data during simulation judging | Snowflake keeps result data in-account via SNOWFLAKE.CORTEX.COMPLETE(). On Databricks, Atlan AI receives only the natural-language question and both SQL statements. No warehouse row data is sent. |
| Data stored by Atlan | CES persists session state only (selected assets, question sets, evaluation results, current YAML), in object storage managed by Atlan's infrastructure layer. No warehouse query results are stored. |
| Deployed artifacts | All deployed Semantic Views (Snowflake), Metric Views, and Genie Spaces (Databricks) live inside your account, under your governance. Atlan holds no copy. |
For a signed data-handling attestation, contact Atlan support.
See also
- Grant Snowflake permissions: service role setup for Snowflake.
- Grant Databricks permissions: service principal setup for Databricks.
- Simulation diagnostics: how simulation judging works per engine.