Skip to main content

Set up Snowflake Public Preview

This guide walks through configuring Snowflake to work with Atlan's data quality studio by creating the required roles, setting up database objects, and granting the necessary privileges.

System requirements

Before setting up the integration, make sure you meet the following requirements:

  • Snowflake Enterprise or Business Critical edition
  • Dedicated Snowflake warehouse for running DQ-related queries

Prerequisites

Before you begin, complete the following steps:

  • Obtain ACCOUNTADMIN role or equivalent administrative privileges in Snowflake
  • Identify your dedicated warehouse name for DQ operations
  • Have access to create a new Snowflake user for Atlan
  • Review Data Quality permissions to understand required privileges

Create roles

Create two roles for the integration:

-- Create DQ Admin Role
CREATE ROLE IF NOT EXISTS dq_admin;
GRANT OPERATE, USAGE ON WAREHOUSE "<warehouse-name>" TO ROLE dq_admin;

-- Create Atlan Service Role
CREATE ROLE IF NOT EXISTS atlan_dq_service_role;
GRANT OPERATE, USAGE ON WAREHOUSE "<warehouse-name>" TO ROLE atlan_dq_service_role;

Create user

Create a dedicated Snowflake user for Atlan following your organization's standards, then grant the service role:

GRANT ROLE atlan_dq_service_role TO USER <atlan_dq_user>;

Set up database objects

Create the database, schemas, and stored procedure required for Atlan data quality operations.

  1. Create the required database objects:

    -- Create database
    CREATE DATABASE ATLAN_DQ;

    -- Create schemas
    CREATE SCHEMA ATLAN_DQ.SHARED;

    The ATLAN_DQ database serves as a container for all objects related to Atlan Data Quality. The ATLAN_DQ.SHARED schema provides a separate namespace for shared procedures and functions.

  2. Create stored procedure for DMF schedule management:

    View procedure code
    /**
    * Updates the Data Metric Function (DMF) schedule for a specified Snowflake entity.
    *
    * This secure procedure validates inputs, verifies entity existence, and applies
    * the appropriate DMF schedule configuration (MINUTES, CRON, ON_DATA_CHANGE, NOT_SCHEDULED).
    * It enforces strict validation for:
    * - Entity type (TABLE, VIEW, MATERIALIZED VIEW, EXTERNAL TABLE, ICEBERG TABLE)
    * - Schedule type and required values (including CRON format and timezone validity)
    * - Supported minute intervals for MINUTES-based schedules
    *
    * The procedure executes with OWNER privileges to ensure proper permissions and
    * returns a structured JSON response indicating success or failure with details.
    *
    * @param {string} ENTITY_TYPE - Type of entity (TABLE, VIEW, MATERIALIZED VIEW, EXTERNAL TABLE, ICEBERG TABLE)
    * @param {string} ENTITY_NAME - Fully qualified name of the entity (database.schema.name)
    * @param {string} SCHEDULE_TYPE - Schedule type (MINUTES, CRON, ON_DATA_CHANGE, NOT_SCHEDULED)
    * @param {string} [SCHEDULE_VALUE=null] - Schedule value based on type (minutes or CRON expression)
    * @returns {string} - JSON string with:
    * - isSuccessful: Boolean indicating operation success
    * - message: Success or error details
    */
    CREATE OR REPLACE SECURE PROCEDURE ATLAN_DQ.SHARED.MANAGE_DMF_SCHEDULE(
    ENTITY_TYPE STRING,
    ENTITY_NAME STRING,
    SCHEDULE_TYPE STRING,
    SCHEDULE_VALUE STRING DEFAULT NULL
    )
    RETURNS STRING
    LANGUAGE JAVASCRIPT
    EXECUTE AS OWNER
    AS
    $$
    // -----------------------------------------------------Constants-----------------------------------------------------
    const VALID_ENTITY_TYPES = new Set([
    "TABLE",
    "VIEW",
    "MATERIALIZED VIEW",
    "EXTERNAL TABLE",
    "ICEBERG TABLE",
    ]);
    const VALID_SCHEDULE_TYPES = new Set([
    "MINUTES",
    "CRON",
    "ON_DATA_CHANGE",
    "NOT_SCHEDULED",
    ]);
    const MINUTES_ALLOWED = new Set(["5", "15", "30", "60", "720", "1440"]);
    const SCHEDULE_TYPES_THAT_REQUIRE_VALUE = new Set(["MINUTES", "CRON"]);
    // -----------------------------------------------------UTILITY FUNCTIONS-----------------------------------------------------
    /**
    * Executes a SQL query with optional bind parameters
    * Returns an object with execution status and results/error details
    * @param {string} sqlText - SQL statement to execute
    * @param {Array} [binds=[]] - Array of bind parameters for parameterized queries
    * @returns {Object} Object containing:
    * - isErrored: boolean indicating if execution failed
    * - message: error details if failed, empty string if successful
    * - result: query result object if successful, null if failed
    */
    function executeQuery(sqlText, binds = []) {
    try {
    if (!sqlText) throw new Error("SQL Text is required");
    const statement = snowflake.createStatement({ sqlText, binds });
    const result = statement.execute();
    return {
    isErrored: false,
    message: "",
    result,
    };
    } catch (err) {
    const errorMessage = `${err.code ?? ''} - ${err.message} - ${sqlText} with binds: ${binds.join(", ")}`;
    return {
    isErrored: true,
    message: errorMessage.trim(),
    result: null,
    };
    }
    }
    // -----------------------------------------------------VALIDATION Helpers-----------------------------------------------------
    /**
    * Validates if a timezone string is recognized by Snowflake
    * Uses CONVERT_TIMEZONE to test validity without side effects
    * @param {string} timezone - Timezone to validate (e.g., 'America/New_York', 'UTC')
    * @returns {boolean} True if timezone is valid, false otherwise
    */
    function isTimezoneValid(timezone) {
    const result = executeQuery(
    `SELECT CONVERT_TIMEZONE(?, CURRENT_TIMESTAMP())`,
    [timezone]
    );
    return !result.isErrored;
    }
    /**
    * Validates if a string represents a number within specified bounds
    * @param {string} value - String value to parse and validate
    * @param {number} min - Minimum allowed value (inclusive)
    * @param {number} max - Maximum allowed value (inclusive)
    * @returns {boolean} True if value is a valid number within range
    */
    function isNumberInRange(value, min, max) {
    const num = parseInt(value, 10);
    return !isNaN(num) && num >= min && num <= max;
    }
    /**
    * Validates Snowflake CRON expression format and component values
    * Expected format: minute hour dayOfMonth month dayOfWeek timezone
    * Example: "0 9 * * MON America/New_York" (Every Monday at 9 AM EST)
    * @param {string} cronExpression - CRON expression to validate
    * @throws {Error} If any component of the CRON expression is invalid
    */
    function validateCronExpression(cronExpression) {
    if (cronExpression.length > 100)
    throw new Error("Cron expression is too long");
    const cronFields = cronExpression.trim().split(/\s+/);
    if (cronFields.length !== 6)
    throw new Error("Invalid cron expression. Expected 6 fields");
    const [minute, hour, dayOfMonth, month, dayOfWeek, timezone] = cronFields;
    const isTimezoneValidResult = isTimezoneValid(timezone);
    if (!isTimezoneValidResult)
    throw new Error("Invalid timezone provided in the cron expression");
    const regexPatterns = {
    minute: /^(\*|\d+|\*\/\d+|\d+\-\d+|\d+(,\d+)*)$/,
    hour: /^(\*|\d+|\*\/\d+|\d+\-\d+|\d+(,\d+)*)$/,
    dayOfMonth: /^(\*|L|\d+|\*\/\d+|\d+\-\d+|\d+(,\d+)*)$/,
    month:
    /^(\*|\d+|JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC|\*\/\d+|\d+\-\d+|[A-Z]{3}\-[A-Z]{3}|\d+(,\d+)*|([A-Z]{3}(,[A-Z]{3})*))$/i,
    dayOfWeek:
    /^(\*|\d+|SUN|MON|TUE|WED|THU|FRI|SAT|\d+L|[A-Z]{3}L|\*\/\d+|\d+\-\d+|[A-Z]{3}\-[A-Z]{3}|\d+(,\d+)*|([A-Z]{3}(,[A-Z]{3})*))$/i,
    };
    if (minute.match(/^\d+$/))
    if (!isNumberInRange(minute, 0, 59))
    throw new Error("Invalid minute value");
    if (hour.match(/^\d+$/))
    if (!isNumberInRange(hour, 0, 23)) throw new Error("Invalid hour value");
    if (dayOfMonth.match(/^\d+$/))
    if (!isNumberInRange(dayOfMonth, 1, 31))
    throw new Error("Invalid day of month value");
    if (month.match(/^\d+$/))
    if (!isNumberInRange(month, 1, 12)) throw new Error("Invalid month value");
    if (dayOfWeek.match(/^\d+$/))
    if (!isNumberInRange(dayOfWeek, 0, 6))
    throw new Error("Invalid day of week value");
    if (
    !regexPatterns.minute.test(minute) ||
    !regexPatterns.hour.test(hour) ||
    !regexPatterns.dayOfMonth.test(dayOfMonth) ||
    !regexPatterns.month.test(month) ||
    !regexPatterns.dayOfWeek.test(dayOfWeek)
    )
    throw new Error("Invalid cron expression");
    }
    // -----------------------------------------------------VALIDATION Functions-----------------------------------------------------
    /**
    * Verifies that the specified entity exists and is accessible
    * Uses SHOW COLUMNS to check entity accessibility without modifying data
    * @param {string} entityName - Fully qualified entity name to verify
    * @throws {Error} If entity doesn't exist or isn't accessible
    */
    function verifyEntityExists(entityName) {
    const query = `SHOW COLUMNS IN IDENTIFIER(?)`;
    const result = executeQuery(query, [entityName]);
    if (result.isErrored) throw new Error(result.message);
    }
    /**
    * Main validation orchestrator - validates all input parameters
    * @param {string} entityType - Type of database entity
    * @param {string} entityName - Fully qualified entity name
    * @param {string} scheduleType - Type of schedule to set
    * @param {string} scheduleValue - Schedule configuration value
    * @throws {Error} If any validation fails
    */
    function validateArguments(
    entityType,
    entityName,
    scheduleType,
    scheduleValue
    ) {
    if (!entityType) throw new Error("ENTITY_TYPE cannot be empty");
    if (!entityName) throw new Error("ENTITY_NAME cannot be empty");
    if (!scheduleType) throw new Error("SCHEDULE_TYPE cannot be empty");
    if (!VALID_ENTITY_TYPES.has(entityType))
    throw new Error(
    `Invalid ENTITY_TYPE: "${entityType}". Valid options are ${Array.from(
    VALID_ENTITY_TYPES
    ).join(", ")}`
    );
    const parts = entityName
    .split(".")
    .map((part) => part.trim())
    .filter(Boolean);
    if (parts.length !== 3)
    throw new Error(
    `Invalid fully qualified name: ${entityName}. Expected format: database.schema.name`
    );
    if (!VALID_SCHEDULE_TYPES.has(scheduleType))
    throw new Error(
    `Invalid schedule type: "${scheduleType}". Valid options are ${Array.from(
    VALID_SCHEDULE_TYPES
    ).join(", ")}`
    );
    if (SCHEDULE_TYPES_THAT_REQUIRE_VALUE.has(scheduleType) && !scheduleValue)
    throw new Error("SCHEDULE_VALUE is required");
    verifyEntityExists(entityName);
    if (scheduleType === "MINUTES" && !MINUTES_ALLOWED.has(scheduleValue))
    throw new Error(
    `Invalid SCHEDULE_VALUE for MINUTES. Valid options are ${Array.from(
    MINUTES_ALLOWED
    ).join(", ")}`
    );
    if (scheduleType === "CRON") validateCronExpression(scheduleValue);
    }
    // -----------------------------------------------------MAIN FUNCTION-----------------------------------------------------
    /**
    * Main entry point - manages DMF schedule configuration
    * Validates inputs, constructs appropriate SQL, and executes the schedule change
    * @returns {string} JSON response with operation status and message
    */
    function main() {
    const entityType = ENTITY_TYPE.trim();
    const entityName = ENTITY_NAME.trim();
    const scheduleType = SCHEDULE_TYPE.trim();
    const scheduleValue = SCHEDULE_VALUE?.trim() ?? null;
    validateArguments(entityType, entityName, scheduleType, scheduleValue);
    const SQL_TEMPLATES = {
    MINUTES: `ALTER ${entityType} IDENTIFIER(?) SET DATA_METRIC_SCHEDULE = '${scheduleValue} MINUTE'`,
    CRON: `ALTER ${entityType} IDENTIFIER(?) SET DATA_METRIC_SCHEDULE = 'USING CRON ${scheduleValue}'`,
    ON_DATA_CHANGE: `ALTER ${entityType} IDENTIFIER(?) SET DATA_METRIC_SCHEDULE = 'TRIGGER_ON_CHANGES'`,
    NOT_SCHEDULED: `ALTER ${entityType} IDENTIFIER(?) UNSET DATA_METRIC_SCHEDULE`,
    };
    const sqlText = SQL_TEMPLATES[scheduleType];
    const returnMessage = `Successfully updated schedule for ${entityType} ${entityName} to ${scheduleType} ${scheduleValue ?? ''}`;
    const result = executeQuery(sqlText, [entityName]);
    return JSON.stringify({
    isSuccessful: !result.isErrored,
    message: result.isErrored ? result.message : returnMessage.trim(),
    });
    }
    try {
    return main();
    } catch (err) {
    return JSON.stringify({
    isSuccessful: false,
    message: err.message,
    });
    }
    $$;
  3. Transfer ownership to dq_admin role:

    GRANT OWNERSHIP ON DATABASE ATLAN_DQ TO ROLE dq_admin;
    GRANT OWNERSHIP ON SCHEMA ATLAN_DQ.SHARED TO ROLE dq_admin;
    GRANT OWNERSHIP ON PROCEDURE ATLAN_DQ.SHARED.MANAGE_DMF_SCHEDULE(
    STRING, STRING, STRING, STRING
    ) TO ROLE dq_admin;

Grant privileges

Grant the necessary permissions to enable data quality operations and maintain proper access control.

  1. System privileges: Grant Snowflake system-level permissions to enable data metric functions and monitoring capabilities.

    -- For Atlan Service Role
    GRANT APPLICATION ROLE SNOWFLAKE.DATA_QUALITY_MONITORING_VIEWER TO ROLE atlan_dq_service_role;
    GRANT EXECUTE DATA METRIC FUNCTION ON ACCOUNT TO ROLE atlan_dq_service_role;
    GRANT DATABASE ROLE SNOWFLAKE.DATA_METRIC_USER TO ROLE atlan_dq_service_role;
    GRANT EXECUTE TASK ON ACCOUNT TO ROLE atlan_dq_service_role;
    GRANT EXECUTE MANAGED TASK ON ACCOUNT TO ROLE atlan_dq_service_role;
  2. Table owner privileges: For every role that owns tables in your environment (denoted by <table_owner>), grant the following privileges:

    GRANT ROLE <table_owner> TO ROLE dq_admin;
    To identify table owner roles in your environment:
    -- Find table owners
    SELECT TABLE_CATALOG, TABLE_OWNER
    FROM SNOWFLAKE.ACCOUNT_USAGE.TABLES
    WHERE DELETED IS NULL AND TABLE_OWNER IS NOT NULL
    GROUP BY TABLE_CATALOG, TABLE_OWNER;
  3. DQ Database access: Grant Atlan's service role access to the created objects:

    GRANT USAGE ON DATABASE ATLAN_DQ TO ROLE atlan_dq_service_role;
    GRANT USAGE ON SCHEMA ATLAN_DQ.SHARED TO ROLE atlan_dq_service_role;
    GRANT USAGE ON PROCEDURE ATLAN_DQ.SHARED.MANAGE_DMF_SCHEDULE(
    STRING, STRING, STRING, STRING
    ) TO ROLE atlan_dq_service_role;
    GRANT CREATE SCHEMA ON DATABASE ATLAN_DQ TO ROLE atlan_dq_service_role;
  4. Read access to tables and views: Grant access to the service role to attach rules to tables. Replace <database-name> and <schema-name> with your own object names.

    1. Database and schema usage: Enables Atlan to discover and reference objects.

      GRANT USAGE ON DATABASE <database-name> TO ROLE atlan_dq_service_role;
      GRANT USAGE ON ALL SCHEMAS IN DATABASE <database-name> TO ROLE atlan_dq_service_role;
    2. Table and view read access: Required to attach and execute rules on these objects.

      GRANT SELECT ON ALL TABLES IN DATABASE <database-name> TO ROLE atlan_dq_service_role;
      GRANT SELECT ON ALL VIEWS IN DATABASE <database-name> TO ROLE atlan_dq_service_role;
    3. Set up future grants to retain grants when objects are recreated and apply them to new objects automatically:

      GRANT USAGE  ON FUTURE SCHEMAS  IN DATABASE <database-name> TO ROLE atlan_dq_service_role;
      GRANT SELECT ON FUTURE TABLES IN DATABASE <database-name> TO ROLE atlan_dq_service_role;
      GRANT SELECT ON FUTURE VIEWS IN DATABASE <database-name> TO ROLE atlan_dq_service_role;

      If you define future grants at both database and schema levels, schema-level grants take precedence. For more information, see the Snowflake documentation about Considerations for future grants.

      Schema-level future grants

      If you use schema-level future grants, grant the following to the service role:

      GRANT SELECT ON FUTURE TABLES IN SCHEMA <database-name>.<schema-name> TO ROLE atlan_dq_service_role;
      GRANT SELECT ON FUTURE VIEWS IN SCHEMA <database-name>.<schema-name> TO ROLE atlan_dq_service_role;

Next steps

Need help

If you have questions or need assistance with setting up Snowflake for data quality, reach out to Atlan Support by submitting a support request.

See also