Most recent day scans

Large analytic tables often accumulate years of historical data, making complete table scans during every quality check both slow and computationally expensive. Most recent day scans address this challenge by limiting rule evaluation to only the trailing 24 hours of data, focusing monitoring on the most relevant, recently ingested data while avoiding unnecessary scans of historical partitions.

Key concepts

Most recent day scans are built around three key concepts that determine how the filtering works:

🕐

Row creation timestamp column

A column in your table that stores the time when each row was inserted or updated

Determines which rows are considered "recent"
Shared at the table level across all rules
Automatically updates when changed for any rule

🔄

Rolling 24-hour window

The filter dynamically adjusts to always include the most recent 24 hours relative to the latest data

Automatically shifts forward as new data arrives
Ensures you always monitor the freshest data
No manual configuration updates required

⏱️

Schedule independence

The rule's execution schedule remains unchanged while only the data slice becomes smaller

Execution schedule stays the same
Only the evaluated data slice changes
Faster execution without altering triggers

How it works

When enabled, Atlan automatically adds a WHERE clause to the rule query that filters rows based on a timestamp column you specify. The filter selects rows where the timestamp is within the last 24 hours of the maximum timestamp value in the table.

The filter takes effect immediately after you save the rule with the toggle enabled. No workflow changes are required. Run history shows reduced row counts and shorter execution time once the filter is active. The rule schedule remains unchanged. The filter only limits which rows qualify during each run.

The execution follows this logic:

Timestamp identification: Atlan identifies the maximum timestamp value in your selected row creation timestamp column across the entire table.
Window calculation: The system calculates the 24-hour window ending at that maximum timestamp value.

Filter application: A WHERE clause is automatically injected into the rule's SQL query, filtering to include only rows within that 24-hour window. The filtering logic varies by platform:

BigQuery:

timestamp_col > (SELECT TIMESTAMP_SUB(MAX(timestamp_col), INTERVAL 24 HOUR) FROM table_name)

Databricks:

timestamp_col > ((SELECT MAX(timestamp_col) FROM table_name) - INTERVAL 24 HOURS)

Snowflake:

timestamp_col > (SELECT DATEADD(hour, -24, MAX(timestamp_col)) FROM table_name)

Rule execution: The rule runs on this filtered dataset, maintaining the same schedule you configured. Only the data slice becomes smaller. For example, if your table's most recent row was inserted at 5 PM Friday, when the rule runs next (whether that's Saturday, Monday, or any other day), it scans only rows from the previous 24 hours: 5 PM Thursday through 5 PM Friday. All rows older than Thursday at 5 PM are excluded from the scan.

What you get from most recent day scans

Most recent day scans deliver immediate performance and cost benefits by eliminating redundant historical data processing:

Reduced query execution time: Process only new data instead of the full table, dramatically shortening rule execution times
Lower compute costs: Minimize warehouse resource consumption on Snowflake and Databricks by avoiding scans of stale partitions
Focused alerting: Monitor fresh data that's most likely to contain issues in batch-based ingestion workflows, reducing alert noise from historical data

Key concepts​

Row creation timestamp column

Rolling 24-hour window

Schedule independence

How it works​

What you get from most recent day scans​

See also​

Key concepts

How it works

What you get from most recent day scans

See also