Create your question set Private Preview

Once you've built and refined your context repository, you need a question set before you can run simulations. A question set is a list of business questions, each paired with a verified SQL query that produces the known-correct answer. This is how CES measures whether the model answers correctly.

Prerequisites

Before you begin, make sure:

You've built your context repository. See Build your context repository.
If you're using Databricks, your repository is deployed. See Deploy to Databricks.

Add questions

CES gives you three ways to add questions. Use them in combination: start by generating questions for coverage, then layer in real questions from Data Exploration or your domain expert.

What makes a good question set

Use questions the business actually asks, phrased the way they phrase them, covering simple lookups, aggregations, multi-table joins, and time-window questions. Start with 10 to 20 entries and make sure every question has a verified SQL query a domain expert has confirmed is correct.

Auto-generate
From Data Exploration
Manual

CES drafts an initial question set by reading your semantic model. This is the fastest way to get started. Use it to confirm coverage across the metrics and dimensions your model exposes.

In your context repository, click the Simulate tab.
Click Auto-generate.

Review each suggestion. Accept the ones that look representative, edit the SQL if the logic is off, and discard anything unrealistic.

Generating questions automatically provides broad coverage but doesn't capture business-specific nuance. For the most important queries, layer in questions from Data Exploration or add them manually.

Some questions only your domain expert can write: multi-step calculations, edge-case filters, and business-critical metrics where a silent wrong answer is expensive.

In your context repository, click the Simulate tab.
Click Add example.

Enter the natural-language question:

What is the month-over-month NRR trend for customers acquired in 2024?

Enter the verified SQL query your domain expert confirms produces the correct answer:

WITH cohort AS (
  SELECT account_id
  FROM dim_accounts
  WHERE DATE_TRUNC('year', first_active_date) = '2024-01-01'
),
monthly_arr AS (
  SELECT
    account_id,
    DATE_TRUNC('month', as_of_date) AS month,
    SUM(arr) AS arr
  FROM fct_arr_monthly
  WHERE account_id IN (SELECT account_id FROM cohort)
  GROUP BY 1, 2
)
SELECT month, SUM(arr) / LAG(SUM(arr)) OVER (ORDER BY month) AS nrr
FROM monthly_arr
GROUP BY month
ORDER BY month;

Optionally add tags and notes for context, then click Save.

Next steps

Simulate: run your question set and act on the diagnostics.

Prerequisites​

Add questions​

Next steps​

Prerequisites

Add questions

Next steps