Create example queries¶
The queries.yaml file teaches the AI how to write correct SQL for your
database. Each entry pairs a natural language question with its SQL answer.
How examples are used¶
flowchart TD
A[queries.yaml] --> B[System prompt<br>few-shot examples]
A --> C[Sidebar<br>clickable queries]
B --> D[AI writes SQL]
style A fill:#8a7d55,stroke:#6b6040,color:#fff
style B fill:#15a8a8,stroke:#023d60,color:#fff
style C fill:#fe5d26,stroke:#c44a1e,color:#fff
style D fill:#2e7ebb,stroke:#1a5c8a,color:#fff
At startup, datasight:
- Appends formatted examples to the AI system prompt as few-shot context
- Displays them in the sidebar, filterable by selected table
When a user asks a question, the AI references these examples to write accurate SQL. Users can also click any example query in the sidebar to send it directly to the chat.
File format¶
# queries.yaml
- question: What are the top 10 states by total generation?
sql: |
SELECT state, SUM(net_generation_mwh) AS total_mwh
FROM generation_fuel
GROUP BY state
ORDER BY total_mwh DESC
LIMIT 10
- question: Monthly generation trend
sql: |
SELECT DATE_TRUNC('month', report_date) AS month,
SUM(net_generation_mwh) AS total_mwh
FROM generation_fuel
GROUP BY month
ORDER BY month
Writing good examples¶
Cover common questions. What will your users ask most often?
Show tricky patterns. Joins, enum filters, and aggregations specific to your schema are where examples help the most.
Keep SQL readable. The AI adapts the pattern — it doesn't copy verbatim. Clear SQL teaches better than clever SQL.
Include chart-friendly queries. Two or three columns (one category/date + one numeric) produce the cleanest visualizations.
Aim for 5-15 examples. Too few gives the AI little to work with. Too many dilutes the signal and bloats the system prompt.
Add expected results for verification¶
Each entry can include an expected block used by datasight verify to
validate that the AI generates correct SQL. See
Verify queries across models for full details.
- question: What are the top 5 states by solar generation?
sql: |
SELECT state, SUM(mwh) AS total
FROM generation
GROUP BY state
ORDER BY total DESC
LIMIT 5
expected:
row_count: 5
columns: [state, total]
contains: ["CA"]
Available checks: row_count, min_row_count, max_row_count, columns,
contains, not_contains.
File location¶
By default, datasight looks for queries.yaml in the project directory.
Override with the EXAMPLE_QUERIES_PATH environment variable.