CLI reference¶

This page is generated from the Click command tree in datasight.cli. Update it with python scripts/generate_cli_reference.py.

Common workflows¶

Run batch questions¶

datasight ask --file questions.txt --output-dir batch-output
datasight ask --file questions.yaml --output-dir batch-output
datasight ask --file questions.jsonl --output-dir batch-output

Inspect a project without the LLM¶

datasight profile
datasight profile --table generation_fuel
datasight profile --column generation_fuel.report_date

Run deterministic audits and suggestions¶

datasight quality --table generation_fuel
datasight dimensions --table generation_fuel
datasight trends --table generation_fuel
datasight recipes list --table generation_fuel

Check project health¶

datasight doctor
datasight doctor --format markdown -o doctor.md

`datasight`¶

datasight — AI-powered data exploration with natural language.

datasight [OPTIONS] COMMAND [ARGS]...

Parameters

Name	Details
`--version`	Show the version and exit.
`-v`, `--verbose`	Enable debug logging for the invoked command.

Subcommands

init: Create blank datasight project template files.
config: Manage user-global datasight configuration.
demo: Create ready-to-run demo projects with sample datasets.
generate: Generate schema_description.md, queries.yaml, measures.yaml, and time_series.yaml from your database.
run: Start the datasight web UI.
session: Export and import shareable datasight session archives.
verify: Verify LLM-generated SQL against expected results.
ask: Ask a question about your data from the command line.
profile: Profile your dataset - row counts, date coverage, and column statistics.
measures: Surface likely measures and default aggregations.
quality: Audit data quality - nulls, suspicious ranges, and date coverage.
tidy: Detect untidy column shapes and reshape into long form.
grounding: Detect and repair drift between grounding files and the live schema.
integrity: Audit cross-table referential integrity - keys, orphans, and join risks.
distribution: Profile value distributions - percentiles, outliers, and measure flags.
validate: Run declarative validation rules against the database.
audit-report: Generate a comprehensive audit report combining all checks.
dimensions: Surface likely grouping dimensions and category breakdowns.
trends: Surface likely trend analyses and chart recommendations.
inspect: Run all analyses on Parquet, CSV, Excel, or DuckDB files and print results.
recipes: Generate and run reusable deterministic prompt recipes.
doctor: Check project configuration, local files, and database connectivity.
export: Export a conversation session as a self-contained HTML page or Python script.
log: Display the SQL query log in a formatted table.
report: Manage saved reports.
templates: Save and re-apply dashboards as templates across datasets.

`datasight init`¶

Create blank datasight project template files.

PROJECT_DIR defaults to the current directory.

Use this when you want to fill in .env, schema_description.md,

queries.yaml, and time_series.yaml by hand.

If you already have a DuckDB/SQLite database or CSV/Parquet/Excel

files and want datasight to inspect them and draft these files, use:

datasight generate <file>...

datasight init [OPTIONS] [PROJECT_DIR]

Parameters

Name	Details
`PROJECT_DIR`
`--overwrite`	Overwrite existing files.

`datasight config`¶

Manage user-global datasight configuration.

The user-global config file (~/.config/datasight/.env) holds API

keys and tokens shared across every datasight project. Per-project

.env files override its values, so each project can still pick its

own LLM provider, model, and database.

Examples:

datasight config init
datasight config show

datasight config [OPTIONS] COMMAND [ARGS]...

Subcommands

init: Create the user-global config file (~/.config/datasight/.env).
show: Show the resolved datasight configuration and where it loaded from.

`datasight config init`¶

Create the user-global config file (~/.config/datasight/.env).

Stores API keys and tokens in one place so per-project .env files only need to set provider, model, and database settings.

datasight config init [OPTIONS]

Parameters

Name	Details
`--overwrite`	Overwrite the existing global config file.

`datasight config show`¶

Show the resolved datasight configuration and where it loaded from.

datasight config show [OPTIONS]

`datasight demo`¶

Create ready-to-run demo projects with sample datasets.

Examples:

datasight demo eia-generation eia-demo
datasight demo dsgrid-tempo tempo-demo
datasight demo time-validation time-demo

datasight demo [OPTIONS] COMMAND [ARGS]...

Subcommands

eia-generation: Download an EIA energy demo dataset and create a ready-to-run project.
dsgrid-tempo: Download dsgrid TEMPO EV charging demand projections.
time-validation: Generate a synthetic energy consumption dataset with planted time errors.

`datasight demo eia-generation`¶

Download an EIA energy demo dataset and create a ready-to-run project.

Downloads cleaned EIA-923 and EIA-860 data from the PUDL project's public data releases. Creates a DuckDB database with generation, fuel consumption, and plant data, along with pre-written schema descriptions and example queries.

PROJECT_DIR defaults to the current directory.

Example:

datasight demo eia-generation eia-demo --min-year 2021

datasight demo eia-generation [OPTIONS] [PROJECT_DIR]

Parameters

Name	Details
`PROJECT_DIR`
`--min-year`	Earliest year to include (default: 2020). Default: `2020`.

`datasight demo dsgrid-tempo`¶

Download dsgrid TEMPO EV charging demand projections.

Downloads hourly and annual EV charging demand data from NLR's TEMPO project (published on OEDI). Creates a DuckDB database with charging profiles at census-division level, plus annual summaries by state and county. Covers three adoption scenarios from 2024 to 2050.

Data source: s3://nrel-pds-dsgrid/tempo/tempo-2022/v1.0.0 (public, no credentials needed).

PROJECT_DIR defaults to the current directory.

Example:

datasight demo dsgrid-tempo tempo-demo

datasight demo dsgrid-tempo [OPTIONS] [PROJECT_DIR]

Parameters

Name	Details
`PROJECT_DIR`

`datasight demo time-validation`¶

Generate a synthetic energy consumption dataset with planted time errors.

Creates hourly electricity consumption data across sectors, end uses, and US states for future projection years (2038, 2039, 2040). The dataset contains intentional gaps, duplicates, and DST anomalies that datasight's time series quality checks can detect.

Run "datasight quality" or "datasight run" after setup to find the errors.

PROJECT_DIR defaults to the current directory.

Example:

datasight demo time-validation time-demo

datasight demo time-validation [OPTIONS] [PROJECT_DIR]

Parameters

Name	Details
`PROJECT_DIR`

`datasight generate`¶

Generate schema_description.md, queries.yaml, measures.yaml, and time_series.yaml from your database.

Connects to the database, inspects tables and columns, samples code/enum columns, and asks the LLM to produce documentation and example queries.

Use datasight init for blank templates; use datasight generate to create

project files from an existing database or data files.

Examples:

# Use the database configured in .env
datasight generate
# Reference an existing DuckDB or SQLite database directly
datasight generate grid.duckdb
datasight generate generation.sqlite
# Build ./database.duckdb from CSV inputs
datasight generate generation.csv plants.csv
# Build ./database.duckdb from Parquet inputs
datasight generate generation.parquet plants.parquet
# Build ./database.duckdb from Excel inputs (one table per sheet)
datasight generate generation.xlsx
# Build a custom project DuckDB from CSV, Parquet, or Excel inputs
datasight generate generation.csv --db-path project.duckdb
datasight generate generation.parquet --db-path project.duckdb

FILES are input data. --db-path is only the output DuckDB path used

when datasight needs to build a project database from CSV/Parquet/Excel

or mixed file inputs.

datasight generate [OPTIONS] [FILES]...

Parameters

Name	Details
`FILES`
`--project-dir`	Project directory containing .env. Default: `.`.
`--model`	Model name (overrides .env).
`--overwrite`	Overwrite existing files.
`--table`, `-t`	Table or view to include (can be specified multiple times). If omitted, all tables are included.
`--db-path`	Output DuckDB path to create from CSV/Parquet/Excel or mixed file inputs (default: database.duckdb). Do not use this with a single existing DuckDB or SQLite database; those are referenced directly.
`--import-mode`	When FILES are CSV/Parquet inputs, choose whether datasight creates source-backed views or materialized DuckDB tables. 'auto' preserves the existing cheap behavior and keeps CSV/Parquet source-backed; use 'table' to opt into materialization. Excel workbooks are always materialized as tables. Default: `auto`.
`--compact-schema`	Write schema.yaml with table names only. Default adds an empty 'excluded_columns: []' placeholder per table so you can fill in glob patterns for columns to hide.
`--max-tokens`	Output token budget for the documentation LLM call. Defaults to a per-provider safe value. Reasoning models that hide tokens may need a larger value (subject to provider limits — GitHub Models caps output around 8K).

`datasight run`¶

Start the datasight web UI.

If the current directory contains schema_description.md, it will be auto-loaded as the project. Otherwise, use the UI to select a project, or pass --project-dir to specify one explicitly.

Examples:

datasight run
datasight run --project-dir eia-demo
datasight run --port 9000 --model gpt-4o
datasight run --unix-socket /tmp/datasight.sock

datasight run [OPTIONS]

Parameters

Name	Details
`--port`	Web UI port (default: 8084).
`--host`	Bind address for TCP mode. Default: `127.0.0.1`.
`--unix-socket`	Listen on this UNIX domain socket instead of TCP.
`--model`	LLM model name (overrides .env).
`--project-dir`	Auto-load this project on startup (optional).

`datasight session`¶

Export and import shareable datasight session archives.

Archives carry the conversation transcript and per-session dashboard only — never .env or LLM credentials, and never the underlying database. Recipients need to bring their own data.

Examples:

datasight session list
datasight session export abc123 --output-path analysis.zip
datasight session import analysis.zip
datasight session import analysis.zip --session-id copied-session --overwrite

datasight session [OPTIONS] COMMAND [ARGS]...

Subcommands

list: List saved sessions available for export.
export: Export SESSION_ID as a versioned datasight session archive.
import: Import a datasight session archive into PROJECT_DIR.

`datasight session list`¶

List saved sessions available for export.

datasight session list [OPTIONS]

Parameters

Name	Details
`--project-dir`	Project directory containing .datasight/ state (default: cwd). Default: `.`.

`datasight session export`¶

Export SESSION_ID as a versioned datasight session archive.

datasight session export [OPTIONS] SESSION_ID

Parameters

Name	Details
`SESSION_ID`
`--project-dir`	Project directory containing .datasight/ state (default: cwd). Default: `.`.
`--output-path`	Output archive path. Defaults to .zip in the current directory.

`datasight session import`¶

Import a datasight session archive into PROJECT_DIR.

datasight session import [OPTIONS] ARCHIVE_PATH

Parameters

Name	Details
`ARCHIVE_PATH`
`--project-dir`	Project directory containing .datasight/ state (default: cwd). Default: `.`.
`--session-id`	Import under this session ID instead of the archived ID.
`--overwrite`	Replace an existing session with the same ID.

`datasight verify`¶

Verify LLM-generated SQL against expected results.

Runs each question from queries.yaml through the full LLM pipeline, executes the generated SQL, and compares results against expected values. Use this to validate correctness across different models and providers.

Before the LLM phase, runs a static schema-drift check that flags references to columns or tables that no longer exist in the live database. --static-only skips the LLM phase entirely; --skip-grounding-check skips the static check.

Examples:

datasight verify
datasight verify --static-only
datasight verify --queries verification.yaml
datasight verify --model gpt-4o

Add expected results to queries.yaml entries:

- question: "Top 3 states by generation"
  sql: |
    SELECT state, SUM(mwh) AS total
    FROM generation GROUP BY state
    ORDER BY total DESC LIMIT 3
  expected:
    row_count: 3
    columns: [state, total]
    contains: ["CA", "TX"]

datasight verify [OPTIONS]

Parameters

Name	Details
`--project-dir`	Project directory containing .env and queries.yaml. Default: `.`.
`--model`	Model name (overrides .env).
`--queries`	Path to queries YAML file (default: queries.yaml in project dir).
`--static-only`	Run only the cheap schema-drift check (no LLM, no query execution). Reports unresolved column/table references in queries.yaml, schema_description.md, and time_series.yaml against the live DB.
`--skip-grounding-check`	Skip the static drift check that normally runs before the LLM phase.

`datasight ask`¶

Ask a question about your data from the command line.

Runs the full LLM agent loop without starting a web server. Results are printed to the console.

Examples:

datasight ask "What are the top 5 states by generation?"
datasight ask "Show generation by year" --chart-format html -o chart.html
datasight ask "Top 5 states" --format csv -o results.csv
datasight ask --file questions.txt --output-dir batch-output
datasight ask "Top 5 states" --print-sql
datasight ask "Top 5 states" --provenance
datasight ask "Top 5 states" --sql-script top-states.sql

datasight ask [OPTIONS] [QUESTION]

Parameters

Name	Details
`QUESTION`
`--project-dir`	Project directory containing .env and config files. Default: `.`.
`--model`	Model name (overrides .env).
`--format`	Output format for query results (default: table). Default: `table`.
`--chart-format`	Save chart output in this format (requires --output).
`--output`, `-o`	Output file path for chart or data export.
`--file`	Read one question per line from a text file.
`--output-dir`	Directory for per-question batch outputs (only with --file).
`--print-sql`	Print the SQL queries executed by the agent to the console.
`--provenance`	Print run provenance as JSON to stdout (suppresses human-readable answer).
`--sql-script`	Write executed queries to a SQL script that materializes results into auto-named tables (CREATE OR REPLACE).

`datasight profile`¶

Profile your dataset - row counts, date coverage, and column statistics.

Use this before asking questions to understand table sizes, candidate measures, dimensions, null rates, and date ranges.

Examples:

datasight profile
datasight profile --table generation_fuel
datasight profile --column generation_fuel.net_generation_mwh
datasight profile --format markdown -o profile.md

datasight profile [OPTIONS]

Parameters

Name	Details
`--project-dir`	Project directory containing .env and config files. Default: `.`.
`--table`	Profile a specific table.
`--column`	Profile a specific column as table.column.
`--format`	Output format (default: table). Default: `table`.
`--output`, `-o`	Write the profile output to a file instead of stdout.

`datasight measures`¶

Surface likely measures and default aggregations.

Measures are numeric columns that should usually be summed, averaged, or otherwise aggregated in generated SQL. Use --scaffold to create an editable measures.yaml override file.

Examples:

datasight measures
datasight measures --table generation_fuel
datasight measures --scaffold
datasight measures --format markdown -o measures.md

datasight measures [OPTIONS]

Parameters

Name	Details
`--project-dir`	Project directory containing .env and config files. Default: `.`.
`--table`	Inspect measures for a specific table.
`--scaffold`	Write an editable measures.yaml scaffold and exit.
`--overwrite`	Overwrite an existing scaffold file.
`--format`	Output format (default: table). Default: `table`.
`--output`, `-o`	Write the measure overview to a file instead of stdout.

`datasight quality`¶

Audit data quality - nulls, suspicious ranges, and date coverage.

Also checks temporal completeness when time_series.yaml defines expected time series structure.

Examples:

datasight quality
datasight quality --table generation_fuel
datasight quality --format markdown -o quality.md

datasight quality [OPTIONS]

Parameters

Name	Details
`--project-dir`	Project directory containing .env and config files. Default: `.`.
`--table`	Audit a specific table.
`--format`	Output format (default: table). Default: `table`.
`--output`, `-o`	Write the quality audit to a file instead of stdout.

`datasight tidy`¶

Detect untidy column shapes and reshape into long form.

Two paths:

Deterministic: 'tidy suggest' lists candidates, 'tidy view' creates long-form views, 'tidy table' materializes long-form tables. These run on column-name pattern matching and never call an LLM.
LLM-augmented: 'tidy review' adds an advisor that proposes pivots the regex misses (fuel-type-as-column, geography-as-column, multi-axis pivots) for the developer to approve before applying.

Examples:

datasight tidy suggest
datasight tidy suggest --table sales_wide
datasight tidy view --dry-run
datasight tidy view
datasight tidy table --table sales_wide
datasight tidy review --from plan.json --apply-all

datasight tidy [OPTIONS] COMMAND [ARGS]...

Subcommands

suggest: List detected untidy column shapes without changing the database.

view: Create CREATE OR REPLACE VIEW _long for each detected pattern.

table: Materialize CREATE OR REPLACE TABLE

_long for each detected pattern.

review: LLM-augmented advisor that proposes reshapes for the developer to review.

`datasight tidy suggest`¶

List detected untidy column shapes without changing the database.

Pass one or more CSV / Parquet / Excel / DuckDB files as positional arguments to inspect them in an ephemeral session — no project setup required. With no files, runs against the current project's database. Detection is deterministic: column names plus dtypes plus row counts. No LLM is involved. For pivots the regex misses, see 'tidy review'.

Examples:

datasight tidy suggest                           # current project
datasight tidy suggest monthly_generation.csv    # standalone file
datasight tidy suggest gen.csv plants.parquet    # multiple files
datasight tidy suggest --table sales_wide
datasight tidy suggest --format markdown -o tidy.md

datasight tidy suggest [OPTIONS] [FILES]...

Parameters

Name	Details
`FILES`
`--project-dir`	Project directory containing .env and config files. Default: `.`.
`--table`	Scope tidy detection to a specific source table.
`--format`	Output format (default: table). Default: `table`.
`--output`, `-o`	Write the tidy listing to a file instead of stdout.

`datasight tidy view`¶

Create CREATE OR REPLACE VIEW

_long for each detected pattern.

Deterministic — applies the regex detector's hits without consulting an LLM. For LLM-augmented proposals (fuel-type-as-column, multi-axis pivots), use 'tidy review'.

Examples:

datasight tidy view
datasight tidy view --dry-run
datasight tidy view --table sales_wide

datasight tidy view [OPTIONS]

Parameters

Name	Details
`--project-dir`	Project directory containing .env and config files. Default: `.`.
`--table`	Scope tidy detection to a specific source table.
`--dry-run`	Print the DDL without executing it.

`datasight tidy table`¶

Materialize CREATE OR REPLACE TABLE

_long for each detected pattern.

Deterministic — applies the regex detector's hits without consulting an LLM. For LLM-augmented proposals (fuel-type-as-column, multi-axis pivots), use 'tidy review'.

Examples:

datasight tidy table
datasight tidy table --dry-run
datasight tidy table --table sales_wide

datasight tidy table [OPTIONS]

Parameters

Name	Details
`--project-dir`	Project directory containing .env and config files. Default: `.`.
`--table`	Scope tidy detection to a specific source table.
`--dry-run`	Print the DDL without executing it.

`datasight tidy review`¶

LLM-augmented advisor that proposes reshapes for the developer to review.

Runs the deterministic detector first, then asks the configured LLM provider for additional candidates the regex misses (fuel-type-as-column, geography-as-column, scenario-as-column, multi-axis pivots). The developer approves each candidate before it is applied.

Use --from PLAN to skip the LLM call and apply a pre-built plan. Use --out PLAN to write proposals to a file instead of applying; without --from this writes the deterministic detector's hits, giving a starting point that can be hand-edited and fed back via --from.

Calls the configured LLM provider whenever --from is not set.

Examples:

datasight tidy review --from plan.json --apply-all
datasight tidy review --from plan.json --dry-run
datasight tidy review --out detector.json
datasight tidy review --from plan.json --apply-all --drop-source
datasight tidy review --from plan.json --apply-all --replace-source
datasight tidy review --from plan.json --apply-all --rename-source sales_wide_raw

datasight tidy review [OPTIONS]

Parameters

Name	Details
`--project-dir`	Project directory containing .env and config files. Default: `.`.
`--table`	Scope tidy detection to a specific source table.
`--from`	Load proposals from a JSON plan file (no LLM call).
`--out`	Write proposals to a JSON plan file instead of applying. Without --from, writes the deterministic detector's hits.
`--apply-all`	Apply every proposal without prompting. Required for non-interactive use.
`--dry-run`	Print DDL and proposed dispositions without changing the database.
`--as`	Materialize the long form as a table or view (default: view). Default: `view`.
`--keep-source`	Leave the source object (table/view) unchanged after the reshape (default).
`--rename-source`	Rename the source object (table/view) to NAME after a successful reshape. Requires '--as table' — a view's body references its source by name.
`--replace-source`	Drop the source after a successful reshape and rename the long-form table to take the source's old name. Downstream code that referenced the source keeps working without edits. Requires '--as table' — a view's body references its source by name.
`--drop-source`	Drop the source after a successful reshape; the long form keeps its target name. Pick this when the new shape is the canonical one going forward and you don't need to preserve the source's name. Requires '--as table'. NOTE: previously this flag carried the semantics now moved to '--replace-source'; scripts depending on the old behavior should switch to '--replace-source'.
`--sample`	Send N sample rows per candidate to the configured LLM provider (default 0). Sample values get sent over the network — opt in only when the LLM seeing the values is acceptable.
`--model`	LLM model name to use for the propose-reshapes call and the post-apply grounding-repair call (overrides .env). Useful when different models suit each workload — see docs/use/concepts/choosing-an-llm.md.

`datasight grounding`¶

Detect and repair drift between grounding files and the live schema.

Grounding files (queries.yaml, schema_description.md, time_series.yaml) describe the database to the LLM. When the schema changes (typically after datasight tidy review), these files fall out of sync and the agent silently hallucinates against columns that no longer exist.

check reports drift without changing anything.
repair asks the configured LLM to rewrite the stale files against the current schema, validates each proposed query, and writes atomically after you confirm the diff.

Examples:

datasight grounding check
datasight grounding repair
datasight grounding repair --model qwen3.6
datasight grounding repair --from-csv load_data.csv
datasight grounding repair --dry-run

datasight grounding [OPTIONS] COMMAND [ARGS]...

Subcommands

check: Report stale references in grounding files against the live schema.
repair: Run the LLM grounding repair against an existing drift.

`datasight grounding check`¶

Report stale references in grounding files against the live schema.

Static — no LLM, no query execution. Exits 0 when grounding is clean, 1 when drift is detected. Use datasight grounding repair to fix what this command finds.

Examples:

datasight grounding check
datasight grounding check --project-dir /path/to/project

datasight grounding check [OPTIONS]

Parameters

Name	Details
`--project-dir`	Project directory containing .env and grounding files. Default: `.`.

`datasight grounding repair`¶

Run the LLM grounding repair against an existing drift.

Reads the pre-tidy schema snapshot persisted by the most recent apply (.datasight/grounding_snapshot.json). When no snapshot is on file, --from-csv lets you supply the wide-form schema by pointing at the source CSV(s).

Shows the unified diff and prompts for confirmation before writing. Use --dry-run to skip the write entirely.

Examples:

datasight grounding repair
datasight grounding repair --model qwen3.6
datasight grounding repair --from-csv load_data.csv
datasight grounding repair --dry-run

datasight grounding repair [OPTIONS]

Parameters

Name	Details
`--project-dir`	Project directory containing .env and grounding files. Default: `.`.
`--model`	LLM model name to use for the repair (overrides .env). Useful for retrying with a different model after a timeout.
`--from-csv`	Derive the pre-tidy schema from CSV headers when no snapshot is available. Pass once per source file (e.g. the wide-format input the apply consumed). Each CSV becomes a single table named after the file stem. Combinable with the snapshot — snapshot tables win on conflict.
`--dry-run`	Show drift + LLM proposal + diff, but don't write any files.

`datasight integrity`¶

Audit cross-table referential integrity - keys, orphans, and join risks.

Use this to find likely primary keys, duplicate keys, orphaned foreign keys, and joins that may multiply rows unexpectedly.

Examples:

datasight integrity
datasight integrity --table plants
datasight integrity --format json -o integrity.json

datasight integrity [OPTIONS]

Parameters

Name	Details
`--project-dir`	Project directory containing .env and config files. Default: `.`.
`--table`	Focus integrity checks on a specific table.
`--format`	Output format (default: table). Default: `table`.
`--output`, `-o`	Write the integrity audit to a file instead of stdout.

`datasight distribution`¶

Profile value distributions - percentiles, outliers, and measure flags.

Use this to inspect numeric ranges, skew, zero/negative rates, outliers, and measure-semantic flags before building charts or validation rules.

Examples:

datasight distribution
datasight distribution --table generation_fuel
datasight distribution --column generation_fuel.net_generation_mwh
datasight distribution --format markdown -o distributions.md

datasight distribution [OPTIONS]

Parameters

Name	Details
`--project-dir`	Project directory containing .env and config files. Default: `.`.
`--table`	Profile distributions for a specific table.
`--column`	Focus on a specific column as table.column.
`--format`	Output format (default: table). Default: `table`.
`--output`, `-o`	Write the distribution profile to a file instead of stdout.

`datasight validate`¶

Run declarative validation rules against the database.

Rules live in validation.yaml. Use --scaffold to create a starter file, edit it for your dataset, then run validate to produce pass/fail output.

Examples:

datasight validate --scaffold
datasight validate
datasight validate --table generation_fuel
datasight validate --format markdown -o validation.md

datasight validate [OPTIONS]

Parameters

Name	Details
`--project-dir`	Project directory containing .env and config files. Default: `.`.
`--table`	Run rules for a specific table only.
`--config`	Path to validation.yaml (default: project_dir/validation.yaml).
`--format`	Output format (default: table). Default: `table`.
`--output`, `-o`	Write the validation report to a file instead of stdout.
`--scaffold`	Write an example validation.yaml to the project directory and exit.
`--overwrite`	Overwrite an existing validation.yaml.

`datasight audit-report`¶

Generate a comprehensive audit report combining all checks.

Combines profile, measures, quality, integrity, distribution, and validation results into one HTML, Markdown, or JSON artifact.

Examples:

datasight audit-report
datasight audit-report -o audit.html
datasight audit-report --format markdown -o audit.md
datasight audit-report --table generation_fuel -o generation-audit.html

datasight audit-report [OPTIONS]

Parameters

Name	Details
`--project-dir`	Project directory containing .env and config files. Default: `.`.
`--table`	Scope the audit to a specific table.
`--output`, `-o`	Output path (.html, .md, or .json). Default: `report.html`.
`--format`	Output format (default: inferred from file extension).

`datasight dimensions`¶

Surface likely grouping dimensions and category breakdowns.

Use this to find text/code columns that are good GROUP BY candidates, such as fuel codes, states, sectors, plants, or scenario labels.

Examples:

datasight dimensions
datasight dimensions --table generation_fuel
datasight dimensions --format json -o dimensions.json

datasight dimensions [OPTIONS]

Parameters

Name	Details
`--project-dir`	Project directory containing .env and config files. Default: `.`.
`--table`	Inspect dimensions for a specific table.
`--format`	Output format (default: table). Default: `table`.
`--output`, `-o`	Write the dimension overview to a file instead of stdout.

`datasight trends`¶

Surface likely trend analyses and chart recommendations.

Run inside a configured project, or pass one or more Parquet, CSV, Excel, or DuckDB files directly for a quick file-only trend scan.

Examples:

datasight trends
datasight trends --table generation_fuel
datasight trends generation.parquet plants.parquet
datasight trends --format markdown -o trends.md

datasight trends [OPTIONS] [FILES]...

Parameters

Name	Details
`FILES`
`--project-dir`	Project directory containing .env and config files.
`--table`	Suggest trends for a specific table.
`--format`	Output format (default: table). Default: `table`.
`--output`, `-o`	Write the trend overview to a file instead of stdout.

`datasight inspect`¶

Run all analyses on Parquet, CSV, Excel, or DuckDB files and print results.

Creates a file-backed session and runs profile, quality, measures, dimensions, trends, and recipes — printing everything to the console without creating a project. When the current directory contains a .env with DB_MODE=spark, the files are registered as Spark temp views and all queries run on the cluster; otherwise an ephemeral in-memory DuckDB session is used.

Examples:

datasight inspect generation.parquet
datasight inspect generation.csv plants.csv
datasight inspect data_dir/
datasight inspect generation.parquet --format markdown -o inspect.md

datasight inspect [OPTIONS] FILES...

Parameters

Name	Details
`FILES`
`--format`	Output format (default: table). Default: `table`.
`--output`, `-o`	Write the full report to a file instead of stdout.

`datasight recipes`¶

Generate and run reusable deterministic prompt recipes.

Recipes are suggested natural-language questions derived from the schema. Listing recipes does not call an LLM; running one sends the recipe prompt through the normal ask pipeline.

Examples:

datasight recipes list
datasight recipes list --table generation_fuel
datasight recipes run 1

datasight recipes [OPTIONS] COMMAND [ARGS]...

Subcommands

list: List reusable deterministic prompt recipes for a project.
run: Run a generated recipe by ID through the normal ask pipeline.

`datasight recipes list`¶

List reusable deterministic prompt recipes for a project.

Examples:

datasight recipes list
datasight recipes list --table generation_fuel
datasight recipes list --format markdown -o recipes.md

datasight recipes list [OPTIONS]

Parameters

Name	Details
`--project-dir`	Project directory containing .env and config files. Default: `.`.
`--table`	Generate recipes for a specific table.
`--format`	Output format (default: table). Default: `table`.
`--output`, `-o`	Write the recipes output to a file instead of stdout.

`datasight recipes run`¶

Run a generated recipe by ID through the normal ask pipeline.

RECIPE_ID is the numeric ID shown by datasight recipes list.

Examples:

datasight recipes run 1
datasight recipes run 2 --format csv -o recipe.csv
datasight recipes run 3 --chart-format html -o recipe.html

datasight recipes run [OPTIONS] RECIPE_ID

Parameters

Name	Details
`RECIPE_ID`
`--project-dir`	Project directory containing .env and config files. Default: `.`.
`--table`	Use recipes generated for a specific table.
`--model`	Model name (overrides .env).
`--format`	Output format for query results (default: table). Default: `table`.
`--chart-format`	Save chart output in this format (requires --output).
`--output`, `-o`	Output file path for chart or data export.

`datasight doctor`¶

Check project configuration, local files, and database connectivity.

Use this when a project will not load, an API key is missing, a database path is wrong, or the web UI cannot write state under .datasight/.

Examples:

datasight doctor
datasight doctor --format markdown -o doctor.md
datasight doctor --project-dir eia-demo

datasight doctor [OPTIONS]

Parameters

Name	Details
`--project-dir`	Project directory containing .env and config files. Default: `.`.
`--format`	Output format (default: table). Default: `table`.
`--output`, `-o`	Write doctor output to a file instead of stdout.

`datasight export`¶

Export a conversation session as a self-contained HTML page or Python script.

SESSION_ID is the conversation ID (use --list-sessions to see available IDs).

Examples:

datasight export --list-sessions
datasight export abc123def -o my-analysis.html
datasight export abc123def --format py -o my-analysis.py
datasight export abc123def --format bundle -o analysis-bundle.zip
datasight export abc123def --exclude 2,3

datasight export [OPTIONS] [SESSION_ID]

Parameters

Name	Details
`SESSION_ID`
`--output`, `-o`	Output file path. Defaults to . with the session ID truncated to 20 characters.
`--format`	html (viewer), py (runnable script), or bundle (zip archive with artifacts). Default: `html`.
`--include`	Bundle-only: comma-separated artifacts to include. Choices: html,sql,python,csv,charts,metadata. Default: all.
`--project-dir`	Project directory containing .datasight/conversations/. Default: `.`.
`--exclude`	Comma-separated turn indices to exclude (0-based, each turn is a Q&A pair).
`--list-sessions`	List available sessions and exit.

`datasight log`¶

Display the SQL query log in a formatted table.

Shows recent SQL queries generated by datasight. Use --sql N to print one raw SQL statement for copy/paste into DuckDB, SQLite, or another SQL client.

Examples:

datasight log
datasight log --tail 50 --full
datasight log --errors
datasight log --cost
datasight log --sql 1

datasight log [OPTIONS]

Parameters

Name	Details
`--project-dir`	Project directory containing query_log.jsonl. Default: `.`.
`--tail`	Show last N entries (default: 20). Default: `20`.
`--errors`	Show only failed queries.
`--full`	Show full SQL and user question.
`--cost`	Show LLM cost summary.
`--sql`	Print raw SQL for query # (shown in the # column). Ready to copy-paste.

`datasight report`¶

Manage saved reports.

Reports are saved from the web UI and can be listed, re-run against fresh data, exported, or deleted from the CLI.

Examples:

datasight report list
datasight report run 1
datasight report run 1 --format csv -o report.csv
datasight report delete 1

datasight report [OPTIONS] COMMAND [ARGS]...

Subcommands

list: List all saved reports.
run: Re-execute a saved report against fresh data.
delete: Delete a saved report.

`datasight report list`¶

List all saved reports.

Example:

datasight report list

datasight report list [OPTIONS]

Parameters

Name	Details
`--project-dir`	Project directory. Default: `.`.

`datasight report run`¶

Re-execute a saved report against fresh data.

REPORT_ID is the numeric ID shown by 'datasight report list'.

Examples:

datasight report run 1
datasight report run 1 --format csv -o report.csv
datasight report run 2 --chart-format html -o chart.html

datasight report run [OPTIONS] REPORT_ID

Parameters

Name	Details
`REPORT_ID`
`--project-dir`	Project directory containing .env and config files. Default: `.`.
`--format`	Output format for query results (default: table). Default: `table`.
`--chart-format`	Save chart output in this format (requires --output).
`--output`, `-o`	Output file path for chart or data export.

`datasight report delete`¶

Delete a saved report.

REPORT_ID is the numeric ID shown by 'datasight report list'.

Example:

datasight report delete 1

datasight report delete [OPTIONS] REPORT_ID

Parameters

Name	Details
`REPORT_ID`
`--project-dir`	Project directory. Default: `.`.

`datasight templates`¶

Save and re-apply dashboards as templates across datasets.

Templates capture dashboard cards from the web UI so the same SQL and charts can be applied to another dataset with matching tables.

Examples:

datasight templates save generation-dashboard
datasight templates list
datasight templates apply generation-dashboard --output out.html

datasight templates [OPTIONS] COMMAND [ARGS]...

Subcommands

save: Save the current project dashboard as a reusable template.
list: List dashboard templates saved in this project.
show: Print a saved template as JSON.
apply: Apply a saved template to parquet files and export HTML dashboards.
delete: Delete a saved template.

`datasight templates save`¶

Save the current project dashboard as a reusable template.

The dashboard must already exist in the project, usually from building and saving cards in the web UI.

Examples:

datasight templates save generation-dashboard
datasight templates save generation-dashboard --description "Monthly generation cards"
datasight templates save generation-dashboard --table generation_fuel --overwrite
datasight templates save by-scenario --var SCENARIO=reference

datasight templates save [OPTIONS] NAME

Parameters

Name	Details
`NAME`
`--project-dir`	Project directory containing .datasight/templates/ (default: cwd). Default: `.`.
`--description`	Template description.
`--table`	Table the template requires. Repeat once per table. When omitted, tables are inferred from each card's SQL.
`--var`	Declare a template variable: --var NAME=VALUE. Every occurrence of VALUE in each card's SQL is rewritten to {{NAME}}, and NAME becomes a placeholder that must be resolved at apply time.
`--var-from-filename`	Attach a filename-extraction regex to a variable: --var-from-filename NAME=REGEX. At apply time the regex is run against each input parquet's filename and its first capture group (or whole match) becomes the variable value. Use with --var to also set the save-time literal and default.
`--overwrite`	Replace an existing template.

`datasight templates list`¶

List dashboard templates saved in this project.

Example:

datasight templates list

datasight templates list [OPTIONS]

Parameters

Name	Details
`--project-dir`	Project directory containing .datasight/templates/ (default: cwd). Default: `.`.

`datasight templates show`¶

Print a saved template as JSON.

Example:

datasight templates show generation-dashboard

datasight templates show [OPTIONS] NAME

Parameters

Name	Details
`NAME`
`--project-dir`	Project directory containing .datasight/templates/ (default: cwd). Default: `.`.

`datasight templates apply`¶

Apply a saved template to parquet files and export HTML dashboards.

Each required table is registered as a view inside an in-memory DuckDB connection. Tables not passed via --table fall back to the project's own DuckDB (from .env DB_PATH) — so fixed lookup tables like plants don't need to be re-supplied. A single --table mapping may use a shell glob, in which case the template is applied once per matching file and written to --export-dir.

Examples:

# Render once, mapping one required table to a parquet file
datasight templates apply generation-by-fuel \
    --table generation_fuel=data/generation.parquet \
    --output generation.html
# Render once per matching parquet, writing one HTML per file
datasight templates apply generation-by-fuel \
    --table 'generation_fuel=data/*.parquet' \
    --export-dir out/

datasight templates apply [OPTIONS] NAME

Parameters

Name	Details
`NAME`
`--project-dir`	Project directory containing .datasight/templates/ (default: cwd). Default: `.`.
`--table`	Map a required table to a parquet file: --table NAME=PATH. Repeat per table. One mapping may use a glob to iterate the template across many files. Tables not mapped here are looked up in the project's DuckDB.
`--output`	HTML output path for a single-shot run (no globbing).
`--export-dir`	Directory for per-file HTML output when a --table mapping globs.
`--var`	Override a template variable: --var NAME=VALUE. Takes precedence over the variable's filename-derived value and default.
`--fail-fast`	Stop on the first failure instead of continuing.

`datasight templates delete`¶

Delete a saved template.

Example:

datasight templates delete generation-dashboard

datasight templates delete [OPTIONS] NAME

Parameters

Name	Details
`NAME`
`--project-dir`	Project directory containing .datasight/templates/ (default: cwd). Default: `.`.

CLI reference¶

Common workflows¶

Run batch questions¶

Inspect a project without the LLM¶

Run deterministic audits and suggestions¶

Check project health¶

datasight¶

datasight init¶

datasight config¶

datasight config init¶

datasight config show¶

datasight demo¶

datasight demo eia-generation¶

datasight demo dsgrid-tempo¶

datasight demo time-validation¶

datasight generate¶

datasight run¶

datasight session¶

datasight session list¶

datasight session export¶

datasight session import¶

datasight verify¶

datasight ask¶

datasight profile¶

datasight measures¶

datasight quality¶

datasight tidy¶

datasight tidy suggest¶

datasight tidy view¶

datasight tidy table¶

datasight tidy review¶

datasight grounding¶

datasight grounding check¶

datasight grounding repair¶

datasight integrity¶

datasight distribution¶

datasight validate¶

datasight audit-report¶

datasight dimensions¶

datasight trends¶

datasight inspect¶

datasight recipes¶

datasight recipes list¶

datasight recipes run¶

datasight doctor¶

datasight export¶

datasight log¶

datasight report¶

datasight report list¶

datasight report run¶

datasight report delete¶

datasight templates¶

datasight templates save¶

datasight templates list¶

datasight templates show¶

datasight templates apply¶

datasight templates delete¶

`datasight`¶

`datasight init`¶

`datasight config`¶

`datasight config init`¶

`datasight config show`¶

`datasight demo`¶

`datasight demo eia-generation`¶

`datasight demo dsgrid-tempo`¶

`datasight demo time-validation`¶

`datasight generate`¶

`datasight run`¶

`datasight session`¶

`datasight session list`¶

`datasight session export`¶

`datasight session import`¶

`datasight verify`¶

`datasight ask`¶

`datasight profile`¶

`datasight measures`¶

`datasight quality`¶

`datasight tidy`¶

`datasight tidy suggest`¶

`datasight tidy view`¶

`datasight tidy table`¶

`datasight tidy review`¶

`datasight grounding`¶

`datasight grounding check`¶

`datasight grounding repair`¶

`datasight integrity`¶

`datasight distribution`¶

`datasight validate`¶

`datasight audit-report`¶

`datasight dimensions`¶

`datasight trends`¶

`datasight inspect`¶

`datasight recipes`¶

`datasight recipes list`¶

`datasight recipes run`¶

`datasight doctor`¶

`datasight export`¶

`datasight log`¶

`datasight report`¶

`datasight report list`¶

`datasight report run`¶

`datasight report delete`¶

`datasight templates`¶

`datasight templates save`¶

`datasight templates list`¶

`datasight templates show`¶

`datasight templates apply`¶

`datasight templates delete`¶