Contributing¶
Development setup¶
The project uses uv to manage the Python
environment and pin dependencies via uv.lock. Install uv first (see the
uv install guide),
then:
git clone https://github.com/dsgrid/datasight.git
cd datasight
uv sync --extra dev
. .venv/bin/activate
uv sync creates .venv/ automatically and installs the project plus the
dev extras pinned in uv.lock.
Project structure¶
src/datasight/
├── cli.py # Click CLI commands (run, ask, init, demo, generate, verify, profile, quality, doctor, export, log)
├── agent.py # Shared agent loop and tool execution
├── config.py # Configuration helpers
├── data_profile.py # Deterministic dataset overviews and CLI/web recipes
├── schema.py # Database introspection
├── llm.py # LLM client abstraction
├── chart.py # Plotly chart generator
├── runner.py # SQL execution backends (DuckDB, SQLite, Postgres, Flight SQL)
├── export.py # Session-to-HTML export
├── verify.py # Query verification engine
├── demo.py # Demo dataset generator
└── web/
├── app.py # FastAPI server + SSE streaming
├── static/ # Vite build output (assets/) + icons
└── templates/
└── index.html # Generated by Vite build
frontend/ # Svelte 5 + TypeScript + Tailwind source
├── src/
│ ├── App.svelte # Root component
│ ├── main.ts # Entry point
│ ├── app.css # Tailwind + design tokens
│ └── lib/
│ ├── stores/ # Svelte 5 rune-based stores
│ ├── api/ # Typed API client functions
│ ├── components/ # ~40 Svelte components
│ └── utils/ # Search, format, markdown utilities
├── tests/ # Vitest unit tests
└── e2e/ # Playwright E2E tests
Running locally¶
# Build the generated web assets once after a clean checkout
bash scripts/build-frontend.sh
# Start with a demo project
datasight demo ./dev-project
cd dev-project
# Edit .env with your API key (Anthropic, GitHub token, or Ollama)
datasight run -v
The -v flag enables debug logging, which shows the full LLM request/response
cycle including tool calls.
The FastAPI app serves generated files from src/datasight/web/static/ and
src/datasight/web/templates/index.html. Those files are ignored by git. Run
bash scripts/build-frontend.sh after a clean checkout and whenever you want
datasight run to serve a freshly built production UI.
Pre-commit hooks¶
The project uses prek — a drop-in replacement
for pre-commit — to run checks automatically on
every commit. It reads the same .pre-commit-config.yaml. Install the hooks
after cloning:
Hooks run ruff (lint + format), ruff-format, ty (type checking), and a docs CLI reference drift check. If a hook fails, it will either auto-fix the file (ruff format) or show you what to fix. Stage the fixes and commit again.
To run all hooks manually against every file:
Code style¶
The project uses ruff for linting and formatting, and ty for type checking.
Frontend¶
The frontend is built with Svelte 5 + TypeScript + Tailwind CSS and uses
Vite as the build tool. Source lives in frontend/.
cd frontend
npm install
npm run dev # Vite dev server on :5173 (proxies /api to :8084)
npm run check # Svelte + TypeScript checks
npm test # Vitest unit tests
npm run build # Production build
For frontend development, run datasight run in another terminal for the API
server, then use npm run dev from frontend/. To test the exact production
UI served by FastAPI, build and copy the frontend output into FastAPI's serving
directories:
Release builds run this script before packaging. Hatch includes the generated assets in the sdist and wheel as build artifacts, so the repository stays free of generated frontend bundles while published packages still contain the web UI.
Testing¶
# Run the full Python test suite
pytest
# Run the CI-safe Python suite, excluding local Ollama integration tests
pytest -m "not integration"
# Run only tests that require a local Ollama model
pytest -m integration
# Frontend unit tests (Vitest)
cd frontend && npm test
# Frontend E2E tests (Playwright, requires datasight run)
cd frontend && npm run test:e2e
# Run the verification suite against a demo project
datasight demo ./test-project
cd test-project
datasight verify -v
Tests marked integration require a running local Ollama server with the
qwen2.5:7b model available:
Keep live-provider tests behind that marker so CI can run deterministically without a local model.
Documentation¶
Docs use Zensical.
When you change Click commands or help text, regenerate the static CLI docs:
Regenerating UI screenshots¶
Screenshots embedded in the docs live in docs/assets/screenshots/ and are
captured by a dedicated Playwright spec at frontend/e2e/screenshots.spec.ts.
They're excluded from the regular npm run test:e2e run (via
--grep-invert screenshots) because they need a project-loaded server and
specific UI state.
Regenerate when the UI changes in a way that makes the committed images stale.
-
Start a server with a demo project in one terminal:
-
In the live browser, run a chart-producing query (e.g. "Show monthly generation by fuel type as a line chart") and pin at least one card to the dashboard. The capture spec loads the first saved conversation for the chart screenshot, and reads dashboard state for the dashboard screenshot.
-
Run the capture spec in a second terminal:
To regenerate a single view, filter by test name:
All tests capture in dark mode at a fixed 1280×800 viewport for visual
consistency. The landing test intercepts GET /api/project so it renders
the no-project landing page even against a server that has a project loaded.