Install datasight¶
datasight is distributed as a Python CLI on PyPI. The recommended installer is
uv, which handles Python version management and installs
datasight as a global tool. If you already have a Python toolchain you prefer, pip
works too.
Both options include support for DuckDB, SQLite, PostgreSQL, and all AI providers (Anthropic, OpenAI, GitHub Models, Ollama) — no extra packages needed.
Configure an AI provider¶
datasight needs an AI provider to translate your questions into SQL. If you're choosing for the first time, start with Anthropic — it's the default, the models are good, and you can get a key in about two minutes. GitHub Models is a free alternative if you have a GitHub account.
Run datasight config init to create a shared credentials file that every project on
this machine will pick up automatically:
Alternatively, paste the key directly into a project's .env file instead.
- Go to console.anthropic.com → API Keys → Create Key.
- Copy the key (starts with
sk-ant-). - Add it to your
.env:
No other settings are needed — anthropic is the default provider and
Claude Haiku is the default model, which handles SQL generation well at low cost.
Anthropic accounts require a payment method
New accounts come with trial credit, but you'll need to add a payment method before that runs out. If you'd rather skip billing entirely, see GitHub Models below — it's free with just a GitHub account.
For Azure OpenAI or a corporate gateway, also set OPENAI_BASE_URL.
GitHub Models provides free access to GPT and other models using your GitHub account — no billing setup required.
To get a GITHUB_TOKEN:
- GitHub CLI (quickest): run
gh auth tokenand paste the output. - Personal access token: go to GitHub → Settings → Developer settings → Personal access tokens → Fine-grained tokens → Generate new token. Grant the Models: read account permission. Classic tokens do not work.
Free tier context limit
GitHub Models caps requests at 8,000 tokens. Databases with more than ~20 tables can exceed this. If you hit "request too large" errors, see Limit schema sent to the LLM.
Ollama runs AI models on your own hardware — nothing leaves your machine and there's no per-query cost. Use it when data sensitivity or offline use requires it; for most users, a hosted provider gives better results with less setup.
Install Ollama, then pull a tool-calling model:
Then configure .env:
qwen2.5:7b works well for CLI queries (datasight ask) and is the safest
cross-platform default — it uses ~2 GB resident on Apple Silicon, so it fits
even on 16 GB Macs. For the web UI with chart generation, qwen2.5:14b handles
the more complex interactions better. On Apple Silicon with 48 GB+ of unified
memory, qwen3.6:35b-a3b-coding-mxfp8 produces noticeably richer answers with
comparable decode speed (sparse MoE). See
Choosing an AI provider
for measured per-model memory footprints and Apple-Silicon-specific guidance.
See the Configuration reference for every supported variable.
PNG chart export
datasight ask --chart-format png needs the optional export extra.
Reinstall with uv tool install "datasight[export]" or
pip install --user "datasight[export]".
The web UI does not need it.