>
---
name: duckdb-docs
description: >
Search DuckDB and DuckLake documentation and blog posts. Returns relevant
doc chunks for a question or keyword using full-text search against a
locally cached index.
argument-hint: <question or keyword>
allowed-tools: Bash
---
You are helping the user find relevant DuckDB or DuckLake documentation.
Query: `$@`
Follow these steps in order.
## Step 1 — Check DuckDB is installed
```bash
command -v duckdb
```
If not found, delegate to `/duckdb-skills:install-duckdb` and then continue.
## Step 2 — Ensure required extensions are installed
```bash
duckdb :memory: -c "INSTALL httpfs; INSTALL fts;"
```
If this fails, report the error and stop.
## Step 3 — Choose the data source and extract search terms
The query is: `$@`
### Data source selection
There are two search indexes available:
| Index | Remote URL | Local cache filename | Versions | Use when |
|-------|-----------|---------------------|----------|----------|
| **DuckDB docs + blog** | `https://duckdb.org/data/docs-search.duckdb` | `duckdb-docs.duckdb` | `lts`, `current`, `blog` | Default — any DuckDB question |
| **DuckLake docs** | `https://ducklake.select/data/docs-search.duckdb` | `ducklake-docs.duckdb` | `stable`, `preview` | Query mentions DuckLake, catalogs, or DuckLake-specific features |
Both indexes share the same schema:
| Column | Type | Description |
|--------|------|-------------|
| `chunk_id` | `VARCHAR` (PK) | e.g. `stable/sql/functions/numeric#absx` |
| `page_title` | `VARCHAR` | Page title from front matter |
| `section` | `VARCHAR` | Section heading (null for page intros) |
| `breadcrumb` | `VARCHAR` | e.g. `SQL > Functions > Numeric` |
| `url` | `VARCHAR` | URL path with anchor |
| `version` | `VARCHAR` | See table above |
| `text` | `TEXT` | Full markdown of the chunk |
By default, search **DuckDB docs** and filter to `version = 'lts'`. Use different versions when:
- The user explicitly asks about `current`/nightly features → `version = 'current'`
- The user asks about a blog post or wants background/motivation → `version = 'blog'`
- The user asks about DuckLake → search the DuckLake index with `version = 'stable'`
- When unsure, omit the version filter to search across all versions.
### Search terms
If the input is a **natural language question** (e.g. "how do I find the most frequent value"), extract the key technical terms (nouns, function names, SQL keywords) to form a compact BM25 query string. Drop stop words like "how", "do", "I", "the".
If the input is already a **function name or technical term** (e.g. `arg_max`, `GROUP BY ALL`), use it as-is.
Use the extracted terms as `SEARCH_QUERY` in the next step.
## Step 4 — Ensure local cache is fresh
The cache lives at `$HOME/.duckdb/docs/CACHE_FILENAME` (where `CACHE_FILENAME` is `duckdb-docs.duckdb` or `ducklake-docs.duckdb` per Step 3).
First, ensure the directory exists:
```bash
mkdir -p "$HOME/.duckdb/docs"
```
Then check whether the cache file exists and is fresh (≤2 days old):
```bash
CACHE_FILE="$HOME/.duckdb/docs/CACHE_FILENAME"
if [ -f "$CACHE_FILE" ]; then
MTIME=$(stat -f %m "$CACHE_FILE" 2>/dev/null || stat -c %Y "$CACHE_FILE")
CACHE_AGE_DAYS=$(( ( $(date +%s) - MTIME ) / 86400 ))
else
CACHE_AGE_DAYS=999
fi
echo "Cache age: $CACHE_AGE_DAYS days"
```
**If `CACHE_AGE_DAYS` ≤ 2** → skip to Step 5.
**Otherwise** (stale or missing) → fetch the index:
```bash
duckdb -c "
LOAD httpfs;
LOAD fts;
ATTACH 'REMOTE_URL' AS remote (READ_ONLY);
ATTACH '$HOME/.duckdb/docs/CACHE_FILENAME.tmp' AS tmp;
COPY FROM DATABASE remote TO tmp;
" && mv "$HOME/.duckdb/docs/CACHE_FILENAME.tmp" "$HOME/.duckdb/docs/CACHE_FILENAME"
```
Replace `REMOTE_URL` and `CACHE_FILENAME` per Step 3. If the fetch fails (network error), report the error and stop.
## Step 5 — Search the docs
```bash
duckdb "$HOME/.duckdb/docs/CACHE_FILENAME" -readonly -json -c "
LOAD fts;
SELECT
chunk_id, page_title, section, breadcrumb, url, version, text,
fts_main_docs_chunks.match_bm25(chunk_id, 'SEARCH_QUERY') AS score
FROM docs_chunks
WHERE score IS NOT NULL
AND version = 'VERSION'
ORDER BY score DESC
LIMIT 8;
"
```
Replace `CACHE_FILENAME`, `SEARCH_QUERY`, and `VERSION` per Step 3. Remove the `AND version = 'VERSION'` line if searching across all versions.
If the user's question could benefit from both DuckDB docs and blog results, run two queries (one with `version = 'stable'`, one with `version = 'blog'`) or omit the version filter entirely.
## Step 6 — Handle errors
- **Extension not installed** (`httpfs` or `fts` not found): run `duckdb :memory: -c "INSTALL httpfs; INSTALL fts;"` and retry.
- **ATTACH fails / network unreachable**: inform the user that the docs index is unavailable and suggest checking their internet connection. The DuckDB index is hosted at `https://duckdb.org/data/docs-search.duckdb` and the DuckLake index at `https://ducklake.select/data/docs-search.duckdb`.
- **No results** (all scores NULL or empty result set): try broadening the query — drop the least specific term, or try a single-word version of the query — then retry Step 5. If still no results, tell the user no matching documentation was found and suggest visiting https://duckdb.org/docs or https://ducklake.select/docs directly.
## Step 7 — Present results
For each result chunk returned (ordered by score descending), format as:
```
### {section} — {page_title}
{url}
{text}
---
```
After presenting all chunks, synthesize a concise answer to the user's original question (`$@`) based on the retrieved documentation. If the chunks directly answer the question, lead with the answer before showing the sources.
Creator's repository · duckdb/duckdb-skills