116 lines
3.1 KiB
Markdown
116 lines
3.1 KiB
Markdown
# Danbooru MCP Tag Validator — Plan
|
|
|
|
## Overview
|
|
|
|
A Python MCP server backed by a SQLite database that allows an LLM to search, validate, and get suggestions for valid Danbooru tags. This enables an Illustrious/Stable Diffusion ComfyUI workflow to ensure all generated prompts contain only real Danbooru tags.
|
|
|
|
---
|
|
|
|
## Architecture
|
|
|
|
```mermaid
|
|
graph TD
|
|
A[LLM / ComfyUI Workflow] -->|MCP protocol| B[danbooru-mcp Python server]
|
|
B -->|SQL queries| C[SQLite DB: tags.db]
|
|
C -->|FTS5 index| D[fast text search]
|
|
E[all_tags.csv 900k rows] -->|import script one-time| C
|
|
```
|
|
|
|
---
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
danbooru-mcp/
|
|
├── data/
|
|
│ └── all_tags.csv # source data (existing, ~900k entries)
|
|
├── db/
|
|
│ └── tags.db # generated SQLite DB (gitignored)
|
|
├── scripts/
|
|
│ └── import_tags.py # one-time CSV → SQLite importer
|
|
├── src/
|
|
│ └── server.py # MCP server entry point
|
|
├── pyproject.toml # deps: mcp[cli] only (sqlite3 is stdlib)
|
|
├── .gitignore
|
|
└── README.md
|
|
```
|
|
|
|
---
|
|
|
|
## Database Design
|
|
|
|
### Table: `tags`
|
|
```sql
|
|
CREATE TABLE tags (
|
|
id INTEGER PRIMARY KEY,
|
|
name TEXT UNIQUE NOT NULL
|
|
);
|
|
```
|
|
|
|
### FTS5 Virtual Table: `tags_fts`
|
|
```sql
|
|
CREATE VIRTUAL TABLE tags_fts USING fts5(
|
|
name,
|
|
content='tags',
|
|
content_rowid='id'
|
|
);
|
|
```
|
|
|
|
- Allows fast prefix queries: `tags_fts MATCH 'blue_*'`
|
|
- Exact-match validation uses B-tree index on `tags.name UNIQUE`
|
|
|
|
---
|
|
|
|
## MCP Tools
|
|
|
|
| Tool | Input | Output | Purpose |
|
|
|------|-------|--------|---------|
|
|
| `search_tags` | `query: str`, `limit: int = 20` | `list[str]` | Prefix/substring search for tag discovery |
|
|
| `validate_tags` | `tags: list[str]` | `{valid: list, invalid: list}` | Bulk exact-match validation of a prompt |
|
|
| `suggest_tags` | `partial: str`, `limit: int = 10` | `list[str]` | Autocomplete-style correction of near-miss tags |
|
|
|
|
---
|
|
|
|
## Key Design Decisions
|
|
|
|
- **SQLite over PostgreSQL** — zero operational overhead, file-portable, sufficient for read-only tag lookup
|
|
- **FTS5 over LIKE** — `LIKE '%query%'` on 900k rows is O(n); FTS5 prefix matching is O(log n) via inverted index
|
|
- **Exact-match bulk validation** — `WHERE name IN (...)` leverages SQLite's B-tree index on the UNIQUE constraint
|
|
- **No vector/embedding search** — Danbooru tags are well-defined strings; exact + prefix matching is appropriate and zero-dependency
|
|
|
|
---
|
|
|
|
## Dependencies
|
|
|
|
- Python 3.10+
|
|
- `mcp[cli]` — official Python MCP SDK
|
|
- `sqlite3` — Python stdlib, no install required
|
|
|
|
---
|
|
|
|
## MCP Client Config Example
|
|
|
|
```json
|
|
{
|
|
"mcpServers": {
|
|
"danbooru-tags": {
|
|
"command": "python",
|
|
"args": ["/path/to/danbooru-mcp/src/server.py"]
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Todo List
|
|
|
|
- [ ] Write `pyproject.toml` with `mcp[cli]` dependency
|
|
- [ ] Write `scripts/import_tags.py` — CSV to SQLite with FTS5 index
|
|
- [ ] Write `src/server.py` — MCP server with 3 tools
|
|
- [ ] `search_tags` tool
|
|
- [ ] `validate_tags` tool
|
|
- [ ] `suggest_tags` tool
|
|
- [ ] Write `.gitignore` (exclude `db/tags.db`)
|
|
- [ ] Write `README.md` with setup and usage instructions
|