Initial commit

This commit is contained in:
Aodhan Collins
2026-03-02 23:29:58 +00:00
commit 08c6e14616
12 changed files with 2121 additions and 0 deletions

115
plans/danbooru-mcp-plan.md Normal file
View File

@@ -0,0 +1,115 @@
# Danbooru MCP Tag Validator — Plan
## Overview
A Python MCP server backed by a SQLite database that allows an LLM to search, validate, and get suggestions for valid Danbooru tags. This enables an Illustrious/Stable Diffusion ComfyUI workflow to ensure all generated prompts contain only real Danbooru tags.
---
## Architecture
```mermaid
graph TD
A[LLM / ComfyUI Workflow] -->|MCP protocol| B[danbooru-mcp Python server]
B -->|SQL queries| C[SQLite DB: tags.db]
C -->|FTS5 index| D[fast text search]
E[all_tags.csv 900k rows] -->|import script one-time| C
```
---
## Project Structure
```
danbooru-mcp/
├── data/
│ └── all_tags.csv # source data (existing, ~900k entries)
├── db/
│ └── tags.db # generated SQLite DB (gitignored)
├── scripts/
│ └── import_tags.py # one-time CSV → SQLite importer
├── src/
│ └── server.py # MCP server entry point
├── pyproject.toml # deps: mcp[cli] only (sqlite3 is stdlib)
├── .gitignore
└── README.md
```
---
## Database Design
### Table: `tags`
```sql
CREATE TABLE tags (
id INTEGER PRIMARY KEY,
name TEXT UNIQUE NOT NULL
);
```
### FTS5 Virtual Table: `tags_fts`
```sql
CREATE VIRTUAL TABLE tags_fts USING fts5(
name,
content='tags',
content_rowid='id'
);
```
- Allows fast prefix queries: `tags_fts MATCH 'blue_*'`
- Exact-match validation uses B-tree index on `tags.name UNIQUE`
---
## MCP Tools
| Tool | Input | Output | Purpose |
|------|-------|--------|---------|
| `search_tags` | `query: str`, `limit: int = 20` | `list[str]` | Prefix/substring search for tag discovery |
| `validate_tags` | `tags: list[str]` | `{valid: list, invalid: list}` | Bulk exact-match validation of a prompt |
| `suggest_tags` | `partial: str`, `limit: int = 10` | `list[str]` | Autocomplete-style correction of near-miss tags |
---
## Key Design Decisions
- **SQLite over PostgreSQL** — zero operational overhead, file-portable, sufficient for read-only tag lookup
- **FTS5 over LIKE** — `LIKE '%query%'` on 900k rows is O(n); FTS5 prefix matching is O(log n) via inverted index
- **Exact-match bulk validation** — `WHERE name IN (...)` leverages SQLite's B-tree index on the UNIQUE constraint
- **No vector/embedding search** — Danbooru tags are well-defined strings; exact + prefix matching is appropriate and zero-dependency
---
## Dependencies
- Python 3.10+
- `mcp[cli]` — official Python MCP SDK
- `sqlite3` — Python stdlib, no install required
---
## MCP Client Config Example
```json
{
"mcpServers": {
"danbooru-tags": {
"command": "python",
"args": ["/path/to/danbooru-mcp/src/server.py"]
}
}
}
```
---
## Todo List
- [ ] Write `pyproject.toml` with `mcp[cli]` dependency
- [ ] Write `scripts/import_tags.py` — CSV to SQLite with FTS5 index
- [ ] Write `src/server.py` — MCP server with 3 tools
- [ ] `search_tags` tool
- [ ] `validate_tags` tool
- [ ] `suggest_tags` tool
- [ ] Write `.gitignore` (exclude `db/tags.db`)
- [ ] Write `README.md` with setup and usage instructions