Files
danbooru-mcp/plans/danbooru-mcp-plan.md
Aodhan Collins 08c6e14616 Initial commit
2026-03-02 23:29:58 +00:00

3.1 KiB

Danbooru MCP Tag Validator — Plan

Overview

A Python MCP server backed by a SQLite database that allows an LLM to search, validate, and get suggestions for valid Danbooru tags. This enables an Illustrious/Stable Diffusion ComfyUI workflow to ensure all generated prompts contain only real Danbooru tags.


Architecture

graph TD
    A[LLM / ComfyUI Workflow] -->|MCP protocol| B[danbooru-mcp Python server]
    B -->|SQL queries| C[SQLite DB: tags.db]
    C -->|FTS5 index| D[fast text search]
    E[all_tags.csv 900k rows] -->|import script one-time| C

Project Structure

danbooru-mcp/
├── data/
│   └── all_tags.csv          # source data (existing, ~900k entries)
├── db/
│   └── tags.db               # generated SQLite DB (gitignored)
├── scripts/
│   └── import_tags.py        # one-time CSV → SQLite importer
├── src/
│   └── server.py             # MCP server entry point
├── pyproject.toml            # deps: mcp[cli] only (sqlite3 is stdlib)
├── .gitignore
└── README.md

Database Design

Table: tags

CREATE TABLE tags (
    id   INTEGER PRIMARY KEY,
    name TEXT    UNIQUE NOT NULL
);

FTS5 Virtual Table: tags_fts

CREATE VIRTUAL TABLE tags_fts USING fts5(
    name,
    content='tags',
    content_rowid='id'
);
  • Allows fast prefix queries: tags_fts MATCH 'blue_*'
  • Exact-match validation uses B-tree index on tags.name UNIQUE

MCP Tools

Tool Input Output Purpose
search_tags query: str, limit: int = 20 list[str] Prefix/substring search for tag discovery
validate_tags tags: list[str] {valid: list, invalid: list} Bulk exact-match validation of a prompt
suggest_tags partial: str, limit: int = 10 list[str] Autocomplete-style correction of near-miss tags

Key Design Decisions

  • SQLite over PostgreSQL — zero operational overhead, file-portable, sufficient for read-only tag lookup
  • FTS5 over LIKELIKE '%query%' on 900k rows is O(n); FTS5 prefix matching is O(log n) via inverted index
  • Exact-match bulk validationWHERE name IN (...) leverages SQLite's B-tree index on the UNIQUE constraint
  • No vector/embedding search — Danbooru tags are well-defined strings; exact + prefix matching is appropriate and zero-dependency

Dependencies

  • Python 3.10+
  • mcp[cli] — official Python MCP SDK
  • sqlite3 — Python stdlib, no install required

MCP Client Config Example

{
  "mcpServers": {
    "danbooru-tags": {
      "command": "python",
      "args": ["/path/to/danbooru-mcp/src/server.py"]
    }
  }
}

Todo List

  • Write pyproject.toml with mcp[cli] dependency
  • Write scripts/import_tags.py — CSV to SQLite with FTS5 index
  • Write src/server.py — MCP server with 3 tools
    • search_tags tool
    • validate_tags tool
    • suggest_tags tool
  • Write .gitignore (exclude db/tags.db)
  • Write README.md with setup and usage instructions