# Danbooru MCP Tag Validator — User Guide This guide explains how to integrate and use the `danbooru-mcp` server with an LLM to generate valid, high-quality prompts for Illustrious / Stable Diffusion models trained on Danbooru data. --- ## Table of Contents 1. [What is this?](#what-is-this) 2. [Quick Start](#quick-start) 3. [Tool Reference](#tool-reference) - [search_tags](#search_tags) - [validate_tags](#validate_tags) - [suggest_tags](#suggest_tags) 4. [Prompt Engineering Workflow](#prompt-engineering-workflow) 5. [Category Reference](#category-reference) 6. [Best Practices](#best-practices) 7. [Common Scenarios](#common-scenarios) 8. [Troubleshooting](#troubleshooting) --- ## What is this? Illustrious (and similar Danbooru-trained Stable Diffusion models) uses **Danbooru tags** as its prompt language. Tags like `1girl`, `blue_hair`, `looking_at_viewer` are meaningful because the model was trained on images annotated with them. The problem: there are hundreds of thousands of valid Danbooru tags, and misspelling or inventing tags produces no useful signal — the model generates less accurate images. **This MCP server** lets an LLM: - **Search** the full tag database for tag discovery - **Validate** a proposed prompt's tags against the real Danbooru database - **Suggest** corrections for typos or near-miss tags The database contains **292,500 tags**, all with ≥10 posts on Danbooru — filtering out one-off or misspelled entries. --- ## Quick Start ### 1. Add to your MCP client (Claude Desktop example) **Using Docker (recommended):** ```json { "mcpServers": { "danbooru-tags": { "command": "docker", "args": ["run", "--rm", "-i", "danbooru-mcp:latest"] } } } ``` **Using Python directly:** ```json { "mcpServers": { "danbooru-tags": { "command": "/path/to/danbooru-mcp/.venv/bin/python", "args": ["/path/to/danbooru-mcp/src/server.py"] } } } ``` ### 2. Instruct the LLM Add a system prompt telling the LLM to use the server: ``` You have access to the danbooru-tags MCP server for validating Stable Diffusion prompts. Before generating any final prompt: 1. Use validate_tags to check all proposed tags are real Danbooru tags. 2. Use suggest_tags to fix any invalid tags. 3. Only output the validated, corrected tag list. ``` --- ## Tool Reference ### `search_tags` Find tags by name using full-text / prefix search. **Parameters:** | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `query` | `string` | *required* | Search string. Trailing `*` added automatically for prefix match. Supports FTS5 syntax. | | `limit` | `integer` | `20` | Max results (1–200) | | `category` | `string` | `null` | Optional filter: `"general"`, `"artist"`, `"copyright"`, `"character"`, `"meta"` | **Returns:** List of tag objects: ```json [ { "name": "blue_hair", "post_count": 1079925, "category": "general", "is_deprecated": false } ] ``` **Examples:** ``` Search for hair colour tags: search_tags("blue_hair") → blue_hair, blue_hairband, blue_hair-chan_(ramchi), … Search only character tags for a Vocaloid: search_tags("hatsune", category="character") → hatsune_miku, hatsune_mikuo, hatsune_miku_(append), … Boolean search: search_tags("hair AND blue") → tags matching both "hair" and "blue" ``` **FTS5 query syntax:** | Syntax | Meaning | |--------|---------| | `blue_ha*` | prefix match (added automatically) | | `"blue hair"` | phrase match | | `hair AND blue` | both terms present | | `hair NOT red` | exclusion | --- ### `validate_tags` Check a list of tags against the full Danbooru database. Returns three groups: valid, deprecated, and invalid. **Parameters:** | Parameter | Type | Description | |-----------|------|-------------| | `tags` | `list[string]` | Tags to validate, e.g. `["1girl", "blue_hair", "sword"]` | **Returns:** ```json { "valid": ["1girl", "blue_hair", "sword"], "deprecated": [], "invalid": ["blue_hairs", "not_a_real_tag"] } ``` | Key | Meaning | |-----|---------| | `valid` | Exists in Danbooru and is not deprecated — safe to use | | `deprecated` | Exists but has been deprecated (an updated canonical tag exists) | | `invalid` | Not found — likely misspelled, hallucinated, or too niche (<10 posts) | **Important:** Always run `validate_tags` before finalising a prompt. Invalid tags are silently ignored by the model but waste token budget and reduce prompt clarity. --- ### `suggest_tags` Autocomplete-style suggestions for a partial or approximate tag. Results are sorted by post count (most commonly used first). Deprecated tags are **excluded**. **Parameters:** | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `partial` | `string` | *required* | Partial tag or rough approximation | | `limit` | `integer` | `10` | Max suggestions (1–50) | | `category` | `string` | `null` | Optional category filter | **Returns:** Same format as `search_tags`, sorted by `post_count` descending. **Examples:** ``` Fix a typo: suggest_tags("looking_at_vewer") → ["looking_at_viewer", …] Find the most popular sword-related tags: suggest_tags("sword", limit=5, category="general") → sword (337,737), sword_behind_back (7,203), … Find character tags for a partial name: suggest_tags("miku", category="character") → hatsune_miku (129,806), yuki_miku (4,754), … ``` --- ## Prompt Engineering Workflow This is the recommended workflow for an LLM building Illustrious prompts: ### Step 1 — Draft The LLM drafts an initial list of conceptual tags based on the user's description: ``` User: "A girl with long silver hair wearing a kimono in a Japanese garden" Draft tags: 1girl, silver_hair, long_hair, kimono, japanese_garden, cherry_blossoms, sitting, looking_at_viewer, outdoors, traditional_clothes ``` ### Step 2 — Validate ``` validate_tags([ "1girl", "silver_hair", "long_hair", "kimono", "japanese_garden", "cherry_blossoms", "sitting", "looking_at_viewer", "outdoors", "traditional_clothes" ]) ``` Response: ```json { "valid": ["1girl", "long_hair", "kimono", "cherry_blossoms", "sitting", "looking_at_viewer", "outdoors", "traditional_clothes"], "deprecated": [], "invalid": ["silver_hair", "japanese_garden"] } ``` ### Step 3 — Fix invalid tags ``` suggest_tags("silver_hair", limit=3) → [{"name": "white_hair", "post_count": 800000}, ...] suggest_tags("japanese_garden", limit=3) → [{"name": "garden", "post_count": 45000}, {"name": "japanese_clothes", "post_count": 12000}, ...] ``` ### Step 4 — Finalise ``` Final prompt: 1girl, white_hair, long_hair, kimono, garden, cherry_blossoms, sitting, looking_at_viewer, outdoors, traditional_clothes ``` All tags are validated. Prompt is ready to send to ComfyUI. --- ## Category Reference Danbooru organises tags into five categories. Understanding them helps scope searches: | Category | Value | Description | Examples | |----------|-------|-------------|---------| | **general** | `0` | Descriptive tags for image content | `1girl`, `blue_hair`, `sword`, `outdoors` | | **artist** | `1` | Artist/creator names | `wlop`, `natsuki_subaru` | | **copyright** | `3` | Source material / franchise | `fate/stay_night`, `touhou`, `genshin_impact` | | **character** | `4` | Specific character names | `hatsune_miku`, `hakurei_reimu` | | **meta** | `5` | Image quality / format tags | `highres`, `absurdres`, `commentary` | **Tips:** - For generating images, focus on **general** tags (colours, poses, clothing, expressions) - Add **character** and **copyright** tags when depicting a specific character - **meta** tags like `highres` and `best_quality` can improve output quality - Avoid **artist** tags unless intentionally mimicking a specific art style --- ## Best Practices ### ✅ Always validate before generating ```python # Always run this before finalising result = validate_tags(your_proposed_tags) # Fix everything in result["invalid"] before sending to ComfyUI ``` ### ✅ Use suggest_tags for discoverability Even for tags you think you know, run `suggest_tags` to find the canonical form: - `standing` vs `standing_on_one_leg` vs `standing_split` - `smile` vs `small_smile` vs `evil_smile` The tag with the highest `post_count` is almost always the right one for your intent. ### ✅ Prefer high-post-count tags Higher post count = more training data = more consistent model response. ```python # Get the top 5 most established hair colour tags suggest_tags("hair_color", limit=5, category="general") ``` ### ✅ Layer specificity Good prompts move from general to specific: ``` # General → Specific 1girl, # subject count solo, # composition long_hair, blue_hair, # hair white_dress, off_shoulder, # clothing smile, looking_at_viewer, # expression/pose outdoors, garden, daytime, # setting masterpiece, best_quality # quality ``` ### ❌ Avoid deprecated tags If `validate_tags` reports a tag as `deprecated`, use `suggest_tags` to find the current replacement: ```python # If "nude" is deprecated, find the current tag: suggest_tags("nude", category="general") ``` ### ❌ Don't invent tags The model doesn't understand arbitrary natural language in prompts — only tags it was trained on. `beautiful_landscape` is not a Danbooru tag; `scenery` and `landscape` are. --- ## Common Scenarios ### Scenario: Character in a specific pose ``` # 1. Search for pose tags search_tags("sitting", category="general", limit=10) → sitting, sitting_on_ground, kneeling, seiza, wariza, … # 2. Validate the full tag set validate_tags(["1girl", "hatsune_miku", "sitting", "looking_at_viewer", "smile"]) ``` ### Scenario: Specific art style ``` # Find copyright tags for a franchise search_tags("genshin", category="copyright", limit=5) → genshin_impact, … # Find character from that franchise search_tags("hu_tao", category="character", limit=3) → hu_tao_(genshin_impact), … ``` ### Scenario: Quality boosting tags ``` # Find commonly used meta/quality tags search_tags("quality", category="meta", limit=5) → best_quality, high_quality, … search_tags("res", category="meta", limit=5) → highres, absurdres, ultra-high_res, … ``` ### Scenario: Unknown misspelling ``` # You typed "haor" instead of "hair" suggest_tags("haor", limit=5) → [] (no prefix match) # Try a broader search search_tags("long hair") → long_hair, long_hair_between_eyes, wavy_hair, … ``` --- ## Troubleshooting ### "invalid" tags that should be valid The database contains only tags with **≥10 posts**. Tags with fewer posts are intentionally excluded as they are likely misspellings, very niche, or one-off annotations. If a tag you expect to be valid shows as invalid: 1. Try `suggest_tags` to find a close variant 2. Use `search_tags` to explore the tag space 3. The tag may genuinely have <10 posts — use a broader synonym instead ### Server not responding Check the MCP server is running and the `db/tags.db` file exists: ```bash # Local python src/server.py # Docker docker run --rm -i danbooru-mcp:latest ``` Environment variable override: ```bash DANBOORU_TAGS_DB=/custom/path/tags.db python src/server.py ``` ### Database needs rebuilding / updating Re-run the scraper (it's resumable): ```bash # Refresh all tags python scripts/scrape_tags.py --no-resume # Update changed tags only (re-scrapes from scratch, stops at ≥10 posts boundary) python scripts/scrape_tags.py ``` Then rebuild the Docker image: ```bash docker build -f Dockerfile.prebuilt -t danbooru-mcp:latest . ```