# P8: homeai-images — Image Generation > Phase 6 | Depends on: P4 (OpenClaw skill runner) | Independent of P6, P7 --- ## Goal ComfyUI running natively on Mac Mini with SDXL and Flux.1 models. A character LoRA trained for consistent appearance. OpenClaw skill exposes image generation as a callable tool. Saved workflows cover the most common use cases. --- ## Why Native (not Docker) Same reasoning as Ollama: ComfyUI needs Metal GPU acceleration. Docker on Mac can't access the GPU. ComfyUI runs natively as a launchd service. --- ## Installation ```bash # Clone ComfyUI git clone https://github.com/comfyanonymous/ComfyUI ~/ComfyUI cd ~/ComfyUI # Install dependencies (Python 3.11+, venv recommended) python3 -m venv venv source venv/bin/activate pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cpu pip install -r requirements.txt # Launch python main.py --listen 0.0.0.0 --port 8188 ``` **Note:** Use the PyTorch MPS backend for Apple Silicon: ```python # ComfyUI auto-detects MPS — no extra config needed # Verify by checking ComfyUI startup logs for "Using device: mps" ``` ### launchd plist — `com.homeai.comfyui.plist` ```xml Label com.homeai.comfyui ProgramArguments /Users//ComfyUI/venv/bin/python /Users//ComfyUI/main.py --listen 0.0.0.0 --port 8188 WorkingDirectory /Users//ComfyUI RunAtLoad KeepAlive StandardOutPath /tmp/comfyui.log StandardErrorPath /tmp/comfyui.err ``` --- ## Model Downloads ### Model Manifest `~/ComfyUI/models/` structure: ``` checkpoints/ ├── sd_xl_base_1.0.safetensors # SDXL base ├── flux1-dev.safetensors # Flux.1-dev (high quality) └── flux1-schnell.safetensors # Flux.1-schnell (fast drafts) vae/ ├── sdxl_vae.safetensors └── ae.safetensors # Flux VAE clip/ ├── clip_l.safetensors └── t5xxl_fp16.safetensors # Flux text encoder controlnet/ ├── controlnet-canny-sdxl.safetensors └── controlnet-depth-sdxl.safetensors loras/ └── aria-v1.safetensors # Character LoRA (trained locally) ``` ### Download Script — `scripts/download-models.sh` ```bash #!/usr/bin/env bash MODELS_DIR=~/ComfyUI/models # HuggingFace downloads (requires huggingface-cli or wget) pip install huggingface_hub python3 -c " from huggingface_hub import hf_hub_download import os downloads = [ ('stabilityai/stable-diffusion-xl-base-1.0', 'sd_xl_base_1.0.safetensors', 'checkpoints'), ('black-forest-labs/FLUX.1-schnell', 'flux1-schnell.safetensors', 'checkpoints'), ] for repo, filename, subdir in downloads: hf_hub_download( repo_id=repo, filename=filename, local_dir=f'{os.path.expanduser(\"~/ComfyUI/models\")}/{subdir}' ) " ``` > Flux.1-dev requires accepting HuggingFace license agreement. Download manually if script fails. --- ## Saved Workflows All workflows stored as ComfyUI JSON in `homeai-images/workflows/`. ### `portrait.json` — Character Portrait Standard character portrait with expression control. Key nodes: - **CheckpointLoader:** SDXL base - **LoraLoader:** aria character LoRA - **CLIPTextEncode:** positive prompt includes character description + expression - **KSampler:** 25 steps, DPM++ 2M, CFG 7 - **VAEDecode → SaveImage** Positive prompt template: ``` aria, (character lora), 1girl, solo, portrait, looking at viewer, soft lighting, detailed face, high quality, masterpiece, ``` ### `scene.json` — Character in Scene with ControlNet Uses ControlNet depth/canny for pose control. Key nodes: - **LoadImage:** input pose reference image - **ControlNetLoader:** canny or depth model - **ControlNetApply:** apply to conditioning - **KSampler** with ControlNet guidance ### `quick.json` — Fast Draft via Flux.1-schnell Low-step, fast generation for quick previews. Key nodes: - **CheckpointLoader:** flux1-schnell - **KSampler:** 4 steps, Euler, CFG 1 (Flux uses CFG=1) - Output: 512×512 or 768×768 ### `upscale.json` — 2× Upscale Takes existing image, upscales 2× with detail enhancement. Key nodes: - **LoadImage** - **UpscaleModelLoader:** `4x_NMKD-Siax_200k.pth` (download separately) - **ImageUpscaleWithModel** - **KSampler img2img** for detail pass --- ## `comfyui.py` Skill — OpenClaw Integration Full implementation (replaces stub from P4). File: `homeai-images/skills/comfyui.py` ```python """ ComfyUI image generation skill for OpenClaw. Submits workflow JSON via ComfyUI REST API and returns generated image path. """ import json import time import uuid import requests from pathlib import Path COMFYUI_URL = "http://localhost:8188" WORKFLOWS_DIR = Path(__file__).parent.parent / "workflows" OUTPUT_DIR = Path.home() / "ComfyUI" / "output" def generate(workflow_name: str, params: dict = None) -> str: """ Submit a named workflow to ComfyUI. Returns the path of the generated image. Args: workflow_name: Name of workflow JSON (without .json extension) params: Dict of node overrides, e.g. {"positive_prompt": "...", "steps": 20} Returns: Absolute path to generated image file """ workflow_path = WORKFLOWS_DIR / f"{workflow_name}.json" if not workflow_path.exists(): raise ValueError(f"Workflow '{workflow_name}' not found at {workflow_path}") workflow = json.loads(workflow_path.read_text()) # Apply param overrides if params: workflow = _apply_params(workflow, params) # Submit to ComfyUI queue client_id = str(uuid.uuid4()) prompt_id = _queue_prompt(workflow, client_id) # Poll for completion image_path = _wait_for_output(prompt_id, client_id) return str(image_path) def _queue_prompt(workflow: dict, client_id: str) -> str: resp = requests.post( f"{COMFYUI_URL}/prompt", json={"prompt": workflow, "client_id": client_id} ) resp.raise_for_status() return resp.json()["prompt_id"] def _wait_for_output(prompt_id: str, client_id: str, timeout: int = 120) -> Path: start = time.time() while time.time() - start < timeout: resp = requests.get(f"{COMFYUI_URL}/history/{prompt_id}") history = resp.json() if prompt_id in history: outputs = history[prompt_id]["outputs"] for node_output in outputs.values(): if "images" in node_output: img = node_output["images"][0] return OUTPUT_DIR / img["subfolder"] / img["filename"] time.sleep(2) raise TimeoutError(f"ComfyUI generation timed out after {timeout}s") def _apply_params(workflow: dict, params: dict) -> dict: """ Apply parameter overrides to workflow nodes. Expects workflow nodes to have a 'title' field for addressing. e.g., params={"positive_prompt": "new prompt"} updates node titled "positive_prompt" """ for node_id, node in workflow.items(): title = node.get("_meta", {}).get("title", "") if title in params: node["inputs"]["text"] = params[title] return workflow # Convenience wrappers for OpenClaw def portrait(expression: str = "neutral", extra_prompt: str = "") -> str: return generate("portrait", {"positive_prompt": f"aria, {expression}, {extra_prompt}"}) def quick(prompt: str) -> str: return generate("quick", {"positive_prompt": prompt}) def scene(prompt: str, controlnet_image_path: str = None) -> str: params = {"positive_prompt": prompt} if controlnet_image_path: params["controlnet_image"] = controlnet_image_path return generate("scene", params) ``` --- ## Character LoRA Training A LoRA trains the model to consistently generate the character's appearance. ### Dataset Preparation 1. Collect 20–50 reference images of the character (or commission a character sheet) 2. Consistent style, multiple angles/expressions 3. Resize to 1024×1024, square crop 4. Write captions: `aria, 1girl, solo, ` 5. Store in `~/lora-training/aria/` ### Training Use **kohya_ss** or **SimpleTuner** for LoRA training on Apple Silicon: ```bash # kohya_ss (SDXL LoRA) git clone https://github.com/bmaltais/kohya_ss pip install -r requirements.txt # Training config — key params for MPS python train_network.py \ --pretrained_model_name_or_path=~/ComfyUI/models/checkpoints/sd_xl_base_1.0.safetensors \ --train_data_dir=~/lora-training/aria \ --output_dir=~/ComfyUI/models/loras \ --output_name=aria-v1 \ --network_module=networks.lora \ --network_dim=32 \ --network_alpha=16 \ --max_train_epochs=10 \ --learning_rate=1e-4 ``` > Training on M4 Pro via MPS: expect 1–4 hours for a 20-image dataset at 10 epochs. --- ## Directory Layout ``` homeai-images/ ├── workflows/ │ ├── portrait.json │ ├── scene.json │ ├── quick.json │ └── upscale.json └── skills/ └── comfyui.py ``` --- ## Interface Contracts **Consumes:** - ComfyUI REST API: `http://localhost:8188` - Workflows from `homeai-images/workflows/` - Character LoRA from `~/ComfyUI/models/loras/aria-v1.safetensors` **Exposes:** - `comfyui.generate(workflow, params)` → image path — called by P4 OpenClaw **Add to `.env.services`:** ```dotenv COMFYUI_URL=http://localhost:8188 ``` --- ## Implementation Steps - [ ] Clone ComfyUI to `~/ComfyUI/`, install deps in venv - [ ] Verify MPS is detected at launch (`Using device: mps` in logs) - [ ] Write and load launchd plist - [ ] Download SDXL base model via `scripts/download-models.sh` - [ ] Download Flux.1-schnell - [ ] Test basic generation via ComfyUI web UI (browse to port 8188) - [ ] Build and save `quick.json` workflow in ComfyUI UI, export JSON - [ ] Build and save `portrait.json` workflow, export JSON - [ ] Build and save `scene.json` workflow with ControlNet, export JSON - [ ] Write `skills/comfyui.py` full implementation - [ ] Test skill: `comfyui.quick("a cat sitting on a couch")` → image file - [ ] Collect character reference images for LoRA training - [ ] Train SDXL LoRA with kohya_ss - [ ] Load LoRA in `portrait.json` workflow, verify character consistency - [ ] Symlink `skills/` to `~/.openclaw/skills/` - [ ] Test via OpenClaw: "Generate a portrait of Aria looking happy" --- ## Success Criteria - [ ] ComfyUI UI accessible at `http://localhost:8188` after reboot - [ ] `quick.json` workflow generates an image in <30s on M4 Pro - [ ] `portrait.json` with character LoRA produces consistent character appearance - [ ] `comfyui.generate("quick", {"positive_prompt": "test"})` returns a valid image path - [ ] Generated images are saved to `~/ComfyUI/output/` - [ ] ComfyUI survives Mac Mini reboot via launchd