feat: character system v2 — schema upgrade, memory system, per-character TTS routing

Character schema v2: background, dialogue_style, appearance, skills, gaze_presets
with automatic v1→v2 migration. LLM-assisted character creation via Character MCP
server. Two-tier memory system (personal per-character + general shared) with
budget-based injection into LLM system prompt. Per-character TTS voice routing via
state file — Wyoming TTS server reads active config to route between Kokoro (local)
and ElevenLabs (cloud PCM 24kHz). Dashboard: memories page, conversation history,
character profile on cards, auto-TTS engine selection from character config.
Also includes VTube Studio expression bridge and ComfyUI API guide.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Aodhan Collins
2026-03-17 19:15:46 +00:00
parent 1e52c002c2
commit 60eb89ea42
39 changed files with 3846 additions and 409 deletions

View File

@@ -9,6 +9,7 @@ OPENAI_API_KEY=
DEEPSEEK_API_KEY= DEEPSEEK_API_KEY=
GEMINI_API_KEY= GEMINI_API_KEY=
ELEVENLABS_API_KEY= ELEVENLABS_API_KEY=
GAZE_API_KEY=
# ─── Data & Paths ────────────────────────────────────────────────────────────── # ─── Data & Paths ──────────────────────────────────────────────────────────────
DATA_DIR=${HOME}/homeai-data DATA_DIR=${HOME}/homeai-data
@@ -40,10 +41,14 @@ OPEN_WEBUI_URL=http://localhost:3030
OLLAMA_PRIMARY_MODEL=llama3.3:70b OLLAMA_PRIMARY_MODEL=llama3.3:70b
OLLAMA_FAST_MODEL=qwen2.5:7b OLLAMA_FAST_MODEL=qwen2.5:7b
# Medium model kept warm for voice pipeline (override per persona)
# Used by preload-models.sh keep-warm daemon
HOMEAI_MEDIUM_MODEL=qwen3.5:35b-a3b
# ─── P3: Voice ───────────────────────────────────────────────────────────────── # ─── P3: Voice ─────────────────────────────────────────────────────────────────
WYOMING_STT_URL=tcp://localhost:10300 WYOMING_STT_URL=tcp://localhost:10300
WYOMING_TTS_URL=tcp://localhost:10301 WYOMING_TTS_URL=tcp://localhost:10301
ELEVENLABS_API_KEY= # Create at elevenlabs.io if using elevenlabs TTS engine # ELEVENLABS_API_KEY is set above in API Keys section
# ─── P4: Agent ───────────────────────────────────────────────────────────────── # ─── P4: Agent ─────────────────────────────────────────────────────────────────
OPENCLAW_URL=http://localhost:8080 OPENCLAW_URL=http://localhost:8080

View File

@@ -26,6 +26,7 @@ All AI inference runs locally on this machine. No cloud dependency required (clo
### AI & LLM ### AI & LLM
- **Ollama** — local LLM runtime (target models: Llama 3.3 70B, Qwen 2.5 72B) - **Ollama** — local LLM runtime (target models: Llama 3.3 70B, Qwen 2.5 72B)
- **Model keep-warm daemon** — `preload-models.sh` runs as a loop, checks every 5 min, re-pins evicted models with `keep_alive=-1`. Keeps `qwen2.5:7b` (small/fast) and `$HOMEAI_MEDIUM_MODEL` (default: `qwen3.5:35b-a3b`) always loaded in VRAM. Medium model is configurable via env var for per-persona model assignment.
- **Open WebUI** — browser-based chat interface, runs as Docker container - **Open WebUI** — browser-based chat interface, runs as Docker container
### Image Generation ### Image Generation
@@ -35,7 +36,8 @@ All AI inference runs locally on this machine. No cloud dependency required (clo
### Speech ### Speech
- **Whisper.cpp** — speech-to-text, optimised for Apple Silicon/Neural Engine - **Whisper.cpp** — speech-to-text, optimised for Apple Silicon/Neural Engine
- **Kokoro TTS** — fast, lightweight text-to-speech (primary, low-latency) - **Kokoro TTS** — fast, lightweight text-to-speech (primary, low-latency, local)
- **ElevenLabs TTS** — cloud voice cloning/synthesis (per-character voice ID, routed via state file)
- **Chatterbox TTS** — voice cloning engine (Apple Silicon MPS optimised) - **Chatterbox TTS** — voice cloning engine (Apple Silicon MPS optimised)
- **Qwen3-TTS** — alternative voice cloning via MLX - **Qwen3-TTS** — alternative voice cloning via MLX
- **openWakeWord** — always-on wake word detection - **openWakeWord** — always-on wake word detection
@@ -49,11 +51,13 @@ All AI inference runs locally on this machine. No cloud dependency required (clo
### AI Agent / Orchestration ### AI Agent / Orchestration
- **OpenClaw** — primary AI agent layer; receives voice commands, calls tools, manages personality - **OpenClaw** — primary AI agent layer; receives voice commands, calls tools, manages personality
- **n8n** — visual workflow automation (Docker), chains AI actions - **n8n** — visual workflow automation (Docker), chains AI actions
- **mem0** — long-term memory layer for the AI character - **Character Memory System** — two-tier JSON-based memories (personal per-character + general shared), injected into LLM system prompt with budget truncation
### Character & Personality ### Character & Personality
- **Character Manager** (built — see `character-manager.jsx`) — single config UI for personality, prompts, models, Live2D mappings, and notes - **Character Schema v2** — JSON spec with background, dialogue_style, appearance, skills, gaze_presets (v1 auto-migrated)
- Character config exports to JSON, consumed by OpenClaw system prompt and pipeline - **HomeAI Dashboard** — unified web app: character editor, chat, memory manager, service dashboard
- **Character MCP Server** — LLM-assisted character creation via Fandom wiki/Wikipedia lookup (Docker)
- Character config stored as JSON files in `~/homeai-data/characters/`, consumed by bridge for system prompt construction
### Visual Representation ### Visual Representation
- **VTube Studio** — Live2D model display on desktop (macOS) and mobile (iOS/Android) - **VTube Studio** — Live2D model display on desktop (macOS) and mobile (iOS/Android)
@@ -85,47 +89,79 @@ All AI inference runs locally on this machine. No cloud dependency required (clo
ESP32-S3-BOX-3 (room) ESP32-S3-BOX-3 (room)
→ Wake word detected (openWakeWord, runs locally on device or Mac Mini) → Wake word detected (openWakeWord, runs locally on device or Mac Mini)
→ Audio streamed to Mac Mini via Wyoming Satellite → Audio streamed to Mac Mini via Wyoming Satellite
→ Whisper.cpp transcribes speech to text → Whisper MLX transcribes speech to text
OpenClaw receives text + context HA conversation agent → OpenClaw HTTP Bridge
Ollama LLM generates response (with character persona from system prompt) Bridge resolves character (satellite_id → character mapping)
mem0 updates long-term memory Bridge builds system prompt (profile + memories) and writes TTS config to state file
→ OpenClaw CLI → Ollama LLM generates response
→ Response dispatched: → Response dispatched:
Kokoro/Chatterbox renders TTS audio Wyoming TTS reads state file → routes to Kokoro (local) or ElevenLabs (cloud)
→ Audio sent back to ESP32-S3-BOX-3 (spoken response) → Audio sent back to ESP32-S3-BOX-3 (spoken response)
→ VTube Studio API triggered (expression + lip sync on desktop/mobile) → VTube Studio API triggered (expression + lip sync on desktop/mobile)
→ Home Assistant action called if applicable (lights, music, etc.) → Home Assistant action called if applicable (lights, music, etc.)
``` ```
### Timeout Strategy
The HTTP bridge checks Ollama `/api/ps` before each request to determine if the LLM is already loaded:
| Layer | Warm (model loaded) | Cold (model loading) |
|---|---|---|
| HA conversation component | 200s | 200s |
| OpenClaw HTTP bridge | 60s | 180s |
| OpenClaw agent | 60s | 60s |
The keep-warm daemon ensures models stay loaded, so cold starts should be rare (only after Ollama restarts or VRAM pressure).
--- ---
## Character System ## Character System
The AI assistant has a defined personality managed via the Character Manager tool. The AI assistant has a defined personality managed via the HomeAI Dashboard (character editor + memory manager).
Key config surfaces: ### Character Schema v2
- **System prompt** — injected into every Ollama request
- **Voice clone reference** — `.wav` file path for Chatterbox/Qwen3-TTS Each character is a JSON file in `~/homeai-data/characters/` with:
- **Live2D expression mappings** — idle, speaking, thinking, happy, error states - **System prompt** — core personality, injected into every LLM request
- **VTube Studio WebSocket triggers** — JSON map of events to expressions - **Profile fields** — background, appearance, dialogue_style, skills array
- **TTS config** — engine (kokoro/elevenlabs), kokoro_voice, elevenlabs_voice_id, elevenlabs_model, speed
- **GAZE presets** — array of `{preset, trigger}` for image generation styles
- **Custom prompt rules** — trigger/response overrides for specific contexts - **Custom prompt rules** — trigger/response overrides for specific contexts
- **mem0** — persistent memory that evolves over time
Character config JSON (exported from Character Manager) is the single source of truth consumed by all pipeline components. ### Memory System
Two-tier memory stored as JSON in `~/homeai-data/memories/`:
- **Personal memories** (`personal/{character_id}.json`) — per-character, about user interactions
- **General memories** (`general.json`) — shared operational knowledge (tool usage, device info, routines)
Memories are injected into the system prompt by the bridge with budget truncation (personal: 4000 chars, general: 3000 chars, newest first).
### TTS Voice Routing
The bridge writes the active character's TTS config to `~/homeai-data/active-tts-voice.json` before each request. The Wyoming TTS server reads this state file to determine which engine/voice to use:
- **Kokoro** — local, fast, uses `kokoro_voice` field (e.g., `af_heart`)
- **ElevenLabs** — cloud, uses `elevenlabs_voice_id` + `elevenlabs_model`, returns PCM 24kHz
This works for both ESP32/HA pipeline and dashboard chat.
--- ---
## Project Priorities ## Project Priorities
1. **Foundation** — Docker stack up (Home Assistant, Open WebUI, Portainer, Uptime Kuma) 1. **Foundation** — Docker stack up (Home Assistant, Open WebUI, Portainer, Uptime Kuma)
2. **LLM** — Ollama running with target models, Open WebUI connected 2. **LLM** — Ollama running with target models, Open WebUI connected
3. **Voice pipeline** — Whisper → Ollama → Kokoro → Wyoming → Home Assistant 3. **Voice pipeline** — Whisper → Ollama → Kokoro → Wyoming → Home Assistant
4. **OpenClaw** — installed, onboarded, connected to Ollama and Home Assistant 4. **OpenClaw** — installed, onboarded, connected to Ollama and Home Assistant
5. **ESP32-S3-BOX-3** — ESPHome flash, Wyoming Satellite, LVGL face 5. **ESP32-S3-BOX-3** — ESPHome flash, Wyoming Satellite, display faces ✅
6. **Character system** — system prompt wired up, mem0 integrated, voice cloned 6. **Character system** — schema v2, dashboard editor, memory system, per-character TTS routing ✅
7. **VTube Studio** — model loaded, WebSocket API bridge written as OpenClaw skill 7. **Animated visual** — PNG/GIF character visual for the web assistant (initial visual layer)
8. **ComfyUI** — image generation online, character-consistent model workflows 8. **Android app** — companion app for mobile access to the assistant
9. **Extended integrations** — n8n workflows, Music Assistant, Snapcast, Gitea, code-server 9. **ComfyUI** — image generation online, character-consistent model workflows
10. **Polish** — Authelia, Tailscale hardening, mobile companion, iOS widgets 10. **Extended integrations** — n8n workflows, Music Assistant, Snapcast, Gitea, code-server
11. **Polish** — Authelia, Tailscale hardening, iOS widgets
### Stretch Goals
- **Live2D / VTube Studio** — full Live2D model with WebSocket API bridge (requires learning Live2D tooling)
--- ---
@@ -133,7 +169,11 @@ Character config JSON (exported from Character Manager) is the single source of
- All Docker compose files: `~/server/docker/` - All Docker compose files: `~/server/docker/`
- OpenClaw skills: `~/.openclaw/skills/` - OpenClaw skills: `~/.openclaw/skills/`
- Character configs: `~/.openclaw/characters/` - Character configs: `~/homeai-data/characters/`
- Character memories: `~/homeai-data/memories/`
- Conversation history: `~/homeai-data/conversations/`
- Active TTS state: `~/homeai-data/active-tts-voice.json`
- Satellite → character map: `~/homeai-data/satellite-map.json`
- Whisper models: `~/models/whisper/` - Whisper models: `~/models/whisper/`
- Ollama models: managed by Ollama at `~/.ollama/models/` - Ollama models: managed by Ollama at `~/.ollama/models/`
- ComfyUI models: `~/ComfyUI/models/` - ComfyUI models: `~/ComfyUI/models/`

88
TODO.md
View File

@@ -26,7 +26,7 @@
- [x] Register local GGUF models via Modelfiles (no download): llama3.3:70b, qwen3:32b, codestral:22b, qwen2.5:7b - [x] Register local GGUF models via Modelfiles (no download): llama3.3:70b, qwen3:32b, codestral:22b, qwen2.5:7b
- [x] Register additional models: EVA-LLaMA-3.33-70B, Midnight-Miqu-70B, QwQ-32B, Qwen3.5-35B, Qwen3-Coder-30B, Qwen3-VL-30B, GLM-4.6V-Flash, DeepSeek-R1-8B, gemma-3-27b - [x] Register additional models: EVA-LLaMA-3.33-70B, Midnight-Miqu-70B, QwQ-32B, Qwen3.5-35B, Qwen3-Coder-30B, Qwen3-VL-30B, GLM-4.6V-Flash, DeepSeek-R1-8B, gemma-3-27b
- [x] Add qwen3.5:35b-a3b (MoE, Q8_0) — 26.7 tok/s, recommended for voice pipeline - [x] Add qwen3.5:35b-a3b (MoE, Q8_0) — 26.7 tok/s, recommended for voice pipeline
- [x] Write model preload script + launchd service (keeps voice model in VRAM permanently) - [x] Write model keep-warm daemon + launchd service (pins qwen2.5:7b + $HOMEAI_MEDIUM_MODEL in VRAM, checks every 5 min)
- [x] Deploy Open WebUI via Docker compose (port 3030) - [x] Deploy Open WebUI via Docker compose (port 3030)
- [x] Verify Open WebUI connected to Ollama, all models available - [x] Verify Open WebUI connected to Ollama, all models available
- [x] Run pipeline benchmark (homeai-voice/scripts/benchmark_pipeline.py) — STT/LLM/TTS latency profiled - [x] Run pipeline benchmark (homeai-voice/scripts/benchmark_pipeline.py) — STT/LLM/TTS latency profiled
@@ -82,7 +82,7 @@
- [x] Verify full voice → agent → HA action flow - [x] Verify full voice → agent → HA action flow
- [x] Add OpenClaw to Uptime Kuma monitors (Manual user action required) - [x] Add OpenClaw to Uptime Kuma monitors (Manual user action required)
### P5 · homeai-character *(can start alongside P4)* ### P5 · homeai-dashboard *(character system + dashboard)*
- [x] Define and write `schema/character.schema.json` (v1) - [x] Define and write `schema/character.schema.json` (v1)
- [x] Write `characters/aria.json` — default character - [x] Write `characters/aria.json` — default character
@@ -100,6 +100,15 @@
- [x] Add character profile management to dashboard — store/switch character configs with attached profile images - [x] Add character profile management to dashboard — store/switch character configs with attached profile images
- [x] Add TTS voice preview in character editor — Kokoro preview via OpenClaw bridge with loading state, custom text, stop control - [x] Add TTS voice preview in character editor — Kokoro preview via OpenClaw bridge with loading state, custom text, stop control
- [x] Merge homeai-character + homeai-desktop into unified homeai-dashboard (services, chat, characters, editor) - [x] Merge homeai-character + homeai-desktop into unified homeai-dashboard (services, chat, characters, editor)
- [x] Upgrade character schema to v2 — background, dialogue_style, appearance, skills, gaze_presets (auto-migrate v1)
- [x] Add LLM-assisted character creation via Character MCP server (Fandom/Wikipedia lookup)
- [x] Add character memory system — personal (per-character) + general (shared) memories with dashboard UI
- [x] Add conversation history with per-conversation persistence
- [x] Wire character_id through full pipeline (dashboard → bridge → LLM system prompt)
- [x] Add TTS text cleaning — strip tags, asterisks, emojis, markdown before synthesis
- [x] Add per-character TTS voice routing — bridge writes state file, Wyoming server reads it
- [x] Add ElevenLabs TTS support in Wyoming server — cloud voice synthesis via state file routing
- [x] Dashboard auto-selects character's TTS engine/voice (Kokoro or ElevenLabs)
- [ ] Deploy dashboard as Docker container or static site on Mac Mini - [ ] Deploy dashboard as Docker container or static site on Mac Mini
--- ---
@@ -123,50 +132,71 @@
- [ ] Flash remaining units (bedroom, kitchen) - [ ] Flash remaining units (bedroom, kitchen)
- [ ] Document MAC address → room name mapping - [ ] Document MAC address → room name mapping
### P6b · homeai-rpi (Kitchen Satellite)
- [x] Set up Wyoming Satellite on Raspberry Pi 5 (SELBINA) with ReSpeaker 2-Mics pHAT
- [x] Write setup.sh — full Pi provisioning (venv, drivers, systemd, scripts)
- [x] Write deploy.sh — remote deploy/manage from Mac Mini (push-wrapper, test-logs, etc.)
- [x] Write satellite_wrapper.py — monkey-patches fixing TTS echo, writer race, streaming timeout
- [x] Test multi-command voice loop without freezing
--- ---
## Phase 5 — Visual Layer ## Phase 5 — Visual Layer
### P7 · homeai-visual ### P7 · homeai-visual
- [ ] Install VTube Studio (Mac App Store) #### VTube Studio Expression Bridge
- [ ] Enable WebSocket API on port 8001 - [x] Write `vtube-bridge.py` — persistent WebSocket ↔ HTTP bridge daemon (port 8002)
- [ ] Source/purchase a Live2D model (nizima.com or booth.pm) - [x] Write `vtube-ctl` CLI wrapper + OpenClaw skill (`~/.openclaw/skills/vtube-studio/`)
- [ ] Load model in VTube Studio - [x] Wire expression triggers into `openclaw-http-bridge.py` (thinking → idle, speaking → idle)
- [ ] Create hotkeys for all 8 expression states - [x] Add amplitude-based lip sync to `wyoming_kokoro_server.py` (RMS → MouthOpen parameter)
- [ ] Write `skills/vtube_studio` SKILL.md + implementation - [x] Write `test-expressions.py` — auth flow, expression cycle, lip sync sweep, latency test
- [ ] Run auth flow — click Allow in VTube Studio, save token - [x] Write launchd plist + setup.sh for venv creation and service registration
- [ ] Test all 8 expressions via test script - [ ] Install VTube Studio from Mac App Store, enable WebSocket API (port 8001)
- [ ] Update `aria.json` with real VTube Studio hotkey IDs - [ ] Source/purchase Live2D model, load in VTube Studio
- [ ] Write `lipsync.py` amplitude-based helper - [ ] Create 8 expression hotkeys, record UUIDs
- [ ] Integrate lip sync into OpenClaw TTS dispatch - [ ] Run `setup.sh` to create venv, install websockets, load launchd service
- [ ] Test full pipeline: voice → thinking expression → speaking with lip sync - [ ] Run `vtube-ctl auth` — click Allow in VTube Studio
- [ ] Update `aria.json` with real hotkey UUIDs (replace placeholders)
- [ ] Run `test-expressions.py --all` — verify expressions + lip sync + latency
- [ ] Set up VTube Studio mobile (iPhone/iPad) on Tailnet - [ ] Set up VTube Studio mobile (iPhone/iPad) on Tailnet
#### Web Visuals (Dashboard)
- [ ] Design PNG/GIF character visuals for web assistant (idle, thinking, speaking, etc.)
- [ ] Integrate animated visuals into homeai-dashboard chat view
- [ ] Sync visual state to voice pipeline events (listening, processing, responding)
- [ ] Add expression transitions and idle animations
### P8 · homeai-android
- [ ] Build Android companion app for mobile assistant access
- [ ] Integrate with OpenClaw bridge API (chat, TTS, STT)
- [ ] Add character visual display
- [ ] Push notification support via ntfy/FCM
--- ---
## Phase 6 — Image Generation ## Phase 6 — Image Generation
### P8 · homeai-images ### P9 · homeai-images (ComfyUI)
- [ ] Clone ComfyUI to `~/ComfyUI/`, install deps in venv - [ ] Clone ComfyUI to `~/ComfyUI/`, install deps in venv
- [ ] Verify MPS is detected at launch - [ ] Verify MPS is detected at launch
- [ ] Write and load launchd plist (`com.homeai.comfyui.plist`) - [ ] Write and load launchd plist (`com.homeai.comfyui.plist`)
- [ ] Download SDXL base model - [ ] Download SDXL base model + Flux.1-schnell + ControlNet models
- [ ] Download Flux.1-schnell
- [ ] Download ControlNet models (canny, depth)
- [ ] Test generation via ComfyUI web UI (port 8188) - [ ] Test generation via ComfyUI web UI (port 8188)
- [ ] Build and export `quick.json`, `portrait.json`, `scene.json`, `upscale.json` workflows - [ ] Build and export workflow JSONs (quick, portrait, scene, upscale)
- [ ] Write `skills/comfyui` SKILL.md + implementation - [ ] Write `skills/comfyui` SKILL.md + implementation
- [ ] Test skill: "Generate a portrait of Aria looking happy"
- [ ] Collect character reference images for LoRA training - [ ] Collect character reference images for LoRA training
- [ ] Train SDXL LoRA with kohya_ss, verify character consistency
- [ ] Add ComfyUI to Uptime Kuma monitors - [ ] Add ComfyUI to Uptime Kuma monitors
--- ---
## Phase 7 — Extended Integrations & Polish ## Phase 7 — Extended Integrations & Polish
### P10 · Integrations & Polish
- [ ] Deploy Music Assistant (Docker), integrate with Home Assistant - [ ] Deploy Music Assistant (Docker), integrate with Home Assistant
- [ ] Write `skills/music` SKILL.md for OpenClaw - [ ] Write `skills/music` SKILL.md for OpenClaw
- [ ] Deploy Snapcast server on Mac Mini - [ ] Deploy Snapcast server on Mac Mini
@@ -183,10 +213,24 @@
--- ---
## Stretch Goals
### Live2D / VTube Studio
- [ ] Learn Live2D modelling toolchain (Live2D Cubism Editor)
- [ ] Install VTube Studio (Mac App Store), enable WebSocket API on port 8001
- [ ] Source/commission a Live2D model (nizima.com or booth.pm)
- [ ] Create hotkeys for expression states
- [ ] Write `skills/vtube_studio` SKILL.md + implementation
- [ ] Write `lipsync.py` amplitude-based helper
- [ ] Integrate lip sync into OpenClaw TTS dispatch
- [ ] Set up VTube Studio mobile (iPhone/iPad) on Tailnet
---
## Open Decisions ## Open Decisions
- [ ] Confirm character name (determines wake word training) - [ ] Confirm character name (determines wake word training)
- [ ] Live2D model: purchase off-the-shelf or commission custom?
- [ ] mem0 backend: Chroma (simple) vs Qdrant Docker (better semantic search)? - [ ] mem0 backend: Chroma (simple) vs Qdrant Docker (better semantic search)?
- [ ] Snapcast output: ESP32 built-in speakers or dedicated audio hardware per room? - [ ] Snapcast output: ESP32 built-in speakers or dedicated audio hardware per room?
- [ ] Authelia user store: local file vs LDAP? - [ ] Authelia user store: local file vs LDAP?

View File

@@ -12,7 +12,7 @@ CONF_TIMEOUT = "timeout"
DEFAULT_HOST = "10.0.0.101" DEFAULT_HOST = "10.0.0.101"
DEFAULT_PORT = 8081 # OpenClaw HTTP Bridge (not 8080 gateway) DEFAULT_PORT = 8081 # OpenClaw HTTP Bridge (not 8080 gateway)
DEFAULT_AGENT = "main" DEFAULT_AGENT = "main"
DEFAULT_TIMEOUT = 120 DEFAULT_TIMEOUT = 200 # Must exceed bridge cold timeout (180s)
# API endpoints # API endpoints
OPENCLAW_API_PATH = "/api/agent/message" OPENCLAW_API_PATH = "/api/agent/message"

View File

@@ -77,7 +77,11 @@ class OpenClawAgent(AbstractConversationAgent):
_LOGGER.debug("Processing message: %s", text) _LOGGER.debug("Processing message: %s", text)
try: try:
response_text = await self._call_openclaw(text) response_text = await self._call_openclaw(
text,
satellite_id=getattr(user_input, "satellite_id", None),
device_id=getattr(user_input, "device_id", None),
)
# Create proper IntentResponse for Home Assistant # Create proper IntentResponse for Home Assistant
intent_response = IntentResponse(language=user_input.language or "en") intent_response = IntentResponse(language=user_input.language or "en")
@@ -96,13 +100,14 @@ class OpenClawAgent(AbstractConversationAgent):
conversation_id=conversation_id, conversation_id=conversation_id,
) )
async def _call_openclaw(self, message: str) -> str: async def _call_openclaw(self, message: str, satellite_id: str = None, device_id: str = None) -> str:
"""Call OpenClaw API and return the response.""" """Call OpenClaw API and return the response."""
url = f"http://{self.host}:{self.port}{OPENCLAW_API_PATH}" url = f"http://{self.host}:{self.port}{OPENCLAW_API_PATH}"
payload = { payload = {
"message": message, "message": message,
"agent": self.agent_name, "agent": self.agent_name,
"satellite_id": satellite_id or device_id,
} }
session = async_get_clientsession(self.hass) session = async_get_clientsession(self.hass)

View File

@@ -35,6 +35,8 @@
<dict> <dict>
<key>PATH</key> <key>PATH</key>
<string>/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin</string> <string>/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin</string>
<key>ELEVENLABS_API_KEY</key>
<string>sk_ec10e261c6190307a37aa161a9583504dcf25a0cabe5dbd5</string>
</dict> </dict>
</dict> </dict>
</plist> </plist>

View File

@@ -28,6 +28,8 @@
<string>eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJmZGQ1NzZlYWNkMTU0ZTY2ODY1OTkzYTlhNTIxM2FmNyIsImlhdCI6MTc3MjU4ODYyOCwiZXhwIjoyMDg3OTQ4NjI4fQ.CTAU1EZgpVLp_aRnk4vg6cQqwS5N-p8jQkAAXTxFmLY</string> <string>eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJmZGQ1NzZlYWNkMTU0ZTY2ODY1OTkzYTlhNTIxM2FmNyIsImlhdCI6MTc3MjU4ODYyOCwiZXhwIjoyMDg3OTQ4NjI4fQ.CTAU1EZgpVLp_aRnk4vg6cQqwS5N-p8jQkAAXTxFmLY</string>
<key>HASS_TOKEN</key> <key>HASS_TOKEN</key>
<string>eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJmZGQ1NzZlYWNkMTU0ZTY2ODY1OTkzYTlhNTIxM2FmNyIsImlhdCI6MTc3MjU4ODYyOCwiZXhwIjoyMDg3OTQ4NjI4fQ.CTAU1EZgpVLp_aRnk4vg6cQqwS5N-p8jQkAAXTxFmLY</string> <string>eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJmZGQ1NzZlYWNkMTU0ZTY2ODY1OTkzYTlhNTIxM2FmNyIsImlhdCI6MTc3MjU4ODYyOCwiZXhwIjoyMDg3OTQ4NjI4fQ.CTAU1EZgpVLp_aRnk4vg6cQqwS5N-p8jQkAAXTxFmLY</string>
<key>GAZE_API_KEY</key>
<string>e63401f17e4845e1059f830267f839fe7fc7b6083b1cb1730863318754d799f4</string>
</dict> </dict>
<key>RunAtLoad</key> <key>RunAtLoad</key>

View File

@@ -24,9 +24,12 @@ Endpoints:
import argparse import argparse
import json import json
import os
import subprocess import subprocess
import sys import sys
import asyncio import asyncio
import urllib.request
import threading
from http.server import HTTPServer, BaseHTTPRequestHandler from http.server import HTTPServer, BaseHTTPRequestHandler
from socketserver import ThreadingMixIn from socketserver import ThreadingMixIn
from urllib.parse import urlparse from urllib.parse import urlparse
@@ -40,19 +43,222 @@ from wyoming.asr import Transcribe, Transcript
from wyoming.audio import AudioStart, AudioChunk, AudioStop from wyoming.audio import AudioStart, AudioChunk, AudioStop
from wyoming.info import Info from wyoming.info import Info
# Timeout settings (seconds)
TIMEOUT_WARM = 120 # Model already loaded in VRAM
TIMEOUT_COLD = 180 # Model needs loading first (~10-20s load + inference)
OLLAMA_PS_URL = "http://localhost:11434/api/ps"
VTUBE_BRIDGE_URL = "http://localhost:8002"
def load_character_prompt() -> str:
"""Load the active character system prompt.""" def _vtube_fire_and_forget(path: str, data: dict):
character_path = Path.home() / ".openclaw" / "characters" / "aria.json" """Send a non-blocking POST to the VTube Studio bridge. Failures are silent."""
def _post():
try:
body = json.dumps(data).encode()
req = urllib.request.Request(
f"{VTUBE_BRIDGE_URL}{path}",
data=body,
headers={"Content-Type": "application/json"},
method="POST",
)
urllib.request.urlopen(req, timeout=2)
except Exception:
pass # bridge may not be running — that's fine
threading.Thread(target=_post, daemon=True).start()
def is_model_warm() -> bool:
"""Check if the default Ollama model is already loaded in VRAM."""
try:
req = urllib.request.Request(OLLAMA_PS_URL)
with urllib.request.urlopen(req, timeout=2) as resp:
data = json.loads(resp.read())
return len(data.get("models", [])) > 0
except Exception:
# If we can't reach Ollama, assume cold (safer longer timeout)
return False
CHARACTERS_DIR = Path("/Users/aodhan/homeai-data/characters")
SATELLITE_MAP_PATH = Path("/Users/aodhan/homeai-data/satellite-map.json")
MEMORIES_DIR = Path("/Users/aodhan/homeai-data/memories")
ACTIVE_TTS_VOICE_PATH = Path("/Users/aodhan/homeai-data/active-tts-voice.json")
def clean_text_for_tts(text: str) -> str:
"""Strip content that shouldn't be spoken: tags, asterisks, emojis, markdown."""
# Remove HTML/XML tags and their content for common non-spoken tags
text = re.sub(r'<[^>]+>', '', text)
# Remove content between asterisks (actions/emphasis markup like *sighs*)
text = re.sub(r'\*[^*]+\*', '', text)
# Remove markdown bold/italic markers that might remain
text = re.sub(r'[*_]{1,3}', '', text)
# Remove markdown headers
text = re.sub(r'^#{1,6}\s+', '', text, flags=re.MULTILINE)
# Remove markdown links [text](url) → keep text
text = re.sub(r'\[([^\]]+)\]\([^)]+\)', r'\1', text)
# Remove bare URLs
text = re.sub(r'https?://\S+', '', text)
# Remove code blocks and inline code
text = re.sub(r'```[\s\S]*?```', '', text)
text = re.sub(r'`[^`]+`', '', text)
# Remove emojis
text = re.sub(
r'[\U0001F600-\U0001F64F\U0001F300-\U0001F5FF\U0001F680-\U0001F6FF'
r'\U0001F1E0-\U0001F1FF\U0001F900-\U0001F9FF\U0001FA00-\U0001FAFF'
r'\U00002702-\U000027B0\U0000FE00-\U0000FE0F\U0000200D'
r'\U00002600-\U000026FF\U00002300-\U000023FF]+', '', text
)
# Collapse multiple spaces/newlines
text = re.sub(r'\n{2,}', '\n', text)
text = re.sub(r'[ \t]{2,}', ' ', text)
return text.strip()
def load_satellite_map() -> dict:
"""Load the satellite-to-character mapping."""
try:
with open(SATELLITE_MAP_PATH) as f:
return json.load(f)
except Exception:
return {"default": "aria_default", "satellites": {}}
def set_active_tts_voice(character_id: str, tts_config: dict):
"""Write the active TTS config to a state file for the Wyoming TTS server to read."""
try:
ACTIVE_TTS_VOICE_PATH.parent.mkdir(parents=True, exist_ok=True)
state = {
"character_id": character_id,
"engine": tts_config.get("engine", "kokoro"),
"kokoro_voice": tts_config.get("kokoro_voice", ""),
"elevenlabs_voice_id": tts_config.get("elevenlabs_voice_id", ""),
"elevenlabs_model": tts_config.get("elevenlabs_model", "eleven_multilingual_v2"),
"speed": tts_config.get("speed", 1),
}
with open(ACTIVE_TTS_VOICE_PATH, "w") as f:
json.dump(state, f)
except Exception as e:
print(f"[OpenClaw Bridge] Warning: could not write active TTS config: {e}")
def resolve_character_id(satellite_id: str = None) -> str:
"""Resolve a satellite ID to a character profile ID."""
sat_map = load_satellite_map()
if satellite_id and satellite_id in sat_map.get("satellites", {}):
return sat_map["satellites"][satellite_id]
return sat_map.get("default", "aria_default")
def load_character(character_id: str = None) -> dict:
"""Load a character profile by ID. Returns the full character data dict."""
if not character_id:
character_id = resolve_character_id()
safe_id = character_id.replace("/", "_")
character_path = CHARACTERS_DIR / f"{safe_id}.json"
if not character_path.exists(): if not character_path.exists():
return "" return {}
try: try:
with open(character_path) as f: with open(character_path) as f:
data = json.load(f) profile = json.load(f)
return data.get("system_prompt", "") return profile.get("data", {})
except Exception: except Exception:
return {}
def load_character_prompt(satellite_id: str = None, character_id: str = None) -> str:
"""Load the full system prompt for a character, resolved by satellite or explicit ID.
Builds a rich prompt from system_prompt + profile fields (background, dialogue_style, etc.)."""
if not character_id:
character_id = resolve_character_id(satellite_id)
char = load_character(character_id)
if not char:
return "" return ""
sections = []
# Core system prompt
prompt = char.get("system_prompt", "")
if prompt:
sections.append(prompt)
# Character profile fields
profile_parts = []
if char.get("background"):
profile_parts.append(f"## Background\n{char['background']}")
if char.get("appearance"):
profile_parts.append(f"## Appearance\n{char['appearance']}")
if char.get("dialogue_style"):
profile_parts.append(f"## Dialogue Style\n{char['dialogue_style']}")
if char.get("skills"):
skills = char["skills"]
if isinstance(skills, list):
skills_text = ", ".join(skills[:15])
else:
skills_text = str(skills)
profile_parts.append(f"## Skills & Interests\n{skills_text}")
if profile_parts:
sections.append("[Character Profile]\n" + "\n\n".join(profile_parts))
# Character metadata
meta_lines = []
if char.get("display_name"):
meta_lines.append(f"Your name is: {char['display_name']}")
# Support both v1 (gaze_preset string) and v2 (gaze_presets array)
gaze_presets = char.get("gaze_presets", [])
if gaze_presets and isinstance(gaze_presets, list):
for gp in gaze_presets:
preset = gp.get("preset", "")
trigger = gp.get("trigger", "self-portrait")
if preset:
meta_lines.append(f"GAZE preset '{preset}' — use for: {trigger}")
elif char.get("gaze_preset"):
meta_lines.append(f"Your gaze_preset for self-portraits is: {char['gaze_preset']}")
if meta_lines:
sections.append("[Character Metadata]\n" + "\n".join(meta_lines))
# Memories (personal + general)
personal, general = load_memories(character_id)
if personal:
sections.append("[Personal Memories]\n" + "\n".join(f"- {m}" for m in personal))
if general:
sections.append("[General Knowledge]\n" + "\n".join(f"- {m}" for m in general))
return "\n\n".join(sections)
def load_memories(character_id: str) -> tuple[list[str], list[str]]:
"""Load personal (per-character) and general memories.
Returns (personal_contents, general_contents) truncated to fit context budget."""
PERSONAL_BUDGET = 4000 # max chars for personal memories in prompt
GENERAL_BUDGET = 3000 # max chars for general memories in prompt
def _read_memories(path: Path, budget: int) -> list[str]:
try:
with open(path) as f:
data = json.load(f)
except Exception:
return []
memories = data.get("memories", [])
# Sort newest first
memories.sort(key=lambda m: m.get("createdAt", ""), reverse=True)
result = []
used = 0
for m in memories:
content = m.get("content", "").strip()
if not content:
continue
if used + len(content) > budget:
break
result.append(content)
used += len(content)
return result
safe_id = character_id.replace("/", "_")
personal = _read_memories(MEMORIES_DIR / "personal" / f"{safe_id}.json", PERSONAL_BUDGET)
general = _read_memories(MEMORIES_DIR / "general.json", GENERAL_BUDGET)
return personal, general
class OpenClawBridgeHandler(BaseHTTPRequestHandler): class OpenClawBridgeHandler(BaseHTTPRequestHandler):
"""HTTP request handler for OpenClaw bridge.""" """HTTP request handler for OpenClaw bridge."""
@@ -95,7 +301,7 @@ class OpenClawBridgeHandler(BaseHTTPRequestHandler):
self._send_json_response(404, {"error": "Not found"}) self._send_json_response(404, {"error": "Not found"})
def _handle_tts_request(self): def _handle_tts_request(self):
"""Handle TTS request and return wav audio.""" """Handle TTS request and return audio. Routes to Kokoro or ElevenLabs based on engine."""
content_length = int(self.headers.get("Content-Length", 0)) content_length = int(self.headers.get("Content-Length", 0))
if content_length == 0: if content_length == 0:
self._send_json_response(400, {"error": "Empty body"}) self._send_json_response(400, {"error": "Empty body"})
@@ -109,30 +315,64 @@ class OpenClawBridgeHandler(BaseHTTPRequestHandler):
return return
text = data.get("text", "Hello, this is a test.") text = data.get("text", "Hello, this is a test.")
# Strip emojis so TTS doesn't try to read them out text = clean_text_for_tts(text)
text = re.sub(
r'[\U0001F600-\U0001F64F\U0001F300-\U0001F5FF\U0001F680-\U0001F6FF'
r'\U0001F1E0-\U0001F1FF\U0001F900-\U0001F9FF\U0001FA00-\U0001FAFF'
r'\U00002702-\U000027B0\U0000FE00-\U0000FE0F\U0000200D'
r'\U00002600-\U000026FF\U00002300-\U000023FF]+', '', text
).strip()
voice = data.get("voice", "af_heart") voice = data.get("voice", "af_heart")
engine = data.get("engine", "kokoro")
try: try:
# Run the async Wyoming client # Signal avatar: speaking
audio_bytes = asyncio.run(self._synthesize_audio(text, voice)) _vtube_fire_and_forget("/expression", {"event": "speaking"})
if engine == "elevenlabs":
audio_bytes, content_type = self._synthesize_elevenlabs(text, voice, data.get("model"))
else:
# Default: local Kokoro via Wyoming
audio_bytes = asyncio.run(self._synthesize_audio(text, voice))
content_type = "audio/wav"
# Signal avatar: idle
_vtube_fire_and_forget("/expression", {"event": "idle"})
# Send WAV response
self.send_response(200) self.send_response(200)
self.send_header("Content-Type", "audio/wav") self.send_header("Content-Type", content_type)
# Allow CORS for local testing from Vite
self.send_header("Access-Control-Allow-Origin", "*") self.send_header("Access-Control-Allow-Origin", "*")
self.end_headers() self.end_headers()
self.wfile.write(audio_bytes) self.wfile.write(audio_bytes)
except Exception as e: except Exception as e:
_vtube_fire_and_forget("/expression", {"event": "error"})
self._send_json_response(500, {"error": str(e)}) self._send_json_response(500, {"error": str(e)})
def _synthesize_elevenlabs(self, text: str, voice_id: str, model: str = None) -> tuple[bytes, str]:
"""Call ElevenLabs TTS API and return (audio_bytes, content_type)."""
api_key = os.environ.get("ELEVENLABS_API_KEY", "")
if not api_key:
raise RuntimeError("ELEVENLABS_API_KEY not set in environment")
if not voice_id:
raise RuntimeError("No ElevenLabs voice ID provided")
model = model or "eleven_multilingual_v2"
url = f"https://api.elevenlabs.io/v1/text-to-speech/{voice_id}"
payload = json.dumps({
"text": text,
"model_id": model,
"voice_settings": {"stability": 0.5, "similarity_boost": 0.75},
}).encode()
req = urllib.request.Request(
url,
data=payload,
headers={
"Content-Type": "application/json",
"xi-api-key": api_key,
"Accept": "audio/mpeg",
},
method="POST",
)
with urllib.request.urlopen(req, timeout=30) as resp:
audio_bytes = resp.read()
return audio_bytes, "audio/mpeg"
def do_OPTIONS(self): def do_OPTIONS(self):
"""Handle CORS preflight requests.""" """Handle CORS preflight requests."""
self.send_response(204) self.send_response(204)
@@ -264,6 +504,43 @@ class OpenClawBridgeHandler(BaseHTTPRequestHandler):
print(f"[OpenClaw Bridge] Wake word detected: {wake_word_data.get('wake_word', 'unknown')}") print(f"[OpenClaw Bridge] Wake word detected: {wake_word_data.get('wake_word', 'unknown')}")
self._send_json_response(200, {"status": "ok", "message": "Wake word received"}) self._send_json_response(200, {"status": "ok", "message": "Wake word received"})
@staticmethod
def _call_openclaw(message: str, agent: str, timeout: int) -> str:
"""Call OpenClaw CLI and return stdout."""
result = subprocess.run(
["/opt/homebrew/bin/openclaw", "agent", "--message", message, "--agent", agent],
capture_output=True,
text=True,
timeout=timeout,
check=True,
)
return result.stdout.strip()
@staticmethod
def _needs_followup(response: str) -> bool:
"""Detect if the model promised to act but didn't actually do it.
Returns True if the response looks like a 'will do' without a result."""
if not response:
return False
resp_lower = response.lower()
# If the response contains a URL or JSON-like output, it probably completed
if "http://" in response or "https://" in response or '"status"' in response:
return False
# If it contains a tool result indicator (ha-ctl output, gaze-ctl output)
if any(kw in resp_lower for kw in ["image_url", "seed", "entity_id", "state:", "turned on", "turned off"]):
return False
# Detect promise-like language without substance
promise_phrases = [
"let me", "i'll ", "i will ", "sure thing", "sure,", "right away",
"generating", "one moment", "working on", "hang on", "just a moment",
"on it", "let me generate", "let me create",
]
has_promise = any(phrase in resp_lower for phrase in promise_phrases)
# Short responses with promise language are likely incomplete
if has_promise and len(response) < 200:
return True
return False
def _handle_agent_request(self): def _handle_agent_request(self):
"""Handle agent message request.""" """Handle agent message request."""
content_length = int(self.headers.get("Content-Length", 0)) content_length = int(self.headers.get("Content-Length", 0))
@@ -280,29 +557,63 @@ class OpenClawBridgeHandler(BaseHTTPRequestHandler):
message = data.get("message") message = data.get("message")
agent = data.get("agent", "main") agent = data.get("agent", "main")
satellite_id = data.get("satellite_id")
explicit_character_id = data.get("character_id")
if not message: if not message:
self._send_json_response(400, {"error": "Message is required"}) self._send_json_response(400, {"error": "Message is required"})
return return
# Inject system prompt # Resolve character: explicit ID > satellite mapping > default
system_prompt = load_character_prompt() if explicit_character_id:
character_id = explicit_character_id
else:
character_id = resolve_character_id(satellite_id)
system_prompt = load_character_prompt(character_id=character_id)
# Set the active TTS config for the Wyoming server to pick up
char = load_character(character_id)
tts_config = char.get("tts", {})
if tts_config:
set_active_tts_voice(character_id, tts_config)
engine = tts_config.get("engine", "kokoro")
voice_label = tts_config.get("kokoro_voice", "") if engine == "kokoro" else tts_config.get("elevenlabs_voice_id", "")
print(f"[OpenClaw Bridge] Active TTS: {engine} / {voice_label}")
if satellite_id:
print(f"[OpenClaw Bridge] Satellite: {satellite_id} → character: {character_id}")
elif explicit_character_id:
print(f"[OpenClaw Bridge] Character: {character_id}")
if system_prompt: if system_prompt:
message = f"System Context: {system_prompt}\n\nUser Request: {message}" message = f"System Context: {system_prompt}\n\nUser Request: {message}"
# Check if model is warm to set appropriate timeout
warm = is_model_warm()
timeout = TIMEOUT_WARM if warm else TIMEOUT_COLD
print(f"[OpenClaw Bridge] Model {'warm' if warm else 'cold'}, timeout={timeout}s")
# Signal avatar: thinking
_vtube_fire_and_forget("/expression", {"event": "thinking"})
# Call OpenClaw CLI (use full path for launchd compatibility) # Call OpenClaw CLI (use full path for launchd compatibility)
try: try:
result = subprocess.run( response_text = self._call_openclaw(message, agent, timeout)
["/opt/homebrew/bin/openclaw", "agent", "--message", message, "--agent", agent],
capture_output=True, # Re-prompt if the model promised to act but didn't call a tool.
text=True, # Detect "I'll do X" / "Let me X" responses that lack any result.
timeout=120, if self._needs_followup(response_text):
check=True print(f"[OpenClaw Bridge] Response looks like a promise without action, re-prompting")
followup = (
"You just said you would do something but didn't actually call the exec tool. "
"Do NOT explain what you will do — call the tool NOW using exec and return the result."
) )
response_text = result.stdout.strip() response_text = self._call_openclaw(followup, agent, timeout)
# Signal avatar: idle (TTS handler will override to 'speaking' if voice is used)
_vtube_fire_and_forget("/expression", {"event": "idle"})
self._send_json_response(200, {"response": response_text}) self._send_json_response(200, {"response": response_text})
except subprocess.TimeoutExpired: except subprocess.TimeoutExpired:
self._send_json_response(504, {"error": "OpenClaw command timed out"}) self._send_json_response(504, {"error": f"OpenClaw command timed out after {timeout}s (model was {'warm' if warm else 'cold'})"})
except subprocess.CalledProcessError as e: except subprocess.CalledProcessError as e:
error_msg = e.stderr.strip() if e.stderr else "OpenClaw command failed" error_msg = e.stderr.strip() if e.stderr else "OpenClaw command failed"
self._send_json_response(500, {"error": error_msg}) self._send_json_response(500, {"error": error_msg})

View File

@@ -24,6 +24,8 @@
<string>/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin</string> <string>/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin</string>
<key>HOME</key> <key>HOME</key>
<string>/Users/aodhan</string> <string>/Users/aodhan</string>
<key>GAZE_API_KEY</key>
<string>e63401f17e4845e1059f830267f839fe7fc7b6083b1cb1730863318754d799f4</string>
</dict> </dict>
<key>RunAtLoad</key> <key>RunAtLoad</key>

View File

@@ -1,15 +1,24 @@
{ {
"$schema": "http://json-schema.org/draft-07/schema#", "$schema": "http://json-schema.org/draft-07/schema#",
"title": "HomeAI Character Config", "title": "HomeAI Character Config",
"version": "1", "version": "2",
"type": "object", "type": "object",
"required": ["schema_version", "name", "system_prompt", "tts"], "required": ["schema_version", "name", "system_prompt", "tts"],
"properties": { "properties": {
"schema_version": { "type": "integer", "const": 1 }, "schema_version": { "type": "integer", "enum": [1, 2] },
"name": { "type": "string" }, "name": { "type": "string" },
"display_name": { "type": "string" }, "display_name": { "type": "string" },
"description": { "type": "string" }, "description": { "type": "string" },
"background": { "type": "string", "description": "Backstory, lore, or general prompt enrichment" },
"dialogue_style": { "type": "string", "description": "How the persona speaks or reacts, with example lines" },
"appearance": { "type": "string", "description": "Physical description, also used for image prompting" },
"skills": {
"type": "array",
"description": "Topics the persona specialises in or enjoys talking about",
"items": { "type": "string" }
},
"system_prompt": { "type": "string" }, "system_prompt": { "type": "string" },
"model_overrides": { "model_overrides": {
@@ -31,35 +40,21 @@
"voice_ref_path": { "type": "string" }, "voice_ref_path": { "type": "string" },
"kokoro_voice": { "type": "string" }, "kokoro_voice": { "type": "string" },
"elevenlabs_voice_id": { "type": "string" }, "elevenlabs_voice_id": { "type": "string" },
"elevenlabs_voice_name": { "type": "string" },
"elevenlabs_model": { "type": "string", "default": "eleven_monolingual_v1" }, "elevenlabs_model": { "type": "string", "default": "eleven_monolingual_v1" },
"speed": { "type": "number", "default": 1.0 } "speed": { "type": "number", "default": 1.0 }
} }
}, },
"live2d_expressions": { "gaze_presets": {
"type": "array",
"description": "GAZE image generation presets with trigger conditions",
"items": {
"type": "object", "type": "object",
"description": "Maps semantic state to VTube Studio hotkey ID", "required": ["preset"],
"properties": { "properties": {
"idle": { "type": "string" }, "preset": { "type": "string" },
"listening": { "type": "string" }, "trigger": { "type": "string", "default": "self-portrait" }
"thinking": { "type": "string" },
"speaking": { "type": "string" },
"happy": { "type": "string" },
"sad": { "type": "string" },
"surprised": { "type": "string" },
"error": { "type": "string" }
}
},
"vtube_ws_triggers": {
"type": "object",
"description": "VTube Studio WebSocket actions keyed by event name",
"additionalProperties": {
"type": "object",
"properties": {
"type": { "type": "string", "enum": ["hotkey", "parameter"] },
"id": { "type": "string" },
"value": { "type": "number" }
} }
} }
}, },
@@ -78,5 +73,6 @@
}, },
"notes": { "type": "string" } "notes": { "type": "string" }
} },
"additionalProperties": true
} }

View File

@@ -3,6 +3,7 @@ import Dashboard from './pages/Dashboard';
import Chat from './pages/Chat'; import Chat from './pages/Chat';
import Characters from './pages/Characters'; import Characters from './pages/Characters';
import Editor from './pages/Editor'; import Editor from './pages/Editor';
import Memories from './pages/Memories';
function NavItem({ to, children, icon }) { function NavItem({ to, children, icon }) {
return ( return (
@@ -77,6 +78,17 @@ function Layout({ children }) {
Characters Characters
</NavItem> </NavItem>
<NavItem
to="/memories"
icon={
<svg className="w-5 h-5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1.5}>
<path strokeLinecap="round" strokeLinejoin="round" d="M12 18v-5.25m0 0a6.01 6.01 0 001.5-.189m-1.5.189a6.01 6.01 0 01-1.5-.189m3.75 7.478a12.06 12.06 0 01-4.5 0m3.75 2.383a14.406 14.406 0 01-3 0M14.25 18v-.192c0-.983.658-1.823 1.508-2.316a7.5 7.5 0 10-7.517 0c.85.493 1.509 1.333 1.509 2.316V18" />
</svg>
}
>
Memories
</NavItem>
<NavItem <NavItem
to="/editor" to="/editor"
icon={ icon={
@@ -113,6 +125,7 @@ function App() {
<Route path="/" element={<div className="flex-1 overflow-y-auto p-8"><div className="max-w-6xl mx-auto"><Dashboard /></div></div>} /> <Route path="/" element={<div className="flex-1 overflow-y-auto p-8"><div className="max-w-6xl mx-auto"><Dashboard /></div></div>} />
<Route path="/chat" element={<Chat />} /> <Route path="/chat" element={<Chat />} />
<Route path="/characters" element={<div className="flex-1 overflow-y-auto p-8"><div className="max-w-6xl mx-auto"><Characters /></div></div>} /> <Route path="/characters" element={<div className="flex-1 overflow-y-auto p-8"><div className="max-w-6xl mx-auto"><Characters /></div></div>} />
<Route path="/memories" element={<div className="flex-1 overflow-y-auto p-8"><div className="max-w-6xl mx-auto"><Memories /></div></div>} />
<Route path="/editor" element={<div className="flex-1 overflow-y-auto p-8"><div className="max-w-6xl mx-auto"><Editor /></div></div>} /> <Route path="/editor" element={<div className="flex-1 overflow-y-auto p-8"><div className="max-w-6xl mx-auto"><Editor /></div></div>} />
</Routes> </Routes>
</Layout> </Layout>

View File

@@ -2,8 +2,10 @@ import { useEffect, useRef } from 'react'
import MessageBubble from './MessageBubble' import MessageBubble from './MessageBubble'
import ThinkingIndicator from './ThinkingIndicator' import ThinkingIndicator from './ThinkingIndicator'
export default function ChatPanel({ messages, isLoading, onReplay }) { export default function ChatPanel({ messages, isLoading, onReplay, character }) {
const bottomRef = useRef(null) const bottomRef = useRef(null)
const name = character?.name || 'AI'
const image = character?.image || null
useEffect(() => { useEffect(() => {
bottomRef.current?.scrollIntoView({ behavior: 'smooth' }) bottomRef.current?.scrollIntoView({ behavior: 'smooth' })
@@ -13,10 +15,14 @@ export default function ChatPanel({ messages, isLoading, onReplay }) {
return ( return (
<div className="flex-1 flex items-center justify-center"> <div className="flex-1 flex items-center justify-center">
<div className="text-center"> <div className="text-center">
<div className="w-16 h-16 rounded-full bg-indigo-600/20 flex items-center justify-center mx-auto mb-4"> {image ? (
<span className="text-indigo-400 text-2xl">AI</span> <img src={image} alt={name} className="w-20 h-20 rounded-full object-cover mx-auto mb-4 ring-2 ring-indigo-500/30" />
) : (
<div className="w-20 h-20 rounded-full bg-indigo-600/20 flex items-center justify-center mx-auto mb-4">
<span className="text-indigo-400 text-2xl">{name[0]}</span>
</div> </div>
<h2 className="text-xl font-medium text-gray-200 mb-2">Hi, I'm Aria</h2> )}
<h2 className="text-xl font-medium text-gray-200 mb-2">Hi, I'm {name}</h2>
<p className="text-gray-500 text-sm">Type a message or press the mic to talk</p> <p className="text-gray-500 text-sm">Type a message or press the mic to talk</p>
</div> </div>
</div> </div>
@@ -26,9 +32,9 @@ export default function ChatPanel({ messages, isLoading, onReplay }) {
return ( return (
<div className="flex-1 overflow-y-auto py-4"> <div className="flex-1 overflow-y-auto py-4">
{messages.map((msg) => ( {messages.map((msg) => (
<MessageBubble key={msg.id} message={msg} onReplay={onReplay} /> <MessageBubble key={msg.id} message={msg} onReplay={onReplay} character={character} />
))} ))}
{isLoading && <ThinkingIndicator />} {isLoading && <ThinkingIndicator character={character} />}
<div ref={bottomRef} /> <div ref={bottomRef} />
</div> </div>
) )

View File

@@ -0,0 +1,70 @@
function timeAgo(dateStr) {
if (!dateStr) return ''
const diff = Date.now() - new Date(dateStr).getTime()
const mins = Math.floor(diff / 60000)
if (mins < 1) return 'just now'
if (mins < 60) return `${mins}m ago`
const hours = Math.floor(mins / 60)
if (hours < 24) return `${hours}h ago`
const days = Math.floor(hours / 24)
return `${days}d ago`
}
export default function ConversationList({ conversations, activeId, onCreate, onSelect, onDelete }) {
return (
<div className="w-72 border-r border-gray-800 flex flex-col bg-gray-950 shrink-0">
{/* New chat button */}
<div className="p-3 border-b border-gray-800">
<button
onClick={onCreate}
className="w-full flex items-center justify-center gap-2 px-3 py-2 bg-indigo-600 hover:bg-indigo-500 text-white text-sm rounded-lg transition-colors"
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M12 4.5v15m7.5-7.5h-15" />
</svg>
New chat
</button>
</div>
{/* Conversation list */}
<div className="flex-1 overflow-y-auto">
{conversations.length === 0 ? (
<p className="text-xs text-gray-600 text-center py-6">No conversations yet</p>
) : (
conversations.map(conv => (
<div
key={conv.id}
onClick={() => onSelect(conv.id)}
className={`group flex items-start gap-2 px-3 py-2.5 cursor-pointer border-b border-gray-800/50 transition-colors ${
conv.id === activeId
? 'bg-gray-800 text-white'
: 'text-gray-400 hover:bg-gray-800/50 hover:text-gray-200'
}`}
>
<div className="flex-1 min-w-0">
<p className="text-sm truncate">
{conv.title || 'New conversation'}
</p>
<div className="flex items-center gap-2 mt-0.5">
{conv.characterName && (
<span className="text-xs text-indigo-400/70">{conv.characterName}</span>
)}
<span className="text-xs text-gray-600">{timeAgo(conv.updatedAt)}</span>
</div>
</div>
<button
onClick={(e) => { e.stopPropagation(); onDelete(conv.id) }}
className="opacity-0 group-hover:opacity-100 p-1 text-gray-500 hover:text-red-400 transition-all shrink-0 mt-0.5"
title="Delete"
>
<svg className="w-3.5 h-3.5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M14.74 9l-.346 9m-4.788 0L9.26 9m9.968-3.21c.342.052.682.107 1.022.166m-1.022-.165L18.16 19.673a2.25 2.25 0 01-2.244 2.077H8.084a2.25 2.25 0 01-2.244-2.077L4.772 5.79m14.456 0a48.108 48.108 0 00-3.478-.397m-12 .562c.34-.059.68-.114 1.022-.165m0 0a48.11 48.11 0 013.478-.397m7.5 0v-.916c0-1.18-.91-2.164-2.09-2.201a51.964 51.964 0 00-3.32 0c-1.18.037-2.09 1.022-2.09 2.201v.916m7.5 0a48.667 48.667 0 00-7.5 0" />
</svg>
</button>
</div>
))
)}
</div>
</div>
)
}

View File

@@ -1,14 +1,100 @@
export default function MessageBubble({ message, onReplay }) { import { useState } from 'react'
function Avatar({ character }) {
const name = character?.name || 'AI'
const image = character?.image || null
if (image) {
return <img src={image} alt={name} className="w-8 h-8 rounded-full object-cover shrink-0 mt-0.5 ring-1 ring-gray-700" />
}
return (
<div className="w-8 h-8 rounded-full bg-indigo-600/20 flex items-center justify-center shrink-0 mt-0.5">
<span className="text-indigo-400 text-sm">{name[0]}</span>
</div>
)
}
function ImageOverlay({ src, onClose }) {
return (
<div
className="fixed inset-0 z-50 bg-black/80 flex items-center justify-center cursor-zoom-out"
onClick={onClose}
>
<img
src={src}
alt="Full size"
className="max-w-[90vw] max-h-[90vh] object-contain rounded-lg shadow-2xl"
onClick={(e) => e.stopPropagation()}
/>
<button
onClick={onClose}
className="absolute top-4 right-4 text-white/70 hover:text-white transition-colors p-2"
>
<svg className="w-6 h-6" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M6 18L18 6M6 6l12 12" />
</svg>
</button>
</div>
)
}
const IMAGE_URL_RE = /(https?:\/\/[^\s]+\.(?:png|jpg|jpeg|gif|webp))/gi
function RichContent({ text }) {
const [overlayImage, setOverlayImage] = useState(null)
const parts = []
let lastIndex = 0
let match
IMAGE_URL_RE.lastIndex = 0
while ((match = IMAGE_URL_RE.exec(text)) !== null) {
if (match.index > lastIndex) {
parts.push({ type: 'text', value: text.slice(lastIndex, match.index) })
}
parts.push({ type: 'image', value: match[1] })
lastIndex = IMAGE_URL_RE.lastIndex
}
if (lastIndex < text.length) {
parts.push({ type: 'text', value: text.slice(lastIndex) })
}
if (parts.length === 1 && parts[0].type === 'text') {
return <>{text}</>
}
return (
<>
{parts.map((part, i) =>
part.type === 'image' ? (
<button
key={i}
onClick={() => setOverlayImage(part.value)}
className="block my-2 cursor-zoom-in"
>
<img
src={part.value}
alt="Generated image"
className="rounded-xl max-w-full max-h-80 object-contain"
loading="lazy"
/>
</button>
) : (
<span key={i}>{part.value}</span>
)
)}
{overlayImage && <ImageOverlay src={overlayImage} onClose={() => setOverlayImage(null)} />}
</>
)
}
export default function MessageBubble({ message, onReplay, character }) {
const isUser = message.role === 'user' const isUser = message.role === 'user'
return ( return (
<div className={`flex ${isUser ? 'justify-end' : 'justify-start'} px-4 py-1.5`}> <div className={`flex ${isUser ? 'justify-end' : 'justify-start'} px-4 py-1.5`}>
<div className={`flex items-start gap-3 max-w-[80%] ${isUser ? 'flex-row-reverse' : ''}`}> <div className={`flex items-start gap-3 max-w-[80%] ${isUser ? 'flex-row-reverse' : ''}`}>
{!isUser && ( {!isUser && <Avatar character={character} />}
<div className="w-8 h-8 rounded-full bg-indigo-600/20 flex items-center justify-center shrink-0 mt-0.5">
<span className="text-indigo-400 text-sm">AI</span>
</div>
)}
<div> <div>
<div <div
className={`rounded-2xl px-4 py-2.5 text-sm leading-relaxed whitespace-pre-wrap ${ className={`rounded-2xl px-4 py-2.5 text-sm leading-relaxed whitespace-pre-wrap ${
@@ -19,7 +105,7 @@ export default function MessageBubble({ message, onReplay }) {
: 'bg-gray-800 text-gray-100' : 'bg-gray-800 text-gray-100'
}`} }`}
> >
{message.content} {isUser ? message.content : <RichContent text={message.content} />}
</div> </div>
{!isUser && !message.isError && onReplay && ( {!isUser && !message.isError && onReplay && (
<button <button

View File

@@ -1,8 +1,10 @@
import { VOICES } from '../lib/constants' import { VOICES, TTS_ENGINES } from '../lib/constants'
export default function SettingsDrawer({ isOpen, onClose, settings, onUpdate }) { export default function SettingsDrawer({ isOpen, onClose, settings, onUpdate }) {
if (!isOpen) return null if (!isOpen) return null
const isKokoro = !settings.ttsEngine || settings.ttsEngine === 'kokoro'
return ( return (
<> <>
<div className="fixed inset-0 bg-black/50 z-40" onClick={onClose} /> <div className="fixed inset-0 bg-black/50 z-40" onClick={onClose} />
@@ -16,9 +18,24 @@ export default function SettingsDrawer({ isOpen, onClose, settings, onUpdate })
</button> </button>
</div> </div>
<div className="flex-1 overflow-y-auto p-4 space-y-5"> <div className="flex-1 overflow-y-auto p-4 space-y-5">
{/* TTS Engine */}
<div>
<label className="block text-xs font-medium text-gray-400 mb-1.5">TTS Engine</label>
<select
value={settings.ttsEngine || 'kokoro'}
onChange={(e) => onUpdate('ttsEngine', e.target.value)}
className="w-full bg-gray-800 text-gray-200 text-sm rounded-lg px-3 py-2 border border-gray-700 focus:outline-none focus:border-indigo-500"
>
{TTS_ENGINES.map((e) => (
<option key={e.id} value={e.id}>{e.label}</option>
))}
</select>
</div>
{/* Voice */} {/* Voice */}
<div> <div>
<label className="block text-xs font-medium text-gray-400 mb-1.5">Voice</label> <label className="block text-xs font-medium text-gray-400 mb-1.5">Voice</label>
{isKokoro ? (
<select <select
value={settings.voice} value={settings.voice}
onChange={(e) => onUpdate('voice', e.target.value)} onChange={(e) => onUpdate('voice', e.target.value)}
@@ -28,6 +45,21 @@ export default function SettingsDrawer({ isOpen, onClose, settings, onUpdate })
<option key={v.id} value={v.id}>{v.label}</option> <option key={v.id} value={v.id}>{v.label}</option>
))} ))}
</select> </select>
) : (
<div>
<input
type="text"
value={settings.voice || ''}
onChange={(e) => onUpdate('voice', e.target.value)}
className="w-full bg-gray-800 text-gray-200 text-sm rounded-lg px-3 py-2 border border-gray-700 focus:outline-none focus:border-indigo-500"
placeholder={settings.ttsEngine === 'elevenlabs' ? 'ElevenLabs voice ID' : 'Voice identifier'}
readOnly
/>
<p className="text-xs text-gray-500 mt-1">
Set via active character profile
</p>
</div>
)}
</div> </div>
{/* Auto TTS */} {/* Auto TTS */}

View File

@@ -1,9 +1,16 @@
export default function ThinkingIndicator() { export default function ThinkingIndicator({ character }) {
const name = character?.name || 'AI'
const image = character?.image || null
return ( return (
<div className="flex items-start gap-3 px-4 py-3"> <div className="flex items-start gap-3 px-4 py-3">
{image ? (
<img src={image} alt={name} className="w-8 h-8 rounded-full object-cover shrink-0 ring-1 ring-gray-700" />
) : (
<div className="w-8 h-8 rounded-full bg-indigo-600/20 flex items-center justify-center shrink-0"> <div className="w-8 h-8 rounded-full bg-indigo-600/20 flex items-center justify-center shrink-0">
<span className="text-indigo-400 text-sm">AI</span> <span className="text-indigo-400 text-sm">{name[0]}</span>
</div> </div>
)}
<div className="flex items-center gap-1 pt-2.5"> <div className="flex items-center gap-1 pt-2.5">
<span className="w-2 h-2 rounded-full bg-gray-400 animate-[bounce_1.4s_ease-in-out_infinite]" /> <span className="w-2 h-2 rounded-full bg-gray-400 animate-[bounce_1.4s_ease-in-out_infinite]" />
<span className="w-2 h-2 rounded-full bg-gray-400 animate-[bounce_1.4s_ease-in-out_0.2s_infinite]" /> <span className="w-2 h-2 rounded-full bg-gray-400 animate-[bounce_1.4s_ease-in-out_0.2s_infinite]" />

View File

@@ -0,0 +1,28 @@
import { useState, useEffect } from 'react'
const ACTIVE_KEY = 'homeai_active_character'
export function useActiveCharacter() {
const [character, setCharacter] = useState(null)
useEffect(() => {
const activeId = localStorage.getItem(ACTIVE_KEY)
if (!activeId) return
fetch(`/api/characters/${activeId}`)
.then(r => r.ok ? r.json() : null)
.then(profile => {
if (profile) {
setCharacter({
id: profile.id,
name: profile.data.display_name || profile.data.name || 'AI',
image: profile.image || null,
tts: profile.data.tts || null,
})
}
})
.catch(() => {})
}, [])
return character
}

View File

@@ -1,45 +1,124 @@
import { useState, useCallback } from 'react' import { useState, useCallback, useEffect, useRef } from 'react'
import { sendMessage } from '../lib/api' import { sendMessage } from '../lib/api'
import { getConversation, saveConversation } from '../lib/conversationApi'
export function useChat() { export function useChat(conversationId, conversationMeta, onConversationUpdate) {
const [messages, setMessages] = useState([]) const [messages, setMessages] = useState([])
const [isLoading, setIsLoading] = useState(false) const [isLoading, setIsLoading] = useState(false)
const [isLoadingConv, setIsLoadingConv] = useState(false)
const convRef = useRef(null)
const idRef = useRef(conversationId)
const send = useCallback(async (text) => { // Keep idRef in sync
useEffect(() => { idRef.current = conversationId }, [conversationId])
// Load conversation from server when ID changes
useEffect(() => {
if (!conversationId) {
setMessages([])
convRef.current = null
return
}
let cancelled = false
setIsLoadingConv(true)
getConversation(conversationId).then(conv => {
if (cancelled) return
if (conv) {
convRef.current = conv
setMessages(conv.messages || [])
} else {
convRef.current = null
setMessages([])
}
setIsLoadingConv(false)
}).catch(() => {
if (!cancelled) {
convRef.current = null
setMessages([])
setIsLoadingConv(false)
}
})
return () => { cancelled = true }
}, [conversationId])
// Persist conversation to server
const persist = useCallback(async (updatedMessages, title, overrideId) => {
const id = overrideId || idRef.current
if (!id) return
const now = new Date().toISOString()
const conv = {
id,
title: title || convRef.current?.title || '',
characterId: conversationMeta?.characterId || convRef.current?.characterId || '',
characterName: conversationMeta?.characterName || convRef.current?.characterName || '',
createdAt: convRef.current?.createdAt || now,
updatedAt: now,
messages: updatedMessages,
}
convRef.current = conv
await saveConversation(conv).catch(() => {})
if (onConversationUpdate) {
onConversationUpdate(id, {
title: conv.title,
updatedAt: conv.updatedAt,
messageCount: conv.messages.length,
})
}
}, [conversationMeta, onConversationUpdate])
// send accepts an optional overrideId for when the conversation was just created
const send = useCallback(async (text, overrideId) => {
if (!text.trim() || isLoading) return null if (!text.trim() || isLoading) return null
const userMsg = { id: Date.now(), role: 'user', content: text.trim(), timestamp: new Date() } const userMsg = { id: Date.now(), role: 'user', content: text.trim(), timestamp: new Date().toISOString() }
setMessages((prev) => [...prev, userMsg]) const isFirstMessage = messages.length === 0
const newMessages = [...messages, userMsg]
setMessages(newMessages)
setIsLoading(true) setIsLoading(true)
try { try {
const response = await sendMessage(text.trim()) const response = await sendMessage(text.trim(), conversationMeta?.characterId || null)
const assistantMsg = { const assistantMsg = {
id: Date.now() + 1, id: Date.now() + 1,
role: 'assistant', role: 'assistant',
content: response, content: response,
timestamp: new Date(), timestamp: new Date().toISOString(),
} }
setMessages((prev) => [...prev, assistantMsg]) const allMessages = [...newMessages, assistantMsg]
setMessages(allMessages)
const title = isFirstMessage
? text.trim().slice(0, 80) + (text.trim().length > 80 ? '...' : '')
: undefined
await persist(allMessages, title, overrideId)
return response return response
} catch (err) { } catch (err) {
const errorMsg = { const errorMsg = {
id: Date.now() + 1, id: Date.now() + 1,
role: 'assistant', role: 'assistant',
content: `Error: ${err.message}`, content: `Error: ${err.message}`,
timestamp: new Date(), timestamp: new Date().toISOString(),
isError: true, isError: true,
} }
setMessages((prev) => [...prev, errorMsg]) const allMessages = [...newMessages, errorMsg]
setMessages(allMessages)
await persist(allMessages, undefined, overrideId)
return null return null
} finally { } finally {
setIsLoading(false) setIsLoading(false)
} }
}, [isLoading]) }, [isLoading, messages, persist])
const clearHistory = useCallback(() => { const clearHistory = useCallback(async () => {
setMessages([]) setMessages([])
}, []) if (idRef.current) {
await persist([], undefined)
return { messages, isLoading, send, clearHistory } }
}, [persist])
return { messages, isLoading, isLoadingConv, send, clearHistory }
} }

View File

@@ -0,0 +1,66 @@
import { useState, useEffect, useCallback } from 'react'
import { listConversations, saveConversation, deleteConversation as deleteConv } from '../lib/conversationApi'
const ACTIVE_KEY = 'homeai_active_conversation'
export function useConversations() {
const [conversations, setConversations] = useState([])
const [activeId, setActiveId] = useState(() => localStorage.getItem(ACTIVE_KEY) || null)
const [isLoading, setIsLoading] = useState(true)
const loadList = useCallback(async () => {
try {
const list = await listConversations()
setConversations(list)
} catch {
setConversations([])
} finally {
setIsLoading(false)
}
}, [])
useEffect(() => { loadList() }, [loadList])
const select = useCallback((id) => {
setActiveId(id)
if (id) {
localStorage.setItem(ACTIVE_KEY, id)
} else {
localStorage.removeItem(ACTIVE_KEY)
}
}, [])
const create = useCallback(async (characterId, characterName) => {
const id = `conv_${Date.now()}`
const now = new Date().toISOString()
const conv = {
id,
title: '',
characterId: characterId || '',
characterName: characterName || '',
createdAt: now,
updatedAt: now,
messages: [],
}
await saveConversation(conv)
setConversations(prev => [{ ...conv, messageCount: 0 }, ...prev])
select(id)
return id
}, [select])
const remove = useCallback(async (id) => {
await deleteConv(id)
setConversations(prev => prev.filter(c => c.id !== id))
if (activeId === id) {
select(null)
}
}, [activeId, select])
const updateMeta = useCallback((id, updates) => {
setConversations(prev => prev.map(c =>
c.id === id ? { ...c, ...updates } : c
))
}, [])
return { conversations, activeId, isLoading, select, create, remove, updateMeta, refresh: loadList }
}

View File

@@ -1,7 +1,7 @@
import { useState, useRef, useCallback } from 'react' import { useState, useRef, useCallback } from 'react'
import { synthesize } from '../lib/api' import { synthesize } from '../lib/api'
export function useTtsPlayback(voice) { export function useTtsPlayback(voice, engine = 'kokoro', model = null) {
const [isPlaying, setIsPlaying] = useState(false) const [isPlaying, setIsPlaying] = useState(false)
const audioCtxRef = useRef(null) const audioCtxRef = useRef(null)
const sourceRef = useRef(null) const sourceRef = useRef(null)
@@ -23,7 +23,7 @@ export function useTtsPlayback(voice) {
setIsPlaying(true) setIsPlaying(true)
try { try {
const audioData = await synthesize(text, voice) const audioData = await synthesize(text, voice, engine, model)
const ctx = getAudioContext() const ctx = getAudioContext()
if (ctx.state === 'suspended') await ctx.resume() if (ctx.state === 'suspended') await ctx.resume()
@@ -42,7 +42,7 @@ export function useTtsPlayback(voice) {
console.error('TTS playback error:', err) console.error('TTS playback error:', err)
setIsPlaying(false) setIsPlaying(false)
} }
}, [voice]) }, [voice, engine, model])
const stop = useCallback(() => { const stop = useCallback(() => {
if (sourceRef.current) { if (sourceRef.current) {

View File

@@ -4,7 +4,43 @@ import schema from '../../schema/character.schema.json'
const ajv = new Ajv({ allErrors: true, strict: false }) const ajv = new Ajv({ allErrors: true, strict: false })
const validate = ajv.compile(schema) const validate = ajv.compile(schema)
/**
* Migrate a v1 character config to v2 in-place.
* Removes live2d/vtube fields, converts gaze_preset to gaze_presets array,
* and initialises new persona fields.
*/
export function migrateV1toV2(config) {
config.schema_version = 2
// Remove deprecated fields
delete config.live2d_expressions
delete config.vtube_ws_triggers
// Convert single gaze_preset string → gaze_presets array
if ('gaze_preset' in config) {
const old = config.gaze_preset
config.gaze_presets = old ? [{ preset: old, trigger: 'self-portrait' }] : []
delete config.gaze_preset
}
if (!config.gaze_presets) {
config.gaze_presets = []
}
// Initialise new fields if absent
if (config.background === undefined) config.background = ''
if (config.dialogue_style === undefined) config.dialogue_style = ''
if (config.appearance === undefined) config.appearance = ''
if (config.skills === undefined) config.skills = []
return config
}
export function validateCharacter(config) { export function validateCharacter(config) {
// Auto-migrate v1 → v2
if (config.schema_version === 1 || config.schema_version === undefined) {
migrateV1toV2(config)
}
const valid = validate(config) const valid = validate(config)
if (!valid) { if (!valid) {
throw new Error(ajv.errorsText(validate.errors)) throw new Error(ajv.errorsText(validate.errors))

View File

@@ -1,8 +1,30 @@
export async function sendMessage(text) { const MAX_RETRIES = 3
const res = await fetch('/api/agent/message', { const RETRY_DELAY_MS = 2000
async function fetchWithRetry(url, options, retries = MAX_RETRIES) {
for (let attempt = 1; attempt <= retries; attempt++) {
try {
const res = await fetch(url, options)
if (res.status === 502 && attempt < retries) {
// Bridge unreachable — wait and retry
await new Promise(r => setTimeout(r, RETRY_DELAY_MS * attempt))
continue
}
return res
} catch (err) {
if (attempt >= retries) throw err
await new Promise(r => setTimeout(r, RETRY_DELAY_MS * attempt))
}
}
}
export async function sendMessage(text, characterId = null) {
const payload = { message: text, agent: 'main' }
if (characterId) payload.character_id = characterId
const res = await fetchWithRetry('/api/agent/message', {
method: 'POST', method: 'POST',
headers: { 'Content-Type': 'application/json' }, headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ message: text, agent: 'main' }), body: JSON.stringify(payload),
}) })
if (!res.ok) { if (!res.ok) {
const err = await res.json().catch(() => ({ error: 'Request failed' })) const err = await res.json().catch(() => ({ error: 'Request failed' }))
@@ -12,11 +34,13 @@ export async function sendMessage(text) {
return data.response return data.response
} }
export async function synthesize(text, voice) { export async function synthesize(text, voice, engine = 'kokoro', model = null) {
const payload = { text, voice, engine }
if (model) payload.model = model
const res = await fetch('/api/tts', { const res = await fetch('/api/tts', {
method: 'POST', method: 'POST',
headers: { 'Content-Type': 'application/json' }, headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ text, voice }), body: JSON.stringify(payload),
}) })
if (!res.ok) throw new Error('TTS failed') if (!res.ok) throw new Error('TTS failed')
return await res.arrayBuffer() return await res.arrayBuffer()

View File

@@ -30,7 +30,15 @@ export const VOICES = [
{ id: 'bm_lewis', label: 'Lewis (M, UK)' }, { id: 'bm_lewis', label: 'Lewis (M, UK)' },
] ]
export const TTS_ENGINES = [
{ id: 'kokoro', label: 'Kokoro (local)' },
{ id: 'chatterbox', label: 'Chatterbox (voice clone)' },
{ id: 'qwen3', label: 'Qwen3 TTS' },
{ id: 'elevenlabs', label: 'ElevenLabs (cloud)' },
]
export const DEFAULT_SETTINGS = { export const DEFAULT_SETTINGS = {
ttsEngine: 'kokoro',
voice: DEFAULT_VOICE, voice: DEFAULT_VOICE,
autoTts: true, autoTts: true,
sttMode: 'bridge', sttMode: 'bridge',

View File

@@ -0,0 +1,25 @@
export async function listConversations() {
const res = await fetch('/api/conversations')
if (!res.ok) throw new Error(`Failed to list conversations: ${res.status}`)
return res.json()
}
export async function getConversation(id) {
const res = await fetch(`/api/conversations/${encodeURIComponent(id)}`)
if (!res.ok) return null
return res.json()
}
export async function saveConversation(conversation) {
const res = await fetch('/api/conversations', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(conversation),
})
if (!res.ok) throw new Error(`Failed to save conversation: ${res.status}`)
}
export async function deleteConversation(id) {
const res = await fetch(`/api/conversations/${encodeURIComponent(id)}`, { method: 'DELETE' })
if (!res.ok) throw new Error(`Failed to delete conversation: ${res.status}`)
}

View File

@@ -0,0 +1,45 @@
export async function getPersonalMemories(characterId) {
const res = await fetch(`/api/memories/personal/${encodeURIComponent(characterId)}`)
if (!res.ok) return { characterId, memories: [] }
return res.json()
}
export async function savePersonalMemory(characterId, memory) {
const res = await fetch(`/api/memories/personal/${encodeURIComponent(characterId)}`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(memory),
})
if (!res.ok) throw new Error(`Failed to save memory: ${res.status}`)
return res.json()
}
export async function deletePersonalMemory(characterId, memoryId) {
const res = await fetch(`/api/memories/personal/${encodeURIComponent(characterId)}/${encodeURIComponent(memoryId)}`, {
method: 'DELETE',
})
if (!res.ok) throw new Error(`Failed to delete memory: ${res.status}`)
}
export async function getGeneralMemories() {
const res = await fetch('/api/memories/general')
if (!res.ok) return { memories: [] }
return res.json()
}
export async function saveGeneralMemory(memory) {
const res = await fetch('/api/memories/general', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(memory),
})
if (!res.ok) throw new Error(`Failed to save memory: ${res.status}`)
return res.json()
}
export async function deleteGeneralMemory(memoryId) {
const res = await fetch(`/api/memories/general/${encodeURIComponent(memoryId)}`, {
method: 'DELETE',
})
if (!res.ok) throw new Error(`Failed to delete memory: ${res.status}`)
}

View File

@@ -1,23 +1,9 @@
import { useState, useEffect } from 'react'; import { useState, useEffect, useCallback } from 'react';
import { useNavigate } from 'react-router-dom'; import { useNavigate } from 'react-router-dom';
import { validateCharacter } from '../lib/SchemaValidator'; import { validateCharacter } from '../lib/SchemaValidator';
const STORAGE_KEY = 'homeai_characters';
const ACTIVE_KEY = 'homeai_active_character'; const ACTIVE_KEY = 'homeai_active_character';
function loadProfiles() {
try {
const raw = localStorage.getItem(STORAGE_KEY);
return raw ? JSON.parse(raw) : [];
} catch {
return [];
}
}
function saveProfiles(profiles) {
localStorage.setItem(STORAGE_KEY, JSON.stringify(profiles));
}
function getActiveId() { function getActiveId() {
return localStorage.getItem(ACTIVE_KEY) || null; return localStorage.getItem(ACTIVE_KEY) || null;
} }
@@ -27,15 +13,52 @@ function setActiveId(id) {
} }
export default function Characters() { export default function Characters() {
const [profiles, setProfiles] = useState(loadProfiles); const [profiles, setProfiles] = useState([]);
const [activeId, setActive] = useState(getActiveId); const [activeId, setActive] = useState(getActiveId);
const [error, setError] = useState(null); const [error, setError] = useState(null);
const [dragOver, setDragOver] = useState(false); const [dragOver, setDragOver] = useState(false);
const [loading, setLoading] = useState(true);
const [satMap, setSatMap] = useState({ default: '', satellites: {} });
const [newSatId, setNewSatId] = useState('');
const [newSatChar, setNewSatChar] = useState('');
const navigate = useNavigate(); const navigate = useNavigate();
// Load profiles and satellite map on mount
useEffect(() => { useEffect(() => {
saveProfiles(profiles); Promise.all([
}, [profiles]); fetch('/api/characters').then(r => r.json()),
fetch('/api/satellite-map').then(r => r.json()),
])
.then(([chars, map]) => {
setProfiles(chars);
setSatMap(map);
setLoading(false);
})
.catch(err => { setError(`Failed to load: ${err.message}`); setLoading(false); });
}, []);
const saveSatMap = useCallback(async (updated) => {
setSatMap(updated);
await fetch('/api/satellite-map', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(updated),
});
}, []);
const saveProfile = useCallback(async (profile) => {
const res = await fetch('/api/characters', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(profile),
});
if (!res.ok) throw new Error('Failed to save profile');
}, []);
const deleteProfile = useCallback(async (id) => {
const safeId = id.replace(/[^a-zA-Z0-9_\-\.]/g, '_');
await fetch(`/api/characters/${safeId}`, { method: 'DELETE' });
}, []);
const handleImport = (e) => { const handleImport = (e) => {
const files = Array.from(e.target?.files || []); const files = Array.from(e.target?.files || []);
@@ -47,12 +70,14 @@ export default function Characters() {
files.forEach(file => { files.forEach(file => {
if (!file.name.endsWith('.json')) return; if (!file.name.endsWith('.json')) return;
const reader = new FileReader(); const reader = new FileReader();
reader.onload = (ev) => { reader.onload = async (ev) => {
try { try {
const data = JSON.parse(ev.target.result); const data = JSON.parse(ev.target.result);
validateCharacter(data); validateCharacter(data);
const id = data.name + '_' + Date.now(); const id = data.name + '_' + Date.now();
setProfiles(prev => [...prev, { id, data, image: null, addedAt: new Date().toISOString() }]); const profile = { id, data, image: null, addedAt: new Date().toISOString() };
await saveProfile(profile);
setProfiles(prev => [...prev, profile]);
setError(null); setError(null);
} catch (err) { } catch (err) {
setError(`Import failed for ${file.name}: ${err.message}`); setError(`Import failed for ${file.name}: ${err.message}`);
@@ -73,15 +98,17 @@ export default function Characters() {
const file = e.target.files[0]; const file = e.target.files[0];
if (!file) return; if (!file) return;
const reader = new FileReader(); const reader = new FileReader();
reader.onload = (ev) => { reader.onload = async (ev) => {
setProfiles(prev => const updated = profiles.map(p => p.id === profileId ? { ...p, image: ev.target.result } : p);
prev.map(p => p.id === profileId ? { ...p, image: ev.target.result } : p) const profile = updated.find(p => p.id === profileId);
); if (profile) await saveProfile(profile);
setProfiles(updated);
}; };
reader.readAsDataURL(file); reader.readAsDataURL(file);
}; };
const removeProfile = (id) => { const removeProfile = async (id) => {
await deleteProfile(id);
setProfiles(prev => prev.filter(p => p.id !== id)); setProfiles(prev => prev.filter(p => p.id !== id));
if (activeId === id) { if (activeId === id) {
setActive(null); setActive(null);
@@ -92,6 +119,28 @@ export default function Characters() {
const activateProfile = (id) => { const activateProfile = (id) => {
setActive(id); setActive(id);
setActiveId(id); setActiveId(id);
// Sync active character's TTS settings to chat settings
const profile = profiles.find(p => p.id === id);
if (profile?.data?.tts) {
const tts = profile.data.tts;
const engine = tts.engine || 'kokoro';
let voice;
if (engine === 'kokoro') voice = tts.kokoro_voice || 'af_heart';
else if (engine === 'elevenlabs') voice = tts.elevenlabs_voice_id || '';
else if (engine === 'chatterbox') voice = tts.voice_ref_path || '';
else voice = '';
try {
const raw = localStorage.getItem('homeai_dashboard_settings');
const settings = raw ? JSON.parse(raw) : {};
localStorage.setItem('homeai_dashboard_settings', JSON.stringify({
...settings,
ttsEngine: engine,
voice: voice,
}));
} catch { /* ignore */ }
}
}; };
const exportProfile = (profile) => { const exportProfile = (profile) => {
@@ -125,14 +174,29 @@ export default function Characters() {
)} )}
</p> </p>
</div> </div>
<label className="flex items-center gap-2 px-4 py-2 bg-indigo-600 hover:bg-indigo-500 text-white rounded-lg cursor-pointer transition-colors"> <div className="flex gap-3">
<button
onClick={() => {
sessionStorage.removeItem('edit_character');
sessionStorage.removeItem('edit_character_profile_id');
navigate('/editor');
}}
className="flex items-center gap-2 px-4 py-2 bg-indigo-600 hover:bg-indigo-500 text-white rounded-lg transition-colors"
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}> <svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M12 4.5v15m7.5-7.5h-15" /> <path strokeLinecap="round" strokeLinejoin="round" d="M12 4.5v15m7.5-7.5h-15" />
</svg> </svg>
New Character
</button>
<label className="flex items-center gap-2 px-4 py-2 bg-gray-800 hover:bg-gray-700 text-gray-300 rounded-lg cursor-pointer border border-gray-700 transition-colors">
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M3 16.5v2.25A2.25 2.25 0 005.25 21h13.5A2.25 2.25 0 0021 18.75V16.5m-13.5-9L12 3m0 0l4.5 4.5M12 3v13.5" />
</svg>
Import JSON Import JSON
<input type="file" accept=".json" multiple className="hidden" onChange={handleImport} /> <input type="file" accept=".json" multiple className="hidden" onChange={handleImport} />
</label> </label>
</div> </div>
</div>
{error && ( {error && (
<div className="bg-red-900/30 border border-red-500/50 text-red-300 px-4 py-3 rounded-lg text-sm"> <div className="bg-red-900/30 border border-red-500/50 text-red-300 px-4 py-3 rounded-lg text-sm">
@@ -158,7 +222,11 @@ export default function Characters() {
</div> </div>
{/* Profile grid */} {/* Profile grid */}
{profiles.length === 0 ? ( {loading ? (
<div className="text-center py-16">
<p className="text-gray-500">Loading characters...</p>
</div>
) : profiles.length === 0 ? (
<div className="text-center py-16"> <div className="text-center py-16">
<svg className="w-16 h-16 mx-auto text-gray-700 mb-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1}> <svg className="w-16 h-16 mx-auto text-gray-700 mb-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1}>
<path strokeLinecap="round" strokeLinejoin="round" d="M15.75 6a3.75 3.75 0 11-7.5 0 3.75 3.75 0 017.5 0zM4.501 20.118a7.5 7.5 0 0114.998 0A17.933 17.933 0 0112 21.75c-2.676 0-5.216-.584-7.499-1.632z" /> <path strokeLinecap="round" strokeLinejoin="round" d="M15.75 6a3.75 3.75 0 11-7.5 0 3.75 3.75 0 017.5 0zM4.501 20.118a7.5 7.5 0 0114.998 0A17.933 17.933 0 0112 21.75c-2.676 0-5.216-.584-7.499-1.632z" />
@@ -230,11 +298,32 @@ export default function Characters() {
<span className="px-2 py-0.5 bg-gray-700/70 text-gray-400 text-xs rounded-full"> <span className="px-2 py-0.5 bg-gray-700/70 text-gray-400 text-xs rounded-full">
{char.model_overrides?.primary || 'default'} {char.model_overrides?.primary || 'default'}
</span> </span>
{char.tts?.kokoro_voice && ( {char.tts?.engine === 'kokoro' && char.tts?.kokoro_voice && (
<span className="px-2 py-0.5 bg-gray-700/70 text-gray-400 text-xs rounded-full"> <span className="px-2 py-0.5 bg-gray-700/70 text-gray-400 text-xs rounded-full">
{char.tts.kokoro_voice} {char.tts.kokoro_voice}
</span> </span>
)} )}
{char.tts?.engine === 'elevenlabs' && char.tts?.elevenlabs_voice_id && (
<span className="px-2 py-0.5 bg-gray-700/70 text-gray-400 text-xs rounded-full" title={char.tts.elevenlabs_voice_id}>
{char.tts.elevenlabs_voice_name || char.tts.elevenlabs_voice_id.slice(0, 8) + '…'}
</span>
)}
{char.tts?.engine === 'chatterbox' && char.tts?.voice_ref_path && (
<span className="px-2 py-0.5 bg-gray-700/70 text-gray-400 text-xs rounded-full" title={char.tts.voice_ref_path}>
{char.tts.voice_ref_path.split('/').pop()}
</span>
)}
{(() => {
const defaultPreset = char.gaze_presets?.find(gp => gp.trigger === 'self-portrait')?.preset
|| char.gaze_presets?.[0]?.preset
|| char.gaze_preset
|| null;
return defaultPreset ? (
<span className="px-2 py-0.5 bg-violet-500/20 text-violet-300 text-xs rounded-full border border-violet-500/30" title={`GAZE: ${defaultPreset}`}>
{defaultPreset}
</span>
) : null;
})()}
</div> </div>
<div className="flex gap-2 pt-1"> <div className="flex gap-2 pt-1">
@@ -287,6 +376,96 @@ export default function Characters() {
})} })}
</div> </div>
)} )}
{/* Satellite Assignment */}
{!loading && profiles.length > 0 && (
<div className="bg-gray-900 border border-gray-800 rounded-xl p-5 space-y-4">
<div>
<h2 className="text-lg font-semibold text-gray-200">Satellite Routing</h2>
<p className="text-xs text-gray-500 mt-1">Assign characters to voice satellites. Unmapped satellites use the default.</p>
</div>
{/* Default character */}
<div className="flex items-center gap-3">
<label className="text-sm text-gray-400 w-32 shrink-0">Default</label>
<select
value={satMap.default || ''}
onChange={(e) => saveSatMap({ ...satMap, default: e.target.value })}
className="flex-1 bg-gray-800 text-gray-200 text-sm rounded-lg px-3 py-2 border border-gray-700 focus:outline-none focus:border-indigo-500"
>
<option value="">-- None --</option>
{profiles.map(p => (
<option key={p.id} value={p.id}>{p.data.display_name || p.data.name}</option>
))}
</select>
</div>
{/* Per-satellite assignments */}
{Object.entries(satMap.satellites || {}).map(([satId, charId]) => (
<div key={satId} className="flex items-center gap-3">
<span className="text-sm text-gray-300 w-32 shrink-0 truncate font-mono" title={satId}>{satId}</span>
<select
value={charId}
onChange={(e) => {
const updated = { ...satMap, satellites: { ...satMap.satellites, [satId]: e.target.value } };
saveSatMap(updated);
}}
className="flex-1 bg-gray-800 text-gray-200 text-sm rounded-lg px-3 py-2 border border-gray-700 focus:outline-none focus:border-indigo-500"
>
{profiles.map(p => (
<option key={p.id} value={p.id}>{p.data.display_name || p.data.name}</option>
))}
</select>
<button
onClick={() => {
const { [satId]: _, ...rest } = satMap.satellites;
saveSatMap({ ...satMap, satellites: rest });
}}
className="px-2 py-1.5 bg-gray-700 hover:bg-red-600 text-gray-400 hover:text-white rounded-lg transition-colors"
title="Remove"
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M6 18L18 6M6 6l12 12" />
</svg>
</button>
</div>
))}
{/* Add new satellite */}
<div className="flex items-center gap-3 pt-2 border-t border-gray-800">
<input
type="text"
value={newSatId}
onChange={(e) => setNewSatId(e.target.value)}
placeholder="Satellite ID (from bridge log)"
className="w-32 shrink-0 bg-gray-800 text-gray-200 text-sm rounded-lg px-3 py-2 border border-gray-700 focus:outline-none focus:border-indigo-500 font-mono"
/>
<select
value={newSatChar}
onChange={(e) => setNewSatChar(e.target.value)}
className="flex-1 bg-gray-800 text-gray-200 text-sm rounded-lg px-3 py-2 border border-gray-700 focus:outline-none focus:border-indigo-500"
>
<option value="">-- Select Character --</option>
{profiles.map(p => (
<option key={p.id} value={p.id}>{p.data.display_name || p.data.name}</option>
))}
</select>
<button
onClick={() => {
if (newSatId && newSatChar) {
saveSatMap({ ...satMap, satellites: { ...satMap.satellites, [newSatId]: newSatChar } });
setNewSatId('');
setNewSatChar('');
}
}}
disabled={!newSatId || !newSatChar}
className="px-3 py-1.5 bg-indigo-600 hover:bg-indigo-500 disabled:bg-gray-700 disabled:text-gray-500 text-white text-sm rounded-lg transition-colors"
>
Add
</button>
</div>
</div>
)}
</div> </div>
); );
} }

View File

@@ -1,56 +1,81 @@
import { useState, useEffect, useCallback } from 'react' import { useState, useCallback } from 'react'
import ChatPanel from '../components/ChatPanel' import ChatPanel from '../components/ChatPanel'
import InputBar from '../components/InputBar' import InputBar from '../components/InputBar'
import StatusIndicator from '../components/StatusIndicator' import StatusIndicator from '../components/StatusIndicator'
import SettingsDrawer from '../components/SettingsDrawer' import SettingsDrawer from '../components/SettingsDrawer'
import ConversationList from '../components/ConversationList'
import { useSettings } from '../hooks/useSettings' import { useSettings } from '../hooks/useSettings'
import { useBridgeHealth } from '../hooks/useBridgeHealth' import { useBridgeHealth } from '../hooks/useBridgeHealth'
import { useChat } from '../hooks/useChat' import { useChat } from '../hooks/useChat'
import { useTtsPlayback } from '../hooks/useTtsPlayback' import { useTtsPlayback } from '../hooks/useTtsPlayback'
import { useVoiceInput } from '../hooks/useVoiceInput' import { useVoiceInput } from '../hooks/useVoiceInput'
import { useActiveCharacter } from '../hooks/useActiveCharacter'
import { useConversations } from '../hooks/useConversations'
export default function Chat() { export default function Chat() {
const { settings, updateSetting } = useSettings() const { settings, updateSetting } = useSettings()
const isOnline = useBridgeHealth() const isOnline = useBridgeHealth()
const { messages, isLoading, send, clearHistory } = useChat() const character = useActiveCharacter()
const { isPlaying, speak, stop } = useTtsPlayback(settings.voice) const {
conversations, activeId, isLoading: isLoadingList,
select, create, remove, updateMeta,
} = useConversations()
const convMeta = {
characterId: character?.id || '',
characterName: character?.name || '',
}
const { messages, isLoading, isLoadingConv, send, clearHistory } = useChat(activeId, convMeta, updateMeta)
// Use character's TTS config if available, fall back to global settings
const ttsEngine = character?.tts?.engine || settings.ttsEngine
const ttsVoice = ttsEngine === 'elevenlabs'
? (character?.tts?.elevenlabs_voice_id || settings.voice)
: (character?.tts?.kokoro_voice || settings.voice)
const ttsModel = ttsEngine === 'elevenlabs' ? (character?.tts?.elevenlabs_model || null) : null
const { isPlaying, speak, stop } = useTtsPlayback(ttsVoice, ttsEngine, ttsModel)
const { isRecording, isTranscribing, startRecording, stopRecording } = useVoiceInput(settings.sttMode) const { isRecording, isTranscribing, startRecording, stopRecording } = useVoiceInput(settings.sttMode)
const [settingsOpen, setSettingsOpen] = useState(false) const [settingsOpen, setSettingsOpen] = useState(false)
// Send a message and optionally speak the response
const handleSend = useCallback(async (text) => { const handleSend = useCallback(async (text) => {
const response = await send(text) // Auto-create a conversation if none is active
let newId = null
if (!activeId) {
newId = await create(convMeta.characterId, convMeta.characterName)
}
const response = await send(text, newId)
if (response && settings.autoTts) { if (response && settings.autoTts) {
speak(response) speak(response)
} }
}, [send, settings.autoTts, speak]) }, [activeId, create, convMeta, send, settings.autoTts, speak])
// Toggle voice recording
const handleVoiceToggle = useCallback(async () => { const handleVoiceToggle = useCallback(async () => {
if (isRecording) { if (isRecording) {
const text = await stopRecording() const text = await stopRecording()
if (text) { if (text) handleSend(text)
handleSend(text)
}
} else { } else {
startRecording() startRecording()
} }
}, [isRecording, stopRecording, startRecording, handleSend]) }, [isRecording, stopRecording, startRecording, handleSend])
// Space bar push-to-talk when input not focused const handleNewChat = useCallback(() => {
useEffect(() => { create(convMeta.characterId, convMeta.characterName)
const handleKeyDown = (e) => { }, [create, convMeta])
if (e.code === 'Space' && e.target.tagName !== 'TEXTAREA' && e.target.tagName !== 'INPUT') {
e.preventDefault()
handleVoiceToggle()
}
}
window.addEventListener('keydown', handleKeyDown)
return () => window.removeEventListener('keydown', handleKeyDown)
}, [handleVoiceToggle])
return ( return (
<div className="flex-1 flex flex-col min-h-0"> <div className="flex-1 flex min-h-0">
{/* Conversation sidebar */}
<ConversationList
conversations={conversations}
activeId={activeId}
onCreate={handleNewChat}
onSelect={select}
onDelete={remove}
/>
{/* Chat area */}
<div className="flex-1 flex flex-col min-h-0 min-w-0">
{/* Status bar */} {/* Status bar */}
<header className="flex items-center justify-between px-4 py-2 border-b border-gray-800/50 shrink-0"> <header className="flex items-center justify-between px-4 py-2 border-b border-gray-800/50 shrink-0">
<div className="flex items-center gap-2"> <div className="flex items-center gap-2">
@@ -91,8 +116,13 @@ export default function Chat() {
</div> </div>
</header> </header>
{/* Chat area */} {/* Messages */}
<ChatPanel messages={messages} isLoading={isLoading} onReplay={speak} /> <ChatPanel
messages={messages}
isLoading={isLoading || isLoadingConv}
onReplay={speak}
character={character}
/>
{/* Input */} {/* Input */}
<InputBar <InputBar
@@ -111,5 +141,6 @@ export default function Chat() {
onUpdate={updateSetting} onUpdate={updateSetting}
/> />
</div> </div>
</div>
) )
} }

View File

@@ -1,14 +1,18 @@
import React, { useState, useEffect, useRef } from 'react'; import React, { useState, useEffect, useRef } from 'react';
import { validateCharacter } from '../lib/SchemaValidator'; import { validateCharacter, migrateV1toV2 } from '../lib/SchemaValidator';
const DEFAULT_CHARACTER = { const DEFAULT_CHARACTER = {
schema_version: 1, schema_version: 2,
name: "aria", name: "",
display_name: "Aria", display_name: "",
description: "Default HomeAI assistant persona", description: "",
system_prompt: "You are Aria, a warm, curious, and helpful AI assistant living in the home. You speak naturally and conversationally — never robotic. You are knowledgeable but never condescending. You remember the people you live with and build on those memories over time. Keep responses concise when controlling smart home devices; be more expressive in casual conversation. Never break character.", background: "",
dialogue_style: "",
appearance: "",
skills: [],
system_prompt: "",
model_overrides: { model_overrides: {
primary: "llama3.3:70b", primary: "qwen3.5:35b-a3b",
fast: "qwen2.5:7b" fast: "qwen2.5:7b"
}, },
tts: { tts: {
@@ -16,24 +20,8 @@ const DEFAULT_CHARACTER = {
kokoro_voice: "af_heart", kokoro_voice: "af_heart",
speed: 1.0 speed: 1.0
}, },
live2d_expressions: { gaze_presets: [],
idle: "expr_idle", custom_rules: [],
listening: "expr_listening",
thinking: "expr_thinking",
speaking: "expr_speaking",
happy: "expr_happy",
sad: "expr_sad",
surprised: "expr_surprised",
error: "expr_error"
},
vtube_ws_triggers: {
thinking: { type: "hotkey", id: "expr_thinking" },
speaking: { type: "hotkey", id: "expr_speaking" },
idle: { type: "hotkey", id: "expr_idle" }
},
custom_rules: [
{ trigger: "good morning", response: "Good morning! How did you sleep?", condition: "time_of_day == morning" }
],
notes: "" notes: ""
}; };
@@ -43,7 +31,12 @@ export default function Editor() {
if (editData) { if (editData) {
sessionStorage.removeItem('edit_character'); sessionStorage.removeItem('edit_character');
try { try {
return JSON.parse(editData); const parsed = JSON.parse(editData);
// Auto-migrate v1 data
if (parsed.schema_version === 1 || !parsed.schema_version) {
migrateV1toV2(parsed);
}
return parsed;
} catch { } catch {
return DEFAULT_CHARACTER; return DEFAULT_CHARACTER;
} }
@@ -52,6 +45,7 @@ export default function Editor() {
}); });
const [error, setError] = useState(null); const [error, setError] = useState(null);
const [saved, setSaved] = useState(false); const [saved, setSaved] = useState(false);
const isEditing = !!sessionStorage.getItem('edit_character_profile_id');
// TTS preview state // TTS preview state
const [ttsState, setTtsState] = useState('idle'); const [ttsState, setTtsState] = useState('idle');
@@ -65,6 +59,19 @@ export default function Editor() {
const [elevenLabsModels, setElevenLabsModels] = useState([]); const [elevenLabsModels, setElevenLabsModels] = useState([]);
const [isLoadingElevenLabs, setIsLoadingElevenLabs] = useState(false); const [isLoadingElevenLabs, setIsLoadingElevenLabs] = useState(false);
// GAZE presets state (from API)
const [availableGazePresets, setAvailableGazePresets] = useState([]);
const [isLoadingGaze, setIsLoadingGaze] = useState(false);
// Character lookup state
const [lookupName, setLookupName] = useState('');
const [lookupFranchise, setLookupFranchise] = useState('');
const [isLookingUp, setIsLookingUp] = useState(false);
const [lookupDone, setLookupDone] = useState(false);
// Skills input state
const [newSkill, setNewSkill] = useState('');
const fetchElevenLabsData = async (key) => { const fetchElevenLabsData = async (key) => {
if (!key) return; if (!key) return;
setIsLoadingElevenLabs(true); setIsLoadingElevenLabs(true);
@@ -95,6 +102,16 @@ export default function Editor() {
} }
}, [character.tts.engine]); }, [character.tts.engine]);
// Fetch GAZE presets on mount
useEffect(() => {
setIsLoadingGaze(true);
fetch('/api/gaze/presets')
.then(r => r.ok ? r.json() : { presets: [] })
.then(data => setAvailableGazePresets(data.presets || []))
.catch(() => {})
.finally(() => setIsLoadingGaze(false));
}, []);
useEffect(() => { useEffect(() => {
return () => { return () => {
if (audioRef.current) { audioRef.current.pause(); audioRef.current = null; } if (audioRef.current) { audioRef.current.pause(); audioRef.current = null; }
@@ -119,27 +136,35 @@ export default function Editor() {
} }
}; };
const handleSaveToProfiles = () => { const handleSaveToProfiles = async () => {
try { try {
validateCharacter(character); validateCharacter(character);
setError(null); setError(null);
const profileId = sessionStorage.getItem('edit_character_profile_id'); const profileId = sessionStorage.getItem('edit_character_profile_id');
const storageKey = 'homeai_characters'; let profile;
const raw = localStorage.getItem(storageKey);
let profiles = raw ? JSON.parse(raw) : [];
if (profileId) { if (profileId) {
profiles = profiles.map(p => const res = await fetch('/api/characters');
p.id === profileId ? { ...p, data: character } : p const profiles = await res.json();
); const existing = profiles.find(p => p.id === profileId);
sessionStorage.removeItem('edit_character_profile_id'); profile = existing
? { ...existing, data: character }
: { id: profileId, data: character, image: null, addedAt: new Date().toISOString() };
// Keep the profile ID in sessionStorage so subsequent saves update the same file
} else { } else {
const id = character.name + '_' + Date.now(); const id = character.name + '_' + Date.now();
profiles.push({ id, data: character, image: null, addedAt: new Date().toISOString() }); profile = { id, data: character, image: null, addedAt: new Date().toISOString() };
// Store the new ID so subsequent saves update the same file
sessionStorage.setItem('edit_character_profile_id', profile.id);
} }
localStorage.setItem(storageKey, JSON.stringify(profiles)); await fetch('/api/characters', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(profile),
});
setSaved(true); setSaved(true);
setTimeout(() => setSaved(false), 2000); setTimeout(() => setSaved(false), 2000);
} catch (err) { } catch (err) {
@@ -164,6 +189,59 @@ export default function Editor() {
reader.readAsText(file); reader.readAsText(file);
}; };
// Character lookup from MCP
const handleCharacterLookup = async () => {
if (!lookupName || !lookupFranchise) return;
setIsLookingUp(true);
setError(null);
try {
const res = await fetch('/api/character-lookup', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ name: lookupName, franchise: lookupFranchise }),
});
if (!res.ok) {
const err = await res.json().catch(() => ({ error: 'Lookup failed' }));
throw new Error(err.error || `Lookup returned ${res.status}`);
}
const data = await res.json();
// Build dialogue_style from personality + notable quotes
let dialogueStyle = data.personality || '';
if (data.notable_quotes?.length) {
dialogueStyle += '\n\nExample dialogue:\n' + data.notable_quotes.map(q => `"${q}"`).join('\n');
}
// Filter abilities to clean text-only entries (skip image captions)
const skills = (data.abilities || [])
.filter(a => a.length > 20 && !a.includes('.jpg') && !a.includes('.png'))
.slice(0, 10);
// Auto-generate system prompt
const promptName = character.display_name || lookupName;
const personality = data.personality ? data.personality.split('.').slice(0, 3).join('.') + '.' : '';
const systemPrompt = `You are ${promptName} from ${lookupFranchise}. ${personality} Stay in character at all times. Respond naturally and conversationally.`;
setCharacter(prev => ({
...prev,
name: prev.name || lookupName.toLowerCase().replace(/\s+/g, '_'),
display_name: prev.display_name || lookupName,
description: data.description ? data.description.split('.').slice(0, 2).join('.') + '.' : prev.description,
background: data.background || prev.background,
appearance: data.appearance || prev.appearance,
dialogue_style: dialogueStyle || prev.dialogue_style,
skills: skills.length > 0 ? skills : prev.skills,
system_prompt: prev.system_prompt || systemPrompt,
}));
setLookupDone(true);
} catch (err) {
setError(`Character lookup failed: ${err.message}`);
} finally {
setIsLookingUp(false);
}
};
const handleChange = (field, value) => { const handleChange = (field, value) => {
setCharacter(prev => ({ ...prev, [field]: value })); setCharacter(prev => ({ ...prev, [field]: value }));
}; };
@@ -175,6 +253,50 @@ export default function Editor() {
})); }));
}; };
// Skills helpers
const addSkill = () => {
const trimmed = newSkill.trim();
if (!trimmed) return;
setCharacter(prev => ({
...prev,
skills: [...(prev.skills || []), trimmed]
}));
setNewSkill('');
};
const removeSkill = (index) => {
setCharacter(prev => {
const updated = [...(prev.skills || [])];
updated.splice(index, 1);
return { ...prev, skills: updated };
});
};
// GAZE preset helpers
const addGazePreset = () => {
setCharacter(prev => ({
...prev,
gaze_presets: [...(prev.gaze_presets || []), { preset: '', trigger: 'self-portrait' }]
}));
};
const removeGazePreset = (index) => {
setCharacter(prev => {
const updated = [...(prev.gaze_presets || [])];
updated.splice(index, 1);
return { ...prev, gaze_presets: updated };
});
};
const handleGazePresetChange = (index, field, value) => {
setCharacter(prev => {
const updated = [...(prev.gaze_presets || [])];
updated[index] = { ...updated[index], [field]: value };
return { ...prev, gaze_presets: updated };
});
};
// Custom rules helpers
const handleRuleChange = (index, field, value) => { const handleRuleChange = (index, field, value) => {
setCharacter(prev => { setCharacter(prev => {
const newRules = [...(prev.custom_rules || [])]; const newRules = [...(prev.custom_rules || [])];
@@ -198,37 +320,40 @@ export default function Editor() {
}); });
}; };
// TTS preview
const stopPreview = () => { const stopPreview = () => {
if (audioRef.current) { if (audioRef.current) { audioRef.current.pause(); audioRef.current = null; }
audioRef.current.pause(); if (objectUrlRef.current) { URL.revokeObjectURL(objectUrlRef.current); objectUrlRef.current = null; }
audioRef.current = null;
}
if (objectUrlRef.current) {
URL.revokeObjectURL(objectUrlRef.current);
objectUrlRef.current = null;
}
window.speechSynthesis.cancel(); window.speechSynthesis.cancel();
setTtsState('idle'); setTtsState('idle');
}; };
const previewTTS = async () => { const previewTTS = async () => {
stopPreview(); stopPreview();
const text = previewText || `Hi, I am ${character.display_name}. This is a preview of my voice.`; const text = previewText || `Hi, I am ${character.display_name || character.name}. This is a preview of my voice.`;
const engine = character.tts.engine;
if (character.tts.engine === 'kokoro') { let bridgeBody = null;
if (engine === 'kokoro') {
bridgeBody = { text, voice: character.tts.kokoro_voice, engine: 'kokoro' };
} else if (engine === 'elevenlabs' && character.tts.elevenlabs_voice_id) {
bridgeBody = { text, voice: character.tts.elevenlabs_voice_id, engine: 'elevenlabs', model: character.tts.elevenlabs_model };
}
if (bridgeBody) {
setTtsState('loading'); setTtsState('loading');
let blob; let blob;
try { try {
const response = await fetch('/api/tts', { const response = await fetch('/api/tts', {
method: 'POST', method: 'POST',
headers: { 'Content-Type': 'application/json' }, headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ text, voice: character.tts.kokoro_voice }) body: JSON.stringify(bridgeBody)
}); });
if (!response.ok) throw new Error('TTS bridge returned ' + response.status); if (!response.ok) throw new Error('TTS bridge returned ' + response.status);
blob = await response.blob(); blob = await response.blob();
} catch (err) { } catch (err) {
setTtsState('idle'); setTtsState('idle');
setError(`Kokoro preview failed: ${err.message}. Falling back to browser TTS.`); setError(`${engine} preview failed: ${err.message}. Falling back to browser TTS.`);
runBrowserTTS(text); runBrowserTTS(text);
return; return;
} }
@@ -269,7 +394,9 @@ export default function Editor() {
<div> <div>
<h1 className="text-3xl font-bold text-gray-100">Character Editor</h1> <h1 className="text-3xl font-bold text-gray-100">Character Editor</h1>
<p className="text-sm text-gray-500 mt-1"> <p className="text-sm text-gray-500 mt-1">
Editing: {character.display_name || character.name} {character.display_name || character.name
? `Editing: ${character.display_name || character.name}`
: 'New character'}
</p> </p>
</div> </div>
<div className="flex gap-3"> <div className="flex gap-3">
@@ -311,6 +438,64 @@ export default function Editor() {
{error && ( {error && (
<div className="bg-red-900/30 border border-red-500/50 text-red-300 px-4 py-3 rounded-lg text-sm"> <div className="bg-red-900/30 border border-red-500/50 text-red-300 px-4 py-3 rounded-lg text-sm">
{error} {error}
<button onClick={() => setError(null)} className="ml-2 text-red-400 hover:text-red-300">&times;</button>
</div>
)}
{/* Character Lookup — auto-fill from fictional character wiki */}
{!isEditing && (
<div className={cardClass}>
<div className="flex items-center gap-2">
<svg className="w-5 h-5 text-indigo-400" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M21 21l-5.197-5.197m0 0A7.5 7.5 0 105.196 5.196a7.5 7.5 0 0010.607 10.607z" />
</svg>
<h2 className="text-lg font-semibold text-gray-200">Auto-fill from Character</h2>
</div>
<p className="text-xs text-gray-500">Fetch character data from Fandom/Wikipedia to auto-populate fields. You can edit everything after.</p>
<div className="flex gap-3 items-end">
<div className="flex-1">
<label className={labelClass}>Character Name</label>
<input
type="text"
className={inputClass}
value={lookupName}
onChange={(e) => setLookupName(e.target.value)}
placeholder="e.g. Tifa Lockhart"
/>
</div>
<div className="flex-1">
<label className={labelClass}>Franchise / Series</label>
<input
type="text"
className={inputClass}
value={lookupFranchise}
onChange={(e) => setLookupFranchise(e.target.value)}
placeholder="e.g. Final Fantasy VII"
/>
</div>
<button
onClick={handleCharacterLookup}
disabled={isLookingUp || !lookupName || !lookupFranchise}
className={`flex items-center gap-2 px-5 py-2 rounded-lg text-white transition-colors whitespace-nowrap ${
isLookingUp
? 'bg-indigo-800 cursor-wait'
: lookupDone
? 'bg-emerald-600 hover:bg-emerald-500'
: 'bg-indigo-600 hover:bg-indigo-500 disabled:bg-gray-700 disabled:text-gray-500'
}`}
>
{isLookingUp && (
<svg className="w-4 h-4 animate-spin" viewBox="0 0 24 24" fill="none">
<circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" />
<path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4z" />
</svg>
)}
{isLookingUp ? 'Fetching...' : lookupDone ? 'Fetched' : 'Lookup'}
</button>
</div>
{lookupDone && (
<p className="text-xs text-emerald-400">Fields populated from wiki data. Review and edit below.</p>
)}
</div> </div>
)} )}
@@ -324,11 +509,11 @@ export default function Editor() {
</div> </div>
<div> <div>
<label className={labelClass}>Display Name</label> <label className={labelClass}>Display Name</label>
<input type="text" className={inputClass} value={character.display_name} onChange={(e) => handleChange('display_name', e.target.value)} /> <input type="text" className={inputClass} value={character.display_name || ''} onChange={(e) => handleChange('display_name', e.target.value)} />
</div> </div>
<div> <div>
<label className={labelClass}>Description</label> <label className={labelClass}>Description</label>
<input type="text" className={inputClass} value={character.description} onChange={(e) => handleChange('description', e.target.value)} /> <input type="text" className={inputClass} value={character.description || ''} onChange={(e) => handleChange('description', e.target.value)} />
</div> </div>
</div> </div>
@@ -359,7 +544,14 @@ export default function Editor() {
<div> <div>
<label className={labelClass}>Voice ID</label> <label className={labelClass}>Voice ID</label>
{elevenLabsVoices.length > 0 ? ( {elevenLabsVoices.length > 0 ? (
<select className={selectClass} value={character.tts.elevenlabs_voice_id || ''} onChange={(e) => handleNestedChange('tts', 'elevenlabs_voice_id', e.target.value)}> <select className={selectClass} value={character.tts.elevenlabs_voice_id || ''} onChange={(e) => {
const voiceId = e.target.value;
const voice = elevenLabsVoices.find(v => v.voice_id === voiceId);
setCharacter(prev => ({
...prev,
tts: { ...prev.tts, elevenlabs_voice_id: voiceId, elevenlabs_voice_name: voice?.name || '' }
}));
}}>
<option value="">-- Select Voice --</option> <option value="">-- Select Voice --</option>
{elevenLabsVoices.map(v => ( {elevenLabsVoices.map(v => (
<option key={v.voice_id} value={v.voice_id}>{v.name} ({v.category})</option> <option key={v.voice_id} value={v.voice_id}>{v.name} ({v.category})</option>
@@ -439,7 +631,7 @@ export default function Editor() {
className={inputClass} className={inputClass}
value={previewText} value={previewText}
onChange={(e) => setPreviewText(e.target.value)} onChange={(e) => setPreviewText(e.target.value)}
placeholder={`Hi, I am ${character.display_name}. This is a preview of my voice.`} placeholder={`Hi, I am ${character.display_name || character.name || 'your character'}. This is a preview of my voice.`}
/> />
</div> </div>
<div className="flex gap-2"> <div className="flex gap-2">
@@ -474,6 +666,8 @@ export default function Editor() {
<p className="text-xs text-gray-600"> <p className="text-xs text-gray-600">
{character.tts.engine === 'kokoro' {character.tts.engine === 'kokoro'
? 'Previews via local Kokoro TTS bridge (port 8081).' ? 'Previews via local Kokoro TTS bridge (port 8081).'
: character.tts.engine === 'elevenlabs'
? 'Previews via ElevenLabs through bridge.'
: 'Uses browser TTS for preview. Local TTS available with Kokoro engine.'} : 'Uses browser TTS for preview. Local TTS available with Kokoro engine.'}
</p> </p>
</div> </div>
@@ -483,33 +677,162 @@ export default function Editor() {
<div className={cardClass}> <div className={cardClass}>
<div className="flex justify-between items-center"> <div className="flex justify-between items-center">
<h2 className="text-lg font-semibold text-gray-200">System Prompt</h2> <h2 className="text-lg font-semibold text-gray-200">System Prompt</h2>
<span className="text-xs text-gray-600">{character.system_prompt.length} chars</span> <span className="text-xs text-gray-600">{(character.system_prompt || '').length} chars</span>
</div> </div>
<textarea <textarea
className={inputClass + " h-32 resize-y"} className={inputClass + " h-32 resize-y"}
value={character.system_prompt} value={character.system_prompt}
onChange={(e) => handleChange('system_prompt', e.target.value)} onChange={(e) => handleChange('system_prompt', e.target.value)}
placeholder="You are [character name]. Describe their personality, behaviour, and role..."
/> />
</div> </div>
<div className="grid grid-cols-1 md:grid-cols-2 gap-6"> {/* Character Profile — new v2 fields */}
{/* Live2D Expressions */}
<div className={cardClass}> <div className={cardClass}>
<h2 className="text-lg font-semibold text-gray-200">Live2D Expressions</h2> <h2 className="text-lg font-semibold text-gray-200">Character Profile</h2>
{Object.entries(character.live2d_expressions).map(([key, val]) => ( <div>
<div key={key} className="flex justify-between items-center gap-4"> <label className={labelClass}>Background / Backstory</label>
<label className="text-sm font-medium text-gray-400 w-1/3 capitalize">{key}</label> <textarea
<input type="text" className={inputClass + " w-2/3"} value={val} onChange={(e) => handleNestedChange('live2d_expressions', key, e.target.value)} /> className={inputClass + " h-28 resize-y text-sm"}
value={character.background || ''}
onChange={(e) => handleChange('background', e.target.value)}
placeholder="Character history, origins, key life events..."
/>
</div>
<div>
<label className={labelClass}>Appearance</label>
<textarea
className={inputClass + " h-24 resize-y text-sm"}
value={character.appearance || ''}
onChange={(e) => handleChange('appearance', e.target.value)}
placeholder="Physical description — also used for image generation prompts..."
/>
</div>
<div>
<label className={labelClass}>Dialogue Style & Examples</label>
<textarea
className={inputClass + " h-24 resize-y text-sm"}
value={character.dialogue_style || ''}
onChange={(e) => handleChange('dialogue_style', e.target.value)}
placeholder="How the persona speaks, their tone, mannerisms, and example lines..."
/>
</div>
<div>
<label className={labelClass}>Skills & Interests</label>
<div className="flex flex-wrap gap-2 mb-2">
{(character.skills || []).map((skill, idx) => (
<span
key={idx}
className="inline-flex items-center gap-1 px-3 py-1 bg-indigo-500/20 text-indigo-300 text-sm rounded-full border border-indigo-500/30"
>
{skill.length > 80 ? skill.slice(0, 80) + '...' : skill}
<button
onClick={() => removeSkill(idx)}
className="ml-1 text-indigo-400 hover:text-red-400 transition-colors"
>
<svg className="w-3 h-3" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={3}>
<path strokeLinecap="round" strokeLinejoin="round" d="M6 18L18 6M6 6l12 12" />
</svg>
</button>
</span>
))}
</div>
<div className="flex gap-2">
<input
type="text"
className={inputClass + " text-sm"}
value={newSkill}
onChange={(e) => setNewSkill(e.target.value)}
onKeyDown={(e) => { if (e.key === 'Enter') { e.preventDefault(); addSkill(); } }}
placeholder="Add a skill or interest..."
/>
<button
onClick={addSkill}
disabled={!newSkill.trim()}
className="px-3 py-2 bg-indigo-600 hover:bg-indigo-500 disabled:bg-gray-700 disabled:text-gray-500 text-white text-sm rounded-lg transition-colors whitespace-nowrap"
>
Add
</button>
</div>
</div>
</div>
<div className="grid grid-cols-1 md:grid-cols-2 gap-6">
{/* Image Generation — GAZE presets */}
<div className={cardClass}>
<div className="flex justify-between items-center">
<h2 className="text-lg font-semibold text-gray-200">GAZE Presets</h2>
<button onClick={addGazePreset} className="flex items-center gap-1 bg-indigo-600 hover:bg-indigo-500 text-white px-3 py-1.5 rounded-lg text-sm transition-colors">
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M12 4.5v15m7.5-7.5h-15" />
</svg>
Add Preset
</button>
</div>
<p className="text-xs text-gray-500">Image generation presets with trigger conditions. Default trigger is "self-portrait".</p>
{(!character.gaze_presets || character.gaze_presets.length === 0) ? (
<p className="text-sm text-gray-600 italic">No GAZE presets configured.</p>
) : (
<div className="space-y-3">
{character.gaze_presets.map((gp, idx) => (
<div key={idx} className="flex items-center gap-2 border border-gray-700 p-3 rounded-lg bg-gray-800/50">
<div className="flex-1">
<label className="block text-xs text-gray-500 mb-1">Preset</label>
{isLoadingGaze ? (
<p className="text-sm text-gray-500">Loading...</p>
) : availableGazePresets.length > 0 ? (
<select
className={selectClass + " text-sm"}
value={gp.preset || ''}
onChange={(e) => handleGazePresetChange(idx, 'preset', e.target.value)}
>
<option value="">-- Select --</option>
{availableGazePresets.map(p => (
<option key={p.slug} value={p.slug}>{p.name} ({p.slug})</option>
))}
</select>
) : (
<input
type="text"
className={inputClass + " text-sm"}
value={gp.preset || ''}
onChange={(e) => handleGazePresetChange(idx, 'preset', e.target.value)}
placeholder="Preset slug"
/>
)}
</div>
<div className="flex-1">
<label className="block text-xs text-gray-500 mb-1">Trigger</label>
<input
type="text"
className={inputClass + " text-sm"}
value={gp.trigger || ''}
onChange={(e) => handleGazePresetChange(idx, 'trigger', e.target.value)}
placeholder="e.g. self-portrait, battle scene"
/>
</div>
<button
onClick={() => removeGazePreset(idx)}
className="mt-5 px-2 py-1.5 text-gray-500 hover:text-red-400 transition-colors"
title="Remove"
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M6 18L18 6M6 6l12 12" />
</svg>
</button>
</div> </div>
))} ))}
</div> </div>
)}
</div>
{/* Model Overrides */} {/* Model Overrides */}
<div className={cardClass}> <div className={cardClass}>
<h2 className="text-lg font-semibold text-gray-200">Model Overrides</h2> <h2 className="text-lg font-semibold text-gray-200">Model Overrides</h2>
<div> <div>
<label className={labelClass}>Primary Model</label> <label className={labelClass}>Primary Model</label>
<select className={selectClass} value={character.model_overrides?.primary || 'llama3.3:70b'} onChange={(e) => handleNestedChange('model_overrides', 'primary', e.target.value)}> <select className={selectClass} value={character.model_overrides?.primary || 'qwen3.5:35b-a3b'} onChange={(e) => handleNestedChange('model_overrides', 'primary', e.target.value)}>
<option value="llama3.3:70b">llama3.3:70b</option> <option value="llama3.3:70b">llama3.3:70b</option>
<option value="qwen3.5:35b-a3b">qwen3.5:35b-a3b</option> <option value="qwen3.5:35b-a3b">qwen3.5:35b-a3b</option>
<option value="qwen2.5:7b">qwen2.5:7b</option> <option value="qwen2.5:7b">qwen2.5:7b</option>
@@ -576,6 +899,17 @@ export default function Editor() {
</div> </div>
)} )}
</div> </div>
{/* Notes */}
<div className={cardClass}>
<h2 className="text-lg font-semibold text-gray-200">Notes</h2>
<textarea
className={inputClass + " h-20 resize-y text-sm"}
value={character.notes || ''}
onChange={(e) => handleChange('notes', e.target.value)}
placeholder="Internal notes, reminders, or references..."
/>
</div>
</div> </div>
); );
} }

View File

@@ -0,0 +1,346 @@
import { useState, useEffect, useCallback } from 'react';
import {
getPersonalMemories, savePersonalMemory, deletePersonalMemory,
getGeneralMemories, saveGeneralMemory, deleteGeneralMemory,
} from '../lib/memoryApi';
const PERSONAL_CATEGORIES = [
{ value: 'personal_info', label: 'Personal Info', color: 'bg-blue-500/20 text-blue-300 border-blue-500/30' },
{ value: 'preference', label: 'Preference', color: 'bg-amber-500/20 text-amber-300 border-amber-500/30' },
{ value: 'interaction', label: 'Interaction', color: 'bg-emerald-500/20 text-emerald-300 border-emerald-500/30' },
{ value: 'emotional', label: 'Emotional', color: 'bg-pink-500/20 text-pink-300 border-pink-500/30' },
{ value: 'other', label: 'Other', color: 'bg-gray-500/20 text-gray-300 border-gray-500/30' },
];
const GENERAL_CATEGORIES = [
{ value: 'system', label: 'System', color: 'bg-indigo-500/20 text-indigo-300 border-indigo-500/30' },
{ value: 'tool_usage', label: 'Tool Usage', color: 'bg-cyan-500/20 text-cyan-300 border-cyan-500/30' },
{ value: 'home_layout', label: 'Home Layout', color: 'bg-emerald-500/20 text-emerald-300 border-emerald-500/30' },
{ value: 'device', label: 'Device', color: 'bg-amber-500/20 text-amber-300 border-amber-500/30' },
{ value: 'routine', label: 'Routine', color: 'bg-purple-500/20 text-purple-300 border-purple-500/30' },
{ value: 'other', label: 'Other', color: 'bg-gray-500/20 text-gray-300 border-gray-500/30' },
];
const ACTIVE_KEY = 'homeai_active_character';
function CategoryBadge({ category, categories }) {
const cat = categories.find(c => c.value === category) || categories[categories.length - 1];
return (
<span className={`px-2 py-0.5 text-xs rounded-full border ${cat.color}`}>
{cat.label}
</span>
);
}
function MemoryCard({ memory, categories, onEdit, onDelete }) {
return (
<div className="border border-gray-700 rounded-lg p-4 bg-gray-800/50 space-y-2">
<div className="flex items-start justify-between gap-3">
<p className="text-sm text-gray-200 flex-1 whitespace-pre-wrap">{memory.content}</p>
<div className="flex gap-1 shrink-0">
<button
onClick={() => onEdit(memory)}
className="p-1.5 text-gray-500 hover:text-gray-300 transition-colors"
title="Edit"
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M16.862 4.487l1.687-1.688a1.875 1.875 0 112.652 2.652L10.582 16.07a4.5 4.5 0 01-1.897 1.13L6 18l.8-2.685a4.5 4.5 0 011.13-1.897l8.932-8.931z" />
</svg>
</button>
<button
onClick={() => onDelete(memory.id)}
className="p-1.5 text-gray-500 hover:text-red-400 transition-colors"
title="Delete"
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M14.74 9l-.346 9m-4.788 0L9.26 9m9.968-3.21c.342.052.682.107 1.022.166m-1.022-.165L18.16 19.673a2.25 2.25 0 01-2.244 2.077H8.084a2.25 2.25 0 01-2.244-2.077L4.772 5.79m14.456 0a48.108 48.108 0 00-3.478-.397m-12 .562c.34-.059.68-.114 1.022-.165m0 0a48.11 48.11 0 013.478-.397m7.5 0v-.916c0-1.18-.91-2.164-2.09-2.201a51.964 51.964 0 00-3.32 0c-1.18.037-2.09 1.022-2.09 2.201v.916m7.5 0a48.667 48.667 0 00-7.5 0" />
</svg>
</button>
</div>
</div>
<div className="flex items-center gap-2">
<CategoryBadge category={memory.category} categories={categories} />
<span className="text-xs text-gray-600">
{memory.createdAt ? new Date(memory.createdAt).toLocaleDateString() : ''}
</span>
</div>
</div>
);
}
function MemoryForm({ categories, editing, onSave, onCancel }) {
const [content, setContent] = useState(editing?.content || '');
const [category, setCategory] = useState(editing?.category || categories[0].value);
const handleSubmit = () => {
if (!content.trim()) return;
const memory = {
...(editing?.id ? { id: editing.id } : {}),
content: content.trim(),
category,
};
onSave(memory);
setContent('');
setCategory(categories[0].value);
};
return (
<div className="border border-indigo-500/30 rounded-lg p-4 bg-indigo-500/5 space-y-3">
<textarea
className="w-full bg-gray-800 border border-gray-700 text-gray-200 p-2 rounded-lg text-sm h-20 resize-y focus:border-indigo-500 focus:ring-1 focus:ring-indigo-500 outline-none"
value={content}
onChange={(e) => setContent(e.target.value)}
placeholder="Enter memory content..."
autoFocus
/>
<div className="flex items-center gap-3">
<select
className="bg-gray-800 border border-gray-700 text-gray-200 text-sm p-2 rounded-lg focus:border-indigo-500 outline-none"
value={category}
onChange={(e) => setCategory(e.target.value)}
>
{categories.map(c => (
<option key={c.value} value={c.value}>{c.label}</option>
))}
</select>
<div className="flex gap-2 ml-auto">
<button
onClick={onCancel}
className="px-3 py-1.5 bg-gray-700 hover:bg-gray-600 text-gray-300 text-sm rounded-lg transition-colors"
>
Cancel
</button>
<button
onClick={handleSubmit}
disabled={!content.trim()}
className="px-3 py-1.5 bg-indigo-600 hover:bg-indigo-500 disabled:bg-gray-700 disabled:text-gray-500 text-white text-sm rounded-lg transition-colors"
>
{editing?.id ? 'Update' : 'Add Memory'}
</button>
</div>
</div>
</div>
);
}
export default function Memories() {
const [tab, setTab] = useState('personal'); // 'personal' | 'general'
const [characters, setCharacters] = useState([]);
const [selectedCharId, setSelectedCharId] = useState('');
const [memories, setMemories] = useState([]);
const [loading, setLoading] = useState(false);
const [showForm, setShowForm] = useState(false);
const [editing, setEditing] = useState(null);
const [error, setError] = useState(null);
const [filter, setFilter] = useState('');
// Load characters list
useEffect(() => {
fetch('/api/characters')
.then(r => r.json())
.then(chars => {
setCharacters(chars);
const activeId = localStorage.getItem(ACTIVE_KEY);
if (activeId && chars.some(c => c.id === activeId)) {
setSelectedCharId(activeId);
} else if (chars.length > 0) {
setSelectedCharId(chars[0].id);
}
})
.catch(() => {});
}, []);
// Load memories when tab or selected character changes
const loadMemories = useCallback(async () => {
setLoading(true);
setError(null);
try {
if (tab === 'personal' && selectedCharId) {
const data = await getPersonalMemories(selectedCharId);
setMemories(data.memories || []);
} else if (tab === 'general') {
const data = await getGeneralMemories();
setMemories(data.memories || []);
} else {
setMemories([]);
}
} catch (err) {
setError(err.message);
} finally {
setLoading(false);
}
}, [tab, selectedCharId]);
useEffect(() => { loadMemories(); }, [loadMemories]);
const handleSave = async (memory) => {
try {
if (tab === 'personal') {
await savePersonalMemory(selectedCharId, memory);
} else {
await saveGeneralMemory(memory);
}
setShowForm(false);
setEditing(null);
await loadMemories();
} catch (err) {
setError(err.message);
}
};
const handleDelete = async (memoryId) => {
try {
if (tab === 'personal') {
await deletePersonalMemory(selectedCharId, memoryId);
} else {
await deleteGeneralMemory(memoryId);
}
await loadMemories();
} catch (err) {
setError(err.message);
}
};
const handleEdit = (memory) => {
setEditing(memory);
setShowForm(true);
};
const categories = tab === 'personal' ? PERSONAL_CATEGORIES : GENERAL_CATEGORIES;
const filteredMemories = filter
? memories.filter(m => m.content?.toLowerCase().includes(filter.toLowerCase()) || m.category === filter)
: memories;
// Sort newest first
const sortedMemories = [...filteredMemories].sort(
(a, b) => (b.createdAt || '').localeCompare(a.createdAt || '')
);
const selectedChar = characters.find(c => c.id === selectedCharId);
return (
<div className="space-y-6">
{/* Header */}
<div className="flex items-center justify-between">
<div>
<h1 className="text-3xl font-bold text-gray-100">Memories</h1>
<p className="text-sm text-gray-500 mt-1">
{sortedMemories.length} {tab} memor{sortedMemories.length !== 1 ? 'ies' : 'y'}
{tab === 'personal' && selectedChar && (
<span className="ml-1 text-indigo-400">
for {selectedChar.data?.display_name || selectedChar.data?.name || selectedCharId}
</span>
)}
</p>
</div>
<button
onClick={() => { setEditing(null); setShowForm(!showForm); }}
className="flex items-center gap-2 px-4 py-2 bg-indigo-600 hover:bg-indigo-500 text-white rounded-lg transition-colors"
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M12 4.5v15m7.5-7.5h-15" />
</svg>
Add Memory
</button>
</div>
{error && (
<div className="bg-red-900/30 border border-red-500/50 text-red-300 px-4 py-3 rounded-lg text-sm">
{error}
<button onClick={() => setError(null)} className="ml-2 text-red-400 hover:text-red-300">&times;</button>
</div>
)}
{/* Tabs */}
<div className="flex gap-1 bg-gray-900 p-1 rounded-lg border border-gray-800 w-fit">
<button
onClick={() => { setTab('personal'); setShowForm(false); setEditing(null); }}
className={`px-4 py-2 text-sm font-medium rounded-md transition-colors ${
tab === 'personal'
? 'bg-gray-800 text-white'
: 'text-gray-400 hover:text-gray-200'
}`}
>
Personal
</button>
<button
onClick={() => { setTab('general'); setShowForm(false); setEditing(null); }}
className={`px-4 py-2 text-sm font-medium rounded-md transition-colors ${
tab === 'general'
? 'bg-gray-800 text-white'
: 'text-gray-400 hover:text-gray-200'
}`}
>
General
</button>
</div>
{/* Character selector (personal tab only) */}
{tab === 'personal' && (
<div className="flex items-center gap-3">
<label className="text-sm text-gray-400">Character</label>
<select
value={selectedCharId}
onChange={(e) => setSelectedCharId(e.target.value)}
className="bg-gray-800 border border-gray-700 text-gray-200 text-sm p-2 rounded-lg focus:border-indigo-500 outline-none"
>
{characters.map(c => (
<option key={c.id} value={c.id}>
{c.data?.display_name || c.data?.name || c.id}
</option>
))}
</select>
</div>
)}
{/* Search filter */}
<div>
<input
type="text"
className="w-full bg-gray-800 border border-gray-700 text-gray-200 p-2 rounded-lg text-sm focus:border-indigo-500 focus:ring-1 focus:ring-indigo-500 outline-none"
value={filter}
onChange={(e) => setFilter(e.target.value)}
placeholder="Search memories..."
/>
</div>
{/* Add/Edit form */}
{showForm && (
<MemoryForm
categories={categories}
editing={editing}
onSave={handleSave}
onCancel={() => { setShowForm(false); setEditing(null); }}
/>
)}
{/* Memory list */}
{loading ? (
<div className="text-center py-12">
<p className="text-gray-500">Loading memories...</p>
</div>
) : sortedMemories.length === 0 ? (
<div className="text-center py-12">
<svg className="w-12 h-12 mx-auto text-gray-700 mb-3" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1}>
<path strokeLinecap="round" strokeLinejoin="round" d="M12 18v-5.25m0 0a6.01 6.01 0 001.5-.189m-1.5.189a6.01 6.01 0 01-1.5-.189m3.75 7.478a12.06 12.06 0 01-4.5 0m3.75 2.383a14.406 14.406 0 01-3 0M14.25 18v-.192c0-.983.658-1.823 1.508-2.316a7.5 7.5 0 10-7.517 0c.85.493 1.509 1.333 1.509 2.316V18" />
</svg>
<p className="text-gray-500 text-sm">
{filter ? 'No memories match your search.' : 'No memories yet. Add one to get started.'}
</p>
</div>
) : (
<div className="space-y-3">
{sortedMemories.map(memory => (
<MemoryCard
key={memory.id}
memory={memory}
categories={categories}
onEdit={handleEdit}
onDelete={handleDelete}
/>
))}
</div>
)}
</div>
);
}

View File

@@ -2,6 +2,267 @@ import { defineConfig } from 'vite'
import react from '@vitejs/plugin-react' import react from '@vitejs/plugin-react'
import tailwindcss from '@tailwindcss/vite' import tailwindcss from '@tailwindcss/vite'
const CHARACTERS_DIR = '/Users/aodhan/homeai-data/characters'
const SATELLITE_MAP_PATH = '/Users/aodhan/homeai-data/satellite-map.json'
const CONVERSATIONS_DIR = '/Users/aodhan/homeai-data/conversations'
const MEMORIES_DIR = '/Users/aodhan/homeai-data/memories'
const GAZE_HOST = 'http://10.0.0.101:5782'
const GAZE_API_KEY = process.env.GAZE_API_KEY || ''
function characterStoragePlugin() {
return {
name: 'character-storage',
configureServer(server) {
const ensureDir = async () => {
const { mkdir } = await import('fs/promises')
await mkdir(CHARACTERS_DIR, { recursive: true })
}
// GET /api/characters — list all profiles
server.middlewares.use('/api/characters', async (req, res, next) => {
if (req.method === 'OPTIONS') {
res.writeHead(204, { 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Methods': 'GET,POST,DELETE', 'Access-Control-Allow-Headers': 'Content-Type' })
res.end()
return
}
const { readdir, readFile, writeFile, unlink } = await import('fs/promises')
await ensureDir()
// req.url has the mount prefix stripped by connect, so "/" means /api/characters
const url = new URL(req.url, 'http://localhost')
const subPath = url.pathname.replace(/^\/+/, '')
// GET /api/characters/:id — single profile
if (req.method === 'GET' && subPath) {
try {
const safeId = subPath.replace(/[^a-zA-Z0-9_\-\.]/g, '_')
const raw = await readFile(`${CHARACTERS_DIR}/${safeId}.json`, 'utf-8')
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(raw)
} catch {
res.writeHead(404, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ error: 'Not found' }))
}
return
}
if (req.method === 'GET' && !subPath) {
try {
const files = (await readdir(CHARACTERS_DIR)).filter(f => f.endsWith('.json'))
const profiles = []
for (const file of files) {
try {
const raw = await readFile(`${CHARACTERS_DIR}/${file}`, 'utf-8')
profiles.push(JSON.parse(raw))
} catch { /* skip corrupt files */ }
}
// Sort by addedAt descending
profiles.sort((a, b) => (b.addedAt || '').localeCompare(a.addedAt || ''))
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify(profiles))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
if (req.method === 'POST' && !subPath) {
try {
const chunks = []
for await (const chunk of req) chunks.push(chunk)
const profile = JSON.parse(Buffer.concat(chunks).toString())
if (!profile.id) {
res.writeHead(400, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: 'Missing profile id' }))
return
}
// Sanitize filename — only allow alphanumeric, underscore, dash, dot
const safeId = profile.id.replace(/[^a-zA-Z0-9_\-\.]/g, '_')
await writeFile(`${CHARACTERS_DIR}/${safeId}.json`, JSON.stringify(profile, null, 2))
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ ok: true }))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
if (req.method === 'DELETE' && subPath) {
try {
const safeId = subPath.replace(/[^a-zA-Z0-9_\-\.]/g, '_')
await unlink(`${CHARACTERS_DIR}/${safeId}.json`).catch(() => {})
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ ok: true }))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
next()
})
},
}
}
function satelliteMapPlugin() {
return {
name: 'satellite-map',
configureServer(server) {
server.middlewares.use('/api/satellite-map', async (req, res, next) => {
if (req.method === 'OPTIONS') {
res.writeHead(204, { 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Methods': 'GET,POST', 'Access-Control-Allow-Headers': 'Content-Type' })
res.end()
return
}
const { readFile, writeFile } = await import('fs/promises')
if (req.method === 'GET') {
try {
const raw = await readFile(SATELLITE_MAP_PATH, 'utf-8')
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(raw)
} catch {
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ default: 'aria_default', satellites: {} }))
}
return
}
if (req.method === 'POST') {
try {
const chunks = []
for await (const chunk of req) chunks.push(chunk)
const data = JSON.parse(Buffer.concat(chunks).toString())
await writeFile(SATELLITE_MAP_PATH, JSON.stringify(data, null, 2))
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ ok: true }))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
next()
})
},
}
}
function conversationStoragePlugin() {
return {
name: 'conversation-storage',
configureServer(server) {
const ensureDir = async () => {
const { mkdir } = await import('fs/promises')
await mkdir(CONVERSATIONS_DIR, { recursive: true })
}
server.middlewares.use('/api/conversations', async (req, res, next) => {
if (req.method === 'OPTIONS') {
res.writeHead(204, { 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Methods': 'GET,POST,DELETE', 'Access-Control-Allow-Headers': 'Content-Type' })
res.end()
return
}
const { readdir, readFile, writeFile, unlink } = await import('fs/promises')
await ensureDir()
const url = new URL(req.url, 'http://localhost')
const subPath = url.pathname.replace(/^\/+/, '')
// GET /api/conversations/:id — single conversation with messages
if (req.method === 'GET' && subPath) {
try {
const safeId = subPath.replace(/[^a-zA-Z0-9_\-\.]/g, '_')
const raw = await readFile(`${CONVERSATIONS_DIR}/${safeId}.json`, 'utf-8')
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(raw)
} catch {
res.writeHead(404, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ error: 'Not found' }))
}
return
}
// GET /api/conversations — list metadata (no messages)
if (req.method === 'GET' && !subPath) {
try {
const files = (await readdir(CONVERSATIONS_DIR)).filter(f => f.endsWith('.json'))
const list = []
for (const file of files) {
try {
const raw = await readFile(`${CONVERSATIONS_DIR}/${file}`, 'utf-8')
const conv = JSON.parse(raw)
list.push({
id: conv.id,
title: conv.title || '',
characterId: conv.characterId || '',
characterName: conv.characterName || '',
createdAt: conv.createdAt || '',
updatedAt: conv.updatedAt || '',
messageCount: (conv.messages || []).length,
})
} catch { /* skip corrupt files */ }
}
list.sort((a, b) => (b.updatedAt || '').localeCompare(a.updatedAt || ''))
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify(list))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
// POST /api/conversations — create or update
if (req.method === 'POST' && !subPath) {
try {
const chunks = []
for await (const chunk of req) chunks.push(chunk)
const conv = JSON.parse(Buffer.concat(chunks).toString())
if (!conv.id) {
res.writeHead(400, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: 'Missing conversation id' }))
return
}
const safeId = conv.id.replace(/[^a-zA-Z0-9_\-\.]/g, '_')
await writeFile(`${CONVERSATIONS_DIR}/${safeId}.json`, JSON.stringify(conv, null, 2))
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ ok: true }))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
// DELETE /api/conversations/:id
if (req.method === 'DELETE' && subPath) {
try {
const safeId = subPath.replace(/[^a-zA-Z0-9_\-\.]/g, '_')
await unlink(`${CONVERSATIONS_DIR}/${safeId}.json`).catch(() => {})
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ ok: true }))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
next()
})
},
}
}
function healthCheckPlugin() { function healthCheckPlugin() {
return { return {
name: 'health-check-proxy', name: 'health-check-proxy',
@@ -121,6 +382,273 @@ function healthCheckPlugin() {
}; };
} }
function gazeProxyPlugin() {
return {
name: 'gaze-proxy',
configureServer(server) {
server.middlewares.use('/api/gaze/presets', async (req, res) => {
if (!GAZE_API_KEY) {
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ presets: [] }))
return
}
try {
const http = await import('http')
const url = new URL(`${GAZE_HOST}/api/v1/presets`)
const proxyRes = await new Promise((resolve, reject) => {
const r = http.default.get(url, { headers: { 'X-API-Key': GAZE_API_KEY }, timeout: 5000 }, resolve)
r.on('error', reject)
r.on('timeout', () => { r.destroy(); reject(new Error('timeout')) })
})
const chunks = []
for await (const chunk of proxyRes) chunks.push(chunk)
res.writeHead(proxyRes.statusCode, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(Buffer.concat(chunks))
} catch {
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ presets: [] }))
}
})
},
}
}
function memoryStoragePlugin() {
return {
name: 'memory-storage',
configureServer(server) {
const ensureDirs = async () => {
const { mkdir } = await import('fs/promises')
await mkdir(`${MEMORIES_DIR}/personal`, { recursive: true })
}
const readJsonFile = async (path, fallback) => {
const { readFile } = await import('fs/promises')
try {
return JSON.parse(await readFile(path, 'utf-8'))
} catch {
return fallback
}
}
const writeJsonFile = async (path, data) => {
const { writeFile } = await import('fs/promises')
await writeFile(path, JSON.stringify(data, null, 2))
}
// Personal memories: /api/memories/personal/:characterId[/:memoryId]
server.middlewares.use('/api/memories/personal', async (req, res, next) => {
if (req.method === 'OPTIONS') {
res.writeHead(204, { 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Methods': 'GET,POST,DELETE', 'Access-Control-Allow-Headers': 'Content-Type' })
res.end()
return
}
await ensureDirs()
const url = new URL(req.url, 'http://localhost')
const parts = url.pathname.replace(/^\/+/, '').split('/')
const characterId = parts[0] ? parts[0].replace(/[^a-zA-Z0-9_\-\.]/g, '_') : null
const memoryId = parts[1] || null
if (!characterId) {
res.writeHead(400, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: 'Missing character ID' }))
return
}
const filePath = `${MEMORIES_DIR}/personal/${characterId}.json`
if (req.method === 'GET') {
const data = await readJsonFile(filePath, { characterId, memories: [] })
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify(data))
return
}
if (req.method === 'POST') {
try {
const chunks = []
for await (const chunk of req) chunks.push(chunk)
const memory = JSON.parse(Buffer.concat(chunks).toString())
const data = await readJsonFile(filePath, { characterId, memories: [] })
if (memory.id) {
const idx = data.memories.findIndex(m => m.id === memory.id)
if (idx >= 0) {
data.memories[idx] = { ...data.memories[idx], ...memory }
} else {
data.memories.push(memory)
}
} else {
memory.id = 'm_' + Date.now()
memory.createdAt = memory.createdAt || new Date().toISOString()
data.memories.push(memory)
}
await writeJsonFile(filePath, data)
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ ok: true, memory }))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
if (req.method === 'DELETE' && memoryId) {
try {
const data = await readJsonFile(filePath, { characterId, memories: [] })
data.memories = data.memories.filter(m => m.id !== memoryId)
await writeJsonFile(filePath, data)
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ ok: true }))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
next()
})
// General memories: /api/memories/general[/:memoryId]
server.middlewares.use('/api/memories/general', async (req, res, next) => {
if (req.method === 'OPTIONS') {
res.writeHead(204, { 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Methods': 'GET,POST,DELETE', 'Access-Control-Allow-Headers': 'Content-Type' })
res.end()
return
}
await ensureDirs()
const url = new URL(req.url, 'http://localhost')
const memoryId = url.pathname.replace(/^\/+/, '') || null
const filePath = `${MEMORIES_DIR}/general.json`
if (req.method === 'GET') {
const data = await readJsonFile(filePath, { memories: [] })
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify(data))
return
}
if (req.method === 'POST') {
try {
const chunks = []
for await (const chunk of req) chunks.push(chunk)
const memory = JSON.parse(Buffer.concat(chunks).toString())
const data = await readJsonFile(filePath, { memories: [] })
if (memory.id) {
const idx = data.memories.findIndex(m => m.id === memory.id)
if (idx >= 0) {
data.memories[idx] = { ...data.memories[idx], ...memory }
} else {
data.memories.push(memory)
}
} else {
memory.id = 'm_' + Date.now()
memory.createdAt = memory.createdAt || new Date().toISOString()
data.memories.push(memory)
}
await writeJsonFile(filePath, data)
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ ok: true, memory }))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
if (req.method === 'DELETE' && memoryId) {
try {
const data = await readJsonFile(filePath, { memories: [] })
data.memories = data.memories.filter(m => m.id !== memoryId)
await writeJsonFile(filePath, data)
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ ok: true }))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
next()
})
},
}
}
function characterLookupPlugin() {
return {
name: 'character-lookup',
configureServer(server) {
server.middlewares.use('/api/character-lookup', async (req, res) => {
if (req.method === 'OPTIONS') {
res.writeHead(204, { 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Methods': 'POST', 'Access-Control-Allow-Headers': 'Content-Type' })
res.end()
return
}
if (req.method !== 'POST') {
res.writeHead(405, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: 'POST only' }))
return
}
try {
const chunks = []
for await (const chunk of req) chunks.push(chunk)
const { name, franchise } = JSON.parse(Buffer.concat(chunks).toString())
if (!name || !franchise) {
res.writeHead(400, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: 'Missing name or franchise' }))
return
}
const { execFile } = await import('child_process')
const { promisify } = await import('util')
const execFileAsync = promisify(execFile)
// Call the MCP fetcher inside the running Docker container
const safeName = name.replace(/'/g, "\\'")
const safeFranchise = franchise.replace(/'/g, "\\'")
const pyScript = `
import asyncio, json
from character_details.fetcher import fetch_character
c = asyncio.run(fetch_character('${safeName}', '${safeFranchise}'))
print(json.dumps(c.model_dump(), default=str))
`.trim()
const { stdout } = await execFileAsync(
'docker',
['exec', 'character-browser-character-mcp-1', 'python', '-c', pyScript],
{ timeout: 30000 }
)
const data = JSON.parse(stdout.trim())
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({
name: data.name || name,
franchise: data.franchise || franchise,
description: data.description || '',
background: data.background || '',
appearance: data.appearance || '',
personality: data.personality || '',
abilities: data.abilities || [],
notable_quotes: data.notable_quotes || [],
relationships: data.relationships || [],
sources: data.sources || [],
}))
} catch (err) {
console.error('[character-lookup] failed:', err?.message || err)
const status = err?.message?.includes('timeout') ? 504 : 500
res.writeHead(status, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ error: err?.message || 'Lookup failed' }))
}
})
},
}
}
function bridgeProxyPlugin() { function bridgeProxyPlugin() {
return { return {
name: 'bridge-proxy', name: 'bridge-proxy',
@@ -172,10 +700,11 @@ function bridgeProxyPlugin() {
proxyReq.write(body) proxyReq.write(body)
proxyReq.end() proxyReq.end()
}) })
} catch { } catch (err) {
console.error(`[bridge-proxy] ${targetPath} failed:`, err?.message || err)
if (!res.headersSent) { if (!res.headersSent) {
res.writeHead(502, { 'Content-Type': 'application/json' }) res.writeHead(502, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: 'Bridge unreachable' })) res.end(JSON.stringify({ error: `Bridge unreachable: ${err?.message || 'unknown'}` }))
} }
} }
} }
@@ -189,6 +718,12 @@ function bridgeProxyPlugin() {
export default defineConfig({ export default defineConfig({
plugins: [ plugins: [
characterStoragePlugin(),
satelliteMapPlugin(),
conversationStoragePlugin(),
memoryStoragePlugin(),
gazeProxyPlugin(),
characterLookupPlugin(),
healthCheckPlugin(), healthCheckPlugin(),
bridgeProxyPlugin(), bridgeProxyPlugin(),
tailwindcss(), tailwindcss(),

219
homeai-images/API_GUIDE.md Normal file
View File

@@ -0,0 +1,219 @@
# GAZE REST API Guide
## Setup
1. Open **Settings** in the GAZE web UI
2. Scroll to **REST API Key** and click **Regenerate**
3. Copy the key — you'll need it for all API requests
## Authentication
Every request must include your API key via one of:
- **Header (recommended):** `X-API-Key: <your-key>`
- **Query parameter:** `?api_key=<your-key>`
Responses for auth failures:
| Status | Meaning |
|--------|---------|
| `401` | Missing API key |
| `403` | Invalid API key |
## Endpoints
### List Presets
```
GET /api/v1/presets
```
Returns all available presets.
**Response:**
```json
{
"presets": [
{
"preset_id": "example_01",
"slug": "example_01",
"name": "Example Preset",
"has_cover": true
}
]
}
```
### Generate Image
```
POST /api/v1/generate/<preset_slug>
```
Queue one or more image generations using a preset's configuration. All body parameters are optional — when omitted, the preset's own settings are used.
**Request body (JSON):**
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `count` | int | `1` | Number of images to generate (120) |
| `checkpoint` | string | — | Override checkpoint path (e.g. `"Illustrious/model.safetensors"`) |
| `extra_positive` | string | `""` | Additional positive prompt tags appended to the generated prompt |
| `extra_negative` | string | `""` | Additional negative prompt tags |
| `seed` | int | random | Fixed seed for reproducible generation |
| `width` | int | — | Output width in pixels (must provide both width and height) |
| `height` | int | — | Output height in pixels (must provide both width and height) |
**Response (202):**
```json
{
"jobs": [
{ "job_id": "783f0268-ba85-4426-8ca2-6393c844c887", "status": "queued" }
]
}
```
**Errors:**
| Status | Cause |
|--------|-------|
| `400` | Invalid parameters (bad count, seed, or mismatched width/height) |
| `404` | Preset slug not found |
| `500` | Internal generation error |
### Check Job Status
```
GET /api/v1/job/<job_id>
```
Poll this endpoint to track generation progress.
**Response:**
```json
{
"id": "783f0268-ba85-4426-8ca2-6393c844c887",
"label": "Preset: Example Preset preview",
"status": "done",
"error": null,
"result": {
"image_url": "/static/uploads/presets/example_01/gen_1773601346.png",
"relative_path": "presets/example_01/gen_1773601346.png",
"seed": 927640517599332
}
}
```
**Job statuses:**
| Status | Meaning |
|--------|---------|
| `pending` | Waiting in queue |
| `processing` | Currently generating |
| `done` | Complete — `result` contains image info |
| `failed` | Error occurred — check `error` field |
The `result` object is only present when status is `done`. Use `seed` from the result to reproduce the exact same image later.
**Retrieving the image:** The `image_url` is a path relative to the server root. Fetch it directly:
```
GET http://<host>:5782/static/uploads/presets/example_01/gen_1773601346.png
```
Image retrieval does not require authentication.
## Examples
### Generate a single image and wait for it
```bash
API_KEY="your-key-here"
HOST="http://localhost:5782"
# Queue generation
JOB_ID=$(curl -s -X POST \
-H "X-API-Key: $API_KEY" \
-H "Content-Type: application/json" \
-d '{}' \
"$HOST/api/v1/generate/example_01" | python3 -c "import sys,json; print(json.load(sys.stdin)['jobs'][0]['job_id'])")
echo "Job: $JOB_ID"
# Poll until done
while true; do
RESULT=$(curl -s -H "X-API-Key: $API_KEY" "$HOST/api/v1/job/$JOB_ID")
STATUS=$(echo "$RESULT" | python3 -c "import sys,json; print(json.load(sys.stdin)['status'])")
echo "Status: $STATUS"
if [ "$STATUS" = "done" ] || [ "$STATUS" = "failed" ]; then
echo "$RESULT" | python3 -m json.tool
break
fi
sleep 5
done
```
### Generate 3 images with extra prompts
```bash
curl -X POST \
-H "X-API-Key: $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"count": 3,
"extra_positive": "smiling, outdoors",
"extra_negative": "blurry"
}' \
"$HOST/api/v1/generate/example_01"
```
### Reproduce a specific image
```bash
curl -X POST \
-H "X-API-Key: $API_KEY" \
-H "Content-Type: application/json" \
-d '{"seed": 927640517599332}' \
"$HOST/api/v1/generate/example_01"
```
### Python example
```python
import requests
import time
HOST = "http://localhost:5782"
API_KEY = "your-key-here"
HEADERS = {"X-API-Key": API_KEY, "Content-Type": "application/json"}
# List presets
presets = requests.get(f"{HOST}/api/v1/presets", headers=HEADERS).json()
print(f"Available presets: {[p['name'] for p in presets['presets']]}")
# Generate
resp = requests.post(
f"{HOST}/api/v1/generate/{presets['presets'][0]['slug']}",
headers=HEADERS,
json={"count": 1},
).json()
job_id = resp["jobs"][0]["job_id"]
print(f"Queued job: {job_id}")
# Poll
while True:
status = requests.get(f"{HOST}/api/v1/job/{job_id}", headers=HEADERS).json()
print(f"Status: {status['status']}")
if status["status"] in ("done", "failed"):
break
time.sleep(5)
if status["status"] == "done":
image_url = f"{HOST}{status['result']['image_url']}"
print(f"Image: {image_url}")
print(f"Seed: {status['result']['seed']}")
```

View File

@@ -12,17 +12,27 @@
<string>/Users/aodhan/gitea/homeai/homeai-llm/scripts/preload-models.sh</string> <string>/Users/aodhan/gitea/homeai/homeai-llm/scripts/preload-models.sh</string>
</array> </array>
<key>EnvironmentVariables</key>
<dict>
<!-- Override to change which medium model stays warm -->
<key>HOMEAI_MEDIUM_MODEL</key>
<string>qwen3.5:35b-a3b</string>
</dict>
<key>RunAtLoad</key> <key>RunAtLoad</key>
<true/> <true/>
<key>KeepAlive</key>
<true/>
<key>StandardOutPath</key> <key>StandardOutPath</key>
<string>/tmp/homeai-preload-models.log</string> <string>/tmp/homeai-preload-models.log</string>
<key>StandardErrorPath</key> <key>StandardErrorPath</key>
<string>/tmp/homeai-preload-models-error.log</string> <string>/tmp/homeai-preload-models-error.log</string>
<!-- Delay 15s to let Ollama start first --> <!-- If the script exits/crashes, wait 30s before restarting -->
<key>ThrottleInterval</key> <key>ThrottleInterval</key>
<integer>15</integer> <integer>30</integer>
</dict> </dict>
</plist> </plist>

View File

@@ -1,19 +1,73 @@
#!/bin/bash #!/bin/bash
# Pre-load voice pipeline models into Ollama with infinite keep_alive. # Keep voice pipeline models warm in Ollama VRAM.
# Run after Ollama starts (called by launchd or manually). # Runs as a loop — checks every 5 minutes, re-pins any model that got evicted.
# Only pins lightweight/MoE models — large dense models (70B) use default expiry. # Only pins lightweight/MoE models — large dense models (70B) use default expiry.
OLLAMA_URL="http://localhost:11434" OLLAMA_URL="http://localhost:11434"
CHECK_INTERVAL=300 # seconds between checks
# Wait for Ollama to be ready # Medium model can be overridden via env var (e.g. by persona config)
HOMEAI_MEDIUM_MODEL="${HOMEAI_MEDIUM_MODEL:-qwen3.5:35b-a3b}"
# Models to keep warm: "name|description"
MODELS=(
"qwen2.5:7b|small (4.7GB) — fast fallback"
"${HOMEAI_MEDIUM_MODEL}|medium — persona default"
)
wait_for_ollama() {
for i in $(seq 1 30); do for i in $(seq 1 30); do
curl -sf "$OLLAMA_URL/api/tags" > /dev/null 2>&1 && break curl -sf "$OLLAMA_URL/api/tags" > /dev/null 2>&1 && return 0
sleep 2 sleep 2
done done
return 1
}
# Pin qwen3.5:35b-a3b (MoE, 38.7GB VRAM, voice pipeline default) is_model_loaded() {
echo "[preload] Loading qwen3.5:35b-a3b with keep_alive=-1..." local model="$1"
curl -sf "$OLLAMA_URL/api/ps" 2>/dev/null \
| python3 -c "
import json, sys
data = json.load(sys.stdin)
names = [m['name'] for m in data.get('models', [])]
sys.exit(0 if '$model' in names else 1)
" 2>/dev/null
}
pin_model() {
local model="$1"
local desc="$2"
if is_model_loaded "$model"; then
echo "[keepwarm] $model already loaded — skipping"
return 0
fi
echo "[keepwarm] Loading $model ($desc) with keep_alive=-1..."
curl -sf "$OLLAMA_URL/api/generate" \ curl -sf "$OLLAMA_URL/api/generate" \
-d '{"model":"qwen3.5:35b-a3b","prompt":"ready","stream":false,"keep_alive":-1,"options":{"num_ctx":512}}' \ -d "{\"model\":\"$model\",\"prompt\":\"ready\",\"stream\":false,\"keep_alive\":-1,\"options\":{\"num_ctx\":512}}" \
> /dev/null 2>&1 > /dev/null 2>&1
echo "[preload] qwen3.5:35b-a3b pinned in memory" if [ $? -eq 0 ]; then
echo "[keepwarm] $model pinned in VRAM"
else
echo "[keepwarm] ERROR: failed to load $model"
fi
}
# --- Main loop ---
echo "[keepwarm] Starting model keep-warm daemon (interval: ${CHECK_INTERVAL}s)"
# Initial wait for Ollama
if ! wait_for_ollama; then
echo "[keepwarm] ERROR: Ollama not reachable after 60s, exiting"
exit 1
fi
echo "[keepwarm] Ollama is online"
while true; do
for entry in "${MODELS[@]}"; do
IFS='|' read -r model desc <<< "$entry"
pin_model "$model" "$desc"
done
sleep "$CHECK_INTERVAL"
done

View File

@@ -0,0 +1,40 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
"http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.homeai.vtube-bridge</string>
<key>ProgramArguments</key>
<array>
<string>/Users/aodhan/homeai-visual-env/bin/python3</string>
<string>/Users/aodhan/gitea/homeai/homeai-visual/vtube-bridge.py</string>
<string>--port</string>
<string>8002</string>
<string>--character</string>
<string>/Users/aodhan/gitea/homeai/homeai-dashboard/characters/aria.json</string>
</array>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<true/>
<key>StandardOutPath</key>
<string>/tmp/homeai-vtube-bridge.log</string>
<key>StandardErrorPath</key>
<string>/tmp/homeai-vtube-bridge-error.log</string>
<key>ThrottleInterval</key>
<integer>10</integer>
<key>EnvironmentVariables</key>
<dict>
<key>PATH</key>
<string>/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin</string>
</dict>
</dict>
</plist>

View File

@@ -0,0 +1,170 @@
#!/usr/bin/env python3
"""
Test script for VTube Studio Expression Bridge.
Usage:
python3 test-expressions.py # test all expressions
python3 test-expressions.py --auth # run auth flow first
python3 test-expressions.py --lipsync # test lip sync parameter
python3 test-expressions.py --latency # measure round-trip latency
Requires the vtube-bridge to be running on port 8002.
"""
import argparse
import json
import sys
import time
import urllib.request
BRIDGE_URL = "http://localhost:8002"
EXPRESSIONS = ["idle", "listening", "thinking", "speaking", "happy", "sad", "surprised", "error"]
def _post(path: str, data: dict | None = None) -> dict:
body = json.dumps(data or {}).encode()
req = urllib.request.Request(
f"{BRIDGE_URL}{path}",
data=body,
headers={"Content-Type": "application/json"},
method="POST",
)
with urllib.request.urlopen(req, timeout=10) as resp:
return json.loads(resp.read())
def _get(path: str) -> dict:
req = urllib.request.Request(f"{BRIDGE_URL}{path}")
with urllib.request.urlopen(req, timeout=10) as resp:
return json.loads(resp.read())
def check_bridge():
"""Verify bridge is running and connected."""
try:
status = _get("/status")
print(f"Bridge status: connected={status['connected']}, authenticated={status['authenticated']}")
print(f" Expressions: {', '.join(status.get('expressions', []))}")
if not status["connected"]:
print("\n WARNING: Not connected to VTube Studio. Is it running?")
if not status["authenticated"]:
print(" WARNING: Not authenticated. Run with --auth to initiate auth flow.")
return status
except Exception as e:
print(f"ERROR: Cannot reach bridge at {BRIDGE_URL}: {e}")
print(" Is vtube-bridge.py running?")
sys.exit(1)
def run_auth():
"""Initiate auth flow — user must click Allow in VTube Studio."""
print("Requesting authentication token...")
print(" >>> Click 'Allow' in VTube Studio when prompted <<<")
result = _post("/auth")
print(f" Result: {json.dumps(result, indent=2)}")
return result
def test_expressions(delay: float = 2.0):
"""Cycle through all expressions with a pause between each."""
print(f"\nCycling through {len(EXPRESSIONS)} expressions ({delay}s each):\n")
for expr in EXPRESSIONS:
print(f"{expr}...", end=" ", flush=True)
t0 = time.monotonic()
result = _post("/expression", {"event": expr})
dt = (time.monotonic() - t0) * 1000
if result.get("ok"):
print(f"OK ({dt:.0f}ms)")
else:
print(f"FAILED: {result.get('error', 'unknown')}")
time.sleep(delay)
# Return to idle
_post("/expression", {"event": "idle"})
print("\n Returned to idle.")
def test_lipsync(duration: float = 3.0):
"""Simulate lip sync by sweeping MouthOpen 0→1→0."""
import math
print(f"\nTesting lip sync (MouthOpen sweep, {duration}s)...\n")
fps = 20
frames = int(duration * fps)
for i in range(frames):
t = i / frames
# Sine wave for smooth open/close
value = abs(math.sin(t * math.pi * 4))
value = round(value, 3)
_post("/parameter", {"name": "MouthOpen", "value": value})
print(f"\r MouthOpen = {value:.3f}", end="", flush=True)
time.sleep(1.0 / fps)
_post("/parameter", {"name": "MouthOpen", "value": 0.0})
print("\r MouthOpen = 0.000 (done) ")
def test_latency(iterations: int = 20):
"""Measure expression trigger round-trip latency."""
print(f"\nMeasuring latency ({iterations} iterations)...\n")
times = []
for i in range(iterations):
expr = "thinking" if i % 2 == 0 else "idle"
t0 = time.monotonic()
_post("/expression", {"event": expr})
dt = (time.monotonic() - t0) * 1000
times.append(dt)
print(f" {i+1:2d}. {expr:10s}{dt:.1f}ms")
avg = sum(times) / len(times)
mn = min(times)
mx = max(times)
print(f"\n Avg: {avg:.1f}ms Min: {mn:.1f}ms Max: {mx:.1f}ms")
if avg < 100:
print(" PASS: Average latency under 100ms target")
else:
print(" WARNING: Average latency exceeds 100ms target")
# Return to idle
_post("/expression", {"event": "idle"})
def main():
parser = argparse.ArgumentParser(description="VTube Studio Expression Bridge Tester")
parser.add_argument("--auth", action="store_true", help="Run auth flow")
parser.add_argument("--lipsync", action="store_true", help="Test lip sync parameter sweep")
parser.add_argument("--latency", action="store_true", help="Measure round-trip latency")
parser.add_argument("--delay", type=float, default=2.0, help="Delay between expressions (default: 2s)")
parser.add_argument("--all", action="store_true", help="Run all tests")
args = parser.parse_args()
print("VTube Studio Expression Bridge Tester")
print("=" * 42)
status = check_bridge()
if args.auth:
run_auth()
print()
status = check_bridge()
if not status.get("authenticated") and not args.auth:
print("\nNot authenticated — skipping expression tests.")
print("Run with --auth to authenticate, or start VTube Studio first.")
return
if args.all:
test_expressions(args.delay)
test_lipsync()
test_latency()
elif args.lipsync:
test_lipsync()
elif args.latency:
test_latency()
else:
test_expressions(args.delay)
if __name__ == "__main__":
main()

View File

@@ -1,17 +1,16 @@
#!/usr/bin/env bash #!/usr/bin/env bash
# homeai-visual/setup.sh — P7: VTube Studio bridge + Live2D expressions # homeai-visual/setup.sh — P7: VTube Studio Expression Bridge
# #
# Components: # Sets up:
# - vtube_studio.py — WebSocket client skill for OpenClaw # - Python venv with websockets
# - lipsync.py — amplitude-based lip sync # - vtube-bridge daemon (HTTP ↔ WebSocket bridge)
# - auth.py — VTube Studio token management # - vtube-ctl CLI (symlinked to PATH)
# - launchd service
# #
# Prerequisites: # Prerequisites:
# - P4 (homeai-agent) — OpenClaw running # - P4 (homeai-agent) — OpenClaw running
# - P5 (homeai-character) — aria.json with live2d_expressions set # - P5 (homeai-character) — aria.json with live2d_expressions set
# - macOS: VTube Studio installed (Mac App Store) # - VTube Studio installed (Mac App Store) with WebSocket API enabled
# - Linux: N/A — VTube Studio is macOS/Windows/iOS only
# Linux dev can test the skill code but not the VTube Studio side
set -euo pipefail set -euo pipefail
@@ -19,42 +18,61 @@ SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
REPO_DIR="$(cd "${SCRIPT_DIR}/.." && pwd)" REPO_DIR="$(cd "${SCRIPT_DIR}/.." && pwd)"
source "${REPO_DIR}/scripts/common.sh" source "${REPO_DIR}/scripts/common.sh"
log_section "P7: VTube Studio Bridge" VENV_DIR="$HOME/homeai-visual-env"
detect_platform PLIST_SRC="${SCRIPT_DIR}/launchd/com.homeai.vtube-bridge.plist"
PLIST_DST="$HOME/Library/LaunchAgents/com.homeai.vtube-bridge.plist"
VTUBE_CTL_SRC="$HOME/.openclaw/skills/vtube-studio/scripts/vtube-ctl"
if [[ "$OS_TYPE" == "linux" ]]; then log_section "P7: VTube Studio Expression Bridge"
log_warn "VTube Studio is not available on Linux."
log_warn "This sub-project requires macOS (Mac Mini)." # ─── Python venv ──────────────────────────────────────────────────────────────
if [[ ! -d "$VENV_DIR" ]]; then
log_info "Creating Python venv at $VENV_DIR..."
python3 -m venv "$VENV_DIR"
fi fi
# ─── TODO: Implementation ────────────────────────────────────────────────────── log_info "Installing dependencies..."
cat <<'EOF' "$VENV_DIR/bin/pip" install --upgrade pip -q
"$VENV_DIR/bin/pip" install websockets -q
log_ok "Python venv ready ($(${VENV_DIR}/bin/python3 --version))"
─────────────────────────────────────────────────────────────────┐ # ─── vtube-ctl symlink ───────────────────────────────────────────────────────
│ P7: homeai-visual — NOT YET IMPLEMENTED │
│ │
│ macOS only (VTube Studio is macOS/iOS/Windows) │
│ │
│ Implementation steps: │
│ 1. Install VTube Studio from Mac App Store │
│ 2. Enable WebSocket API in VTube Studio (Settings → port 8001) │
│ 3. Source/purchase Live2D model │
│ 4. Create expression hotkeys for 8 states │
│ 5. Implement skills/vtube_studio.py (WebSocket client) │
│ 6. Implement skills/lipsync.py (amplitude → MouthOpen param) │
│ 7. Implement skills/auth.py (token request + persistence) │
│ 8. Register vtube_studio skill with OpenClaw │
│ 9. Update aria.json live2d_expressions with hotkey IDs │
│ 10. Test all 8 expression states │
│ │
│ On Linux: implement Python skills, test WebSocket protocol │
│ with a mock server before connecting to real VTube Studio. │
│ │
│ Interface contracts: │
│ VTUBE_WS_URL=ws://localhost:8001 │
└─────────────────────────────────────────────────────────────────┘
EOF if [[ -f "$VTUBE_CTL_SRC" ]]; then
chmod +x "$VTUBE_CTL_SRC"
ln -sf "$VTUBE_CTL_SRC" /opt/homebrew/bin/vtube-ctl
log_ok "vtube-ctl symlinked to /opt/homebrew/bin/vtube-ctl"
else
log_warn "vtube-ctl not found at $VTUBE_CTL_SRC — skipping symlink"
fi
log_info "P7 is not yet implemented. See homeai-visual/PLAN.md for details." # ─── launchd service ─────────────────────────────────────────────────────────
exit 0
if [[ -f "$PLIST_SRC" ]]; then
# Unload if already loaded
launchctl bootout "gui/$(id -u)/com.homeai.vtube-bridge" 2>/dev/null || true
cp "$PLIST_SRC" "$PLIST_DST"
launchctl bootstrap "gui/$(id -u)" "$PLIST_DST"
log_ok "launchd service loaded: com.homeai.vtube-bridge"
else
log_warn "Plist not found at $PLIST_SRC — skipping launchd setup"
fi
# ─── Status ──────────────────────────────────────────────────────────────────
echo ""
log_info "VTube Bridge setup complete."
log_info ""
log_info "Next steps:"
log_info " 1. Install VTube Studio from Mac App Store"
log_info " 2. Enable WebSocket API: Settings > WebSocket API > port 8001"
log_info " 3. Load a Live2D model"
log_info " 4. Create expression hotkeys (idle, listening, thinking, speaking, happy, sad, surprised, error)"
log_info " 5. Run: vtube-ctl auth (click Allow in VTube Studio)"
log_info " 6. Run: python3 ${SCRIPT_DIR}/scripts/test-expressions.py --all"
log_info " 7. Update aria.json with real hotkey UUIDs"
log_info ""
log_info "Logs: /tmp/homeai-vtube-bridge.log"
log_info "Bridge: http://localhost:8002/status"

View File

@@ -0,0 +1,454 @@
#!/usr/bin/env python3
"""
VTube Studio Expression Bridge — persistent WebSocket ↔ HTTP bridge.
Maintains a long-lived WebSocket connection to VTube Studio and exposes
a simple HTTP API so other HomeAI components can trigger expressions and
inject parameters (lip sync) without managing their own WS connections.
HTTP API (port 8002):
POST /expression {"event": "thinking"} → trigger hotkey
POST /parameter {"name": "MouthOpen", "value": 0.5} → inject param
POST /parameters [{"name": "MouthOpen", "value": 0.5}, ...]
POST /auth {} → request new token
GET /status → connection info
GET /expressions → list available expressions
Requires: pip install websockets
"""
import argparse
import asyncio
import json
import logging
import signal
import sys
import time
from http import HTTPStatus
from pathlib import Path
try:
import websockets
from websockets.exceptions import ConnectionClosed
except ImportError:
print("ERROR: 'websockets' package required. Install with: pip install websockets", file=sys.stderr)
sys.exit(1)
# ---------------------------------------------------------------------------
# Config
# ---------------------------------------------------------------------------
DEFAULT_VTUBE_WS_URL = "ws://localhost:8001"
DEFAULT_HTTP_PORT = 8002
TOKEN_PATH = Path.home() / ".openclaw" / "vtube_token.json"
DEFAULT_CHARACTER_PATH = (
Path.home() / "gitea" / "homeai" / "homeai-dashboard" / "characters" / "aria.json"
)
logger = logging.getLogger("vtube-bridge")
# ---------------------------------------------------------------------------
# VTube Studio WebSocket Client
# ---------------------------------------------------------------------------
class VTubeClient:
"""Persistent async WebSocket client for VTube Studio API."""
def __init__(self, ws_url: str, character_path: Path):
self.ws_url = ws_url
self.character_path = character_path
self._ws = None
self._token: str | None = None
self._authenticated = False
self._current_expression: str | None = None
self._connected = False
self._request_id = 0
self._lock = asyncio.Lock()
self._load_token()
self._load_character()
# ── Character config ──────────────────────────────────────────────
def _load_character(self):
"""Load expression mappings from character JSON."""
self.expression_map: dict[str, str] = {}
self.ws_triggers: dict = {}
try:
if self.character_path.exists():
cfg = json.loads(self.character_path.read_text())
self.expression_map = cfg.get("live2d_expressions", {})
self.ws_triggers = cfg.get("vtube_ws_triggers", {})
logger.info("Loaded %d expressions from %s", len(self.expression_map), self.character_path.name)
else:
logger.warning("Character file not found: %s", self.character_path)
except Exception as e:
logger.error("Failed to load character config: %s", e)
def reload_character(self):
"""Hot-reload character config without restarting."""
self._load_character()
return {"expressions": self.expression_map, "triggers": self.ws_triggers}
# ── Token persistence ─────────────────────────────────────────────
def _load_token(self):
try:
if TOKEN_PATH.exists():
data = json.loads(TOKEN_PATH.read_text())
self._token = data.get("token")
logger.info("Loaded auth token from %s", TOKEN_PATH)
except Exception as e:
logger.warning("Could not load token: %s", e)
def _save_token(self, token: str):
TOKEN_PATH.parent.mkdir(parents=True, exist_ok=True)
TOKEN_PATH.write_text(json.dumps({"token": token}, indent=2))
self._token = token
logger.info("Saved auth token to %s", TOKEN_PATH)
# ── WebSocket comms ───────────────────────────────────────────────
def _next_id(self) -> str:
self._request_id += 1
return f"homeai-{self._request_id}"
async def _send(self, message_type: str, data: dict | None = None) -> dict:
"""Send a VTube Studio API message and return the response."""
payload = {
"apiName": "VTubeStudioPublicAPI",
"apiVersion": "1.0",
"requestID": self._next_id(),
"messageType": message_type,
"data": data or {},
}
await self._ws.send(json.dumps(payload))
resp = json.loads(await asyncio.wait_for(self._ws.recv(), timeout=10))
return resp
# ── Connection lifecycle ──────────────────────────────────────────
async def connect(self):
"""Connect and authenticate to VTube Studio."""
try:
self._ws = await websockets.connect(self.ws_url, ping_interval=20, ping_timeout=10)
self._connected = True
logger.info("Connected to VTube Studio at %s", self.ws_url)
if self._token:
await self._authenticate()
else:
logger.warning("No auth token — call POST /auth to initiate authentication")
except Exception as e:
self._connected = False
self._authenticated = False
logger.error("Connection failed: %s", e)
raise
async def _authenticate(self):
"""Authenticate with an existing token."""
resp = await self._send("AuthenticationRequest", {
"pluginName": "HomeAI",
"pluginDeveloper": "HomeAI",
"authenticationToken": self._token,
})
self._authenticated = resp.get("data", {}).get("authenticated", False)
if self._authenticated:
logger.info("Authenticated successfully")
else:
logger.warning("Token rejected — request a new one via POST /auth")
self._authenticated = False
async def request_new_token(self) -> dict:
"""Request a new auth token. User must click Allow in VTube Studio."""
if not self._connected:
return {"error": "Not connected to VTube Studio"}
resp = await self._send("AuthenticationTokenRequest", {
"pluginName": "HomeAI",
"pluginDeveloper": "HomeAI",
"pluginIcon": None,
})
token = resp.get("data", {}).get("authenticationToken")
if token:
self._save_token(token)
await self._authenticate()
return {"authenticated": self._authenticated, "token_saved": True}
return {"error": "No token received", "response": resp}
async def disconnect(self):
if self._ws:
await self._ws.close()
self._connected = False
self._authenticated = False
async def ensure_connected(self):
"""Reconnect if the connection dropped."""
if not self._connected or self._ws is None or self._ws.closed:
logger.info("Reconnecting...")
await self.connect()
# ── Expression & parameter API ────────────────────────────────────
async def trigger_expression(self, event: str) -> dict:
"""Trigger a named expression from the character config."""
async with self._lock:
await self.ensure_connected()
if not self._authenticated:
return {"error": "Not authenticated"}
hotkey_id = self.expression_map.get(event)
if not hotkey_id:
return {"error": f"Unknown expression: {event}", "available": list(self.expression_map.keys())}
resp = await self._send("HotkeyTriggerRequest", {"hotkeyID": hotkey_id})
self._current_expression = event
return {"ok": True, "expression": event, "hotkey_id": hotkey_id}
async def set_parameter(self, name: str, value: float, weight: float = 1.0) -> dict:
"""Inject a single VTube Studio parameter value."""
async with self._lock:
await self.ensure_connected()
if not self._authenticated:
return {"error": "Not authenticated"}
resp = await self._send("InjectParameterDataRequest", {
"parameterValues": [{"id": name, "value": value, "weight": weight}],
})
return {"ok": True, "name": name, "value": value}
async def set_parameters(self, params: list[dict]) -> dict:
"""Inject multiple VTube Studio parameters at once."""
async with self._lock:
await self.ensure_connected()
if not self._authenticated:
return {"error": "Not authenticated"}
param_values = [
{"id": p["name"], "value": p["value"], "weight": p.get("weight", 1.0)}
for p in params
]
resp = await self._send("InjectParameterDataRequest", {
"parameterValues": param_values,
})
return {"ok": True, "count": len(param_values)}
async def list_hotkeys(self) -> dict:
"""List all hotkeys available in the current model."""
async with self._lock:
await self.ensure_connected()
if not self._authenticated:
return {"error": "Not authenticated"}
resp = await self._send("HotkeysInCurrentModelRequest", {})
return resp.get("data", {})
async def list_parameters(self) -> dict:
"""List all input parameters for the current model."""
async with self._lock:
await self.ensure_connected()
if not self._authenticated:
return {"error": "Not authenticated"}
resp = await self._send("InputParameterListRequest", {})
return resp.get("data", {})
def status(self) -> dict:
return {
"connected": self._connected,
"authenticated": self._authenticated,
"ws_url": self.ws_url,
"current_expression": self._current_expression,
"expression_count": len(self.expression_map),
"expressions": list(self.expression_map.keys()),
}
# ---------------------------------------------------------------------------
# HTTP Server (asyncio-based, no external deps)
# ---------------------------------------------------------------------------
class BridgeHTTPHandler:
"""Simple async HTTP request handler for the bridge API."""
def __init__(self, client: VTubeClient):
self.client = client
async def handle(self, reader: asyncio.StreamReader, writer: asyncio.StreamWriter):
try:
request_line = await asyncio.wait_for(reader.readline(), timeout=5)
if not request_line:
writer.close()
return
method, path, _ = request_line.decode().strip().split(" ", 2)
path = path.split("?")[0] # strip query params
# Read headers
content_length = 0
while True:
line = await reader.readline()
if line == b"\r\n" or not line:
break
if line.lower().startswith(b"content-length:"):
content_length = int(line.split(b":")[1].strip())
# Read body
body = None
if content_length > 0:
body = await reader.read(content_length)
# Route
try:
result = await self._route(method, path, body)
await self._respond(writer, 200, result)
except Exception as e:
logger.error("Handler error: %s", e, exc_info=True)
await self._respond(writer, 500, {"error": str(e)})
except asyncio.TimeoutError:
writer.close()
except Exception as e:
logger.error("Connection error: %s", e)
try:
writer.close()
except Exception:
pass
async def _route(self, method: str, path: str, body: bytes | None) -> dict:
data = {}
if body:
try:
data = json.loads(body)
except json.JSONDecodeError:
return {"error": "Invalid JSON"}
if method == "GET" and path == "/status":
return self.client.status()
if method == "GET" and path == "/expressions":
return {
"expressions": self.client.expression_map,
"triggers": self.client.ws_triggers,
}
if method == "GET" and path == "/hotkeys":
return await self.client.list_hotkeys()
if method == "GET" and path == "/parameters":
return await self.client.list_parameters()
if method == "POST" and path == "/expression":
event = data.get("event")
if not event:
return {"error": "Missing 'event' field"}
return await self.client.trigger_expression(event)
if method == "POST" and path == "/parameter":
name = data.get("name")
value = data.get("value")
if name is None or value is None:
return {"error": "Missing 'name' or 'value' field"}
return await self.client.set_parameter(name, float(value), float(data.get("weight", 1.0)))
if method == "POST" and path == "/parameters":
if not isinstance(data, list):
return {"error": "Expected JSON array of {name, value} objects"}
return await self.client.set_parameters(data)
if method == "POST" and path == "/auth":
return await self.client.request_new_token()
if method == "POST" and path == "/reload":
return self.client.reload_character()
return {"error": f"Unknown route: {method} {path}"}
async def _respond(self, writer: asyncio.StreamWriter, status: int, data: dict):
body = json.dumps(data, indent=2).encode()
status_text = HTTPStatus(status).phrase
header = (
f"HTTP/1.1 {status} {status_text}\r\n"
f"Content-Type: application/json\r\n"
f"Content-Length: {len(body)}\r\n"
f"Access-Control-Allow-Origin: *\r\n"
f"Access-Control-Allow-Methods: GET, POST, OPTIONS\r\n"
f"Access-Control-Allow-Headers: Content-Type\r\n"
f"\r\n"
)
writer.write(header.encode() + body)
await writer.drain()
writer.close()
# ---------------------------------------------------------------------------
# Auto-reconnect loop
# ---------------------------------------------------------------------------
async def reconnect_loop(client: VTubeClient, interval: float = 5.0):
"""Background task that keeps the VTube Studio connection alive."""
while True:
try:
if not client._connected or client._ws is None or client._ws.closed:
logger.info("Connection lost — attempting reconnect...")
await client.connect()
except Exception as e:
logger.debug("Reconnect failed: %s (retrying in %.0fs)", e, interval)
await asyncio.sleep(interval)
# ---------------------------------------------------------------------------
# Main
# ---------------------------------------------------------------------------
async def main(args):
logging.basicConfig(
level=logging.DEBUG if args.verbose else logging.INFO,
format="%(asctime)s [%(name)s] %(levelname)s: %(message)s",
datefmt="%H:%M:%S",
)
character_path = Path(args.character)
client = VTubeClient(args.vtube_url, character_path)
# Try initial connection (don't fail if VTube Studio isn't running yet)
try:
await client.connect()
except Exception as e:
logger.warning("Initial connection failed: %s (will keep retrying)", e)
# Start reconnect loop
reconnect_task = asyncio.create_task(reconnect_loop(client, interval=5.0))
# Start HTTP server
handler = BridgeHTTPHandler(client)
server = await asyncio.start_server(handler.handle, "0.0.0.0", args.port)
logger.info("HTTP API listening on http://0.0.0.0:%d", args.port)
logger.info("Endpoints: /status /expression /parameter /parameters /auth /reload /hotkeys")
# Graceful shutdown
stop = asyncio.Event()
def _signal_handler():
logger.info("Shutting down...")
stop.set()
loop = asyncio.get_event_loop()
for sig in (signal.SIGINT, signal.SIGTERM):
loop.add_signal_handler(sig, _signal_handler)
async with server:
await stop.wait()
reconnect_task.cancel()
await client.disconnect()
logger.info("Goodbye.")
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="VTube Studio Expression Bridge")
parser.add_argument("--port", type=int, default=DEFAULT_HTTP_PORT, help="HTTP API port (default: 8002)")
parser.add_argument("--vtube-url", default=DEFAULT_VTUBE_WS_URL, help="VTube Studio WebSocket URL")
parser.add_argument("--character", default=str(DEFAULT_CHARACTER_PATH), help="Path to character JSON")
parser.add_argument("--verbose", "-v", action="store_true", help="Debug logging")
args = parser.parse_args()
asyncio.run(main(args))

View File

@@ -18,6 +18,12 @@
<string>1.0</string> <string>1.0</string>
</array> </array>
<key>EnvironmentVariables</key>
<dict>
<key>ELEVENLABS_API_KEY</key>
<string>sk_ec10e261c6190307a37aa161a9583504dcf25a0cabe5dbd5</string>
</dict>
<key>RunAtLoad</key> <key>RunAtLoad</key>
<true/> <true/>

View File

@@ -7,8 +7,10 @@ Usage:
import argparse import argparse
import asyncio import asyncio
import json
import logging import logging
import os import os
import urllib.request
import numpy as np import numpy as np
@@ -20,10 +22,76 @@ from wyoming.tts import Synthesize
_LOGGER = logging.getLogger(__name__) _LOGGER = logging.getLogger(__name__)
ACTIVE_TTS_VOICE_PATH = os.path.expanduser("~/homeai-data/active-tts-voice.json")
SAMPLE_RATE = 24000 SAMPLE_RATE = 24000
SAMPLE_WIDTH = 2 # int16 SAMPLE_WIDTH = 2 # int16
CHANNELS = 1 CHANNELS = 1
CHUNK_SECONDS = 1 # stream in 1-second chunks CHUNK_SECONDS = 1 # stream in 1-second chunks
VTUBE_BRIDGE_URL = "http://localhost:8002"
LIPSYNC_ENABLED = True
LIPSYNC_FRAME_SAMPLES = 1200 # 50ms frames at 24kHz → 20 updates/sec
LIPSYNC_SCALE = 10.0 # amplitude multiplier (tuned for Kokoro output levels)
def _send_lipsync(value: float):
"""Fire-and-forget POST to vtube-bridge with mouth open value."""
try:
body = json.dumps({"name": "MouthOpen", "value": value}).encode()
req = urllib.request.Request(
f"{VTUBE_BRIDGE_URL}/parameter",
data=body,
headers={"Content-Type": "application/json"},
method="POST",
)
urllib.request.urlopen(req, timeout=0.5)
except Exception:
pass # bridge may not be running
def _compute_lipsync_frames(samples_int16: np.ndarray) -> list[float]:
"""Compute per-frame RMS amplitude scaled to 01 for lip sync."""
frames = []
for i in range(0, len(samples_int16), LIPSYNC_FRAME_SAMPLES):
frame = samples_int16[i : i + LIPSYNC_FRAME_SAMPLES].astype(np.float32)
rms = np.sqrt(np.mean(frame ** 2)) / 32768.0
mouth = min(rms * LIPSYNC_SCALE, 1.0)
frames.append(round(mouth, 3))
return frames
def _get_active_tts_config() -> dict | None:
"""Read the active TTS config set by the OpenClaw bridge."""
try:
with open(ACTIVE_TTS_VOICE_PATH) as f:
return json.load(f)
except Exception:
return None
def _synthesize_elevenlabs(text: str, voice_id: str, model: str = "eleven_multilingual_v2") -> bytes:
"""Call ElevenLabs TTS API and return raw PCM audio bytes (24kHz 16-bit mono)."""
api_key = os.environ.get("ELEVENLABS_API_KEY", "")
if not api_key:
raise RuntimeError("ELEVENLABS_API_KEY not set")
url = f"https://api.elevenlabs.io/v1/text-to-speech/{voice_id}?output_format=pcm_24000"
payload = json.dumps({
"text": text,
"model_id": model,
"voice_settings": {"stability": 0.5, "similarity_boost": 0.75},
}).encode()
req = urllib.request.Request(
url,
data=payload,
headers={
"Content-Type": "application/json",
"xi-api-key": api_key,
},
method="POST",
)
with urllib.request.urlopen(req, timeout=30) as resp:
return resp.read()
def _load_kokoro(): def _load_kokoro():
@@ -76,26 +144,53 @@ class KokoroEventHandler(AsyncEventHandler):
synthesize = Synthesize.from_event(event) synthesize = Synthesize.from_event(event)
text = synthesize.text text = synthesize.text
voice = self._default_voice voice = self._default_voice
use_elevenlabs = False
if synthesize.voice and synthesize.voice.name: # Bridge state file takes priority (set per-request by OpenClaw bridge)
tts_config = _get_active_tts_config()
if tts_config and tts_config.get("engine") == "elevenlabs":
use_elevenlabs = True
voice = tts_config.get("elevenlabs_voice_id", "")
_LOGGER.debug("Synthesizing %r with ElevenLabs voice=%s", text, voice)
elif tts_config and tts_config.get("kokoro_voice"):
voice = tts_config["kokoro_voice"]
elif synthesize.voice and synthesize.voice.name:
voice = synthesize.voice.name voice = synthesize.voice.name
_LOGGER.debug("Synthesizing %r with voice=%s speed=%.1f", text, voice, self._speed)
try: try:
loop = asyncio.get_event_loop() loop = asyncio.get_event_loop()
if use_elevenlabs and voice:
# ElevenLabs returns PCM 24kHz 16-bit mono
model = tts_config.get("elevenlabs_model", "eleven_multilingual_v2")
_LOGGER.info("Using ElevenLabs TTS (model=%s, voice=%s)", model, voice)
pcm_bytes = await loop.run_in_executor(
None, lambda: _synthesize_elevenlabs(text, voice, model)
)
samples_int16 = np.frombuffer(pcm_bytes, dtype=np.int16)
audio_bytes = pcm_bytes
else:
_LOGGER.debug("Synthesizing %r with Kokoro voice=%s speed=%.1f", text, voice, self._speed)
samples, sample_rate = await loop.run_in_executor( samples, sample_rate = await loop.run_in_executor(
None, lambda: self._tts.create(text, voice=voice, speed=self._speed) None, lambda: self._tts.create(text, voice=voice, speed=self._speed)
) )
samples_int16 = (np.clip(samples, -1.0, 1.0) * 32767).astype(np.int16) samples_int16 = (np.clip(samples, -1.0, 1.0) * 32767).astype(np.int16)
audio_bytes = samples_int16.tobytes() audio_bytes = samples_int16.tobytes()
# Pre-compute lip sync frames for the entire utterance
lipsync_frames = []
if LIPSYNC_ENABLED:
lipsync_frames = _compute_lipsync_frames(samples_int16)
await self.write_event( await self.write_event(
AudioStart(rate=SAMPLE_RATE, width=SAMPLE_WIDTH, channels=CHANNELS).event() AudioStart(rate=SAMPLE_RATE, width=SAMPLE_WIDTH, channels=CHANNELS).event()
) )
chunk_size = SAMPLE_RATE * SAMPLE_WIDTH * CHANNELS * CHUNK_SECONDS chunk_size = SAMPLE_RATE * SAMPLE_WIDTH * CHANNELS * CHUNK_SECONDS
lipsync_idx = 0
samples_per_chunk = SAMPLE_RATE * CHUNK_SECONDS
frames_per_chunk = samples_per_chunk // LIPSYNC_FRAME_SAMPLES
for i in range(0, len(audio_bytes), chunk_size): for i in range(0, len(audio_bytes), chunk_size):
await self.write_event( await self.write_event(
AudioChunk( AudioChunk(
@@ -106,8 +201,22 @@ class KokoroEventHandler(AsyncEventHandler):
).event() ).event()
) )
# Send lip sync frames for this audio chunk
if LIPSYNC_ENABLED and lipsync_frames:
chunk_frames = lipsync_frames[lipsync_idx : lipsync_idx + frames_per_chunk]
for mouth_val in chunk_frames:
await asyncio.get_event_loop().run_in_executor(
None, _send_lipsync, mouth_val
)
lipsync_idx += frames_per_chunk
# Close mouth after speech
if LIPSYNC_ENABLED:
await asyncio.get_event_loop().run_in_executor(None, _send_lipsync, 0.0)
await self.write_event(AudioStop().event()) await self.write_event(AudioStop().event())
_LOGGER.info("Synthesized %.1fs of audio", len(samples) / sample_rate) duration = len(samples_int16) / SAMPLE_RATE
_LOGGER.info("Synthesized %.1fs of audio (%d lipsync frames)", duration, len(lipsync_frames))
except Exception: except Exception:
_LOGGER.exception("Synthesis error") _LOGGER.exception("Synthesis error")