feat: character system v2 — schema upgrade, memory system, per-character TTS routing

Character schema v2: background, dialogue_style, appearance, skills, gaze_presets
with automatic v1→v2 migration. LLM-assisted character creation via Character MCP
server. Two-tier memory system (personal per-character + general shared) with
budget-based injection into LLM system prompt. Per-character TTS voice routing via
state file — Wyoming TTS server reads active config to route between Kokoro (local)
and ElevenLabs (cloud PCM 24kHz). Dashboard: memories page, conversation history,
character profile on cards, auto-TTS engine selection from character config.
Also includes VTube Studio expression bridge and ComfyUI API guide.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Aodhan Collins
2026-03-17 19:15:46 +00:00
parent 1e52c002c2
commit 60eb89ea42
39 changed files with 3846 additions and 409 deletions

View File

@@ -9,6 +9,7 @@ OPENAI_API_KEY=
DEEPSEEK_API_KEY=
GEMINI_API_KEY=
ELEVENLABS_API_KEY=
GAZE_API_KEY=
# ─── Data & Paths ──────────────────────────────────────────────────────────────
DATA_DIR=${HOME}/homeai-data
@@ -40,10 +41,14 @@ OPEN_WEBUI_URL=http://localhost:3030
OLLAMA_PRIMARY_MODEL=llama3.3:70b
OLLAMA_FAST_MODEL=qwen2.5:7b
# Medium model kept warm for voice pipeline (override per persona)
# Used by preload-models.sh keep-warm daemon
HOMEAI_MEDIUM_MODEL=qwen3.5:35b-a3b
# ─── P3: Voice ─────────────────────────────────────────────────────────────────
WYOMING_STT_URL=tcp://localhost:10300
WYOMING_TTS_URL=tcp://localhost:10301
ELEVENLABS_API_KEY= # Create at elevenlabs.io if using elevenlabs TTS engine
# ELEVENLABS_API_KEY is set above in API Keys section
# ─── P4: Agent ─────────────────────────────────────────────────────────────────
OPENCLAW_URL=http://localhost:8080

View File

@@ -26,6 +26,7 @@ All AI inference runs locally on this machine. No cloud dependency required (clo
### AI & LLM
- **Ollama** — local LLM runtime (target models: Llama 3.3 70B, Qwen 2.5 72B)
- **Model keep-warm daemon** — `preload-models.sh` runs as a loop, checks every 5 min, re-pins evicted models with `keep_alive=-1`. Keeps `qwen2.5:7b` (small/fast) and `$HOMEAI_MEDIUM_MODEL` (default: `qwen3.5:35b-a3b`) always loaded in VRAM. Medium model is configurable via env var for per-persona model assignment.
- **Open WebUI** — browser-based chat interface, runs as Docker container
### Image Generation
@@ -35,7 +36,8 @@ All AI inference runs locally on this machine. No cloud dependency required (clo
### Speech
- **Whisper.cpp** — speech-to-text, optimised for Apple Silicon/Neural Engine
- **Kokoro TTS** — fast, lightweight text-to-speech (primary, low-latency)
- **Kokoro TTS** — fast, lightweight text-to-speech (primary, low-latency, local)
- **ElevenLabs TTS** — cloud voice cloning/synthesis (per-character voice ID, routed via state file)
- **Chatterbox TTS** — voice cloning engine (Apple Silicon MPS optimised)
- **Qwen3-TTS** — alternative voice cloning via MLX
- **openWakeWord** — always-on wake word detection
@@ -49,11 +51,13 @@ All AI inference runs locally on this machine. No cloud dependency required (clo
### AI Agent / Orchestration
- **OpenClaw** — primary AI agent layer; receives voice commands, calls tools, manages personality
- **n8n** — visual workflow automation (Docker), chains AI actions
- **mem0** — long-term memory layer for the AI character
- **Character Memory System** — two-tier JSON-based memories (personal per-character + general shared), injected into LLM system prompt with budget truncation
### Character & Personality
- **Character Manager** (built — see `character-manager.jsx`) — single config UI for personality, prompts, models, Live2D mappings, and notes
- Character config exports to JSON, consumed by OpenClaw system prompt and pipeline
- **Character Schema v2** — JSON spec with background, dialogue_style, appearance, skills, gaze_presets (v1 auto-migrated)
- **HomeAI Dashboard** — unified web app: character editor, chat, memory manager, service dashboard
- **Character MCP Server** — LLM-assisted character creation via Fandom wiki/Wikipedia lookup (Docker)
- Character config stored as JSON files in `~/homeai-data/characters/`, consumed by bridge for system prompt construction
### Visual Representation
- **VTube Studio** — Live2D model display on desktop (macOS) and mobile (iOS/Android)
@@ -85,47 +89,79 @@ All AI inference runs locally on this machine. No cloud dependency required (clo
ESP32-S3-BOX-3 (room)
→ Wake word detected (openWakeWord, runs locally on device or Mac Mini)
→ Audio streamed to Mac Mini via Wyoming Satellite
→ Whisper.cpp transcribes speech to text
OpenClaw receives text + context
Ollama LLM generates response (with character persona from system prompt)
mem0 updates long-term memory
→ Whisper MLX transcribes speech to text
HA conversation agent → OpenClaw HTTP Bridge
Bridge resolves character (satellite_id → character mapping)
Bridge builds system prompt (profile + memories) and writes TTS config to state file
→ OpenClaw CLI → Ollama LLM generates response
→ Response dispatched:
Kokoro/Chatterbox renders TTS audio
Wyoming TTS reads state file → routes to Kokoro (local) or ElevenLabs (cloud)
→ Audio sent back to ESP32-S3-BOX-3 (spoken response)
→ VTube Studio API triggered (expression + lip sync on desktop/mobile)
→ Home Assistant action called if applicable (lights, music, etc.)
```
### Timeout Strategy
The HTTP bridge checks Ollama `/api/ps` before each request to determine if the LLM is already loaded:
| Layer | Warm (model loaded) | Cold (model loading) |
|---|---|---|
| HA conversation component | 200s | 200s |
| OpenClaw HTTP bridge | 60s | 180s |
| OpenClaw agent | 60s | 60s |
The keep-warm daemon ensures models stay loaded, so cold starts should be rare (only after Ollama restarts or VRAM pressure).
---
## Character System
The AI assistant has a defined personality managed via the Character Manager tool.
The AI assistant has a defined personality managed via the HomeAI Dashboard (character editor + memory manager).
Key config surfaces:
- **System prompt** — injected into every Ollama request
- **Voice clone reference** — `.wav` file path for Chatterbox/Qwen3-TTS
- **Live2D expression mappings** — idle, speaking, thinking, happy, error states
- **VTube Studio WebSocket triggers** — JSON map of events to expressions
### Character Schema v2
Each character is a JSON file in `~/homeai-data/characters/` with:
- **System prompt** — core personality, injected into every LLM request
- **Profile fields** — background, appearance, dialogue_style, skills array
- **TTS config** — engine (kokoro/elevenlabs), kokoro_voice, elevenlabs_voice_id, elevenlabs_model, speed
- **GAZE presets** — array of `{preset, trigger}` for image generation styles
- **Custom prompt rules** — trigger/response overrides for specific contexts
- **mem0** — persistent memory that evolves over time
Character config JSON (exported from Character Manager) is the single source of truth consumed by all pipeline components.
### Memory System
Two-tier memory stored as JSON in `~/homeai-data/memories/`:
- **Personal memories** (`personal/{character_id}.json`) — per-character, about user interactions
- **General memories** (`general.json`) — shared operational knowledge (tool usage, device info, routines)
Memories are injected into the system prompt by the bridge with budget truncation (personal: 4000 chars, general: 3000 chars, newest first).
### TTS Voice Routing
The bridge writes the active character's TTS config to `~/homeai-data/active-tts-voice.json` before each request. The Wyoming TTS server reads this state file to determine which engine/voice to use:
- **Kokoro** — local, fast, uses `kokoro_voice` field (e.g., `af_heart`)
- **ElevenLabs** — cloud, uses `elevenlabs_voice_id` + `elevenlabs_model`, returns PCM 24kHz
This works for both ESP32/HA pipeline and dashboard chat.
---
## Project Priorities
1. **Foundation** — Docker stack up (Home Assistant, Open WebUI, Portainer, Uptime Kuma)
2. **LLM** — Ollama running with target models, Open WebUI connected
3. **Voice pipeline** — Whisper → Ollama → Kokoro → Wyoming → Home Assistant
4. **OpenClaw** — installed, onboarded, connected to Ollama and Home Assistant
5. **ESP32-S3-BOX-3** — ESPHome flash, Wyoming Satellite, LVGL face
6. **Character system** — system prompt wired up, mem0 integrated, voice cloned
7. **VTube Studio** — model loaded, WebSocket API bridge written as OpenClaw skill
8. **ComfyUI** — image generation online, character-consistent model workflows
9. **Extended integrations** — n8n workflows, Music Assistant, Snapcast, Gitea, code-server
10. **Polish** — Authelia, Tailscale hardening, mobile companion, iOS widgets
1. **Foundation** — Docker stack up (Home Assistant, Open WebUI, Portainer, Uptime Kuma)
2. **LLM** — Ollama running with target models, Open WebUI connected
3. **Voice pipeline** — Whisper → Ollama → Kokoro → Wyoming → Home Assistant
4. **OpenClaw** — installed, onboarded, connected to Ollama and Home Assistant
5. **ESP32-S3-BOX-3** — ESPHome flash, Wyoming Satellite, display faces ✅
6. **Character system** — schema v2, dashboard editor, memory system, per-character TTS routing ✅
7. **Animated visual** — PNG/GIF character visual for the web assistant (initial visual layer)
8. **Android app** — companion app for mobile access to the assistant
9. **ComfyUI** — image generation online, character-consistent model workflows
10. **Extended integrations** — n8n workflows, Music Assistant, Snapcast, Gitea, code-server
11. **Polish** — Authelia, Tailscale hardening, iOS widgets
### Stretch Goals
- **Live2D / VTube Studio** — full Live2D model with WebSocket API bridge (requires learning Live2D tooling)
---
@@ -133,7 +169,11 @@ Character config JSON (exported from Character Manager) is the single source of
- All Docker compose files: `~/server/docker/`
- OpenClaw skills: `~/.openclaw/skills/`
- Character configs: `~/.openclaw/characters/`
- Character configs: `~/homeai-data/characters/`
- Character memories: `~/homeai-data/memories/`
- Conversation history: `~/homeai-data/conversations/`
- Active TTS state: `~/homeai-data/active-tts-voice.json`
- Satellite → character map: `~/homeai-data/satellite-map.json`
- Whisper models: `~/models/whisper/`
- Ollama models: managed by Ollama at `~/.ollama/models/`
- ComfyUI models: `~/ComfyUI/models/`

88
TODO.md
View File

@@ -26,7 +26,7 @@
- [x] Register local GGUF models via Modelfiles (no download): llama3.3:70b, qwen3:32b, codestral:22b, qwen2.5:7b
- [x] Register additional models: EVA-LLaMA-3.33-70B, Midnight-Miqu-70B, QwQ-32B, Qwen3.5-35B, Qwen3-Coder-30B, Qwen3-VL-30B, GLM-4.6V-Flash, DeepSeek-R1-8B, gemma-3-27b
- [x] Add qwen3.5:35b-a3b (MoE, Q8_0) — 26.7 tok/s, recommended for voice pipeline
- [x] Write model preload script + launchd service (keeps voice model in VRAM permanently)
- [x] Write model keep-warm daemon + launchd service (pins qwen2.5:7b + $HOMEAI_MEDIUM_MODEL in VRAM, checks every 5 min)
- [x] Deploy Open WebUI via Docker compose (port 3030)
- [x] Verify Open WebUI connected to Ollama, all models available
- [x] Run pipeline benchmark (homeai-voice/scripts/benchmark_pipeline.py) — STT/LLM/TTS latency profiled
@@ -82,7 +82,7 @@
- [x] Verify full voice → agent → HA action flow
- [x] Add OpenClaw to Uptime Kuma monitors (Manual user action required)
### P5 · homeai-character *(can start alongside P4)*
### P5 · homeai-dashboard *(character system + dashboard)*
- [x] Define and write `schema/character.schema.json` (v1)
- [x] Write `characters/aria.json` — default character
@@ -100,6 +100,15 @@
- [x] Add character profile management to dashboard — store/switch character configs with attached profile images
- [x] Add TTS voice preview in character editor — Kokoro preview via OpenClaw bridge with loading state, custom text, stop control
- [x] Merge homeai-character + homeai-desktop into unified homeai-dashboard (services, chat, characters, editor)
- [x] Upgrade character schema to v2 — background, dialogue_style, appearance, skills, gaze_presets (auto-migrate v1)
- [x] Add LLM-assisted character creation via Character MCP server (Fandom/Wikipedia lookup)
- [x] Add character memory system — personal (per-character) + general (shared) memories with dashboard UI
- [x] Add conversation history with per-conversation persistence
- [x] Wire character_id through full pipeline (dashboard → bridge → LLM system prompt)
- [x] Add TTS text cleaning — strip tags, asterisks, emojis, markdown before synthesis
- [x] Add per-character TTS voice routing — bridge writes state file, Wyoming server reads it
- [x] Add ElevenLabs TTS support in Wyoming server — cloud voice synthesis via state file routing
- [x] Dashboard auto-selects character's TTS engine/voice (Kokoro or ElevenLabs)
- [ ] Deploy dashboard as Docker container or static site on Mac Mini
---
@@ -123,50 +132,71 @@
- [ ] Flash remaining units (bedroom, kitchen)
- [ ] Document MAC address → room name mapping
### P6b · homeai-rpi (Kitchen Satellite)
- [x] Set up Wyoming Satellite on Raspberry Pi 5 (SELBINA) with ReSpeaker 2-Mics pHAT
- [x] Write setup.sh — full Pi provisioning (venv, drivers, systemd, scripts)
- [x] Write deploy.sh — remote deploy/manage from Mac Mini (push-wrapper, test-logs, etc.)
- [x] Write satellite_wrapper.py — monkey-patches fixing TTS echo, writer race, streaming timeout
- [x] Test multi-command voice loop without freezing
---
## Phase 5 — Visual Layer
### P7 · homeai-visual
- [ ] Install VTube Studio (Mac App Store)
- [ ] Enable WebSocket API on port 8001
- [ ] Source/purchase a Live2D model (nizima.com or booth.pm)
- [ ] Load model in VTube Studio
- [ ] Create hotkeys for all 8 expression states
- [ ] Write `skills/vtube_studio` SKILL.md + implementation
- [ ] Run auth flow — click Allow in VTube Studio, save token
- [ ] Test all 8 expressions via test script
- [ ] Update `aria.json` with real VTube Studio hotkey IDs
- [ ] Write `lipsync.py` amplitude-based helper
- [ ] Integrate lip sync into OpenClaw TTS dispatch
- [ ] Test full pipeline: voice → thinking expression → speaking with lip sync
#### VTube Studio Expression Bridge
- [x] Write `vtube-bridge.py` — persistent WebSocket ↔ HTTP bridge daemon (port 8002)
- [x] Write `vtube-ctl` CLI wrapper + OpenClaw skill (`~/.openclaw/skills/vtube-studio/`)
- [x] Wire expression triggers into `openclaw-http-bridge.py` (thinking → idle, speaking → idle)
- [x] Add amplitude-based lip sync to `wyoming_kokoro_server.py` (RMS → MouthOpen parameter)
- [x] Write `test-expressions.py` — auth flow, expression cycle, lip sync sweep, latency test
- [x] Write launchd plist + setup.sh for venv creation and service registration
- [ ] Install VTube Studio from Mac App Store, enable WebSocket API (port 8001)
- [ ] Source/purchase Live2D model, load in VTube Studio
- [ ] Create 8 expression hotkeys, record UUIDs
- [ ] Run `setup.sh` to create venv, install websockets, load launchd service
- [ ] Run `vtube-ctl auth` — click Allow in VTube Studio
- [ ] Update `aria.json` with real hotkey UUIDs (replace placeholders)
- [ ] Run `test-expressions.py --all` — verify expressions + lip sync + latency
- [ ] Set up VTube Studio mobile (iPhone/iPad) on Tailnet
#### Web Visuals (Dashboard)
- [ ] Design PNG/GIF character visuals for web assistant (idle, thinking, speaking, etc.)
- [ ] Integrate animated visuals into homeai-dashboard chat view
- [ ] Sync visual state to voice pipeline events (listening, processing, responding)
- [ ] Add expression transitions and idle animations
### P8 · homeai-android
- [ ] Build Android companion app for mobile assistant access
- [ ] Integrate with OpenClaw bridge API (chat, TTS, STT)
- [ ] Add character visual display
- [ ] Push notification support via ntfy/FCM
---
## Phase 6 — Image Generation
### P8 · homeai-images
### P9 · homeai-images (ComfyUI)
- [ ] Clone ComfyUI to `~/ComfyUI/`, install deps in venv
- [ ] Verify MPS is detected at launch
- [ ] Write and load launchd plist (`com.homeai.comfyui.plist`)
- [ ] Download SDXL base model
- [ ] Download Flux.1-schnell
- [ ] Download ControlNet models (canny, depth)
- [ ] Download SDXL base model + Flux.1-schnell + ControlNet models
- [ ] Test generation via ComfyUI web UI (port 8188)
- [ ] Build and export `quick.json`, `portrait.json`, `scene.json`, `upscale.json` workflows
- [ ] Build and export workflow JSONs (quick, portrait, scene, upscale)
- [ ] Write `skills/comfyui` SKILL.md + implementation
- [ ] Test skill: "Generate a portrait of Aria looking happy"
- [ ] Collect character reference images for LoRA training
- [ ] Train SDXL LoRA with kohya_ss, verify character consistency
- [ ] Add ComfyUI to Uptime Kuma monitors
---
## Phase 7 — Extended Integrations & Polish
### P10 · Integrations & Polish
- [ ] Deploy Music Assistant (Docker), integrate with Home Assistant
- [ ] Write `skills/music` SKILL.md for OpenClaw
- [ ] Deploy Snapcast server on Mac Mini
@@ -183,10 +213,24 @@
---
## Stretch Goals
### Live2D / VTube Studio
- [ ] Learn Live2D modelling toolchain (Live2D Cubism Editor)
- [ ] Install VTube Studio (Mac App Store), enable WebSocket API on port 8001
- [ ] Source/commission a Live2D model (nizima.com or booth.pm)
- [ ] Create hotkeys for expression states
- [ ] Write `skills/vtube_studio` SKILL.md + implementation
- [ ] Write `lipsync.py` amplitude-based helper
- [ ] Integrate lip sync into OpenClaw TTS dispatch
- [ ] Set up VTube Studio mobile (iPhone/iPad) on Tailnet
---
## Open Decisions
- [ ] Confirm character name (determines wake word training)
- [ ] Live2D model: purchase off-the-shelf or commission custom?
- [ ] mem0 backend: Chroma (simple) vs Qdrant Docker (better semantic search)?
- [ ] Snapcast output: ESP32 built-in speakers or dedicated audio hardware per room?
- [ ] Authelia user store: local file vs LDAP?

View File

@@ -12,7 +12,7 @@ CONF_TIMEOUT = "timeout"
DEFAULT_HOST = "10.0.0.101"
DEFAULT_PORT = 8081 # OpenClaw HTTP Bridge (not 8080 gateway)
DEFAULT_AGENT = "main"
DEFAULT_TIMEOUT = 120
DEFAULT_TIMEOUT = 200 # Must exceed bridge cold timeout (180s)
# API endpoints
OPENCLAW_API_PATH = "/api/agent/message"

View File

@@ -77,12 +77,16 @@ class OpenClawAgent(AbstractConversationAgent):
_LOGGER.debug("Processing message: %s", text)
try:
response_text = await self._call_openclaw(text)
response_text = await self._call_openclaw(
text,
satellite_id=getattr(user_input, "satellite_id", None),
device_id=getattr(user_input, "device_id", None),
)
# Create proper IntentResponse for Home Assistant
intent_response = IntentResponse(language=user_input.language or "en")
intent_response.async_set_speech(response_text)
return ConversationResult(
response=intent_response,
conversation_id=conversation_id,
@@ -96,13 +100,14 @@ class OpenClawAgent(AbstractConversationAgent):
conversation_id=conversation_id,
)
async def _call_openclaw(self, message: str) -> str:
async def _call_openclaw(self, message: str, satellite_id: str = None, device_id: str = None) -> str:
"""Call OpenClaw API and return the response."""
url = f"http://{self.host}:{self.port}{OPENCLAW_API_PATH}"
payload = {
"message": message,
"agent": self.agent_name,
"satellite_id": satellite_id or device_id,
}
session = async_get_clientsession(self.hass)

View File

@@ -35,6 +35,8 @@
<dict>
<key>PATH</key>
<string>/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin</string>
<key>ELEVENLABS_API_KEY</key>
<string>sk_ec10e261c6190307a37aa161a9583504dcf25a0cabe5dbd5</string>
</dict>
</dict>
</plist>

View File

@@ -28,6 +28,8 @@
<string>eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJmZGQ1NzZlYWNkMTU0ZTY2ODY1OTkzYTlhNTIxM2FmNyIsImlhdCI6MTc3MjU4ODYyOCwiZXhwIjoyMDg3OTQ4NjI4fQ.CTAU1EZgpVLp_aRnk4vg6cQqwS5N-p8jQkAAXTxFmLY</string>
<key>HASS_TOKEN</key>
<string>eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJmZGQ1NzZlYWNkMTU0ZTY2ODY1OTkzYTlhNTIxM2FmNyIsImlhdCI6MTc3MjU4ODYyOCwiZXhwIjoyMDg3OTQ4NjI4fQ.CTAU1EZgpVLp_aRnk4vg6cQqwS5N-p8jQkAAXTxFmLY</string>
<key>GAZE_API_KEY</key>
<string>e63401f17e4845e1059f830267f839fe7fc7b6083b1cb1730863318754d799f4</string>
</dict>
<key>RunAtLoad</key>

View File

@@ -24,9 +24,12 @@ Endpoints:
import argparse
import json
import os
import subprocess
import sys
import asyncio
import urllib.request
import threading
from http.server import HTTPServer, BaseHTTPRequestHandler
from socketserver import ThreadingMixIn
from urllib.parse import urlparse
@@ -40,19 +43,222 @@ from wyoming.asr import Transcribe, Transcript
from wyoming.audio import AudioStart, AudioChunk, AudioStop
from wyoming.info import Info
# Timeout settings (seconds)
TIMEOUT_WARM = 120 # Model already loaded in VRAM
TIMEOUT_COLD = 180 # Model needs loading first (~10-20s load + inference)
OLLAMA_PS_URL = "http://localhost:11434/api/ps"
VTUBE_BRIDGE_URL = "http://localhost:8002"
def load_character_prompt() -> str:
"""Load the active character system prompt."""
character_path = Path.home() / ".openclaw" / "characters" / "aria.json"
def _vtube_fire_and_forget(path: str, data: dict):
"""Send a non-blocking POST to the VTube Studio bridge. Failures are silent."""
def _post():
try:
body = json.dumps(data).encode()
req = urllib.request.Request(
f"{VTUBE_BRIDGE_URL}{path}",
data=body,
headers={"Content-Type": "application/json"},
method="POST",
)
urllib.request.urlopen(req, timeout=2)
except Exception:
pass # bridge may not be running — that's fine
threading.Thread(target=_post, daemon=True).start()
def is_model_warm() -> bool:
"""Check if the default Ollama model is already loaded in VRAM."""
try:
req = urllib.request.Request(OLLAMA_PS_URL)
with urllib.request.urlopen(req, timeout=2) as resp:
data = json.loads(resp.read())
return len(data.get("models", [])) > 0
except Exception:
# If we can't reach Ollama, assume cold (safer longer timeout)
return False
CHARACTERS_DIR = Path("/Users/aodhan/homeai-data/characters")
SATELLITE_MAP_PATH = Path("/Users/aodhan/homeai-data/satellite-map.json")
MEMORIES_DIR = Path("/Users/aodhan/homeai-data/memories")
ACTIVE_TTS_VOICE_PATH = Path("/Users/aodhan/homeai-data/active-tts-voice.json")
def clean_text_for_tts(text: str) -> str:
"""Strip content that shouldn't be spoken: tags, asterisks, emojis, markdown."""
# Remove HTML/XML tags and their content for common non-spoken tags
text = re.sub(r'<[^>]+>', '', text)
# Remove content between asterisks (actions/emphasis markup like *sighs*)
text = re.sub(r'\*[^*]+\*', '', text)
# Remove markdown bold/italic markers that might remain
text = re.sub(r'[*_]{1,3}', '', text)
# Remove markdown headers
text = re.sub(r'^#{1,6}\s+', '', text, flags=re.MULTILINE)
# Remove markdown links [text](url) → keep text
text = re.sub(r'\[([^\]]+)\]\([^)]+\)', r'\1', text)
# Remove bare URLs
text = re.sub(r'https?://\S+', '', text)
# Remove code blocks and inline code
text = re.sub(r'```[\s\S]*?```', '', text)
text = re.sub(r'`[^`]+`', '', text)
# Remove emojis
text = re.sub(
r'[\U0001F600-\U0001F64F\U0001F300-\U0001F5FF\U0001F680-\U0001F6FF'
r'\U0001F1E0-\U0001F1FF\U0001F900-\U0001F9FF\U0001FA00-\U0001FAFF'
r'\U00002702-\U000027B0\U0000FE00-\U0000FE0F\U0000200D'
r'\U00002600-\U000026FF\U00002300-\U000023FF]+', '', text
)
# Collapse multiple spaces/newlines
text = re.sub(r'\n{2,}', '\n', text)
text = re.sub(r'[ \t]{2,}', ' ', text)
return text.strip()
def load_satellite_map() -> dict:
"""Load the satellite-to-character mapping."""
try:
with open(SATELLITE_MAP_PATH) as f:
return json.load(f)
except Exception:
return {"default": "aria_default", "satellites": {}}
def set_active_tts_voice(character_id: str, tts_config: dict):
"""Write the active TTS config to a state file for the Wyoming TTS server to read."""
try:
ACTIVE_TTS_VOICE_PATH.parent.mkdir(parents=True, exist_ok=True)
state = {
"character_id": character_id,
"engine": tts_config.get("engine", "kokoro"),
"kokoro_voice": tts_config.get("kokoro_voice", ""),
"elevenlabs_voice_id": tts_config.get("elevenlabs_voice_id", ""),
"elevenlabs_model": tts_config.get("elevenlabs_model", "eleven_multilingual_v2"),
"speed": tts_config.get("speed", 1),
}
with open(ACTIVE_TTS_VOICE_PATH, "w") as f:
json.dump(state, f)
except Exception as e:
print(f"[OpenClaw Bridge] Warning: could not write active TTS config: {e}")
def resolve_character_id(satellite_id: str = None) -> str:
"""Resolve a satellite ID to a character profile ID."""
sat_map = load_satellite_map()
if satellite_id and satellite_id in sat_map.get("satellites", {}):
return sat_map["satellites"][satellite_id]
return sat_map.get("default", "aria_default")
def load_character(character_id: str = None) -> dict:
"""Load a character profile by ID. Returns the full character data dict."""
if not character_id:
character_id = resolve_character_id()
safe_id = character_id.replace("/", "_")
character_path = CHARACTERS_DIR / f"{safe_id}.json"
if not character_path.exists():
return ""
return {}
try:
with open(character_path) as f:
data = json.load(f)
return data.get("system_prompt", "")
profile = json.load(f)
return profile.get("data", {})
except Exception:
return {}
def load_character_prompt(satellite_id: str = None, character_id: str = None) -> str:
"""Load the full system prompt for a character, resolved by satellite or explicit ID.
Builds a rich prompt from system_prompt + profile fields (background, dialogue_style, etc.)."""
if not character_id:
character_id = resolve_character_id(satellite_id)
char = load_character(character_id)
if not char:
return ""
sections = []
# Core system prompt
prompt = char.get("system_prompt", "")
if prompt:
sections.append(prompt)
# Character profile fields
profile_parts = []
if char.get("background"):
profile_parts.append(f"## Background\n{char['background']}")
if char.get("appearance"):
profile_parts.append(f"## Appearance\n{char['appearance']}")
if char.get("dialogue_style"):
profile_parts.append(f"## Dialogue Style\n{char['dialogue_style']}")
if char.get("skills"):
skills = char["skills"]
if isinstance(skills, list):
skills_text = ", ".join(skills[:15])
else:
skills_text = str(skills)
profile_parts.append(f"## Skills & Interests\n{skills_text}")
if profile_parts:
sections.append("[Character Profile]\n" + "\n\n".join(profile_parts))
# Character metadata
meta_lines = []
if char.get("display_name"):
meta_lines.append(f"Your name is: {char['display_name']}")
# Support both v1 (gaze_preset string) and v2 (gaze_presets array)
gaze_presets = char.get("gaze_presets", [])
if gaze_presets and isinstance(gaze_presets, list):
for gp in gaze_presets:
preset = gp.get("preset", "")
trigger = gp.get("trigger", "self-portrait")
if preset:
meta_lines.append(f"GAZE preset '{preset}' — use for: {trigger}")
elif char.get("gaze_preset"):
meta_lines.append(f"Your gaze_preset for self-portraits is: {char['gaze_preset']}")
if meta_lines:
sections.append("[Character Metadata]\n" + "\n".join(meta_lines))
# Memories (personal + general)
personal, general = load_memories(character_id)
if personal:
sections.append("[Personal Memories]\n" + "\n".join(f"- {m}" for m in personal))
if general:
sections.append("[General Knowledge]\n" + "\n".join(f"- {m}" for m in general))
return "\n\n".join(sections)
def load_memories(character_id: str) -> tuple[list[str], list[str]]:
"""Load personal (per-character) and general memories.
Returns (personal_contents, general_contents) truncated to fit context budget."""
PERSONAL_BUDGET = 4000 # max chars for personal memories in prompt
GENERAL_BUDGET = 3000 # max chars for general memories in prompt
def _read_memories(path: Path, budget: int) -> list[str]:
try:
with open(path) as f:
data = json.load(f)
except Exception:
return []
memories = data.get("memories", [])
# Sort newest first
memories.sort(key=lambda m: m.get("createdAt", ""), reverse=True)
result = []
used = 0
for m in memories:
content = m.get("content", "").strip()
if not content:
continue
if used + len(content) > budget:
break
result.append(content)
used += len(content)
return result
safe_id = character_id.replace("/", "_")
personal = _read_memories(MEMORIES_DIR / "personal" / f"{safe_id}.json", PERSONAL_BUDGET)
general = _read_memories(MEMORIES_DIR / "general.json", GENERAL_BUDGET)
return personal, general
class OpenClawBridgeHandler(BaseHTTPRequestHandler):
"""HTTP request handler for OpenClaw bridge."""
@@ -95,44 +301,78 @@ class OpenClawBridgeHandler(BaseHTTPRequestHandler):
self._send_json_response(404, {"error": "Not found"})
def _handle_tts_request(self):
"""Handle TTS request and return wav audio."""
"""Handle TTS request and return audio. Routes to Kokoro or ElevenLabs based on engine."""
content_length = int(self.headers.get("Content-Length", 0))
if content_length == 0:
self._send_json_response(400, {"error": "Empty body"})
return
try:
body = self.rfile.read(content_length).decode()
data = json.loads(body)
except json.JSONDecodeError:
self._send_json_response(400, {"error": "Invalid JSON"})
return
text = data.get("text", "Hello, this is a test.")
# Strip emojis so TTS doesn't try to read them out
text = re.sub(
r'[\U0001F600-\U0001F64F\U0001F300-\U0001F5FF\U0001F680-\U0001F6FF'
r'\U0001F1E0-\U0001F1FF\U0001F900-\U0001F9FF\U0001FA00-\U0001FAFF'
r'\U00002702-\U000027B0\U0000FE00-\U0000FE0F\U0000200D'
r'\U00002600-\U000026FF\U00002300-\U000023FF]+', '', text
).strip()
text = clean_text_for_tts(text)
voice = data.get("voice", "af_heart")
engine = data.get("engine", "kokoro")
try:
# Run the async Wyoming client
audio_bytes = asyncio.run(self._synthesize_audio(text, voice))
# Send WAV response
# Signal avatar: speaking
_vtube_fire_and_forget("/expression", {"event": "speaking"})
if engine == "elevenlabs":
audio_bytes, content_type = self._synthesize_elevenlabs(text, voice, data.get("model"))
else:
# Default: local Kokoro via Wyoming
audio_bytes = asyncio.run(self._synthesize_audio(text, voice))
content_type = "audio/wav"
# Signal avatar: idle
_vtube_fire_and_forget("/expression", {"event": "idle"})
self.send_response(200)
self.send_header("Content-Type", "audio/wav")
# Allow CORS for local testing from Vite
self.send_header("Content-Type", content_type)
self.send_header("Access-Control-Allow-Origin", "*")
self.end_headers()
self.wfile.write(audio_bytes)
except Exception as e:
_vtube_fire_and_forget("/expression", {"event": "error"})
self._send_json_response(500, {"error": str(e)})
def _synthesize_elevenlabs(self, text: str, voice_id: str, model: str = None) -> tuple[bytes, str]:
"""Call ElevenLabs TTS API and return (audio_bytes, content_type)."""
api_key = os.environ.get("ELEVENLABS_API_KEY", "")
if not api_key:
raise RuntimeError("ELEVENLABS_API_KEY not set in environment")
if not voice_id:
raise RuntimeError("No ElevenLabs voice ID provided")
model = model or "eleven_multilingual_v2"
url = f"https://api.elevenlabs.io/v1/text-to-speech/{voice_id}"
payload = json.dumps({
"text": text,
"model_id": model,
"voice_settings": {"stability": 0.5, "similarity_boost": 0.75},
}).encode()
req = urllib.request.Request(
url,
data=payload,
headers={
"Content-Type": "application/json",
"xi-api-key": api_key,
"Accept": "audio/mpeg",
},
method="POST",
)
with urllib.request.urlopen(req, timeout=30) as resp:
audio_bytes = resp.read()
return audio_bytes, "audio/mpeg"
def do_OPTIONS(self):
"""Handle CORS preflight requests."""
self.send_response(204)
@@ -264,6 +504,43 @@ class OpenClawBridgeHandler(BaseHTTPRequestHandler):
print(f"[OpenClaw Bridge] Wake word detected: {wake_word_data.get('wake_word', 'unknown')}")
self._send_json_response(200, {"status": "ok", "message": "Wake word received"})
@staticmethod
def _call_openclaw(message: str, agent: str, timeout: int) -> str:
"""Call OpenClaw CLI and return stdout."""
result = subprocess.run(
["/opt/homebrew/bin/openclaw", "agent", "--message", message, "--agent", agent],
capture_output=True,
text=True,
timeout=timeout,
check=True,
)
return result.stdout.strip()
@staticmethod
def _needs_followup(response: str) -> bool:
"""Detect if the model promised to act but didn't actually do it.
Returns True if the response looks like a 'will do' without a result."""
if not response:
return False
resp_lower = response.lower()
# If the response contains a URL or JSON-like output, it probably completed
if "http://" in response or "https://" in response or '"status"' in response:
return False
# If it contains a tool result indicator (ha-ctl output, gaze-ctl output)
if any(kw in resp_lower for kw in ["image_url", "seed", "entity_id", "state:", "turned on", "turned off"]):
return False
# Detect promise-like language without substance
promise_phrases = [
"let me", "i'll ", "i will ", "sure thing", "sure,", "right away",
"generating", "one moment", "working on", "hang on", "just a moment",
"on it", "let me generate", "let me create",
]
has_promise = any(phrase in resp_lower for phrase in promise_phrases)
# Short responses with promise language are likely incomplete
if has_promise and len(response) < 200:
return True
return False
def _handle_agent_request(self):
"""Handle agent message request."""
content_length = int(self.headers.get("Content-Length", 0))
@@ -280,29 +557,63 @@ class OpenClawBridgeHandler(BaseHTTPRequestHandler):
message = data.get("message")
agent = data.get("agent", "main")
satellite_id = data.get("satellite_id")
explicit_character_id = data.get("character_id")
if not message:
self._send_json_response(400, {"error": "Message is required"})
return
# Inject system prompt
system_prompt = load_character_prompt()
# Resolve character: explicit ID > satellite mapping > default
if explicit_character_id:
character_id = explicit_character_id
else:
character_id = resolve_character_id(satellite_id)
system_prompt = load_character_prompt(character_id=character_id)
# Set the active TTS config for the Wyoming server to pick up
char = load_character(character_id)
tts_config = char.get("tts", {})
if tts_config:
set_active_tts_voice(character_id, tts_config)
engine = tts_config.get("engine", "kokoro")
voice_label = tts_config.get("kokoro_voice", "") if engine == "kokoro" else tts_config.get("elevenlabs_voice_id", "")
print(f"[OpenClaw Bridge] Active TTS: {engine} / {voice_label}")
if satellite_id:
print(f"[OpenClaw Bridge] Satellite: {satellite_id} → character: {character_id}")
elif explicit_character_id:
print(f"[OpenClaw Bridge] Character: {character_id}")
if system_prompt:
message = f"System Context: {system_prompt}\n\nUser Request: {message}"
# Check if model is warm to set appropriate timeout
warm = is_model_warm()
timeout = TIMEOUT_WARM if warm else TIMEOUT_COLD
print(f"[OpenClaw Bridge] Model {'warm' if warm else 'cold'}, timeout={timeout}s")
# Signal avatar: thinking
_vtube_fire_and_forget("/expression", {"event": "thinking"})
# Call OpenClaw CLI (use full path for launchd compatibility)
try:
result = subprocess.run(
["/opt/homebrew/bin/openclaw", "agent", "--message", message, "--agent", agent],
capture_output=True,
text=True,
timeout=120,
check=True
)
response_text = result.stdout.strip()
response_text = self._call_openclaw(message, agent, timeout)
# Re-prompt if the model promised to act but didn't call a tool.
# Detect "I'll do X" / "Let me X" responses that lack any result.
if self._needs_followup(response_text):
print(f"[OpenClaw Bridge] Response looks like a promise without action, re-prompting")
followup = (
"You just said you would do something but didn't actually call the exec tool. "
"Do NOT explain what you will do — call the tool NOW using exec and return the result."
)
response_text = self._call_openclaw(followup, agent, timeout)
# Signal avatar: idle (TTS handler will override to 'speaking' if voice is used)
_vtube_fire_and_forget("/expression", {"event": "idle"})
self._send_json_response(200, {"response": response_text})
except subprocess.TimeoutExpired:
self._send_json_response(504, {"error": "OpenClaw command timed out"})
self._send_json_response(504, {"error": f"OpenClaw command timed out after {timeout}s (model was {'warm' if warm else 'cold'})"})
except subprocess.CalledProcessError as e:
error_msg = e.stderr.strip() if e.stderr else "OpenClaw command failed"
self._send_json_response(500, {"error": error_msg})

View File

@@ -24,6 +24,8 @@
<string>/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin</string>
<key>HOME</key>
<string>/Users/aodhan</string>
<key>GAZE_API_KEY</key>
<string>e63401f17e4845e1059f830267f839fe7fc7b6083b1cb1730863318754d799f4</string>
</dict>
<key>RunAtLoad</key>

View File

@@ -1,15 +1,24 @@
{
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "HomeAI Character Config",
"version": "1",
"version": "2",
"type": "object",
"required": ["schema_version", "name", "system_prompt", "tts"],
"properties": {
"schema_version": { "type": "integer", "const": 1 },
"schema_version": { "type": "integer", "enum": [1, 2] },
"name": { "type": "string" },
"display_name": { "type": "string" },
"description": { "type": "string" },
"background": { "type": "string", "description": "Backstory, lore, or general prompt enrichment" },
"dialogue_style": { "type": "string", "description": "How the persona speaks or reacts, with example lines" },
"appearance": { "type": "string", "description": "Physical description, also used for image prompting" },
"skills": {
"type": "array",
"description": "Topics the persona specialises in or enjoys talking about",
"items": { "type": "string" }
},
"system_prompt": { "type": "string" },
"model_overrides": {
@@ -31,35 +40,21 @@
"voice_ref_path": { "type": "string" },
"kokoro_voice": { "type": "string" },
"elevenlabs_voice_id": { "type": "string" },
"elevenlabs_voice_name": { "type": "string" },
"elevenlabs_model": { "type": "string", "default": "eleven_monolingual_v1" },
"speed": { "type": "number", "default": 1.0 }
}
},
"live2d_expressions": {
"type": "object",
"description": "Maps semantic state to VTube Studio hotkey ID",
"properties": {
"idle": { "type": "string" },
"listening": { "type": "string" },
"thinking": { "type": "string" },
"speaking": { "type": "string" },
"happy": { "type": "string" },
"sad": { "type": "string" },
"surprised": { "type": "string" },
"error": { "type": "string" }
}
},
"vtube_ws_triggers": {
"type": "object",
"description": "VTube Studio WebSocket actions keyed by event name",
"additionalProperties": {
"gaze_presets": {
"type": "array",
"description": "GAZE image generation presets with trigger conditions",
"items": {
"type": "object",
"required": ["preset"],
"properties": {
"type": { "type": "string", "enum": ["hotkey", "parameter"] },
"id": { "type": "string" },
"value": { "type": "number" }
"preset": { "type": "string" },
"trigger": { "type": "string", "default": "self-portrait" }
}
}
},
@@ -78,5 +73,6 @@
},
"notes": { "type": "string" }
}
}
},
"additionalProperties": true
}

View File

@@ -3,6 +3,7 @@ import Dashboard from './pages/Dashboard';
import Chat from './pages/Chat';
import Characters from './pages/Characters';
import Editor from './pages/Editor';
import Memories from './pages/Memories';
function NavItem({ to, children, icon }) {
return (
@@ -77,6 +78,17 @@ function Layout({ children }) {
Characters
</NavItem>
<NavItem
to="/memories"
icon={
<svg className="w-5 h-5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1.5}>
<path strokeLinecap="round" strokeLinejoin="round" d="M12 18v-5.25m0 0a6.01 6.01 0 001.5-.189m-1.5.189a6.01 6.01 0 01-1.5-.189m3.75 7.478a12.06 12.06 0 01-4.5 0m3.75 2.383a14.406 14.406 0 01-3 0M14.25 18v-.192c0-.983.658-1.823 1.508-2.316a7.5 7.5 0 10-7.517 0c.85.493 1.509 1.333 1.509 2.316V18" />
</svg>
}
>
Memories
</NavItem>
<NavItem
to="/editor"
icon={
@@ -113,6 +125,7 @@ function App() {
<Route path="/" element={<div className="flex-1 overflow-y-auto p-8"><div className="max-w-6xl mx-auto"><Dashboard /></div></div>} />
<Route path="/chat" element={<Chat />} />
<Route path="/characters" element={<div className="flex-1 overflow-y-auto p-8"><div className="max-w-6xl mx-auto"><Characters /></div></div>} />
<Route path="/memories" element={<div className="flex-1 overflow-y-auto p-8"><div className="max-w-6xl mx-auto"><Memories /></div></div>} />
<Route path="/editor" element={<div className="flex-1 overflow-y-auto p-8"><div className="max-w-6xl mx-auto"><Editor /></div></div>} />
</Routes>
</Layout>

View File

@@ -2,8 +2,10 @@ import { useEffect, useRef } from 'react'
import MessageBubble from './MessageBubble'
import ThinkingIndicator from './ThinkingIndicator'
export default function ChatPanel({ messages, isLoading, onReplay }) {
export default function ChatPanel({ messages, isLoading, onReplay, character }) {
const bottomRef = useRef(null)
const name = character?.name || 'AI'
const image = character?.image || null
useEffect(() => {
bottomRef.current?.scrollIntoView({ behavior: 'smooth' })
@@ -13,10 +15,14 @@ export default function ChatPanel({ messages, isLoading, onReplay }) {
return (
<div className="flex-1 flex items-center justify-center">
<div className="text-center">
<div className="w-16 h-16 rounded-full bg-indigo-600/20 flex items-center justify-center mx-auto mb-4">
<span className="text-indigo-400 text-2xl">AI</span>
</div>
<h2 className="text-xl font-medium text-gray-200 mb-2">Hi, I'm Aria</h2>
{image ? (
<img src={image} alt={name} className="w-20 h-20 rounded-full object-cover mx-auto mb-4 ring-2 ring-indigo-500/30" />
) : (
<div className="w-20 h-20 rounded-full bg-indigo-600/20 flex items-center justify-center mx-auto mb-4">
<span className="text-indigo-400 text-2xl">{name[0]}</span>
</div>
)}
<h2 className="text-xl font-medium text-gray-200 mb-2">Hi, I'm {name}</h2>
<p className="text-gray-500 text-sm">Type a message or press the mic to talk</p>
</div>
</div>
@@ -26,9 +32,9 @@ export default function ChatPanel({ messages, isLoading, onReplay }) {
return (
<div className="flex-1 overflow-y-auto py-4">
{messages.map((msg) => (
<MessageBubble key={msg.id} message={msg} onReplay={onReplay} />
<MessageBubble key={msg.id} message={msg} onReplay={onReplay} character={character} />
))}
{isLoading && <ThinkingIndicator />}
{isLoading && <ThinkingIndicator character={character} />}
<div ref={bottomRef} />
</div>
)

View File

@@ -0,0 +1,70 @@
function timeAgo(dateStr) {
if (!dateStr) return ''
const diff = Date.now() - new Date(dateStr).getTime()
const mins = Math.floor(diff / 60000)
if (mins < 1) return 'just now'
if (mins < 60) return `${mins}m ago`
const hours = Math.floor(mins / 60)
if (hours < 24) return `${hours}h ago`
const days = Math.floor(hours / 24)
return `${days}d ago`
}
export default function ConversationList({ conversations, activeId, onCreate, onSelect, onDelete }) {
return (
<div className="w-72 border-r border-gray-800 flex flex-col bg-gray-950 shrink-0">
{/* New chat button */}
<div className="p-3 border-b border-gray-800">
<button
onClick={onCreate}
className="w-full flex items-center justify-center gap-2 px-3 py-2 bg-indigo-600 hover:bg-indigo-500 text-white text-sm rounded-lg transition-colors"
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M12 4.5v15m7.5-7.5h-15" />
</svg>
New chat
</button>
</div>
{/* Conversation list */}
<div className="flex-1 overflow-y-auto">
{conversations.length === 0 ? (
<p className="text-xs text-gray-600 text-center py-6">No conversations yet</p>
) : (
conversations.map(conv => (
<div
key={conv.id}
onClick={() => onSelect(conv.id)}
className={`group flex items-start gap-2 px-3 py-2.5 cursor-pointer border-b border-gray-800/50 transition-colors ${
conv.id === activeId
? 'bg-gray-800 text-white'
: 'text-gray-400 hover:bg-gray-800/50 hover:text-gray-200'
}`}
>
<div className="flex-1 min-w-0">
<p className="text-sm truncate">
{conv.title || 'New conversation'}
</p>
<div className="flex items-center gap-2 mt-0.5">
{conv.characterName && (
<span className="text-xs text-indigo-400/70">{conv.characterName}</span>
)}
<span className="text-xs text-gray-600">{timeAgo(conv.updatedAt)}</span>
</div>
</div>
<button
onClick={(e) => { e.stopPropagation(); onDelete(conv.id) }}
className="opacity-0 group-hover:opacity-100 p-1 text-gray-500 hover:text-red-400 transition-all shrink-0 mt-0.5"
title="Delete"
>
<svg className="w-3.5 h-3.5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M14.74 9l-.346 9m-4.788 0L9.26 9m9.968-3.21c.342.052.682.107 1.022.166m-1.022-.165L18.16 19.673a2.25 2.25 0 01-2.244 2.077H8.084a2.25 2.25 0 01-2.244-2.077L4.772 5.79m14.456 0a48.108 48.108 0 00-3.478-.397m-12 .562c.34-.059.68-.114 1.022-.165m0 0a48.11 48.11 0 013.478-.397m7.5 0v-.916c0-1.18-.91-2.164-2.09-2.201a51.964 51.964 0 00-3.32 0c-1.18.037-2.09 1.022-2.09 2.201v.916m7.5 0a48.667 48.667 0 00-7.5 0" />
</svg>
</button>
</div>
))
)}
</div>
</div>
)
}

View File

@@ -1,14 +1,100 @@
export default function MessageBubble({ message, onReplay }) {
import { useState } from 'react'
function Avatar({ character }) {
const name = character?.name || 'AI'
const image = character?.image || null
if (image) {
return <img src={image} alt={name} className="w-8 h-8 rounded-full object-cover shrink-0 mt-0.5 ring-1 ring-gray-700" />
}
return (
<div className="w-8 h-8 rounded-full bg-indigo-600/20 flex items-center justify-center shrink-0 mt-0.5">
<span className="text-indigo-400 text-sm">{name[0]}</span>
</div>
)
}
function ImageOverlay({ src, onClose }) {
return (
<div
className="fixed inset-0 z-50 bg-black/80 flex items-center justify-center cursor-zoom-out"
onClick={onClose}
>
<img
src={src}
alt="Full size"
className="max-w-[90vw] max-h-[90vh] object-contain rounded-lg shadow-2xl"
onClick={(e) => e.stopPropagation()}
/>
<button
onClick={onClose}
className="absolute top-4 right-4 text-white/70 hover:text-white transition-colors p-2"
>
<svg className="w-6 h-6" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M6 18L18 6M6 6l12 12" />
</svg>
</button>
</div>
)
}
const IMAGE_URL_RE = /(https?:\/\/[^\s]+\.(?:png|jpg|jpeg|gif|webp))/gi
function RichContent({ text }) {
const [overlayImage, setOverlayImage] = useState(null)
const parts = []
let lastIndex = 0
let match
IMAGE_URL_RE.lastIndex = 0
while ((match = IMAGE_URL_RE.exec(text)) !== null) {
if (match.index > lastIndex) {
parts.push({ type: 'text', value: text.slice(lastIndex, match.index) })
}
parts.push({ type: 'image', value: match[1] })
lastIndex = IMAGE_URL_RE.lastIndex
}
if (lastIndex < text.length) {
parts.push({ type: 'text', value: text.slice(lastIndex) })
}
if (parts.length === 1 && parts[0].type === 'text') {
return <>{text}</>
}
return (
<>
{parts.map((part, i) =>
part.type === 'image' ? (
<button
key={i}
onClick={() => setOverlayImage(part.value)}
className="block my-2 cursor-zoom-in"
>
<img
src={part.value}
alt="Generated image"
className="rounded-xl max-w-full max-h-80 object-contain"
loading="lazy"
/>
</button>
) : (
<span key={i}>{part.value}</span>
)
)}
{overlayImage && <ImageOverlay src={overlayImage} onClose={() => setOverlayImage(null)} />}
</>
)
}
export default function MessageBubble({ message, onReplay, character }) {
const isUser = message.role === 'user'
return (
<div className={`flex ${isUser ? 'justify-end' : 'justify-start'} px-4 py-1.5`}>
<div className={`flex items-start gap-3 max-w-[80%] ${isUser ? 'flex-row-reverse' : ''}`}>
{!isUser && (
<div className="w-8 h-8 rounded-full bg-indigo-600/20 flex items-center justify-center shrink-0 mt-0.5">
<span className="text-indigo-400 text-sm">AI</span>
</div>
)}
{!isUser && <Avatar character={character} />}
<div>
<div
className={`rounded-2xl px-4 py-2.5 text-sm leading-relaxed whitespace-pre-wrap ${
@@ -19,7 +105,7 @@ export default function MessageBubble({ message, onReplay }) {
: 'bg-gray-800 text-gray-100'
}`}
>
{message.content}
{isUser ? message.content : <RichContent text={message.content} />}
</div>
{!isUser && !message.isError && onReplay && (
<button

View File

@@ -1,8 +1,10 @@
import { VOICES } from '../lib/constants'
import { VOICES, TTS_ENGINES } from '../lib/constants'
export default function SettingsDrawer({ isOpen, onClose, settings, onUpdate }) {
if (!isOpen) return null
const isKokoro = !settings.ttsEngine || settings.ttsEngine === 'kokoro'
return (
<>
<div className="fixed inset-0 bg-black/50 z-40" onClick={onClose} />
@@ -16,18 +18,48 @@ export default function SettingsDrawer({ isOpen, onClose, settings, onUpdate })
</button>
</div>
<div className="flex-1 overflow-y-auto p-4 space-y-5">
{/* TTS Engine */}
<div>
<label className="block text-xs font-medium text-gray-400 mb-1.5">TTS Engine</label>
<select
value={settings.ttsEngine || 'kokoro'}
onChange={(e) => onUpdate('ttsEngine', e.target.value)}
className="w-full bg-gray-800 text-gray-200 text-sm rounded-lg px-3 py-2 border border-gray-700 focus:outline-none focus:border-indigo-500"
>
{TTS_ENGINES.map((e) => (
<option key={e.id} value={e.id}>{e.label}</option>
))}
</select>
</div>
{/* Voice */}
<div>
<label className="block text-xs font-medium text-gray-400 mb-1.5">Voice</label>
<select
value={settings.voice}
onChange={(e) => onUpdate('voice', e.target.value)}
className="w-full bg-gray-800 text-gray-200 text-sm rounded-lg px-3 py-2 border border-gray-700 focus:outline-none focus:border-indigo-500"
>
{VOICES.map((v) => (
<option key={v.id} value={v.id}>{v.label}</option>
))}
</select>
{isKokoro ? (
<select
value={settings.voice}
onChange={(e) => onUpdate('voice', e.target.value)}
className="w-full bg-gray-800 text-gray-200 text-sm rounded-lg px-3 py-2 border border-gray-700 focus:outline-none focus:border-indigo-500"
>
{VOICES.map((v) => (
<option key={v.id} value={v.id}>{v.label}</option>
))}
</select>
) : (
<div>
<input
type="text"
value={settings.voice || ''}
onChange={(e) => onUpdate('voice', e.target.value)}
className="w-full bg-gray-800 text-gray-200 text-sm rounded-lg px-3 py-2 border border-gray-700 focus:outline-none focus:border-indigo-500"
placeholder={settings.ttsEngine === 'elevenlabs' ? 'ElevenLabs voice ID' : 'Voice identifier'}
readOnly
/>
<p className="text-xs text-gray-500 mt-1">
Set via active character profile
</p>
</div>
)}
</div>
{/* Auto TTS */}

View File

@@ -1,9 +1,16 @@
export default function ThinkingIndicator() {
export default function ThinkingIndicator({ character }) {
const name = character?.name || 'AI'
const image = character?.image || null
return (
<div className="flex items-start gap-3 px-4 py-3">
<div className="w-8 h-8 rounded-full bg-indigo-600/20 flex items-center justify-center shrink-0">
<span className="text-indigo-400 text-sm">AI</span>
</div>
{image ? (
<img src={image} alt={name} className="w-8 h-8 rounded-full object-cover shrink-0 ring-1 ring-gray-700" />
) : (
<div className="w-8 h-8 rounded-full bg-indigo-600/20 flex items-center justify-center shrink-0">
<span className="text-indigo-400 text-sm">{name[0]}</span>
</div>
)}
<div className="flex items-center gap-1 pt-2.5">
<span className="w-2 h-2 rounded-full bg-gray-400 animate-[bounce_1.4s_ease-in-out_infinite]" />
<span className="w-2 h-2 rounded-full bg-gray-400 animate-[bounce_1.4s_ease-in-out_0.2s_infinite]" />

View File

@@ -0,0 +1,28 @@
import { useState, useEffect } from 'react'
const ACTIVE_KEY = 'homeai_active_character'
export function useActiveCharacter() {
const [character, setCharacter] = useState(null)
useEffect(() => {
const activeId = localStorage.getItem(ACTIVE_KEY)
if (!activeId) return
fetch(`/api/characters/${activeId}`)
.then(r => r.ok ? r.json() : null)
.then(profile => {
if (profile) {
setCharacter({
id: profile.id,
name: profile.data.display_name || profile.data.name || 'AI',
image: profile.image || null,
tts: profile.data.tts || null,
})
}
})
.catch(() => {})
}, [])
return character
}

View File

@@ -1,45 +1,124 @@
import { useState, useCallback } from 'react'
import { useState, useCallback, useEffect, useRef } from 'react'
import { sendMessage } from '../lib/api'
import { getConversation, saveConversation } from '../lib/conversationApi'
export function useChat() {
export function useChat(conversationId, conversationMeta, onConversationUpdate) {
const [messages, setMessages] = useState([])
const [isLoading, setIsLoading] = useState(false)
const [isLoadingConv, setIsLoadingConv] = useState(false)
const convRef = useRef(null)
const idRef = useRef(conversationId)
const send = useCallback(async (text) => {
// Keep idRef in sync
useEffect(() => { idRef.current = conversationId }, [conversationId])
// Load conversation from server when ID changes
useEffect(() => {
if (!conversationId) {
setMessages([])
convRef.current = null
return
}
let cancelled = false
setIsLoadingConv(true)
getConversation(conversationId).then(conv => {
if (cancelled) return
if (conv) {
convRef.current = conv
setMessages(conv.messages || [])
} else {
convRef.current = null
setMessages([])
}
setIsLoadingConv(false)
}).catch(() => {
if (!cancelled) {
convRef.current = null
setMessages([])
setIsLoadingConv(false)
}
})
return () => { cancelled = true }
}, [conversationId])
// Persist conversation to server
const persist = useCallback(async (updatedMessages, title, overrideId) => {
const id = overrideId || idRef.current
if (!id) return
const now = new Date().toISOString()
const conv = {
id,
title: title || convRef.current?.title || '',
characterId: conversationMeta?.characterId || convRef.current?.characterId || '',
characterName: conversationMeta?.characterName || convRef.current?.characterName || '',
createdAt: convRef.current?.createdAt || now,
updatedAt: now,
messages: updatedMessages,
}
convRef.current = conv
await saveConversation(conv).catch(() => {})
if (onConversationUpdate) {
onConversationUpdate(id, {
title: conv.title,
updatedAt: conv.updatedAt,
messageCount: conv.messages.length,
})
}
}, [conversationMeta, onConversationUpdate])
// send accepts an optional overrideId for when the conversation was just created
const send = useCallback(async (text, overrideId) => {
if (!text.trim() || isLoading) return null
const userMsg = { id: Date.now(), role: 'user', content: text.trim(), timestamp: new Date() }
setMessages((prev) => [...prev, userMsg])
const userMsg = { id: Date.now(), role: 'user', content: text.trim(), timestamp: new Date().toISOString() }
const isFirstMessage = messages.length === 0
const newMessages = [...messages, userMsg]
setMessages(newMessages)
setIsLoading(true)
try {
const response = await sendMessage(text.trim())
const response = await sendMessage(text.trim(), conversationMeta?.characterId || null)
const assistantMsg = {
id: Date.now() + 1,
role: 'assistant',
content: response,
timestamp: new Date(),
timestamp: new Date().toISOString(),
}
setMessages((prev) => [...prev, assistantMsg])
const allMessages = [...newMessages, assistantMsg]
setMessages(allMessages)
const title = isFirstMessage
? text.trim().slice(0, 80) + (text.trim().length > 80 ? '...' : '')
: undefined
await persist(allMessages, title, overrideId)
return response
} catch (err) {
const errorMsg = {
id: Date.now() + 1,
role: 'assistant',
content: `Error: ${err.message}`,
timestamp: new Date(),
timestamp: new Date().toISOString(),
isError: true,
}
setMessages((prev) => [...prev, errorMsg])
const allMessages = [...newMessages, errorMsg]
setMessages(allMessages)
await persist(allMessages, undefined, overrideId)
return null
} finally {
setIsLoading(false)
}
}, [isLoading])
}, [isLoading, messages, persist])
const clearHistory = useCallback(() => {
const clearHistory = useCallback(async () => {
setMessages([])
}, [])
if (idRef.current) {
await persist([], undefined)
}
}, [persist])
return { messages, isLoading, send, clearHistory }
return { messages, isLoading, isLoadingConv, send, clearHistory }
}

View File

@@ -0,0 +1,66 @@
import { useState, useEffect, useCallback } from 'react'
import { listConversations, saveConversation, deleteConversation as deleteConv } from '../lib/conversationApi'
const ACTIVE_KEY = 'homeai_active_conversation'
export function useConversations() {
const [conversations, setConversations] = useState([])
const [activeId, setActiveId] = useState(() => localStorage.getItem(ACTIVE_KEY) || null)
const [isLoading, setIsLoading] = useState(true)
const loadList = useCallback(async () => {
try {
const list = await listConversations()
setConversations(list)
} catch {
setConversations([])
} finally {
setIsLoading(false)
}
}, [])
useEffect(() => { loadList() }, [loadList])
const select = useCallback((id) => {
setActiveId(id)
if (id) {
localStorage.setItem(ACTIVE_KEY, id)
} else {
localStorage.removeItem(ACTIVE_KEY)
}
}, [])
const create = useCallback(async (characterId, characterName) => {
const id = `conv_${Date.now()}`
const now = new Date().toISOString()
const conv = {
id,
title: '',
characterId: characterId || '',
characterName: characterName || '',
createdAt: now,
updatedAt: now,
messages: [],
}
await saveConversation(conv)
setConversations(prev => [{ ...conv, messageCount: 0 }, ...prev])
select(id)
return id
}, [select])
const remove = useCallback(async (id) => {
await deleteConv(id)
setConversations(prev => prev.filter(c => c.id !== id))
if (activeId === id) {
select(null)
}
}, [activeId, select])
const updateMeta = useCallback((id, updates) => {
setConversations(prev => prev.map(c =>
c.id === id ? { ...c, ...updates } : c
))
}, [])
return { conversations, activeId, isLoading, select, create, remove, updateMeta, refresh: loadList }
}

View File

@@ -1,7 +1,7 @@
import { useState, useRef, useCallback } from 'react'
import { synthesize } from '../lib/api'
export function useTtsPlayback(voice) {
export function useTtsPlayback(voice, engine = 'kokoro', model = null) {
const [isPlaying, setIsPlaying] = useState(false)
const audioCtxRef = useRef(null)
const sourceRef = useRef(null)
@@ -23,7 +23,7 @@ export function useTtsPlayback(voice) {
setIsPlaying(true)
try {
const audioData = await synthesize(text, voice)
const audioData = await synthesize(text, voice, engine, model)
const ctx = getAudioContext()
if (ctx.state === 'suspended') await ctx.resume()
@@ -42,7 +42,7 @@ export function useTtsPlayback(voice) {
console.error('TTS playback error:', err)
setIsPlaying(false)
}
}, [voice])
}, [voice, engine, model])
const stop = useCallback(() => {
if (sourceRef.current) {

View File

@@ -4,7 +4,43 @@ import schema from '../../schema/character.schema.json'
const ajv = new Ajv({ allErrors: true, strict: false })
const validate = ajv.compile(schema)
/**
* Migrate a v1 character config to v2 in-place.
* Removes live2d/vtube fields, converts gaze_preset to gaze_presets array,
* and initialises new persona fields.
*/
export function migrateV1toV2(config) {
config.schema_version = 2
// Remove deprecated fields
delete config.live2d_expressions
delete config.vtube_ws_triggers
// Convert single gaze_preset string → gaze_presets array
if ('gaze_preset' in config) {
const old = config.gaze_preset
config.gaze_presets = old ? [{ preset: old, trigger: 'self-portrait' }] : []
delete config.gaze_preset
}
if (!config.gaze_presets) {
config.gaze_presets = []
}
// Initialise new fields if absent
if (config.background === undefined) config.background = ''
if (config.dialogue_style === undefined) config.dialogue_style = ''
if (config.appearance === undefined) config.appearance = ''
if (config.skills === undefined) config.skills = []
return config
}
export function validateCharacter(config) {
// Auto-migrate v1 → v2
if (config.schema_version === 1 || config.schema_version === undefined) {
migrateV1toV2(config)
}
const valid = validate(config)
if (!valid) {
throw new Error(ajv.errorsText(validate.errors))

View File

@@ -1,8 +1,30 @@
export async function sendMessage(text) {
const res = await fetch('/api/agent/message', {
const MAX_RETRIES = 3
const RETRY_DELAY_MS = 2000
async function fetchWithRetry(url, options, retries = MAX_RETRIES) {
for (let attempt = 1; attempt <= retries; attempt++) {
try {
const res = await fetch(url, options)
if (res.status === 502 && attempt < retries) {
// Bridge unreachable — wait and retry
await new Promise(r => setTimeout(r, RETRY_DELAY_MS * attempt))
continue
}
return res
} catch (err) {
if (attempt >= retries) throw err
await new Promise(r => setTimeout(r, RETRY_DELAY_MS * attempt))
}
}
}
export async function sendMessage(text, characterId = null) {
const payload = { message: text, agent: 'main' }
if (characterId) payload.character_id = characterId
const res = await fetchWithRetry('/api/agent/message', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ message: text, agent: 'main' }),
body: JSON.stringify(payload),
})
if (!res.ok) {
const err = await res.json().catch(() => ({ error: 'Request failed' }))
@@ -12,11 +34,13 @@ export async function sendMessage(text) {
return data.response
}
export async function synthesize(text, voice) {
export async function synthesize(text, voice, engine = 'kokoro', model = null) {
const payload = { text, voice, engine }
if (model) payload.model = model
const res = await fetch('/api/tts', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ text, voice }),
body: JSON.stringify(payload),
})
if (!res.ok) throw new Error('TTS failed')
return await res.arrayBuffer()

View File

@@ -30,7 +30,15 @@ export const VOICES = [
{ id: 'bm_lewis', label: 'Lewis (M, UK)' },
]
export const TTS_ENGINES = [
{ id: 'kokoro', label: 'Kokoro (local)' },
{ id: 'chatterbox', label: 'Chatterbox (voice clone)' },
{ id: 'qwen3', label: 'Qwen3 TTS' },
{ id: 'elevenlabs', label: 'ElevenLabs (cloud)' },
]
export const DEFAULT_SETTINGS = {
ttsEngine: 'kokoro',
voice: DEFAULT_VOICE,
autoTts: true,
sttMode: 'bridge',

View File

@@ -0,0 +1,25 @@
export async function listConversations() {
const res = await fetch('/api/conversations')
if (!res.ok) throw new Error(`Failed to list conversations: ${res.status}`)
return res.json()
}
export async function getConversation(id) {
const res = await fetch(`/api/conversations/${encodeURIComponent(id)}`)
if (!res.ok) return null
return res.json()
}
export async function saveConversation(conversation) {
const res = await fetch('/api/conversations', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(conversation),
})
if (!res.ok) throw new Error(`Failed to save conversation: ${res.status}`)
}
export async function deleteConversation(id) {
const res = await fetch(`/api/conversations/${encodeURIComponent(id)}`, { method: 'DELETE' })
if (!res.ok) throw new Error(`Failed to delete conversation: ${res.status}`)
}

View File

@@ -0,0 +1,45 @@
export async function getPersonalMemories(characterId) {
const res = await fetch(`/api/memories/personal/${encodeURIComponent(characterId)}`)
if (!res.ok) return { characterId, memories: [] }
return res.json()
}
export async function savePersonalMemory(characterId, memory) {
const res = await fetch(`/api/memories/personal/${encodeURIComponent(characterId)}`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(memory),
})
if (!res.ok) throw new Error(`Failed to save memory: ${res.status}`)
return res.json()
}
export async function deletePersonalMemory(characterId, memoryId) {
const res = await fetch(`/api/memories/personal/${encodeURIComponent(characterId)}/${encodeURIComponent(memoryId)}`, {
method: 'DELETE',
})
if (!res.ok) throw new Error(`Failed to delete memory: ${res.status}`)
}
export async function getGeneralMemories() {
const res = await fetch('/api/memories/general')
if (!res.ok) return { memories: [] }
return res.json()
}
export async function saveGeneralMemory(memory) {
const res = await fetch('/api/memories/general', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(memory),
})
if (!res.ok) throw new Error(`Failed to save memory: ${res.status}`)
return res.json()
}
export async function deleteGeneralMemory(memoryId) {
const res = await fetch(`/api/memories/general/${encodeURIComponent(memoryId)}`, {
method: 'DELETE',
})
if (!res.ok) throw new Error(`Failed to delete memory: ${res.status}`)
}

View File

@@ -1,23 +1,9 @@
import { useState, useEffect } from 'react';
import { useState, useEffect, useCallback } from 'react';
import { useNavigate } from 'react-router-dom';
import { validateCharacter } from '../lib/SchemaValidator';
const STORAGE_KEY = 'homeai_characters';
const ACTIVE_KEY = 'homeai_active_character';
function loadProfiles() {
try {
const raw = localStorage.getItem(STORAGE_KEY);
return raw ? JSON.parse(raw) : [];
} catch {
return [];
}
}
function saveProfiles(profiles) {
localStorage.setItem(STORAGE_KEY, JSON.stringify(profiles));
}
function getActiveId() {
return localStorage.getItem(ACTIVE_KEY) || null;
}
@@ -27,15 +13,52 @@ function setActiveId(id) {
}
export default function Characters() {
const [profiles, setProfiles] = useState(loadProfiles);
const [profiles, setProfiles] = useState([]);
const [activeId, setActive] = useState(getActiveId);
const [error, setError] = useState(null);
const [dragOver, setDragOver] = useState(false);
const [loading, setLoading] = useState(true);
const [satMap, setSatMap] = useState({ default: '', satellites: {} });
const [newSatId, setNewSatId] = useState('');
const [newSatChar, setNewSatChar] = useState('');
const navigate = useNavigate();
// Load profiles and satellite map on mount
useEffect(() => {
saveProfiles(profiles);
}, [profiles]);
Promise.all([
fetch('/api/characters').then(r => r.json()),
fetch('/api/satellite-map').then(r => r.json()),
])
.then(([chars, map]) => {
setProfiles(chars);
setSatMap(map);
setLoading(false);
})
.catch(err => { setError(`Failed to load: ${err.message}`); setLoading(false); });
}, []);
const saveSatMap = useCallback(async (updated) => {
setSatMap(updated);
await fetch('/api/satellite-map', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(updated),
});
}, []);
const saveProfile = useCallback(async (profile) => {
const res = await fetch('/api/characters', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(profile),
});
if (!res.ok) throw new Error('Failed to save profile');
}, []);
const deleteProfile = useCallback(async (id) => {
const safeId = id.replace(/[^a-zA-Z0-9_\-\.]/g, '_');
await fetch(`/api/characters/${safeId}`, { method: 'DELETE' });
}, []);
const handleImport = (e) => {
const files = Array.from(e.target?.files || []);
@@ -47,12 +70,14 @@ export default function Characters() {
files.forEach(file => {
if (!file.name.endsWith('.json')) return;
const reader = new FileReader();
reader.onload = (ev) => {
reader.onload = async (ev) => {
try {
const data = JSON.parse(ev.target.result);
validateCharacter(data);
const id = data.name + '_' + Date.now();
setProfiles(prev => [...prev, { id, data, image: null, addedAt: new Date().toISOString() }]);
const profile = { id, data, image: null, addedAt: new Date().toISOString() };
await saveProfile(profile);
setProfiles(prev => [...prev, profile]);
setError(null);
} catch (err) {
setError(`Import failed for ${file.name}: ${err.message}`);
@@ -73,15 +98,17 @@ export default function Characters() {
const file = e.target.files[0];
if (!file) return;
const reader = new FileReader();
reader.onload = (ev) => {
setProfiles(prev =>
prev.map(p => p.id === profileId ? { ...p, image: ev.target.result } : p)
);
reader.onload = async (ev) => {
const updated = profiles.map(p => p.id === profileId ? { ...p, image: ev.target.result } : p);
const profile = updated.find(p => p.id === profileId);
if (profile) await saveProfile(profile);
setProfiles(updated);
};
reader.readAsDataURL(file);
};
const removeProfile = (id) => {
const removeProfile = async (id) => {
await deleteProfile(id);
setProfiles(prev => prev.filter(p => p.id !== id));
if (activeId === id) {
setActive(null);
@@ -92,6 +119,28 @@ export default function Characters() {
const activateProfile = (id) => {
setActive(id);
setActiveId(id);
// Sync active character's TTS settings to chat settings
const profile = profiles.find(p => p.id === id);
if (profile?.data?.tts) {
const tts = profile.data.tts;
const engine = tts.engine || 'kokoro';
let voice;
if (engine === 'kokoro') voice = tts.kokoro_voice || 'af_heart';
else if (engine === 'elevenlabs') voice = tts.elevenlabs_voice_id || '';
else if (engine === 'chatterbox') voice = tts.voice_ref_path || '';
else voice = '';
try {
const raw = localStorage.getItem('homeai_dashboard_settings');
const settings = raw ? JSON.parse(raw) : {};
localStorage.setItem('homeai_dashboard_settings', JSON.stringify({
...settings,
ttsEngine: engine,
voice: voice,
}));
} catch { /* ignore */ }
}
};
const exportProfile = (profile) => {
@@ -125,13 +174,28 @@ export default function Characters() {
)}
</p>
</div>
<label className="flex items-center gap-2 px-4 py-2 bg-indigo-600 hover:bg-indigo-500 text-white rounded-lg cursor-pointer transition-colors">
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M12 4.5v15m7.5-7.5h-15" />
</svg>
Import JSON
<input type="file" accept=".json" multiple className="hidden" onChange={handleImport} />
</label>
<div className="flex gap-3">
<button
onClick={() => {
sessionStorage.removeItem('edit_character');
sessionStorage.removeItem('edit_character_profile_id');
navigate('/editor');
}}
className="flex items-center gap-2 px-4 py-2 bg-indigo-600 hover:bg-indigo-500 text-white rounded-lg transition-colors"
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M12 4.5v15m7.5-7.5h-15" />
</svg>
New Character
</button>
<label className="flex items-center gap-2 px-4 py-2 bg-gray-800 hover:bg-gray-700 text-gray-300 rounded-lg cursor-pointer border border-gray-700 transition-colors">
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M3 16.5v2.25A2.25 2.25 0 005.25 21h13.5A2.25 2.25 0 0021 18.75V16.5m-13.5-9L12 3m0 0l4.5 4.5M12 3v13.5" />
</svg>
Import JSON
<input type="file" accept=".json" multiple className="hidden" onChange={handleImport} />
</label>
</div>
</div>
{error && (
@@ -158,7 +222,11 @@ export default function Characters() {
</div>
{/* Profile grid */}
{profiles.length === 0 ? (
{loading ? (
<div className="text-center py-16">
<p className="text-gray-500">Loading characters...</p>
</div>
) : profiles.length === 0 ? (
<div className="text-center py-16">
<svg className="w-16 h-16 mx-auto text-gray-700 mb-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1}>
<path strokeLinecap="round" strokeLinejoin="round" d="M15.75 6a3.75 3.75 0 11-7.5 0 3.75 3.75 0 017.5 0zM4.501 20.118a7.5 7.5 0 0114.998 0A17.933 17.933 0 0112 21.75c-2.676 0-5.216-.584-7.499-1.632z" />
@@ -230,11 +298,32 @@ export default function Characters() {
<span className="px-2 py-0.5 bg-gray-700/70 text-gray-400 text-xs rounded-full">
{char.model_overrides?.primary || 'default'}
</span>
{char.tts?.kokoro_voice && (
{char.tts?.engine === 'kokoro' && char.tts?.kokoro_voice && (
<span className="px-2 py-0.5 bg-gray-700/70 text-gray-400 text-xs rounded-full">
{char.tts.kokoro_voice}
</span>
)}
{char.tts?.engine === 'elevenlabs' && char.tts?.elevenlabs_voice_id && (
<span className="px-2 py-0.5 bg-gray-700/70 text-gray-400 text-xs rounded-full" title={char.tts.elevenlabs_voice_id}>
{char.tts.elevenlabs_voice_name || char.tts.elevenlabs_voice_id.slice(0, 8) + '…'}
</span>
)}
{char.tts?.engine === 'chatterbox' && char.tts?.voice_ref_path && (
<span className="px-2 py-0.5 bg-gray-700/70 text-gray-400 text-xs rounded-full" title={char.tts.voice_ref_path}>
{char.tts.voice_ref_path.split('/').pop()}
</span>
)}
{(() => {
const defaultPreset = char.gaze_presets?.find(gp => gp.trigger === 'self-portrait')?.preset
|| char.gaze_presets?.[0]?.preset
|| char.gaze_preset
|| null;
return defaultPreset ? (
<span className="px-2 py-0.5 bg-violet-500/20 text-violet-300 text-xs rounded-full border border-violet-500/30" title={`GAZE: ${defaultPreset}`}>
{defaultPreset}
</span>
) : null;
})()}
</div>
<div className="flex gap-2 pt-1">
@@ -287,6 +376,96 @@ export default function Characters() {
})}
</div>
)}
{/* Satellite Assignment */}
{!loading && profiles.length > 0 && (
<div className="bg-gray-900 border border-gray-800 rounded-xl p-5 space-y-4">
<div>
<h2 className="text-lg font-semibold text-gray-200">Satellite Routing</h2>
<p className="text-xs text-gray-500 mt-1">Assign characters to voice satellites. Unmapped satellites use the default.</p>
</div>
{/* Default character */}
<div className="flex items-center gap-3">
<label className="text-sm text-gray-400 w-32 shrink-0">Default</label>
<select
value={satMap.default || ''}
onChange={(e) => saveSatMap({ ...satMap, default: e.target.value })}
className="flex-1 bg-gray-800 text-gray-200 text-sm rounded-lg px-3 py-2 border border-gray-700 focus:outline-none focus:border-indigo-500"
>
<option value="">-- None --</option>
{profiles.map(p => (
<option key={p.id} value={p.id}>{p.data.display_name || p.data.name}</option>
))}
</select>
</div>
{/* Per-satellite assignments */}
{Object.entries(satMap.satellites || {}).map(([satId, charId]) => (
<div key={satId} className="flex items-center gap-3">
<span className="text-sm text-gray-300 w-32 shrink-0 truncate font-mono" title={satId}>{satId}</span>
<select
value={charId}
onChange={(e) => {
const updated = { ...satMap, satellites: { ...satMap.satellites, [satId]: e.target.value } };
saveSatMap(updated);
}}
className="flex-1 bg-gray-800 text-gray-200 text-sm rounded-lg px-3 py-2 border border-gray-700 focus:outline-none focus:border-indigo-500"
>
{profiles.map(p => (
<option key={p.id} value={p.id}>{p.data.display_name || p.data.name}</option>
))}
</select>
<button
onClick={() => {
const { [satId]: _, ...rest } = satMap.satellites;
saveSatMap({ ...satMap, satellites: rest });
}}
className="px-2 py-1.5 bg-gray-700 hover:bg-red-600 text-gray-400 hover:text-white rounded-lg transition-colors"
title="Remove"
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M6 18L18 6M6 6l12 12" />
</svg>
</button>
</div>
))}
{/* Add new satellite */}
<div className="flex items-center gap-3 pt-2 border-t border-gray-800">
<input
type="text"
value={newSatId}
onChange={(e) => setNewSatId(e.target.value)}
placeholder="Satellite ID (from bridge log)"
className="w-32 shrink-0 bg-gray-800 text-gray-200 text-sm rounded-lg px-3 py-2 border border-gray-700 focus:outline-none focus:border-indigo-500 font-mono"
/>
<select
value={newSatChar}
onChange={(e) => setNewSatChar(e.target.value)}
className="flex-1 bg-gray-800 text-gray-200 text-sm rounded-lg px-3 py-2 border border-gray-700 focus:outline-none focus:border-indigo-500"
>
<option value="">-- Select Character --</option>
{profiles.map(p => (
<option key={p.id} value={p.id}>{p.data.display_name || p.data.name}</option>
))}
</select>
<button
onClick={() => {
if (newSatId && newSatChar) {
saveSatMap({ ...satMap, satellites: { ...satMap.satellites, [newSatId]: newSatChar } });
setNewSatId('');
setNewSatChar('');
}
}}
disabled={!newSatId || !newSatChar}
className="px-3 py-1.5 bg-indigo-600 hover:bg-indigo-500 disabled:bg-gray-700 disabled:text-gray-500 text-white text-sm rounded-lg transition-colors"
>
Add
</button>
</div>
</div>
)}
</div>
);
}

View File

@@ -1,115 +1,146 @@
import { useState, useEffect, useCallback } from 'react'
import { useState, useCallback } from 'react'
import ChatPanel from '../components/ChatPanel'
import InputBar from '../components/InputBar'
import StatusIndicator from '../components/StatusIndicator'
import SettingsDrawer from '../components/SettingsDrawer'
import ConversationList from '../components/ConversationList'
import { useSettings } from '../hooks/useSettings'
import { useBridgeHealth } from '../hooks/useBridgeHealth'
import { useChat } from '../hooks/useChat'
import { useTtsPlayback } from '../hooks/useTtsPlayback'
import { useVoiceInput } from '../hooks/useVoiceInput'
import { useActiveCharacter } from '../hooks/useActiveCharacter'
import { useConversations } from '../hooks/useConversations'
export default function Chat() {
const { settings, updateSetting } = useSettings()
const isOnline = useBridgeHealth()
const { messages, isLoading, send, clearHistory } = useChat()
const { isPlaying, speak, stop } = useTtsPlayback(settings.voice)
const character = useActiveCharacter()
const {
conversations, activeId, isLoading: isLoadingList,
select, create, remove, updateMeta,
} = useConversations()
const convMeta = {
characterId: character?.id || '',
characterName: character?.name || '',
}
const { messages, isLoading, isLoadingConv, send, clearHistory } = useChat(activeId, convMeta, updateMeta)
// Use character's TTS config if available, fall back to global settings
const ttsEngine = character?.tts?.engine || settings.ttsEngine
const ttsVoice = ttsEngine === 'elevenlabs'
? (character?.tts?.elevenlabs_voice_id || settings.voice)
: (character?.tts?.kokoro_voice || settings.voice)
const ttsModel = ttsEngine === 'elevenlabs' ? (character?.tts?.elevenlabs_model || null) : null
const { isPlaying, speak, stop } = useTtsPlayback(ttsVoice, ttsEngine, ttsModel)
const { isRecording, isTranscribing, startRecording, stopRecording } = useVoiceInput(settings.sttMode)
const [settingsOpen, setSettingsOpen] = useState(false)
// Send a message and optionally speak the response
const handleSend = useCallback(async (text) => {
const response = await send(text)
// Auto-create a conversation if none is active
let newId = null
if (!activeId) {
newId = await create(convMeta.characterId, convMeta.characterName)
}
const response = await send(text, newId)
if (response && settings.autoTts) {
speak(response)
}
}, [send, settings.autoTts, speak])
}, [activeId, create, convMeta, send, settings.autoTts, speak])
// Toggle voice recording
const handleVoiceToggle = useCallback(async () => {
if (isRecording) {
const text = await stopRecording()
if (text) {
handleSend(text)
}
if (text) handleSend(text)
} else {
startRecording()
}
}, [isRecording, stopRecording, startRecording, handleSend])
// Space bar push-to-talk when input not focused
useEffect(() => {
const handleKeyDown = (e) => {
if (e.code === 'Space' && e.target.tagName !== 'TEXTAREA' && e.target.tagName !== 'INPUT') {
e.preventDefault()
handleVoiceToggle()
}
}
window.addEventListener('keydown', handleKeyDown)
return () => window.removeEventListener('keydown', handleKeyDown)
}, [handleVoiceToggle])
const handleNewChat = useCallback(() => {
create(convMeta.characterId, convMeta.characterName)
}, [create, convMeta])
return (
<div className="flex-1 flex flex-col min-h-0">
{/* Status bar */}
<header className="flex items-center justify-between px-4 py-2 border-b border-gray-800/50 shrink-0">
<div className="flex items-center gap-2">
<StatusIndicator isOnline={isOnline} />
<span className="text-xs text-gray-500">
{isOnline === null ? 'Connecting...' : isOnline ? 'Connected' : 'Offline'}
</span>
</div>
<div className="flex items-center gap-2">
{messages.length > 0 && (
<button
onClick={clearHistory}
className="text-xs text-gray-500 hover:text-gray-300 transition-colors px-2 py-1"
title="Clear conversation"
>
Clear
</button>
)}
{isPlaying && (
<button
onClick={stop}
className="text-xs text-indigo-400 hover:text-indigo-300 transition-colors px-2 py-1"
title="Stop speaking"
>
Stop audio
</button>
)}
<button
onClick={() => setSettingsOpen(true)}
className="text-gray-500 hover:text-gray-300 transition-colors p-1"
title="Settings"
>
<svg className="w-5 h-5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1.5}>
<path strokeLinecap="round" strokeLinejoin="round" d="M9.594 3.94c.09-.542.56-.94 1.11-.94h2.593c.55 0 1.02.398 1.11.94l.213 1.281c.063.374.313.686.645.87.074.04.147.083.22.127.325.196.72.257 1.075.124l1.217-.456a1.125 1.125 0 011.37.49l1.296 2.247a1.125 1.125 0 01-.26 1.431l-1.003.827c-.293.241-.438.613-.43.992a7.723 7.723 0 010 .255c-.008.378.137.75.43.991l1.004.827c.424.35.534.955.26 1.43l-1.298 2.247a1.125 1.125 0 01-1.369.491l-1.217-.456c-.355-.133-.75-.072-1.076.124a6.47 6.47 0 01-.22.128c-.331.183-.581.495-.644.869l-.213 1.281c-.09.543-.56.941-1.11.941h-2.594c-.55 0-1.019-.398-1.11-.94l-.213-1.281c-.062-.374-.312-.686-.644-.87a6.52 6.52 0 01-.22-.127c-.325-.196-.72-.257-1.076-.124l-1.217.456a1.125 1.125 0 01-1.369-.49l-1.297-2.247a1.125 1.125 0 01.26-1.431l1.004-.827c.292-.24.437-.613.43-.991a6.932 6.932 0 010-.255c.007-.38-.138-.751-.43-.992l-1.004-.827a1.125 1.125 0 01-.26-1.43l1.297-2.247a1.125 1.125 0 011.37-.491l1.216.456c.356.133.751.072 1.076-.124.072-.044.146-.086.22-.128.332-.183.582-.495.644-.869l.214-1.28z" />
<path strokeLinecap="round" strokeLinejoin="round" d="M15 12a3 3 0 11-6 0 3 3 0 016 0z" />
</svg>
</button>
</div>
</header>
<div className="flex-1 flex min-h-0">
{/* Conversation sidebar */}
<ConversationList
conversations={conversations}
activeId={activeId}
onCreate={handleNewChat}
onSelect={select}
onDelete={remove}
/>
{/* Chat area */}
<ChatPanel messages={messages} isLoading={isLoading} onReplay={speak} />
<div className="flex-1 flex flex-col min-h-0 min-w-0">
{/* Status bar */}
<header className="flex items-center justify-between px-4 py-2 border-b border-gray-800/50 shrink-0">
<div className="flex items-center gap-2">
<StatusIndicator isOnline={isOnline} />
<span className="text-xs text-gray-500">
{isOnline === null ? 'Connecting...' : isOnline ? 'Connected' : 'Offline'}
</span>
</div>
<div className="flex items-center gap-2">
{messages.length > 0 && (
<button
onClick={clearHistory}
className="text-xs text-gray-500 hover:text-gray-300 transition-colors px-2 py-1"
title="Clear conversation"
>
Clear
</button>
)}
{isPlaying && (
<button
onClick={stop}
className="text-xs text-indigo-400 hover:text-indigo-300 transition-colors px-2 py-1"
title="Stop speaking"
>
Stop audio
</button>
)}
<button
onClick={() => setSettingsOpen(true)}
className="text-gray-500 hover:text-gray-300 transition-colors p-1"
title="Settings"
>
<svg className="w-5 h-5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1.5}>
<path strokeLinecap="round" strokeLinejoin="round" d="M9.594 3.94c.09-.542.56-.94 1.11-.94h2.593c.55 0 1.02.398 1.11.94l.213 1.281c.063.374.313.686.645.87.074.04.147.083.22.127.325.196.72.257 1.075.124l1.217-.456a1.125 1.125 0 011.37.49l1.296 2.247a1.125 1.125 0 01-.26 1.431l-1.003.827c-.293.241-.438.613-.43.992a7.723 7.723 0 010 .255c-.008.378.137.75.43.991l1.004.827c.424.35.534.955.26 1.43l-1.298 2.247a1.125 1.125 0 01-1.369.491l-1.217-.456c-.355-.133-.75-.072-1.076.124a6.47 6.47 0 01-.22.128c-.331.183-.581.495-.644.869l-.213 1.281c-.09.543-.56.941-1.11.941h-2.594c-.55 0-1.019-.398-1.11-.94l-.213-1.281c-.062-.374-.312-.686-.644-.87a6.52 6.52 0 01-.22-.127c-.325-.196-.72-.257-1.076-.124l-1.217.456a1.125 1.125 0 01-1.369-.49l-1.297-2.247a1.125 1.125 0 01.26-1.431l1.004-.827c.292-.24.437-.613.43-.991a6.932 6.932 0 010-.255c.007-.38-.138-.751-.43-.992l-1.004-.827a1.125 1.125 0 01-.26-1.43l1.297-2.247a1.125 1.125 0 011.37-.491l1.216.456c.356.133.751.072 1.076-.124.072-.044.146-.086.22-.128.332-.183.582-.495.644-.869l.214-1.28z" />
<path strokeLinecap="round" strokeLinejoin="round" d="M15 12a3 3 0 11-6 0 3 3 0 016 0z" />
</svg>
</button>
</div>
</header>
{/* Input */}
<InputBar
onSend={handleSend}
onVoiceToggle={handleVoiceToggle}
isLoading={isLoading}
isRecording={isRecording}
isTranscribing={isTranscribing}
/>
{/* Messages */}
<ChatPanel
messages={messages}
isLoading={isLoading || isLoadingConv}
onReplay={speak}
character={character}
/>
{/* Settings drawer */}
<SettingsDrawer
isOpen={settingsOpen}
onClose={() => setSettingsOpen(false)}
settings={settings}
onUpdate={updateSetting}
/>
{/* Input */}
<InputBar
onSend={handleSend}
onVoiceToggle={handleVoiceToggle}
isLoading={isLoading}
isRecording={isRecording}
isTranscribing={isTranscribing}
/>
{/* Settings drawer */}
<SettingsDrawer
isOpen={settingsOpen}
onClose={() => setSettingsOpen(false)}
settings={settings}
onUpdate={updateSetting}
/>
</div>
</div>
)
}

View File

@@ -1,14 +1,18 @@
import React, { useState, useEffect, useRef } from 'react';
import { validateCharacter } from '../lib/SchemaValidator';
import { validateCharacter, migrateV1toV2 } from '../lib/SchemaValidator';
const DEFAULT_CHARACTER = {
schema_version: 1,
name: "aria",
display_name: "Aria",
description: "Default HomeAI assistant persona",
system_prompt: "You are Aria, a warm, curious, and helpful AI assistant living in the home. You speak naturally and conversationally — never robotic. You are knowledgeable but never condescending. You remember the people you live with and build on those memories over time. Keep responses concise when controlling smart home devices; be more expressive in casual conversation. Never break character.",
schema_version: 2,
name: "",
display_name: "",
description: "",
background: "",
dialogue_style: "",
appearance: "",
skills: [],
system_prompt: "",
model_overrides: {
primary: "llama3.3:70b",
primary: "qwen3.5:35b-a3b",
fast: "qwen2.5:7b"
},
tts: {
@@ -16,24 +20,8 @@ const DEFAULT_CHARACTER = {
kokoro_voice: "af_heart",
speed: 1.0
},
live2d_expressions: {
idle: "expr_idle",
listening: "expr_listening",
thinking: "expr_thinking",
speaking: "expr_speaking",
happy: "expr_happy",
sad: "expr_sad",
surprised: "expr_surprised",
error: "expr_error"
},
vtube_ws_triggers: {
thinking: { type: "hotkey", id: "expr_thinking" },
speaking: { type: "hotkey", id: "expr_speaking" },
idle: { type: "hotkey", id: "expr_idle" }
},
custom_rules: [
{ trigger: "good morning", response: "Good morning! How did you sleep?", condition: "time_of_day == morning" }
],
gaze_presets: [],
custom_rules: [],
notes: ""
};
@@ -43,7 +31,12 @@ export default function Editor() {
if (editData) {
sessionStorage.removeItem('edit_character');
try {
return JSON.parse(editData);
const parsed = JSON.parse(editData);
// Auto-migrate v1 data
if (parsed.schema_version === 1 || !parsed.schema_version) {
migrateV1toV2(parsed);
}
return parsed;
} catch {
return DEFAULT_CHARACTER;
}
@@ -52,6 +45,7 @@ export default function Editor() {
});
const [error, setError] = useState(null);
const [saved, setSaved] = useState(false);
const isEditing = !!sessionStorage.getItem('edit_character_profile_id');
// TTS preview state
const [ttsState, setTtsState] = useState('idle');
@@ -65,6 +59,19 @@ export default function Editor() {
const [elevenLabsModels, setElevenLabsModels] = useState([]);
const [isLoadingElevenLabs, setIsLoadingElevenLabs] = useState(false);
// GAZE presets state (from API)
const [availableGazePresets, setAvailableGazePresets] = useState([]);
const [isLoadingGaze, setIsLoadingGaze] = useState(false);
// Character lookup state
const [lookupName, setLookupName] = useState('');
const [lookupFranchise, setLookupFranchise] = useState('');
const [isLookingUp, setIsLookingUp] = useState(false);
const [lookupDone, setLookupDone] = useState(false);
// Skills input state
const [newSkill, setNewSkill] = useState('');
const fetchElevenLabsData = async (key) => {
if (!key) return;
setIsLoadingElevenLabs(true);
@@ -95,6 +102,16 @@ export default function Editor() {
}
}, [character.tts.engine]);
// Fetch GAZE presets on mount
useEffect(() => {
setIsLoadingGaze(true);
fetch('/api/gaze/presets')
.then(r => r.ok ? r.json() : { presets: [] })
.then(data => setAvailableGazePresets(data.presets || []))
.catch(() => {})
.finally(() => setIsLoadingGaze(false));
}, []);
useEffect(() => {
return () => {
if (audioRef.current) { audioRef.current.pause(); audioRef.current = null; }
@@ -119,27 +136,35 @@ export default function Editor() {
}
};
const handleSaveToProfiles = () => {
const handleSaveToProfiles = async () => {
try {
validateCharacter(character);
setError(null);
const profileId = sessionStorage.getItem('edit_character_profile_id');
const storageKey = 'homeai_characters';
const raw = localStorage.getItem(storageKey);
let profiles = raw ? JSON.parse(raw) : [];
let profile;
if (profileId) {
profiles = profiles.map(p =>
p.id === profileId ? { ...p, data: character } : p
);
sessionStorage.removeItem('edit_character_profile_id');
const res = await fetch('/api/characters');
const profiles = await res.json();
const existing = profiles.find(p => p.id === profileId);
profile = existing
? { ...existing, data: character }
: { id: profileId, data: character, image: null, addedAt: new Date().toISOString() };
// Keep the profile ID in sessionStorage so subsequent saves update the same file
} else {
const id = character.name + '_' + Date.now();
profiles.push({ id, data: character, image: null, addedAt: new Date().toISOString() });
profile = { id, data: character, image: null, addedAt: new Date().toISOString() };
// Store the new ID so subsequent saves update the same file
sessionStorage.setItem('edit_character_profile_id', profile.id);
}
localStorage.setItem(storageKey, JSON.stringify(profiles));
await fetch('/api/characters', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(profile),
});
setSaved(true);
setTimeout(() => setSaved(false), 2000);
} catch (err) {
@@ -164,6 +189,59 @@ export default function Editor() {
reader.readAsText(file);
};
// Character lookup from MCP
const handleCharacterLookup = async () => {
if (!lookupName || !lookupFranchise) return;
setIsLookingUp(true);
setError(null);
try {
const res = await fetch('/api/character-lookup', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ name: lookupName, franchise: lookupFranchise }),
});
if (!res.ok) {
const err = await res.json().catch(() => ({ error: 'Lookup failed' }));
throw new Error(err.error || `Lookup returned ${res.status}`);
}
const data = await res.json();
// Build dialogue_style from personality + notable quotes
let dialogueStyle = data.personality || '';
if (data.notable_quotes?.length) {
dialogueStyle += '\n\nExample dialogue:\n' + data.notable_quotes.map(q => `"${q}"`).join('\n');
}
// Filter abilities to clean text-only entries (skip image captions)
const skills = (data.abilities || [])
.filter(a => a.length > 20 && !a.includes('.jpg') && !a.includes('.png'))
.slice(0, 10);
// Auto-generate system prompt
const promptName = character.display_name || lookupName;
const personality = data.personality ? data.personality.split('.').slice(0, 3).join('.') + '.' : '';
const systemPrompt = `You are ${promptName} from ${lookupFranchise}. ${personality} Stay in character at all times. Respond naturally and conversationally.`;
setCharacter(prev => ({
...prev,
name: prev.name || lookupName.toLowerCase().replace(/\s+/g, '_'),
display_name: prev.display_name || lookupName,
description: data.description ? data.description.split('.').slice(0, 2).join('.') + '.' : prev.description,
background: data.background || prev.background,
appearance: data.appearance || prev.appearance,
dialogue_style: dialogueStyle || prev.dialogue_style,
skills: skills.length > 0 ? skills : prev.skills,
system_prompt: prev.system_prompt || systemPrompt,
}));
setLookupDone(true);
} catch (err) {
setError(`Character lookup failed: ${err.message}`);
} finally {
setIsLookingUp(false);
}
};
const handleChange = (field, value) => {
setCharacter(prev => ({ ...prev, [field]: value }));
};
@@ -175,6 +253,50 @@ export default function Editor() {
}));
};
// Skills helpers
const addSkill = () => {
const trimmed = newSkill.trim();
if (!trimmed) return;
setCharacter(prev => ({
...prev,
skills: [...(prev.skills || []), trimmed]
}));
setNewSkill('');
};
const removeSkill = (index) => {
setCharacter(prev => {
const updated = [...(prev.skills || [])];
updated.splice(index, 1);
return { ...prev, skills: updated };
});
};
// GAZE preset helpers
const addGazePreset = () => {
setCharacter(prev => ({
...prev,
gaze_presets: [...(prev.gaze_presets || []), { preset: '', trigger: 'self-portrait' }]
}));
};
const removeGazePreset = (index) => {
setCharacter(prev => {
const updated = [...(prev.gaze_presets || [])];
updated.splice(index, 1);
return { ...prev, gaze_presets: updated };
});
};
const handleGazePresetChange = (index, field, value) => {
setCharacter(prev => {
const updated = [...(prev.gaze_presets || [])];
updated[index] = { ...updated[index], [field]: value };
return { ...prev, gaze_presets: updated };
});
};
// Custom rules helpers
const handleRuleChange = (index, field, value) => {
setCharacter(prev => {
const newRules = [...(prev.custom_rules || [])];
@@ -198,37 +320,40 @@ export default function Editor() {
});
};
// TTS preview
const stopPreview = () => {
if (audioRef.current) {
audioRef.current.pause();
audioRef.current = null;
}
if (objectUrlRef.current) {
URL.revokeObjectURL(objectUrlRef.current);
objectUrlRef.current = null;
}
if (audioRef.current) { audioRef.current.pause(); audioRef.current = null; }
if (objectUrlRef.current) { URL.revokeObjectURL(objectUrlRef.current); objectUrlRef.current = null; }
window.speechSynthesis.cancel();
setTtsState('idle');
};
const previewTTS = async () => {
stopPreview();
const text = previewText || `Hi, I am ${character.display_name}. This is a preview of my voice.`;
const text = previewText || `Hi, I am ${character.display_name || character.name}. This is a preview of my voice.`;
const engine = character.tts.engine;
if (character.tts.engine === 'kokoro') {
let bridgeBody = null;
if (engine === 'kokoro') {
bridgeBody = { text, voice: character.tts.kokoro_voice, engine: 'kokoro' };
} else if (engine === 'elevenlabs' && character.tts.elevenlabs_voice_id) {
bridgeBody = { text, voice: character.tts.elevenlabs_voice_id, engine: 'elevenlabs', model: character.tts.elevenlabs_model };
}
if (bridgeBody) {
setTtsState('loading');
let blob;
try {
const response = await fetch('/api/tts', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ text, voice: character.tts.kokoro_voice })
body: JSON.stringify(bridgeBody)
});
if (!response.ok) throw new Error('TTS bridge returned ' + response.status);
blob = await response.blob();
} catch (err) {
setTtsState('idle');
setError(`Kokoro preview failed: ${err.message}. Falling back to browser TTS.`);
setError(`${engine} preview failed: ${err.message}. Falling back to browser TTS.`);
runBrowserTTS(text);
return;
}
@@ -269,7 +394,9 @@ export default function Editor() {
<div>
<h1 className="text-3xl font-bold text-gray-100">Character Editor</h1>
<p className="text-sm text-gray-500 mt-1">
Editing: {character.display_name || character.name}
{character.display_name || character.name
? `Editing: ${character.display_name || character.name}`
: 'New character'}
</p>
</div>
<div className="flex gap-3">
@@ -311,6 +438,64 @@ export default function Editor() {
{error && (
<div className="bg-red-900/30 border border-red-500/50 text-red-300 px-4 py-3 rounded-lg text-sm">
{error}
<button onClick={() => setError(null)} className="ml-2 text-red-400 hover:text-red-300">&times;</button>
</div>
)}
{/* Character Lookup — auto-fill from fictional character wiki */}
{!isEditing && (
<div className={cardClass}>
<div className="flex items-center gap-2">
<svg className="w-5 h-5 text-indigo-400" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M21 21l-5.197-5.197m0 0A7.5 7.5 0 105.196 5.196a7.5 7.5 0 0010.607 10.607z" />
</svg>
<h2 className="text-lg font-semibold text-gray-200">Auto-fill from Character</h2>
</div>
<p className="text-xs text-gray-500">Fetch character data from Fandom/Wikipedia to auto-populate fields. You can edit everything after.</p>
<div className="flex gap-3 items-end">
<div className="flex-1">
<label className={labelClass}>Character Name</label>
<input
type="text"
className={inputClass}
value={lookupName}
onChange={(e) => setLookupName(e.target.value)}
placeholder="e.g. Tifa Lockhart"
/>
</div>
<div className="flex-1">
<label className={labelClass}>Franchise / Series</label>
<input
type="text"
className={inputClass}
value={lookupFranchise}
onChange={(e) => setLookupFranchise(e.target.value)}
placeholder="e.g. Final Fantasy VII"
/>
</div>
<button
onClick={handleCharacterLookup}
disabled={isLookingUp || !lookupName || !lookupFranchise}
className={`flex items-center gap-2 px-5 py-2 rounded-lg text-white transition-colors whitespace-nowrap ${
isLookingUp
? 'bg-indigo-800 cursor-wait'
: lookupDone
? 'bg-emerald-600 hover:bg-emerald-500'
: 'bg-indigo-600 hover:bg-indigo-500 disabled:bg-gray-700 disabled:text-gray-500'
}`}
>
{isLookingUp && (
<svg className="w-4 h-4 animate-spin" viewBox="0 0 24 24" fill="none">
<circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" />
<path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4z" />
</svg>
)}
{isLookingUp ? 'Fetching...' : lookupDone ? 'Fetched' : 'Lookup'}
</button>
</div>
{lookupDone && (
<p className="text-xs text-emerald-400">Fields populated from wiki data. Review and edit below.</p>
)}
</div>
)}
@@ -324,11 +509,11 @@ export default function Editor() {
</div>
<div>
<label className={labelClass}>Display Name</label>
<input type="text" className={inputClass} value={character.display_name} onChange={(e) => handleChange('display_name', e.target.value)} />
<input type="text" className={inputClass} value={character.display_name || ''} onChange={(e) => handleChange('display_name', e.target.value)} />
</div>
<div>
<label className={labelClass}>Description</label>
<input type="text" className={inputClass} value={character.description} onChange={(e) => handleChange('description', e.target.value)} />
<input type="text" className={inputClass} value={character.description || ''} onChange={(e) => handleChange('description', e.target.value)} />
</div>
</div>
@@ -359,7 +544,14 @@ export default function Editor() {
<div>
<label className={labelClass}>Voice ID</label>
{elevenLabsVoices.length > 0 ? (
<select className={selectClass} value={character.tts.elevenlabs_voice_id || ''} onChange={(e) => handleNestedChange('tts', 'elevenlabs_voice_id', e.target.value)}>
<select className={selectClass} value={character.tts.elevenlabs_voice_id || ''} onChange={(e) => {
const voiceId = e.target.value;
const voice = elevenLabsVoices.find(v => v.voice_id === voiceId);
setCharacter(prev => ({
...prev,
tts: { ...prev.tts, elevenlabs_voice_id: voiceId, elevenlabs_voice_name: voice?.name || '' }
}));
}}>
<option value="">-- Select Voice --</option>
{elevenLabsVoices.map(v => (
<option key={v.voice_id} value={v.voice_id}>{v.name} ({v.category})</option>
@@ -439,7 +631,7 @@ export default function Editor() {
className={inputClass}
value={previewText}
onChange={(e) => setPreviewText(e.target.value)}
placeholder={`Hi, I am ${character.display_name}. This is a preview of my voice.`}
placeholder={`Hi, I am ${character.display_name || character.name || 'your character'}. This is a preview of my voice.`}
/>
</div>
<div className="flex gap-2">
@@ -474,7 +666,9 @@ export default function Editor() {
<p className="text-xs text-gray-600">
{character.tts.engine === 'kokoro'
? 'Previews via local Kokoro TTS bridge (port 8081).'
: 'Uses browser TTS for preview. Local TTS available with Kokoro engine.'}
: character.tts.engine === 'elevenlabs'
? 'Previews via ElevenLabs through bridge.'
: 'Uses browser TTS for preview. Local TTS available with Kokoro engine.'}
</p>
</div>
</div>
@@ -483,25 +677,154 @@ export default function Editor() {
<div className={cardClass}>
<div className="flex justify-between items-center">
<h2 className="text-lg font-semibold text-gray-200">System Prompt</h2>
<span className="text-xs text-gray-600">{character.system_prompt.length} chars</span>
<span className="text-xs text-gray-600">{(character.system_prompt || '').length} chars</span>
</div>
<textarea
className={inputClass + " h-32 resize-y"}
value={character.system_prompt}
onChange={(e) => handleChange('system_prompt', e.target.value)}
placeholder="You are [character name]. Describe their personality, behaviour, and role..."
/>
</div>
{/* Character Profile — new v2 fields */}
<div className={cardClass}>
<h2 className="text-lg font-semibold text-gray-200">Character Profile</h2>
<div>
<label className={labelClass}>Background / Backstory</label>
<textarea
className={inputClass + " h-28 resize-y text-sm"}
value={character.background || ''}
onChange={(e) => handleChange('background', e.target.value)}
placeholder="Character history, origins, key life events..."
/>
</div>
<div>
<label className={labelClass}>Appearance</label>
<textarea
className={inputClass + " h-24 resize-y text-sm"}
value={character.appearance || ''}
onChange={(e) => handleChange('appearance', e.target.value)}
placeholder="Physical description — also used for image generation prompts..."
/>
</div>
<div>
<label className={labelClass}>Dialogue Style & Examples</label>
<textarea
className={inputClass + " h-24 resize-y text-sm"}
value={character.dialogue_style || ''}
onChange={(e) => handleChange('dialogue_style', e.target.value)}
placeholder="How the persona speaks, their tone, mannerisms, and example lines..."
/>
</div>
<div>
<label className={labelClass}>Skills & Interests</label>
<div className="flex flex-wrap gap-2 mb-2">
{(character.skills || []).map((skill, idx) => (
<span
key={idx}
className="inline-flex items-center gap-1 px-3 py-1 bg-indigo-500/20 text-indigo-300 text-sm rounded-full border border-indigo-500/30"
>
{skill.length > 80 ? skill.slice(0, 80) + '...' : skill}
<button
onClick={() => removeSkill(idx)}
className="ml-1 text-indigo-400 hover:text-red-400 transition-colors"
>
<svg className="w-3 h-3" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={3}>
<path strokeLinecap="round" strokeLinejoin="round" d="M6 18L18 6M6 6l12 12" />
</svg>
</button>
</span>
))}
</div>
<div className="flex gap-2">
<input
type="text"
className={inputClass + " text-sm"}
value={newSkill}
onChange={(e) => setNewSkill(e.target.value)}
onKeyDown={(e) => { if (e.key === 'Enter') { e.preventDefault(); addSkill(); } }}
placeholder="Add a skill or interest..."
/>
<button
onClick={addSkill}
disabled={!newSkill.trim()}
className="px-3 py-2 bg-indigo-600 hover:bg-indigo-500 disabled:bg-gray-700 disabled:text-gray-500 text-white text-sm rounded-lg transition-colors whitespace-nowrap"
>
Add
</button>
</div>
</div>
</div>
<div className="grid grid-cols-1 md:grid-cols-2 gap-6">
{/* Live2D Expressions */}
{/* Image Generation — GAZE presets */}
<div className={cardClass}>
<h2 className="text-lg font-semibold text-gray-200">Live2D Expressions</h2>
{Object.entries(character.live2d_expressions).map(([key, val]) => (
<div key={key} className="flex justify-between items-center gap-4">
<label className="text-sm font-medium text-gray-400 w-1/3 capitalize">{key}</label>
<input type="text" className={inputClass + " w-2/3"} value={val} onChange={(e) => handleNestedChange('live2d_expressions', key, e.target.value)} />
<div className="flex justify-between items-center">
<h2 className="text-lg font-semibold text-gray-200">GAZE Presets</h2>
<button onClick={addGazePreset} className="flex items-center gap-1 bg-indigo-600 hover:bg-indigo-500 text-white px-3 py-1.5 rounded-lg text-sm transition-colors">
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M12 4.5v15m7.5-7.5h-15" />
</svg>
Add Preset
</button>
</div>
<p className="text-xs text-gray-500">Image generation presets with trigger conditions. Default trigger is "self-portrait".</p>
{(!character.gaze_presets || character.gaze_presets.length === 0) ? (
<p className="text-sm text-gray-600 italic">No GAZE presets configured.</p>
) : (
<div className="space-y-3">
{character.gaze_presets.map((gp, idx) => (
<div key={idx} className="flex items-center gap-2 border border-gray-700 p-3 rounded-lg bg-gray-800/50">
<div className="flex-1">
<label className="block text-xs text-gray-500 mb-1">Preset</label>
{isLoadingGaze ? (
<p className="text-sm text-gray-500">Loading...</p>
) : availableGazePresets.length > 0 ? (
<select
className={selectClass + " text-sm"}
value={gp.preset || ''}
onChange={(e) => handleGazePresetChange(idx, 'preset', e.target.value)}
>
<option value="">-- Select --</option>
{availableGazePresets.map(p => (
<option key={p.slug} value={p.slug}>{p.name} ({p.slug})</option>
))}
</select>
) : (
<input
type="text"
className={inputClass + " text-sm"}
value={gp.preset || ''}
onChange={(e) => handleGazePresetChange(idx, 'preset', e.target.value)}
placeholder="Preset slug"
/>
)}
</div>
<div className="flex-1">
<label className="block text-xs text-gray-500 mb-1">Trigger</label>
<input
type="text"
className={inputClass + " text-sm"}
value={gp.trigger || ''}
onChange={(e) => handleGazePresetChange(idx, 'trigger', e.target.value)}
placeholder="e.g. self-portrait, battle scene"
/>
</div>
<button
onClick={() => removeGazePreset(idx)}
className="mt-5 px-2 py-1.5 text-gray-500 hover:text-red-400 transition-colors"
title="Remove"
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M6 18L18 6M6 6l12 12" />
</svg>
</button>
</div>
))}
</div>
))}
)}
</div>
{/* Model Overrides */}
@@ -509,7 +832,7 @@ export default function Editor() {
<h2 className="text-lg font-semibold text-gray-200">Model Overrides</h2>
<div>
<label className={labelClass}>Primary Model</label>
<select className={selectClass} value={character.model_overrides?.primary || 'llama3.3:70b'} onChange={(e) => handleNestedChange('model_overrides', 'primary', e.target.value)}>
<select className={selectClass} value={character.model_overrides?.primary || 'qwen3.5:35b-a3b'} onChange={(e) => handleNestedChange('model_overrides', 'primary', e.target.value)}>
<option value="llama3.3:70b">llama3.3:70b</option>
<option value="qwen3.5:35b-a3b">qwen3.5:35b-a3b</option>
<option value="qwen2.5:7b">qwen2.5:7b</option>
@@ -576,6 +899,17 @@ export default function Editor() {
</div>
)}
</div>
{/* Notes */}
<div className={cardClass}>
<h2 className="text-lg font-semibold text-gray-200">Notes</h2>
<textarea
className={inputClass + " h-20 resize-y text-sm"}
value={character.notes || ''}
onChange={(e) => handleChange('notes', e.target.value)}
placeholder="Internal notes, reminders, or references..."
/>
</div>
</div>
);
}

View File

@@ -0,0 +1,346 @@
import { useState, useEffect, useCallback } from 'react';
import {
getPersonalMemories, savePersonalMemory, deletePersonalMemory,
getGeneralMemories, saveGeneralMemory, deleteGeneralMemory,
} from '../lib/memoryApi';
const PERSONAL_CATEGORIES = [
{ value: 'personal_info', label: 'Personal Info', color: 'bg-blue-500/20 text-blue-300 border-blue-500/30' },
{ value: 'preference', label: 'Preference', color: 'bg-amber-500/20 text-amber-300 border-amber-500/30' },
{ value: 'interaction', label: 'Interaction', color: 'bg-emerald-500/20 text-emerald-300 border-emerald-500/30' },
{ value: 'emotional', label: 'Emotional', color: 'bg-pink-500/20 text-pink-300 border-pink-500/30' },
{ value: 'other', label: 'Other', color: 'bg-gray-500/20 text-gray-300 border-gray-500/30' },
];
const GENERAL_CATEGORIES = [
{ value: 'system', label: 'System', color: 'bg-indigo-500/20 text-indigo-300 border-indigo-500/30' },
{ value: 'tool_usage', label: 'Tool Usage', color: 'bg-cyan-500/20 text-cyan-300 border-cyan-500/30' },
{ value: 'home_layout', label: 'Home Layout', color: 'bg-emerald-500/20 text-emerald-300 border-emerald-500/30' },
{ value: 'device', label: 'Device', color: 'bg-amber-500/20 text-amber-300 border-amber-500/30' },
{ value: 'routine', label: 'Routine', color: 'bg-purple-500/20 text-purple-300 border-purple-500/30' },
{ value: 'other', label: 'Other', color: 'bg-gray-500/20 text-gray-300 border-gray-500/30' },
];
const ACTIVE_KEY = 'homeai_active_character';
function CategoryBadge({ category, categories }) {
const cat = categories.find(c => c.value === category) || categories[categories.length - 1];
return (
<span className={`px-2 py-0.5 text-xs rounded-full border ${cat.color}`}>
{cat.label}
</span>
);
}
function MemoryCard({ memory, categories, onEdit, onDelete }) {
return (
<div className="border border-gray-700 rounded-lg p-4 bg-gray-800/50 space-y-2">
<div className="flex items-start justify-between gap-3">
<p className="text-sm text-gray-200 flex-1 whitespace-pre-wrap">{memory.content}</p>
<div className="flex gap-1 shrink-0">
<button
onClick={() => onEdit(memory)}
className="p-1.5 text-gray-500 hover:text-gray-300 transition-colors"
title="Edit"
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M16.862 4.487l1.687-1.688a1.875 1.875 0 112.652 2.652L10.582 16.07a4.5 4.5 0 01-1.897 1.13L6 18l.8-2.685a4.5 4.5 0 011.13-1.897l8.932-8.931z" />
</svg>
</button>
<button
onClick={() => onDelete(memory.id)}
className="p-1.5 text-gray-500 hover:text-red-400 transition-colors"
title="Delete"
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M14.74 9l-.346 9m-4.788 0L9.26 9m9.968-3.21c.342.052.682.107 1.022.166m-1.022-.165L18.16 19.673a2.25 2.25 0 01-2.244 2.077H8.084a2.25 2.25 0 01-2.244-2.077L4.772 5.79m14.456 0a48.108 48.108 0 00-3.478-.397m-12 .562c.34-.059.68-.114 1.022-.165m0 0a48.11 48.11 0 013.478-.397m7.5 0v-.916c0-1.18-.91-2.164-2.09-2.201a51.964 51.964 0 00-3.32 0c-1.18.037-2.09 1.022-2.09 2.201v.916m7.5 0a48.667 48.667 0 00-7.5 0" />
</svg>
</button>
</div>
</div>
<div className="flex items-center gap-2">
<CategoryBadge category={memory.category} categories={categories} />
<span className="text-xs text-gray-600">
{memory.createdAt ? new Date(memory.createdAt).toLocaleDateString() : ''}
</span>
</div>
</div>
);
}
function MemoryForm({ categories, editing, onSave, onCancel }) {
const [content, setContent] = useState(editing?.content || '');
const [category, setCategory] = useState(editing?.category || categories[0].value);
const handleSubmit = () => {
if (!content.trim()) return;
const memory = {
...(editing?.id ? { id: editing.id } : {}),
content: content.trim(),
category,
};
onSave(memory);
setContent('');
setCategory(categories[0].value);
};
return (
<div className="border border-indigo-500/30 rounded-lg p-4 bg-indigo-500/5 space-y-3">
<textarea
className="w-full bg-gray-800 border border-gray-700 text-gray-200 p-2 rounded-lg text-sm h-20 resize-y focus:border-indigo-500 focus:ring-1 focus:ring-indigo-500 outline-none"
value={content}
onChange={(e) => setContent(e.target.value)}
placeholder="Enter memory content..."
autoFocus
/>
<div className="flex items-center gap-3">
<select
className="bg-gray-800 border border-gray-700 text-gray-200 text-sm p-2 rounded-lg focus:border-indigo-500 outline-none"
value={category}
onChange={(e) => setCategory(e.target.value)}
>
{categories.map(c => (
<option key={c.value} value={c.value}>{c.label}</option>
))}
</select>
<div className="flex gap-2 ml-auto">
<button
onClick={onCancel}
className="px-3 py-1.5 bg-gray-700 hover:bg-gray-600 text-gray-300 text-sm rounded-lg transition-colors"
>
Cancel
</button>
<button
onClick={handleSubmit}
disabled={!content.trim()}
className="px-3 py-1.5 bg-indigo-600 hover:bg-indigo-500 disabled:bg-gray-700 disabled:text-gray-500 text-white text-sm rounded-lg transition-colors"
>
{editing?.id ? 'Update' : 'Add Memory'}
</button>
</div>
</div>
</div>
);
}
export default function Memories() {
const [tab, setTab] = useState('personal'); // 'personal' | 'general'
const [characters, setCharacters] = useState([]);
const [selectedCharId, setSelectedCharId] = useState('');
const [memories, setMemories] = useState([]);
const [loading, setLoading] = useState(false);
const [showForm, setShowForm] = useState(false);
const [editing, setEditing] = useState(null);
const [error, setError] = useState(null);
const [filter, setFilter] = useState('');
// Load characters list
useEffect(() => {
fetch('/api/characters')
.then(r => r.json())
.then(chars => {
setCharacters(chars);
const activeId = localStorage.getItem(ACTIVE_KEY);
if (activeId && chars.some(c => c.id === activeId)) {
setSelectedCharId(activeId);
} else if (chars.length > 0) {
setSelectedCharId(chars[0].id);
}
})
.catch(() => {});
}, []);
// Load memories when tab or selected character changes
const loadMemories = useCallback(async () => {
setLoading(true);
setError(null);
try {
if (tab === 'personal' && selectedCharId) {
const data = await getPersonalMemories(selectedCharId);
setMemories(data.memories || []);
} else if (tab === 'general') {
const data = await getGeneralMemories();
setMemories(data.memories || []);
} else {
setMemories([]);
}
} catch (err) {
setError(err.message);
} finally {
setLoading(false);
}
}, [tab, selectedCharId]);
useEffect(() => { loadMemories(); }, [loadMemories]);
const handleSave = async (memory) => {
try {
if (tab === 'personal') {
await savePersonalMemory(selectedCharId, memory);
} else {
await saveGeneralMemory(memory);
}
setShowForm(false);
setEditing(null);
await loadMemories();
} catch (err) {
setError(err.message);
}
};
const handleDelete = async (memoryId) => {
try {
if (tab === 'personal') {
await deletePersonalMemory(selectedCharId, memoryId);
} else {
await deleteGeneralMemory(memoryId);
}
await loadMemories();
} catch (err) {
setError(err.message);
}
};
const handleEdit = (memory) => {
setEditing(memory);
setShowForm(true);
};
const categories = tab === 'personal' ? PERSONAL_CATEGORIES : GENERAL_CATEGORIES;
const filteredMemories = filter
? memories.filter(m => m.content?.toLowerCase().includes(filter.toLowerCase()) || m.category === filter)
: memories;
// Sort newest first
const sortedMemories = [...filteredMemories].sort(
(a, b) => (b.createdAt || '').localeCompare(a.createdAt || '')
);
const selectedChar = characters.find(c => c.id === selectedCharId);
return (
<div className="space-y-6">
{/* Header */}
<div className="flex items-center justify-between">
<div>
<h1 className="text-3xl font-bold text-gray-100">Memories</h1>
<p className="text-sm text-gray-500 mt-1">
{sortedMemories.length} {tab} memor{sortedMemories.length !== 1 ? 'ies' : 'y'}
{tab === 'personal' && selectedChar && (
<span className="ml-1 text-indigo-400">
for {selectedChar.data?.display_name || selectedChar.data?.name || selectedCharId}
</span>
)}
</p>
</div>
<button
onClick={() => { setEditing(null); setShowForm(!showForm); }}
className="flex items-center gap-2 px-4 py-2 bg-indigo-600 hover:bg-indigo-500 text-white rounded-lg transition-colors"
>
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M12 4.5v15m7.5-7.5h-15" />
</svg>
Add Memory
</button>
</div>
{error && (
<div className="bg-red-900/30 border border-red-500/50 text-red-300 px-4 py-3 rounded-lg text-sm">
{error}
<button onClick={() => setError(null)} className="ml-2 text-red-400 hover:text-red-300">&times;</button>
</div>
)}
{/* Tabs */}
<div className="flex gap-1 bg-gray-900 p-1 rounded-lg border border-gray-800 w-fit">
<button
onClick={() => { setTab('personal'); setShowForm(false); setEditing(null); }}
className={`px-4 py-2 text-sm font-medium rounded-md transition-colors ${
tab === 'personal'
? 'bg-gray-800 text-white'
: 'text-gray-400 hover:text-gray-200'
}`}
>
Personal
</button>
<button
onClick={() => { setTab('general'); setShowForm(false); setEditing(null); }}
className={`px-4 py-2 text-sm font-medium rounded-md transition-colors ${
tab === 'general'
? 'bg-gray-800 text-white'
: 'text-gray-400 hover:text-gray-200'
}`}
>
General
</button>
</div>
{/* Character selector (personal tab only) */}
{tab === 'personal' && (
<div className="flex items-center gap-3">
<label className="text-sm text-gray-400">Character</label>
<select
value={selectedCharId}
onChange={(e) => setSelectedCharId(e.target.value)}
className="bg-gray-800 border border-gray-700 text-gray-200 text-sm p-2 rounded-lg focus:border-indigo-500 outline-none"
>
{characters.map(c => (
<option key={c.id} value={c.id}>
{c.data?.display_name || c.data?.name || c.id}
</option>
))}
</select>
</div>
)}
{/* Search filter */}
<div>
<input
type="text"
className="w-full bg-gray-800 border border-gray-700 text-gray-200 p-2 rounded-lg text-sm focus:border-indigo-500 focus:ring-1 focus:ring-indigo-500 outline-none"
value={filter}
onChange={(e) => setFilter(e.target.value)}
placeholder="Search memories..."
/>
</div>
{/* Add/Edit form */}
{showForm && (
<MemoryForm
categories={categories}
editing={editing}
onSave={handleSave}
onCancel={() => { setShowForm(false); setEditing(null); }}
/>
)}
{/* Memory list */}
{loading ? (
<div className="text-center py-12">
<p className="text-gray-500">Loading memories...</p>
</div>
) : sortedMemories.length === 0 ? (
<div className="text-center py-12">
<svg className="w-12 h-12 mx-auto text-gray-700 mb-3" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1}>
<path strokeLinecap="round" strokeLinejoin="round" d="M12 18v-5.25m0 0a6.01 6.01 0 001.5-.189m-1.5.189a6.01 6.01 0 01-1.5-.189m3.75 7.478a12.06 12.06 0 01-4.5 0m3.75 2.383a14.406 14.406 0 01-3 0M14.25 18v-.192c0-.983.658-1.823 1.508-2.316a7.5 7.5 0 10-7.517 0c.85.493 1.509 1.333 1.509 2.316V18" />
</svg>
<p className="text-gray-500 text-sm">
{filter ? 'No memories match your search.' : 'No memories yet. Add one to get started.'}
</p>
</div>
) : (
<div className="space-y-3">
{sortedMemories.map(memory => (
<MemoryCard
key={memory.id}
memory={memory}
categories={categories}
onEdit={handleEdit}
onDelete={handleDelete}
/>
))}
</div>
)}
</div>
);
}

View File

@@ -2,6 +2,267 @@ import { defineConfig } from 'vite'
import react from '@vitejs/plugin-react'
import tailwindcss from '@tailwindcss/vite'
const CHARACTERS_DIR = '/Users/aodhan/homeai-data/characters'
const SATELLITE_MAP_PATH = '/Users/aodhan/homeai-data/satellite-map.json'
const CONVERSATIONS_DIR = '/Users/aodhan/homeai-data/conversations'
const MEMORIES_DIR = '/Users/aodhan/homeai-data/memories'
const GAZE_HOST = 'http://10.0.0.101:5782'
const GAZE_API_KEY = process.env.GAZE_API_KEY || ''
function characterStoragePlugin() {
return {
name: 'character-storage',
configureServer(server) {
const ensureDir = async () => {
const { mkdir } = await import('fs/promises')
await mkdir(CHARACTERS_DIR, { recursive: true })
}
// GET /api/characters — list all profiles
server.middlewares.use('/api/characters', async (req, res, next) => {
if (req.method === 'OPTIONS') {
res.writeHead(204, { 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Methods': 'GET,POST,DELETE', 'Access-Control-Allow-Headers': 'Content-Type' })
res.end()
return
}
const { readdir, readFile, writeFile, unlink } = await import('fs/promises')
await ensureDir()
// req.url has the mount prefix stripped by connect, so "/" means /api/characters
const url = new URL(req.url, 'http://localhost')
const subPath = url.pathname.replace(/^\/+/, '')
// GET /api/characters/:id — single profile
if (req.method === 'GET' && subPath) {
try {
const safeId = subPath.replace(/[^a-zA-Z0-9_\-\.]/g, '_')
const raw = await readFile(`${CHARACTERS_DIR}/${safeId}.json`, 'utf-8')
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(raw)
} catch {
res.writeHead(404, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ error: 'Not found' }))
}
return
}
if (req.method === 'GET' && !subPath) {
try {
const files = (await readdir(CHARACTERS_DIR)).filter(f => f.endsWith('.json'))
const profiles = []
for (const file of files) {
try {
const raw = await readFile(`${CHARACTERS_DIR}/${file}`, 'utf-8')
profiles.push(JSON.parse(raw))
} catch { /* skip corrupt files */ }
}
// Sort by addedAt descending
profiles.sort((a, b) => (b.addedAt || '').localeCompare(a.addedAt || ''))
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify(profiles))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
if (req.method === 'POST' && !subPath) {
try {
const chunks = []
for await (const chunk of req) chunks.push(chunk)
const profile = JSON.parse(Buffer.concat(chunks).toString())
if (!profile.id) {
res.writeHead(400, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: 'Missing profile id' }))
return
}
// Sanitize filename — only allow alphanumeric, underscore, dash, dot
const safeId = profile.id.replace(/[^a-zA-Z0-9_\-\.]/g, '_')
await writeFile(`${CHARACTERS_DIR}/${safeId}.json`, JSON.stringify(profile, null, 2))
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ ok: true }))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
if (req.method === 'DELETE' && subPath) {
try {
const safeId = subPath.replace(/[^a-zA-Z0-9_\-\.]/g, '_')
await unlink(`${CHARACTERS_DIR}/${safeId}.json`).catch(() => {})
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ ok: true }))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
next()
})
},
}
}
function satelliteMapPlugin() {
return {
name: 'satellite-map',
configureServer(server) {
server.middlewares.use('/api/satellite-map', async (req, res, next) => {
if (req.method === 'OPTIONS') {
res.writeHead(204, { 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Methods': 'GET,POST', 'Access-Control-Allow-Headers': 'Content-Type' })
res.end()
return
}
const { readFile, writeFile } = await import('fs/promises')
if (req.method === 'GET') {
try {
const raw = await readFile(SATELLITE_MAP_PATH, 'utf-8')
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(raw)
} catch {
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ default: 'aria_default', satellites: {} }))
}
return
}
if (req.method === 'POST') {
try {
const chunks = []
for await (const chunk of req) chunks.push(chunk)
const data = JSON.parse(Buffer.concat(chunks).toString())
await writeFile(SATELLITE_MAP_PATH, JSON.stringify(data, null, 2))
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ ok: true }))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
next()
})
},
}
}
function conversationStoragePlugin() {
return {
name: 'conversation-storage',
configureServer(server) {
const ensureDir = async () => {
const { mkdir } = await import('fs/promises')
await mkdir(CONVERSATIONS_DIR, { recursive: true })
}
server.middlewares.use('/api/conversations', async (req, res, next) => {
if (req.method === 'OPTIONS') {
res.writeHead(204, { 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Methods': 'GET,POST,DELETE', 'Access-Control-Allow-Headers': 'Content-Type' })
res.end()
return
}
const { readdir, readFile, writeFile, unlink } = await import('fs/promises')
await ensureDir()
const url = new URL(req.url, 'http://localhost')
const subPath = url.pathname.replace(/^\/+/, '')
// GET /api/conversations/:id — single conversation with messages
if (req.method === 'GET' && subPath) {
try {
const safeId = subPath.replace(/[^a-zA-Z0-9_\-\.]/g, '_')
const raw = await readFile(`${CONVERSATIONS_DIR}/${safeId}.json`, 'utf-8')
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(raw)
} catch {
res.writeHead(404, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ error: 'Not found' }))
}
return
}
// GET /api/conversations — list metadata (no messages)
if (req.method === 'GET' && !subPath) {
try {
const files = (await readdir(CONVERSATIONS_DIR)).filter(f => f.endsWith('.json'))
const list = []
for (const file of files) {
try {
const raw = await readFile(`${CONVERSATIONS_DIR}/${file}`, 'utf-8')
const conv = JSON.parse(raw)
list.push({
id: conv.id,
title: conv.title || '',
characterId: conv.characterId || '',
characterName: conv.characterName || '',
createdAt: conv.createdAt || '',
updatedAt: conv.updatedAt || '',
messageCount: (conv.messages || []).length,
})
} catch { /* skip corrupt files */ }
}
list.sort((a, b) => (b.updatedAt || '').localeCompare(a.updatedAt || ''))
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify(list))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
// POST /api/conversations — create or update
if (req.method === 'POST' && !subPath) {
try {
const chunks = []
for await (const chunk of req) chunks.push(chunk)
const conv = JSON.parse(Buffer.concat(chunks).toString())
if (!conv.id) {
res.writeHead(400, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: 'Missing conversation id' }))
return
}
const safeId = conv.id.replace(/[^a-zA-Z0-9_\-\.]/g, '_')
await writeFile(`${CONVERSATIONS_DIR}/${safeId}.json`, JSON.stringify(conv, null, 2))
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ ok: true }))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
// DELETE /api/conversations/:id
if (req.method === 'DELETE' && subPath) {
try {
const safeId = subPath.replace(/[^a-zA-Z0-9_\-\.]/g, '_')
await unlink(`${CONVERSATIONS_DIR}/${safeId}.json`).catch(() => {})
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ ok: true }))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
next()
})
},
}
}
function healthCheckPlugin() {
return {
name: 'health-check-proxy',
@@ -121,6 +382,273 @@ function healthCheckPlugin() {
};
}
function gazeProxyPlugin() {
return {
name: 'gaze-proxy',
configureServer(server) {
server.middlewares.use('/api/gaze/presets', async (req, res) => {
if (!GAZE_API_KEY) {
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ presets: [] }))
return
}
try {
const http = await import('http')
const url = new URL(`${GAZE_HOST}/api/v1/presets`)
const proxyRes = await new Promise((resolve, reject) => {
const r = http.default.get(url, { headers: { 'X-API-Key': GAZE_API_KEY }, timeout: 5000 }, resolve)
r.on('error', reject)
r.on('timeout', () => { r.destroy(); reject(new Error('timeout')) })
})
const chunks = []
for await (const chunk of proxyRes) chunks.push(chunk)
res.writeHead(proxyRes.statusCode, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(Buffer.concat(chunks))
} catch {
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ presets: [] }))
}
})
},
}
}
function memoryStoragePlugin() {
return {
name: 'memory-storage',
configureServer(server) {
const ensureDirs = async () => {
const { mkdir } = await import('fs/promises')
await mkdir(`${MEMORIES_DIR}/personal`, { recursive: true })
}
const readJsonFile = async (path, fallback) => {
const { readFile } = await import('fs/promises')
try {
return JSON.parse(await readFile(path, 'utf-8'))
} catch {
return fallback
}
}
const writeJsonFile = async (path, data) => {
const { writeFile } = await import('fs/promises')
await writeFile(path, JSON.stringify(data, null, 2))
}
// Personal memories: /api/memories/personal/:characterId[/:memoryId]
server.middlewares.use('/api/memories/personal', async (req, res, next) => {
if (req.method === 'OPTIONS') {
res.writeHead(204, { 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Methods': 'GET,POST,DELETE', 'Access-Control-Allow-Headers': 'Content-Type' })
res.end()
return
}
await ensureDirs()
const url = new URL(req.url, 'http://localhost')
const parts = url.pathname.replace(/^\/+/, '').split('/')
const characterId = parts[0] ? parts[0].replace(/[^a-zA-Z0-9_\-\.]/g, '_') : null
const memoryId = parts[1] || null
if (!characterId) {
res.writeHead(400, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: 'Missing character ID' }))
return
}
const filePath = `${MEMORIES_DIR}/personal/${characterId}.json`
if (req.method === 'GET') {
const data = await readJsonFile(filePath, { characterId, memories: [] })
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify(data))
return
}
if (req.method === 'POST') {
try {
const chunks = []
for await (const chunk of req) chunks.push(chunk)
const memory = JSON.parse(Buffer.concat(chunks).toString())
const data = await readJsonFile(filePath, { characterId, memories: [] })
if (memory.id) {
const idx = data.memories.findIndex(m => m.id === memory.id)
if (idx >= 0) {
data.memories[idx] = { ...data.memories[idx], ...memory }
} else {
data.memories.push(memory)
}
} else {
memory.id = 'm_' + Date.now()
memory.createdAt = memory.createdAt || new Date().toISOString()
data.memories.push(memory)
}
await writeJsonFile(filePath, data)
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ ok: true, memory }))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
if (req.method === 'DELETE' && memoryId) {
try {
const data = await readJsonFile(filePath, { characterId, memories: [] })
data.memories = data.memories.filter(m => m.id !== memoryId)
await writeJsonFile(filePath, data)
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ ok: true }))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
next()
})
// General memories: /api/memories/general[/:memoryId]
server.middlewares.use('/api/memories/general', async (req, res, next) => {
if (req.method === 'OPTIONS') {
res.writeHead(204, { 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Methods': 'GET,POST,DELETE', 'Access-Control-Allow-Headers': 'Content-Type' })
res.end()
return
}
await ensureDirs()
const url = new URL(req.url, 'http://localhost')
const memoryId = url.pathname.replace(/^\/+/, '') || null
const filePath = `${MEMORIES_DIR}/general.json`
if (req.method === 'GET') {
const data = await readJsonFile(filePath, { memories: [] })
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify(data))
return
}
if (req.method === 'POST') {
try {
const chunks = []
for await (const chunk of req) chunks.push(chunk)
const memory = JSON.parse(Buffer.concat(chunks).toString())
const data = await readJsonFile(filePath, { memories: [] })
if (memory.id) {
const idx = data.memories.findIndex(m => m.id === memory.id)
if (idx >= 0) {
data.memories[idx] = { ...data.memories[idx], ...memory }
} else {
data.memories.push(memory)
}
} else {
memory.id = 'm_' + Date.now()
memory.createdAt = memory.createdAt || new Date().toISOString()
data.memories.push(memory)
}
await writeJsonFile(filePath, data)
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ ok: true, memory }))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
if (req.method === 'DELETE' && memoryId) {
try {
const data = await readJsonFile(filePath, { memories: [] })
data.memories = data.memories.filter(m => m.id !== memoryId)
await writeJsonFile(filePath, data)
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ ok: true }))
} catch (err) {
res.writeHead(500, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: err.message }))
}
return
}
next()
})
},
}
}
function characterLookupPlugin() {
return {
name: 'character-lookup',
configureServer(server) {
server.middlewares.use('/api/character-lookup', async (req, res) => {
if (req.method === 'OPTIONS') {
res.writeHead(204, { 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Methods': 'POST', 'Access-Control-Allow-Headers': 'Content-Type' })
res.end()
return
}
if (req.method !== 'POST') {
res.writeHead(405, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: 'POST only' }))
return
}
try {
const chunks = []
for await (const chunk of req) chunks.push(chunk)
const { name, franchise } = JSON.parse(Buffer.concat(chunks).toString())
if (!name || !franchise) {
res.writeHead(400, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: 'Missing name or franchise' }))
return
}
const { execFile } = await import('child_process')
const { promisify } = await import('util')
const execFileAsync = promisify(execFile)
// Call the MCP fetcher inside the running Docker container
const safeName = name.replace(/'/g, "\\'")
const safeFranchise = franchise.replace(/'/g, "\\'")
const pyScript = `
import asyncio, json
from character_details.fetcher import fetch_character
c = asyncio.run(fetch_character('${safeName}', '${safeFranchise}'))
print(json.dumps(c.model_dump(), default=str))
`.trim()
const { stdout } = await execFileAsync(
'docker',
['exec', 'character-browser-character-mcp-1', 'python', '-c', pyScript],
{ timeout: 30000 }
)
const data = JSON.parse(stdout.trim())
res.writeHead(200, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({
name: data.name || name,
franchise: data.franchise || franchise,
description: data.description || '',
background: data.background || '',
appearance: data.appearance || '',
personality: data.personality || '',
abilities: data.abilities || [],
notable_quotes: data.notable_quotes || [],
relationships: data.relationships || [],
sources: data.sources || [],
}))
} catch (err) {
console.error('[character-lookup] failed:', err?.message || err)
const status = err?.message?.includes('timeout') ? 504 : 500
res.writeHead(status, { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' })
res.end(JSON.stringify({ error: err?.message || 'Lookup failed' }))
}
})
},
}
}
function bridgeProxyPlugin() {
return {
name: 'bridge-proxy',
@@ -172,10 +700,11 @@ function bridgeProxyPlugin() {
proxyReq.write(body)
proxyReq.end()
})
} catch {
} catch (err) {
console.error(`[bridge-proxy] ${targetPath} failed:`, err?.message || err)
if (!res.headersSent) {
res.writeHead(502, { 'Content-Type': 'application/json' })
res.end(JSON.stringify({ error: 'Bridge unreachable' }))
res.end(JSON.stringify({ error: `Bridge unreachable: ${err?.message || 'unknown'}` }))
}
}
}
@@ -189,6 +718,12 @@ function bridgeProxyPlugin() {
export default defineConfig({
plugins: [
characterStoragePlugin(),
satelliteMapPlugin(),
conversationStoragePlugin(),
memoryStoragePlugin(),
gazeProxyPlugin(),
characterLookupPlugin(),
healthCheckPlugin(),
bridgeProxyPlugin(),
tailwindcss(),

219
homeai-images/API_GUIDE.md Normal file
View File

@@ -0,0 +1,219 @@
# GAZE REST API Guide
## Setup
1. Open **Settings** in the GAZE web UI
2. Scroll to **REST API Key** and click **Regenerate**
3. Copy the key — you'll need it for all API requests
## Authentication
Every request must include your API key via one of:
- **Header (recommended):** `X-API-Key: <your-key>`
- **Query parameter:** `?api_key=<your-key>`
Responses for auth failures:
| Status | Meaning |
|--------|---------|
| `401` | Missing API key |
| `403` | Invalid API key |
## Endpoints
### List Presets
```
GET /api/v1/presets
```
Returns all available presets.
**Response:**
```json
{
"presets": [
{
"preset_id": "example_01",
"slug": "example_01",
"name": "Example Preset",
"has_cover": true
}
]
}
```
### Generate Image
```
POST /api/v1/generate/<preset_slug>
```
Queue one or more image generations using a preset's configuration. All body parameters are optional — when omitted, the preset's own settings are used.
**Request body (JSON):**
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `count` | int | `1` | Number of images to generate (120) |
| `checkpoint` | string | — | Override checkpoint path (e.g. `"Illustrious/model.safetensors"`) |
| `extra_positive` | string | `""` | Additional positive prompt tags appended to the generated prompt |
| `extra_negative` | string | `""` | Additional negative prompt tags |
| `seed` | int | random | Fixed seed for reproducible generation |
| `width` | int | — | Output width in pixels (must provide both width and height) |
| `height` | int | — | Output height in pixels (must provide both width and height) |
**Response (202):**
```json
{
"jobs": [
{ "job_id": "783f0268-ba85-4426-8ca2-6393c844c887", "status": "queued" }
]
}
```
**Errors:**
| Status | Cause |
|--------|-------|
| `400` | Invalid parameters (bad count, seed, or mismatched width/height) |
| `404` | Preset slug not found |
| `500` | Internal generation error |
### Check Job Status
```
GET /api/v1/job/<job_id>
```
Poll this endpoint to track generation progress.
**Response:**
```json
{
"id": "783f0268-ba85-4426-8ca2-6393c844c887",
"label": "Preset: Example Preset preview",
"status": "done",
"error": null,
"result": {
"image_url": "/static/uploads/presets/example_01/gen_1773601346.png",
"relative_path": "presets/example_01/gen_1773601346.png",
"seed": 927640517599332
}
}
```
**Job statuses:**
| Status | Meaning |
|--------|---------|
| `pending` | Waiting in queue |
| `processing` | Currently generating |
| `done` | Complete — `result` contains image info |
| `failed` | Error occurred — check `error` field |
The `result` object is only present when status is `done`. Use `seed` from the result to reproduce the exact same image later.
**Retrieving the image:** The `image_url` is a path relative to the server root. Fetch it directly:
```
GET http://<host>:5782/static/uploads/presets/example_01/gen_1773601346.png
```
Image retrieval does not require authentication.
## Examples
### Generate a single image and wait for it
```bash
API_KEY="your-key-here"
HOST="http://localhost:5782"
# Queue generation
JOB_ID=$(curl -s -X POST \
-H "X-API-Key: $API_KEY" \
-H "Content-Type: application/json" \
-d '{}' \
"$HOST/api/v1/generate/example_01" | python3 -c "import sys,json; print(json.load(sys.stdin)['jobs'][0]['job_id'])")
echo "Job: $JOB_ID"
# Poll until done
while true; do
RESULT=$(curl -s -H "X-API-Key: $API_KEY" "$HOST/api/v1/job/$JOB_ID")
STATUS=$(echo "$RESULT" | python3 -c "import sys,json; print(json.load(sys.stdin)['status'])")
echo "Status: $STATUS"
if [ "$STATUS" = "done" ] || [ "$STATUS" = "failed" ]; then
echo "$RESULT" | python3 -m json.tool
break
fi
sleep 5
done
```
### Generate 3 images with extra prompts
```bash
curl -X POST \
-H "X-API-Key: $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"count": 3,
"extra_positive": "smiling, outdoors",
"extra_negative": "blurry"
}' \
"$HOST/api/v1/generate/example_01"
```
### Reproduce a specific image
```bash
curl -X POST \
-H "X-API-Key: $API_KEY" \
-H "Content-Type: application/json" \
-d '{"seed": 927640517599332}' \
"$HOST/api/v1/generate/example_01"
```
### Python example
```python
import requests
import time
HOST = "http://localhost:5782"
API_KEY = "your-key-here"
HEADERS = {"X-API-Key": API_KEY, "Content-Type": "application/json"}
# List presets
presets = requests.get(f"{HOST}/api/v1/presets", headers=HEADERS).json()
print(f"Available presets: {[p['name'] for p in presets['presets']]}")
# Generate
resp = requests.post(
f"{HOST}/api/v1/generate/{presets['presets'][0]['slug']}",
headers=HEADERS,
json={"count": 1},
).json()
job_id = resp["jobs"][0]["job_id"]
print(f"Queued job: {job_id}")
# Poll
while True:
status = requests.get(f"{HOST}/api/v1/job/{job_id}", headers=HEADERS).json()
print(f"Status: {status['status']}")
if status["status"] in ("done", "failed"):
break
time.sleep(5)
if status["status"] == "done":
image_url = f"{HOST}{status['result']['image_url']}"
print(f"Image: {image_url}")
print(f"Seed: {status['result']['seed']}")
```

View File

@@ -12,17 +12,27 @@
<string>/Users/aodhan/gitea/homeai/homeai-llm/scripts/preload-models.sh</string>
</array>
<key>EnvironmentVariables</key>
<dict>
<!-- Override to change which medium model stays warm -->
<key>HOMEAI_MEDIUM_MODEL</key>
<string>qwen3.5:35b-a3b</string>
</dict>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<true/>
<key>StandardOutPath</key>
<string>/tmp/homeai-preload-models.log</string>
<key>StandardErrorPath</key>
<string>/tmp/homeai-preload-models-error.log</string>
<!-- Delay 15s to let Ollama start first -->
<!-- If the script exits/crashes, wait 30s before restarting -->
<key>ThrottleInterval</key>
<integer>15</integer>
<integer>30</integer>
</dict>
</plist>

View File

@@ -1,19 +1,73 @@
#!/bin/bash
# Pre-load voice pipeline models into Ollama with infinite keep_alive.
# Run after Ollama starts (called by launchd or manually).
# Keep voice pipeline models warm in Ollama VRAM.
# Runs as a loop — checks every 5 minutes, re-pins any model that got evicted.
# Only pins lightweight/MoE models — large dense models (70B) use default expiry.
OLLAMA_URL="http://localhost:11434"
CHECK_INTERVAL=300 # seconds between checks
# Wait for Ollama to be ready
for i in $(seq 1 30); do
curl -sf "$OLLAMA_URL/api/tags" > /dev/null 2>&1 && break
sleep 2
# Medium model can be overridden via env var (e.g. by persona config)
HOMEAI_MEDIUM_MODEL="${HOMEAI_MEDIUM_MODEL:-qwen3.5:35b-a3b}"
# Models to keep warm: "name|description"
MODELS=(
"qwen2.5:7b|small (4.7GB) — fast fallback"
"${HOMEAI_MEDIUM_MODEL}|medium — persona default"
)
wait_for_ollama() {
for i in $(seq 1 30); do
curl -sf "$OLLAMA_URL/api/tags" > /dev/null 2>&1 && return 0
sleep 2
done
return 1
}
is_model_loaded() {
local model="$1"
curl -sf "$OLLAMA_URL/api/ps" 2>/dev/null \
| python3 -c "
import json, sys
data = json.load(sys.stdin)
names = [m['name'] for m in data.get('models', [])]
sys.exit(0 if '$model' in names else 1)
" 2>/dev/null
}
pin_model() {
local model="$1"
local desc="$2"
if is_model_loaded "$model"; then
echo "[keepwarm] $model already loaded — skipping"
return 0
fi
echo "[keepwarm] Loading $model ($desc) with keep_alive=-1..."
curl -sf "$OLLAMA_URL/api/generate" \
-d "{\"model\":\"$model\",\"prompt\":\"ready\",\"stream\":false,\"keep_alive\":-1,\"options\":{\"num_ctx\":512}}" \
> /dev/null 2>&1
if [ $? -eq 0 ]; then
echo "[keepwarm] $model pinned in VRAM"
else
echo "[keepwarm] ERROR: failed to load $model"
fi
}
# --- Main loop ---
echo "[keepwarm] Starting model keep-warm daemon (interval: ${CHECK_INTERVAL}s)"
# Initial wait for Ollama
if ! wait_for_ollama; then
echo "[keepwarm] ERROR: Ollama not reachable after 60s, exiting"
exit 1
fi
echo "[keepwarm] Ollama is online"
while true; do
for entry in "${MODELS[@]}"; do
IFS='|' read -r model desc <<< "$entry"
pin_model "$model" "$desc"
done
sleep "$CHECK_INTERVAL"
done
# Pin qwen3.5:35b-a3b (MoE, 38.7GB VRAM, voice pipeline default)
echo "[preload] Loading qwen3.5:35b-a3b with keep_alive=-1..."
curl -sf "$OLLAMA_URL/api/generate" \
-d '{"model":"qwen3.5:35b-a3b","prompt":"ready","stream":false,"keep_alive":-1,"options":{"num_ctx":512}}' \
> /dev/null 2>&1
echo "[preload] qwen3.5:35b-a3b pinned in memory"

View File

@@ -0,0 +1,40 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
"http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.homeai.vtube-bridge</string>
<key>ProgramArguments</key>
<array>
<string>/Users/aodhan/homeai-visual-env/bin/python3</string>
<string>/Users/aodhan/gitea/homeai/homeai-visual/vtube-bridge.py</string>
<string>--port</string>
<string>8002</string>
<string>--character</string>
<string>/Users/aodhan/gitea/homeai/homeai-dashboard/characters/aria.json</string>
</array>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<true/>
<key>StandardOutPath</key>
<string>/tmp/homeai-vtube-bridge.log</string>
<key>StandardErrorPath</key>
<string>/tmp/homeai-vtube-bridge-error.log</string>
<key>ThrottleInterval</key>
<integer>10</integer>
<key>EnvironmentVariables</key>
<dict>
<key>PATH</key>
<string>/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin</string>
</dict>
</dict>
</plist>

View File

@@ -0,0 +1,170 @@
#!/usr/bin/env python3
"""
Test script for VTube Studio Expression Bridge.
Usage:
python3 test-expressions.py # test all expressions
python3 test-expressions.py --auth # run auth flow first
python3 test-expressions.py --lipsync # test lip sync parameter
python3 test-expressions.py --latency # measure round-trip latency
Requires the vtube-bridge to be running on port 8002.
"""
import argparse
import json
import sys
import time
import urllib.request
BRIDGE_URL = "http://localhost:8002"
EXPRESSIONS = ["idle", "listening", "thinking", "speaking", "happy", "sad", "surprised", "error"]
def _post(path: str, data: dict | None = None) -> dict:
body = json.dumps(data or {}).encode()
req = urllib.request.Request(
f"{BRIDGE_URL}{path}",
data=body,
headers={"Content-Type": "application/json"},
method="POST",
)
with urllib.request.urlopen(req, timeout=10) as resp:
return json.loads(resp.read())
def _get(path: str) -> dict:
req = urllib.request.Request(f"{BRIDGE_URL}{path}")
with urllib.request.urlopen(req, timeout=10) as resp:
return json.loads(resp.read())
def check_bridge():
"""Verify bridge is running and connected."""
try:
status = _get("/status")
print(f"Bridge status: connected={status['connected']}, authenticated={status['authenticated']}")
print(f" Expressions: {', '.join(status.get('expressions', []))}")
if not status["connected"]:
print("\n WARNING: Not connected to VTube Studio. Is it running?")
if not status["authenticated"]:
print(" WARNING: Not authenticated. Run with --auth to initiate auth flow.")
return status
except Exception as e:
print(f"ERROR: Cannot reach bridge at {BRIDGE_URL}: {e}")
print(" Is vtube-bridge.py running?")
sys.exit(1)
def run_auth():
"""Initiate auth flow — user must click Allow in VTube Studio."""
print("Requesting authentication token...")
print(" >>> Click 'Allow' in VTube Studio when prompted <<<")
result = _post("/auth")
print(f" Result: {json.dumps(result, indent=2)}")
return result
def test_expressions(delay: float = 2.0):
"""Cycle through all expressions with a pause between each."""
print(f"\nCycling through {len(EXPRESSIONS)} expressions ({delay}s each):\n")
for expr in EXPRESSIONS:
print(f"{expr}...", end=" ", flush=True)
t0 = time.monotonic()
result = _post("/expression", {"event": expr})
dt = (time.monotonic() - t0) * 1000
if result.get("ok"):
print(f"OK ({dt:.0f}ms)")
else:
print(f"FAILED: {result.get('error', 'unknown')}")
time.sleep(delay)
# Return to idle
_post("/expression", {"event": "idle"})
print("\n Returned to idle.")
def test_lipsync(duration: float = 3.0):
"""Simulate lip sync by sweeping MouthOpen 0→1→0."""
import math
print(f"\nTesting lip sync (MouthOpen sweep, {duration}s)...\n")
fps = 20
frames = int(duration * fps)
for i in range(frames):
t = i / frames
# Sine wave for smooth open/close
value = abs(math.sin(t * math.pi * 4))
value = round(value, 3)
_post("/parameter", {"name": "MouthOpen", "value": value})
print(f"\r MouthOpen = {value:.3f}", end="", flush=True)
time.sleep(1.0 / fps)
_post("/parameter", {"name": "MouthOpen", "value": 0.0})
print("\r MouthOpen = 0.000 (done) ")
def test_latency(iterations: int = 20):
"""Measure expression trigger round-trip latency."""
print(f"\nMeasuring latency ({iterations} iterations)...\n")
times = []
for i in range(iterations):
expr = "thinking" if i % 2 == 0 else "idle"
t0 = time.monotonic()
_post("/expression", {"event": expr})
dt = (time.monotonic() - t0) * 1000
times.append(dt)
print(f" {i+1:2d}. {expr:10s}{dt:.1f}ms")
avg = sum(times) / len(times)
mn = min(times)
mx = max(times)
print(f"\n Avg: {avg:.1f}ms Min: {mn:.1f}ms Max: {mx:.1f}ms")
if avg < 100:
print(" PASS: Average latency under 100ms target")
else:
print(" WARNING: Average latency exceeds 100ms target")
# Return to idle
_post("/expression", {"event": "idle"})
def main():
parser = argparse.ArgumentParser(description="VTube Studio Expression Bridge Tester")
parser.add_argument("--auth", action="store_true", help="Run auth flow")
parser.add_argument("--lipsync", action="store_true", help="Test lip sync parameter sweep")
parser.add_argument("--latency", action="store_true", help="Measure round-trip latency")
parser.add_argument("--delay", type=float, default=2.0, help="Delay between expressions (default: 2s)")
parser.add_argument("--all", action="store_true", help="Run all tests")
args = parser.parse_args()
print("VTube Studio Expression Bridge Tester")
print("=" * 42)
status = check_bridge()
if args.auth:
run_auth()
print()
status = check_bridge()
if not status.get("authenticated") and not args.auth:
print("\nNot authenticated — skipping expression tests.")
print("Run with --auth to authenticate, or start VTube Studio first.")
return
if args.all:
test_expressions(args.delay)
test_lipsync()
test_latency()
elif args.lipsync:
test_lipsync()
elif args.latency:
test_latency()
else:
test_expressions(args.delay)
if __name__ == "__main__":
main()

View File

@@ -1,17 +1,16 @@
#!/usr/bin/env bash
# homeai-visual/setup.sh — P7: VTube Studio bridge + Live2D expressions
# homeai-visual/setup.sh — P7: VTube Studio Expression Bridge
#
# Components:
# - vtube_studio.py — WebSocket client skill for OpenClaw
# - lipsync.py — amplitude-based lip sync
# - auth.py — VTube Studio token management
# Sets up:
# - Python venv with websockets
# - vtube-bridge daemon (HTTP ↔ WebSocket bridge)
# - vtube-ctl CLI (symlinked to PATH)
# - launchd service
#
# Prerequisites:
# - P4 (homeai-agent) — OpenClaw running
# - P5 (homeai-character) — aria.json with live2d_expressions set
# - macOS: VTube Studio installed (Mac App Store)
# - Linux: N/A — VTube Studio is macOS/Windows/iOS only
# Linux dev can test the skill code but not the VTube Studio side
# - VTube Studio installed (Mac App Store) with WebSocket API enabled
set -euo pipefail
@@ -19,42 +18,61 @@ SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
REPO_DIR="$(cd "${SCRIPT_DIR}/.." && pwd)"
source "${REPO_DIR}/scripts/common.sh"
log_section "P7: VTube Studio Bridge"
detect_platform
VENV_DIR="$HOME/homeai-visual-env"
PLIST_SRC="${SCRIPT_DIR}/launchd/com.homeai.vtube-bridge.plist"
PLIST_DST="$HOME/Library/LaunchAgents/com.homeai.vtube-bridge.plist"
VTUBE_CTL_SRC="$HOME/.openclaw/skills/vtube-studio/scripts/vtube-ctl"
if [[ "$OS_TYPE" == "linux" ]]; then
log_warn "VTube Studio is not available on Linux."
log_warn "This sub-project requires macOS (Mac Mini)."
log_section "P7: VTube Studio Expression Bridge"
# ─── Python venv ──────────────────────────────────────────────────────────────
if [[ ! -d "$VENV_DIR" ]]; then
log_info "Creating Python venv at $VENV_DIR..."
python3 -m venv "$VENV_DIR"
fi
# ─── TODO: Implementation ──────────────────────────────────────────────────────
cat <<'EOF'
log_info "Installing dependencies..."
"$VENV_DIR/bin/pip" install --upgrade pip -q
"$VENV_DIR/bin/pip" install websockets -q
log_ok "Python venv ready ($(${VENV_DIR}/bin/python3 --version))"
─────────────────────────────────────────────────────────────────┐
│ P7: homeai-visual — NOT YET IMPLEMENTED │
│ │
│ macOS only (VTube Studio is macOS/iOS/Windows) │
│ │
│ Implementation steps: │
│ 1. Install VTube Studio from Mac App Store │
│ 2. Enable WebSocket API in VTube Studio (Settings → port 8001) │
│ 3. Source/purchase Live2D model │
│ 4. Create expression hotkeys for 8 states │
│ 5. Implement skills/vtube_studio.py (WebSocket client) │
│ 6. Implement skills/lipsync.py (amplitude → MouthOpen param) │
│ 7. Implement skills/auth.py (token request + persistence) │
│ 8. Register vtube_studio skill with OpenClaw │
│ 9. Update aria.json live2d_expressions with hotkey IDs │
│ 10. Test all 8 expression states │
│ │
│ On Linux: implement Python skills, test WebSocket protocol │
│ with a mock server before connecting to real VTube Studio. │
│ │
│ Interface contracts: │
│ VTUBE_WS_URL=ws://localhost:8001 │
└─────────────────────────────────────────────────────────────────┘
# ─── vtube-ctl symlink ───────────────────────────────────────────────────────
EOF
if [[ -f "$VTUBE_CTL_SRC" ]]; then
chmod +x "$VTUBE_CTL_SRC"
ln -sf "$VTUBE_CTL_SRC" /opt/homebrew/bin/vtube-ctl
log_ok "vtube-ctl symlinked to /opt/homebrew/bin/vtube-ctl"
else
log_warn "vtube-ctl not found at $VTUBE_CTL_SRC — skipping symlink"
fi
log_info "P7 is not yet implemented. See homeai-visual/PLAN.md for details."
exit 0
# ─── launchd service ─────────────────────────────────────────────────────────
if [[ -f "$PLIST_SRC" ]]; then
# Unload if already loaded
launchctl bootout "gui/$(id -u)/com.homeai.vtube-bridge" 2>/dev/null || true
cp "$PLIST_SRC" "$PLIST_DST"
launchctl bootstrap "gui/$(id -u)" "$PLIST_DST"
log_ok "launchd service loaded: com.homeai.vtube-bridge"
else
log_warn "Plist not found at $PLIST_SRC — skipping launchd setup"
fi
# ─── Status ──────────────────────────────────────────────────────────────────
echo ""
log_info "VTube Bridge setup complete."
log_info ""
log_info "Next steps:"
log_info " 1. Install VTube Studio from Mac App Store"
log_info " 2. Enable WebSocket API: Settings > WebSocket API > port 8001"
log_info " 3. Load a Live2D model"
log_info " 4. Create expression hotkeys (idle, listening, thinking, speaking, happy, sad, surprised, error)"
log_info " 5. Run: vtube-ctl auth (click Allow in VTube Studio)"
log_info " 6. Run: python3 ${SCRIPT_DIR}/scripts/test-expressions.py --all"
log_info " 7. Update aria.json with real hotkey UUIDs"
log_info ""
log_info "Logs: /tmp/homeai-vtube-bridge.log"
log_info "Bridge: http://localhost:8002/status"

View File

@@ -0,0 +1,454 @@
#!/usr/bin/env python3
"""
VTube Studio Expression Bridge — persistent WebSocket ↔ HTTP bridge.
Maintains a long-lived WebSocket connection to VTube Studio and exposes
a simple HTTP API so other HomeAI components can trigger expressions and
inject parameters (lip sync) without managing their own WS connections.
HTTP API (port 8002):
POST /expression {"event": "thinking"} → trigger hotkey
POST /parameter {"name": "MouthOpen", "value": 0.5} → inject param
POST /parameters [{"name": "MouthOpen", "value": 0.5}, ...]
POST /auth {} → request new token
GET /status → connection info
GET /expressions → list available expressions
Requires: pip install websockets
"""
import argparse
import asyncio
import json
import logging
import signal
import sys
import time
from http import HTTPStatus
from pathlib import Path
try:
import websockets
from websockets.exceptions import ConnectionClosed
except ImportError:
print("ERROR: 'websockets' package required. Install with: pip install websockets", file=sys.stderr)
sys.exit(1)
# ---------------------------------------------------------------------------
# Config
# ---------------------------------------------------------------------------
DEFAULT_VTUBE_WS_URL = "ws://localhost:8001"
DEFAULT_HTTP_PORT = 8002
TOKEN_PATH = Path.home() / ".openclaw" / "vtube_token.json"
DEFAULT_CHARACTER_PATH = (
Path.home() / "gitea" / "homeai" / "homeai-dashboard" / "characters" / "aria.json"
)
logger = logging.getLogger("vtube-bridge")
# ---------------------------------------------------------------------------
# VTube Studio WebSocket Client
# ---------------------------------------------------------------------------
class VTubeClient:
"""Persistent async WebSocket client for VTube Studio API."""
def __init__(self, ws_url: str, character_path: Path):
self.ws_url = ws_url
self.character_path = character_path
self._ws = None
self._token: str | None = None
self._authenticated = False
self._current_expression: str | None = None
self._connected = False
self._request_id = 0
self._lock = asyncio.Lock()
self._load_token()
self._load_character()
# ── Character config ──────────────────────────────────────────────
def _load_character(self):
"""Load expression mappings from character JSON."""
self.expression_map: dict[str, str] = {}
self.ws_triggers: dict = {}
try:
if self.character_path.exists():
cfg = json.loads(self.character_path.read_text())
self.expression_map = cfg.get("live2d_expressions", {})
self.ws_triggers = cfg.get("vtube_ws_triggers", {})
logger.info("Loaded %d expressions from %s", len(self.expression_map), self.character_path.name)
else:
logger.warning("Character file not found: %s", self.character_path)
except Exception as e:
logger.error("Failed to load character config: %s", e)
def reload_character(self):
"""Hot-reload character config without restarting."""
self._load_character()
return {"expressions": self.expression_map, "triggers": self.ws_triggers}
# ── Token persistence ─────────────────────────────────────────────
def _load_token(self):
try:
if TOKEN_PATH.exists():
data = json.loads(TOKEN_PATH.read_text())
self._token = data.get("token")
logger.info("Loaded auth token from %s", TOKEN_PATH)
except Exception as e:
logger.warning("Could not load token: %s", e)
def _save_token(self, token: str):
TOKEN_PATH.parent.mkdir(parents=True, exist_ok=True)
TOKEN_PATH.write_text(json.dumps({"token": token}, indent=2))
self._token = token
logger.info("Saved auth token to %s", TOKEN_PATH)
# ── WebSocket comms ───────────────────────────────────────────────
def _next_id(self) -> str:
self._request_id += 1
return f"homeai-{self._request_id}"
async def _send(self, message_type: str, data: dict | None = None) -> dict:
"""Send a VTube Studio API message and return the response."""
payload = {
"apiName": "VTubeStudioPublicAPI",
"apiVersion": "1.0",
"requestID": self._next_id(),
"messageType": message_type,
"data": data or {},
}
await self._ws.send(json.dumps(payload))
resp = json.loads(await asyncio.wait_for(self._ws.recv(), timeout=10))
return resp
# ── Connection lifecycle ──────────────────────────────────────────
async def connect(self):
"""Connect and authenticate to VTube Studio."""
try:
self._ws = await websockets.connect(self.ws_url, ping_interval=20, ping_timeout=10)
self._connected = True
logger.info("Connected to VTube Studio at %s", self.ws_url)
if self._token:
await self._authenticate()
else:
logger.warning("No auth token — call POST /auth to initiate authentication")
except Exception as e:
self._connected = False
self._authenticated = False
logger.error("Connection failed: %s", e)
raise
async def _authenticate(self):
"""Authenticate with an existing token."""
resp = await self._send("AuthenticationRequest", {
"pluginName": "HomeAI",
"pluginDeveloper": "HomeAI",
"authenticationToken": self._token,
})
self._authenticated = resp.get("data", {}).get("authenticated", False)
if self._authenticated:
logger.info("Authenticated successfully")
else:
logger.warning("Token rejected — request a new one via POST /auth")
self._authenticated = False
async def request_new_token(self) -> dict:
"""Request a new auth token. User must click Allow in VTube Studio."""
if not self._connected:
return {"error": "Not connected to VTube Studio"}
resp = await self._send("AuthenticationTokenRequest", {
"pluginName": "HomeAI",
"pluginDeveloper": "HomeAI",
"pluginIcon": None,
})
token = resp.get("data", {}).get("authenticationToken")
if token:
self._save_token(token)
await self._authenticate()
return {"authenticated": self._authenticated, "token_saved": True}
return {"error": "No token received", "response": resp}
async def disconnect(self):
if self._ws:
await self._ws.close()
self._connected = False
self._authenticated = False
async def ensure_connected(self):
"""Reconnect if the connection dropped."""
if not self._connected or self._ws is None or self._ws.closed:
logger.info("Reconnecting...")
await self.connect()
# ── Expression & parameter API ────────────────────────────────────
async def trigger_expression(self, event: str) -> dict:
"""Trigger a named expression from the character config."""
async with self._lock:
await self.ensure_connected()
if not self._authenticated:
return {"error": "Not authenticated"}
hotkey_id = self.expression_map.get(event)
if not hotkey_id:
return {"error": f"Unknown expression: {event}", "available": list(self.expression_map.keys())}
resp = await self._send("HotkeyTriggerRequest", {"hotkeyID": hotkey_id})
self._current_expression = event
return {"ok": True, "expression": event, "hotkey_id": hotkey_id}
async def set_parameter(self, name: str, value: float, weight: float = 1.0) -> dict:
"""Inject a single VTube Studio parameter value."""
async with self._lock:
await self.ensure_connected()
if not self._authenticated:
return {"error": "Not authenticated"}
resp = await self._send("InjectParameterDataRequest", {
"parameterValues": [{"id": name, "value": value, "weight": weight}],
})
return {"ok": True, "name": name, "value": value}
async def set_parameters(self, params: list[dict]) -> dict:
"""Inject multiple VTube Studio parameters at once."""
async with self._lock:
await self.ensure_connected()
if not self._authenticated:
return {"error": "Not authenticated"}
param_values = [
{"id": p["name"], "value": p["value"], "weight": p.get("weight", 1.0)}
for p in params
]
resp = await self._send("InjectParameterDataRequest", {
"parameterValues": param_values,
})
return {"ok": True, "count": len(param_values)}
async def list_hotkeys(self) -> dict:
"""List all hotkeys available in the current model."""
async with self._lock:
await self.ensure_connected()
if not self._authenticated:
return {"error": "Not authenticated"}
resp = await self._send("HotkeysInCurrentModelRequest", {})
return resp.get("data", {})
async def list_parameters(self) -> dict:
"""List all input parameters for the current model."""
async with self._lock:
await self.ensure_connected()
if not self._authenticated:
return {"error": "Not authenticated"}
resp = await self._send("InputParameterListRequest", {})
return resp.get("data", {})
def status(self) -> dict:
return {
"connected": self._connected,
"authenticated": self._authenticated,
"ws_url": self.ws_url,
"current_expression": self._current_expression,
"expression_count": len(self.expression_map),
"expressions": list(self.expression_map.keys()),
}
# ---------------------------------------------------------------------------
# HTTP Server (asyncio-based, no external deps)
# ---------------------------------------------------------------------------
class BridgeHTTPHandler:
"""Simple async HTTP request handler for the bridge API."""
def __init__(self, client: VTubeClient):
self.client = client
async def handle(self, reader: asyncio.StreamReader, writer: asyncio.StreamWriter):
try:
request_line = await asyncio.wait_for(reader.readline(), timeout=5)
if not request_line:
writer.close()
return
method, path, _ = request_line.decode().strip().split(" ", 2)
path = path.split("?")[0] # strip query params
# Read headers
content_length = 0
while True:
line = await reader.readline()
if line == b"\r\n" or not line:
break
if line.lower().startswith(b"content-length:"):
content_length = int(line.split(b":")[1].strip())
# Read body
body = None
if content_length > 0:
body = await reader.read(content_length)
# Route
try:
result = await self._route(method, path, body)
await self._respond(writer, 200, result)
except Exception as e:
logger.error("Handler error: %s", e, exc_info=True)
await self._respond(writer, 500, {"error": str(e)})
except asyncio.TimeoutError:
writer.close()
except Exception as e:
logger.error("Connection error: %s", e)
try:
writer.close()
except Exception:
pass
async def _route(self, method: str, path: str, body: bytes | None) -> dict:
data = {}
if body:
try:
data = json.loads(body)
except json.JSONDecodeError:
return {"error": "Invalid JSON"}
if method == "GET" and path == "/status":
return self.client.status()
if method == "GET" and path == "/expressions":
return {
"expressions": self.client.expression_map,
"triggers": self.client.ws_triggers,
}
if method == "GET" and path == "/hotkeys":
return await self.client.list_hotkeys()
if method == "GET" and path == "/parameters":
return await self.client.list_parameters()
if method == "POST" and path == "/expression":
event = data.get("event")
if not event:
return {"error": "Missing 'event' field"}
return await self.client.trigger_expression(event)
if method == "POST" and path == "/parameter":
name = data.get("name")
value = data.get("value")
if name is None or value is None:
return {"error": "Missing 'name' or 'value' field"}
return await self.client.set_parameter(name, float(value), float(data.get("weight", 1.0)))
if method == "POST" and path == "/parameters":
if not isinstance(data, list):
return {"error": "Expected JSON array of {name, value} objects"}
return await self.client.set_parameters(data)
if method == "POST" and path == "/auth":
return await self.client.request_new_token()
if method == "POST" and path == "/reload":
return self.client.reload_character()
return {"error": f"Unknown route: {method} {path}"}
async def _respond(self, writer: asyncio.StreamWriter, status: int, data: dict):
body = json.dumps(data, indent=2).encode()
status_text = HTTPStatus(status).phrase
header = (
f"HTTP/1.1 {status} {status_text}\r\n"
f"Content-Type: application/json\r\n"
f"Content-Length: {len(body)}\r\n"
f"Access-Control-Allow-Origin: *\r\n"
f"Access-Control-Allow-Methods: GET, POST, OPTIONS\r\n"
f"Access-Control-Allow-Headers: Content-Type\r\n"
f"\r\n"
)
writer.write(header.encode() + body)
await writer.drain()
writer.close()
# ---------------------------------------------------------------------------
# Auto-reconnect loop
# ---------------------------------------------------------------------------
async def reconnect_loop(client: VTubeClient, interval: float = 5.0):
"""Background task that keeps the VTube Studio connection alive."""
while True:
try:
if not client._connected or client._ws is None or client._ws.closed:
logger.info("Connection lost — attempting reconnect...")
await client.connect()
except Exception as e:
logger.debug("Reconnect failed: %s (retrying in %.0fs)", e, interval)
await asyncio.sleep(interval)
# ---------------------------------------------------------------------------
# Main
# ---------------------------------------------------------------------------
async def main(args):
logging.basicConfig(
level=logging.DEBUG if args.verbose else logging.INFO,
format="%(asctime)s [%(name)s] %(levelname)s: %(message)s",
datefmt="%H:%M:%S",
)
character_path = Path(args.character)
client = VTubeClient(args.vtube_url, character_path)
# Try initial connection (don't fail if VTube Studio isn't running yet)
try:
await client.connect()
except Exception as e:
logger.warning("Initial connection failed: %s (will keep retrying)", e)
# Start reconnect loop
reconnect_task = asyncio.create_task(reconnect_loop(client, interval=5.0))
# Start HTTP server
handler = BridgeHTTPHandler(client)
server = await asyncio.start_server(handler.handle, "0.0.0.0", args.port)
logger.info("HTTP API listening on http://0.0.0.0:%d", args.port)
logger.info("Endpoints: /status /expression /parameter /parameters /auth /reload /hotkeys")
# Graceful shutdown
stop = asyncio.Event()
def _signal_handler():
logger.info("Shutting down...")
stop.set()
loop = asyncio.get_event_loop()
for sig in (signal.SIGINT, signal.SIGTERM):
loop.add_signal_handler(sig, _signal_handler)
async with server:
await stop.wait()
reconnect_task.cancel()
await client.disconnect()
logger.info("Goodbye.")
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="VTube Studio Expression Bridge")
parser.add_argument("--port", type=int, default=DEFAULT_HTTP_PORT, help="HTTP API port (default: 8002)")
parser.add_argument("--vtube-url", default=DEFAULT_VTUBE_WS_URL, help="VTube Studio WebSocket URL")
parser.add_argument("--character", default=str(DEFAULT_CHARACTER_PATH), help="Path to character JSON")
parser.add_argument("--verbose", "-v", action="store_true", help="Debug logging")
args = parser.parse_args()
asyncio.run(main(args))

View File

@@ -18,6 +18,12 @@
<string>1.0</string>
</array>
<key>EnvironmentVariables</key>
<dict>
<key>ELEVENLABS_API_KEY</key>
<string>sk_ec10e261c6190307a37aa161a9583504dcf25a0cabe5dbd5</string>
</dict>
<key>RunAtLoad</key>
<true/>

View File

@@ -7,8 +7,10 @@ Usage:
import argparse
import asyncio
import json
import logging
import os
import urllib.request
import numpy as np
@@ -20,10 +22,76 @@ from wyoming.tts import Synthesize
_LOGGER = logging.getLogger(__name__)
ACTIVE_TTS_VOICE_PATH = os.path.expanduser("~/homeai-data/active-tts-voice.json")
SAMPLE_RATE = 24000
SAMPLE_WIDTH = 2 # int16
CHANNELS = 1
CHUNK_SECONDS = 1 # stream in 1-second chunks
VTUBE_BRIDGE_URL = "http://localhost:8002"
LIPSYNC_ENABLED = True
LIPSYNC_FRAME_SAMPLES = 1200 # 50ms frames at 24kHz → 20 updates/sec
LIPSYNC_SCALE = 10.0 # amplitude multiplier (tuned for Kokoro output levels)
def _send_lipsync(value: float):
"""Fire-and-forget POST to vtube-bridge with mouth open value."""
try:
body = json.dumps({"name": "MouthOpen", "value": value}).encode()
req = urllib.request.Request(
f"{VTUBE_BRIDGE_URL}/parameter",
data=body,
headers={"Content-Type": "application/json"},
method="POST",
)
urllib.request.urlopen(req, timeout=0.5)
except Exception:
pass # bridge may not be running
def _compute_lipsync_frames(samples_int16: np.ndarray) -> list[float]:
"""Compute per-frame RMS amplitude scaled to 01 for lip sync."""
frames = []
for i in range(0, len(samples_int16), LIPSYNC_FRAME_SAMPLES):
frame = samples_int16[i : i + LIPSYNC_FRAME_SAMPLES].astype(np.float32)
rms = np.sqrt(np.mean(frame ** 2)) / 32768.0
mouth = min(rms * LIPSYNC_SCALE, 1.0)
frames.append(round(mouth, 3))
return frames
def _get_active_tts_config() -> dict | None:
"""Read the active TTS config set by the OpenClaw bridge."""
try:
with open(ACTIVE_TTS_VOICE_PATH) as f:
return json.load(f)
except Exception:
return None
def _synthesize_elevenlabs(text: str, voice_id: str, model: str = "eleven_multilingual_v2") -> bytes:
"""Call ElevenLabs TTS API and return raw PCM audio bytes (24kHz 16-bit mono)."""
api_key = os.environ.get("ELEVENLABS_API_KEY", "")
if not api_key:
raise RuntimeError("ELEVENLABS_API_KEY not set")
url = f"https://api.elevenlabs.io/v1/text-to-speech/{voice_id}?output_format=pcm_24000"
payload = json.dumps({
"text": text,
"model_id": model,
"voice_settings": {"stability": 0.5, "similarity_boost": 0.75},
}).encode()
req = urllib.request.Request(
url,
data=payload,
headers={
"Content-Type": "application/json",
"xi-api-key": api_key,
},
method="POST",
)
with urllib.request.urlopen(req, timeout=30) as resp:
return resp.read()
def _load_kokoro():
@@ -76,26 +144,53 @@ class KokoroEventHandler(AsyncEventHandler):
synthesize = Synthesize.from_event(event)
text = synthesize.text
voice = self._default_voice
use_elevenlabs = False
if synthesize.voice and synthesize.voice.name:
# Bridge state file takes priority (set per-request by OpenClaw bridge)
tts_config = _get_active_tts_config()
if tts_config and tts_config.get("engine") == "elevenlabs":
use_elevenlabs = True
voice = tts_config.get("elevenlabs_voice_id", "")
_LOGGER.debug("Synthesizing %r with ElevenLabs voice=%s", text, voice)
elif tts_config and tts_config.get("kokoro_voice"):
voice = tts_config["kokoro_voice"]
elif synthesize.voice and synthesize.voice.name:
voice = synthesize.voice.name
_LOGGER.debug("Synthesizing %r with voice=%s speed=%.1f", text, voice, self._speed)
try:
loop = asyncio.get_event_loop()
samples, sample_rate = await loop.run_in_executor(
None, lambda: self._tts.create(text, voice=voice, speed=self._speed)
)
samples_int16 = (np.clip(samples, -1.0, 1.0) * 32767).astype(np.int16)
audio_bytes = samples_int16.tobytes()
if use_elevenlabs and voice:
# ElevenLabs returns PCM 24kHz 16-bit mono
model = tts_config.get("elevenlabs_model", "eleven_multilingual_v2")
_LOGGER.info("Using ElevenLabs TTS (model=%s, voice=%s)", model, voice)
pcm_bytes = await loop.run_in_executor(
None, lambda: _synthesize_elevenlabs(text, voice, model)
)
samples_int16 = np.frombuffer(pcm_bytes, dtype=np.int16)
audio_bytes = pcm_bytes
else:
_LOGGER.debug("Synthesizing %r with Kokoro voice=%s speed=%.1f", text, voice, self._speed)
samples, sample_rate = await loop.run_in_executor(
None, lambda: self._tts.create(text, voice=voice, speed=self._speed)
)
samples_int16 = (np.clip(samples, -1.0, 1.0) * 32767).astype(np.int16)
audio_bytes = samples_int16.tobytes()
# Pre-compute lip sync frames for the entire utterance
lipsync_frames = []
if LIPSYNC_ENABLED:
lipsync_frames = _compute_lipsync_frames(samples_int16)
await self.write_event(
AudioStart(rate=SAMPLE_RATE, width=SAMPLE_WIDTH, channels=CHANNELS).event()
)
chunk_size = SAMPLE_RATE * SAMPLE_WIDTH * CHANNELS * CHUNK_SECONDS
lipsync_idx = 0
samples_per_chunk = SAMPLE_RATE * CHUNK_SECONDS
frames_per_chunk = samples_per_chunk // LIPSYNC_FRAME_SAMPLES
for i in range(0, len(audio_bytes), chunk_size):
await self.write_event(
AudioChunk(
@@ -106,8 +201,22 @@ class KokoroEventHandler(AsyncEventHandler):
).event()
)
# Send lip sync frames for this audio chunk
if LIPSYNC_ENABLED and lipsync_frames:
chunk_frames = lipsync_frames[lipsync_idx : lipsync_idx + frames_per_chunk]
for mouth_val in chunk_frames:
await asyncio.get_event_loop().run_in_executor(
None, _send_lipsync, mouth_val
)
lipsync_idx += frames_per_chunk
# Close mouth after speech
if LIPSYNC_ENABLED:
await asyncio.get_event_loop().run_in_executor(None, _send_lipsync, 0.0)
await self.write_event(AudioStop().event())
_LOGGER.info("Synthesized %.1fs of audio", len(samples) / sample_rate)
duration = len(samples_int16) / SAMPLE_RATE
_LOGGER.info("Synthesized %.1fs of audio (%d lipsync frames)", duration, len(lipsync_frames))
except Exception:
_LOGGER.exception("Synthesis error")